Polling with retry
Integration tests routinely have to wait for asynchronous side-effects: a webhook to be delivered, a background job to finish, a verification email to arrive, a search index to update. The retry { ... } block is Tales’ polling primitive.
The pattern
Section titled “The pattern”step "http" "find_verification_email" { retry { attempts = 10 interval = "100ms" }
request { method = "GET" url = "${config.base_url}/mail/messages?to=${result.register.email}" }
expect { status = 200 json = { messages = is_array() } }
capture { code = regex_find(response.body, "verification code is ([A-Z0-9]{6})", 1) }}What this does:
- Sends the HTTP request.
- If
expectpasses, succeed. Capture the value. Done. - If
expectfails, sleepinterval, then retry. - Bail out after
attemptstotal runs and fail the step.
The total wall-clock budget for the step is roughly attempts × (request_time + interval).
Picking attempts and interval
Section titled “Picking attempts and interval”A useful rule of thumb:
- Fast asynchronous events (in-process queue, local webhook):
attempts = 10, interval = "100ms"(1s budget). - Network-mediated events (real webhook, S3 upload):
attempts = 20, interval = "500ms"(10s budget). - Background jobs:
attempts = 30, interval = "2s"(1min budget).
Match the budget to the worst-case time you’ve actually observed in production, with a safety margin. Wildly oversized budgets paper over real slowdowns.
When to use retry
Section titled “When to use retry”- Polling for a state transition, order moves from
pendingtoprocessed, user is markedverified, document finishes converting. - Waiting for eventual consistency, write to primary, read from replica, expect the replica to catch up.
- Inbox-style assertions, verify an email landed, a webhook was received, a Slack message was posted.
When not to use retry
Section titled “When not to use retry”- As a band-aid for flaky tests. If your test passes 60% of the time and a
retry { attempts = 5 }makes it pass 100% of the time, you’ve hidden the bug, not fixed it. - As a substitute for explicit synchronisation. If your API gives you a “wait for completion” endpoint, call it. Don’t poll an unrelated read endpoint.
- To compensate for race conditions in the system under test. That’s a server bug, file it, don’t paper over it.
Combined with captures
Section titled “Combined with captures”The capture block runs only on the successful attempt. So the values you capture reflect the state at the moment polling succeeded:
step "http" "wait_for_processed" { retry { attempts = 30, interval = "1s" }
request { ... }
expect { status = 200 json = { status = "processed" } }
capture { finished_at = response.json.completed_at // wall-clock at success output_id = response.json.output_id }}This is the right idiom: the downstream step gets a known-good capture, not a pre-completion value.
Combined with --timeout
Section titled “Combined with --timeout”The global --timeout flag wraps the entire run. If a retry loop sits inside a step that runs near the deadline, the runner cancels the in-flight HTTP request (via context.DeadlineExceeded) and the step fails with a clear error. Polling does not silently exceed the suite-wide budget.
Worked example, full polling scenario
Section titled “Worked example, full polling scenario”scenario "Verify email then claim" { step "http" "register" { request { method = "POST" url = "${config.base_url}/users" body { json = { email = generate("user_email"), password = "Sup3rS3cret!" } } } expect { status = 201 } capture { id = response.json.id email = response.json.email } }
step "http" "find_verification_email" { retry { attempts = 20, interval = "200ms" }
request { method = "GET" url = "${config.base_url}/mail/messages?to=${result.register.email}" } expect { status = 200 json = { messages = is_array() } } capture { code = regex_find(response.body, "verification code is ([A-Z0-9]{6})", 1) } }
step "http" "verify_email" { request { method = "POST" url = "${config.base_url}/verify-email" body { json = { user_id = result.register.id, code = result.find_verification_email.code } } } expect { status = 200, json = { verified = true } } }
teardown { step "http" "delete_user" { when = can(result.register.id) request { method = "DELETE" url = "${config.base_url}/users/${result.register.id}" } expect { status = one_of([200, 204, 404]) } } }}Three steps, one retry block, one regex capture, one teardown, and the whole scenario reproduces deterministically with --seed 1234.