How One GitHub Actions File Replaced My Client’s $20/Month Uptime Monitor
You’re already paying for a distributed cron job with built-in alerting. You’re just not pointing it at the right URL.
A client of mine runs a small booking app. Last month it went down on a Saturday morning around 9 AM. He found out about it the way nobody wants to find out: a customer texted him. By the time he got back to a laptop and restarted the container, an hour and a half had passed and he had eight angry messages.
The next Monday he asked me to set up uptime monitoring. He’d been looking at Pingdom and Better Stack and asked if $19/month was a reasonable price. I told him to keep his nineteen dollars and gave him the same setup I use for my own apps.
Why I didn’t recommend a paid monitor
His whole stack is one small Go API talking to Postgres. Two containers on a single VPS. Three users on a good day. He doesn’t need a global probe network or a status page with a custom domain. He needs to know the moment the API stops responding, on a phone that’s not his.
Paid monitors are good products. They’re just sized for a different problem.
| Service | Cheapest plan with alerting | Annual cost |
|---|---|---|
| Pingdom | $15/mo | $180/yr |
| UptimeRobot Pro | $7/mo | $84/yr |
| Better Stack | $19/mo | $228/yr |
| GitHub Actions | $0 | $0 |
The GitHub row isn’t a trick. If you push code to GitHub, you already have a free, distributed cron runner sitting there with email alerts built in. Most people just don’t realise they’re allowed to point it at production.
The file
Here is the entire workflow. Drop it at .github/workflows/uptime.yml in any repo you own.
name: Uptime check
on:
schedule:
- cron: "*/10 * * * *"
workflow_dispatch:
jobs:
ping:
runs-on: ubuntu-latest
timeout-minutes: 2
steps:
- name: Hit healthcheck
run: |
curl --fail --silent --show-error --max-time 10 \
https://api.example.com/healthz
- name: Notify on failure
if: failure()
run: |
curl -X POST -H "Content-Type: application/json" \
-d '{"content":"🚨 api.example.com healthcheck failed"}' \
${{ secrets.DISCORD_WEBHOOK }}
Twenty-two lines including blanks. Let me walk through the parts that matter.
cron: "*/10 * * * *" runs the job every ten minutes. You can go down to five minutes, but GitHub explicitly says scheduled workflows can be delayed during periods of high load. Ten minutes gives you a comfortable buffer where a one-cycle delay still means you hear about the outage within twenty minutes.
curl --fail is the load-bearing flag. Without --fail, curl prints the error page and exits 0, the job goes green, you hear nothing, and you find out from a customer again. With it, any non-2xx response exits non-zero and the job fails.
--max-time 10 means a hung server (TCP open, never responds) trips the alarm instead of timing out the entire job. Adjust if your healthcheck is slow.
if: failure() is the alert. GitHub already emails the repo owner when a workflow fails, which is enough for most people. If you want something louder, point the second step at a Discord or Slack webhook like I do above. Push notifications on your phone, no email-rule fiddling.
workflow_dispatch lets you hit “Run workflow” in the Actions tab whenever you want to confirm the alert path still works. Test it once a quarter by pointing it at a URL that returns 500. It’s the only piece of monitoring infrastructure I trust, because I’ve actually seen it page me.
The healthcheck endpoint
A healthcheck that only returns 200 if the process is alive is worthless. The process being alive is what systemd already tells you. You want a healthcheck that pings the things the app actually depends on.
For the booking app this was Postgres. The endpoint is fifteen lines of Go:
func healthz(db *sql.DB) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithTimeout(r.Context(), 2*time.Second)
defer cancel()
if err := db.PingContext(ctx); err != nil {
http.Error(w, "db unreachable", http.StatusServiceUnavailable)
return
}
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte("ok"))
}
}
Same pattern in Node (await pool.query('SELECT 1')), Python (cursor.execute('SELECT 1')), or whatever you’re running. If the DB is the only critical dependency, that’s all you need. If you’ve got a queue or an external API the app can’t live without, add a quick check for those too. Don’t go overboard – every check is something that can give you a false positive at 4 AM.
The same GitHub Actions cron trick is what I used for the free lead scraper for a client. It turns out that “free scheduled job that runs your code and emails you when something happens” covers a surprising amount of what small businesses actually pay servers for.
What it can’t do
Honest list, because pretending otherwise is how you end up trusting the wrong tool.
- GitHub Actions cron is best-effort, not guaranteed. Scheduled workflows can be delayed by 15+ minutes during high load. For a hobby project this is fine. For a paid SLA, get a real monitor.
- No alerts if GitHub itself is down. Rare, but it happens. If GitHub goes dark at the same time as your app, you’ll hear about it from your customers.
- No status page, no incident timeline, no on-call rotation. This is a ping with an email. If you need any of the surrounding ceremony, you’re already past what this setup is for.
- No regional probes. You’re testing from wherever GitHub’s runner spun up. If you care whether your site is reachable from Asia specifically, this isn’t enough.
- Five minutes is the practical floor. GitHub schedules cron, it doesn’t promise sub-minute precision. If you need sub-minute detection, you’re in different-tool territory.
If you don’t have any of those needs, you don’t need to pay for them.
What it costs to run
Six checks per hour, twenty-four hours, thirty days: ~4,320 runs per month. Each one takes about 15-30 seconds. Call it 30 seconds to be safe. That’s around 36 minutes of compute per month.
GitHub gives you 2,000 free minutes per month on private repos. If the workflow lives in a public repo, it’s free with no cap. Either way you’re using less than 2% of the allowance. You’d have to put uptime monitors on twenty more services before this even becomes a line item.
If your project already runs other workflows and you’re worried about the budget, drop the cron to every 15 minutes. Now you’re down to 24 minutes/month and you still beat any human’s response time to a 9 AM outage.
When to call me
I set this up for clients as part of my maintenance work, along with the rest of the boring-but-load-bearing stuff (backups that actually restore, log rotation that actually rotates, alerts that page the right person). If you’d rather not touch the YAML yourself, or you want someone to think through the healthcheck endpoint properly the first time, send me a message.
The setup itself is twenty-two lines and an afternoon. The hard part is knowing what to check, knowing what’s worth being woken up for, and trusting the alert when it finally fires. That part takes the time.
– Christian