WAL growth spikes: a field notebook
sysgen-cloudx.digital

2025-02-14 · Jonas Meyer

WAL growth spikes: a field notebook

WAL growth spikes: a field notebook

What we log first when archives balloon, and how to separate checkpoint noise from genuine write amplification.

When archives climb faster than forecasts, teams often jump to checkpoint tuning. That can help, but it can also hide a burstier problem: long transactions pinning old segments or an accidental full-table rewrite during business hours.

Start with a simple timeline: archive bytes per hour, checkpoint distance, and concurrent write sessions. Layer application deploy markers so you can correlate code releases with slope changes. In regulated environments we still favor plain language incident notes — finance readers care about customer impact, not jargon.

Finally, document what you chose not to do. Skipping a hasty max_wal_size bump can be the right call if it trades away predictability. Your future self will appreciate the rationale when the next spike arrives.

Tags: PostgreSQL, Operations, WAL

← All posts