On Fri, 24 Nov 2023, 17:12 Ron Johnson, <ronljohnso...@gmail.com> wrote:
> On Fri, Nov 24, 2023 at 11:00 AM Les <nagy...@gmail.com> wrote: > [snip] > >> Writing of WAL files continued after we shut down all clients, and >> restarted the primary PostgreSQL server. >> >> The order was: >> >> 1. shut down all clients >> 2. stop the primary >> 3. start the primary >> 4. primary started to write like mad again >> 5. removed replication slot >> 6. primary stopped madness and deleted all WAL files (except for a few) >> >> How can the primary server generate more and more WAL files (writes) >> after all clients have been shut down and the server was restarted? My only >> bet was the autovacuum. But I ruled that out, because removing a >> replication slot has no effect on the autovacuum (am I wrong?). Now you are >> saying that this looks like a huge rollback. Does rolling back changes >> require even more data to be written to the WAL after server restart? As >> far as I know, if something was not written to the WAL, then it is not >> something that can be rolled back. Does removing a replication slot lessen >> the amount of data needed to be written for a rollback (or for anything >> else)? It is a fact that the primary stopped writing at 1.5GB/sec the >> moment we removed the slot. >> >> I'm not saying that you are wrong. Maybe there was a >> crazy application. I'm just saying that a crazy application cannot be the >> whole picture. It cannot explain this behaviour as a whole. Or maybe I have >> a deep misunderstanding about how WAL files work. On the second occasion, >> the primary was running for a few minutes when pg_wal started to increase. >> We noticed that early, and shut down all clients, then restarted the >> primary server. After the restart, the primary was writing out more WAL >> files for many more minutes, until we dropped the slot again. E.g. it was >> writing much more data after the restart than before the restart; and it >> only stopped (exactly) when we removed the slot. >> > > pg_stat_activity will tell you something about what's happening even after > you think "all clients have been shut down". > > I'd crank up the logging.to at least: > log_error_verbosity = verbose > log_statement = all > track_activity_query_size = 10240 > client_min_messages = notice > log_line_prefix = '%m\t%r\t%u\t%d\t%p\t%i\t%a\t%e\t' > I dont know if it makes any sense, but is there a relatively painless way to look into the produced wal files to see what are they filled with? It might give some pointers to the source of the issue. Regards, Sándor >