ProcessStandbyHSFeedbackMessage has a race condition: it thinks it can call GetOldestXmin and then the result will politely hold still while it considers what to do next. But in fact, whoever has the oldest xmin could exit their transaction, allowing the global minimum to advance. If a VACUUM process then inspects the ProcArray, it could compute an oldest xmin that is newer than the value that ProcessStandbyHSFeedbackMessage installs just afterwards. So much for keeping the data the standby wanted.
AFAICS we have to do all the logic about choosing the new value for MyProc->xmin while holding ProcArrayLock, which IMO means that it should go into a function in procarray.c. The fact that walsender.c is taking ProcArrayLock and whacking MyProc->xmin around is already a horrid violation of modularity, even if it weren't incorrect. Also, it seems like using GetOldestXmin is quite wrong here anyway; we don't really want a result modified by vacuum_defer_cleanup_age do we? It looks to me like that would result in vacuum_defer_cleanup_age being applied twice: the walsender process sets its xmin the defer age into the past, and then subsequent calls of GetOldestXmin compute a result that is the defer age further back. IOW this is an independent mechanism that also results in the computed global xmin going backwards. (Now that we have a standby feedback mechanism, I'm a bit tempted to propose getting rid of vacuum_defer_cleanup_age altogether, rather than hacking things to avoid the above.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers