Hi,
On Tue, Feb 11, 2025 at 02:11:10PM -0800, Masahiko Sawada wrote:
> I've updated the patch that includes comment updates and bug fixes.
Thanks!
> The main idea of changing WAL level online is to decouple two aspects:
> (1) the information included in WAL records and (2) the
> functionalities available at each WAL level. With that, we can change
> the WAL level gradually. For example, when increasing the WAL level
> from 'replica' to 'logical', we first switch the WAL level on the
> shared memory to a new higher level where we allow processes to write
> WAL records with additional information required by the logical
> decoding, while keeping the logical decoding unavailable. The new
> level is something between 'replica' and 'logical'. Once we confirm
> all processes have synchronized to the new level, we increase the WAL
> level further to 'logical', allowing us to start logical decoding. The
> patch supports all combinations of WAL level transitions. It makes
> sense to me to use a background worker to proceed with this transition
> work since we need to wait at some points, rather than delegating it
> to the checkpointer process.
The background worker being added is "wal_level control worker". I wonder if
it would make sense to create a more "generic" one instead (to whom we could
assign more "tasks" later on, as suggested in the past in [1]).
+ /*
+ * XXX: Perhaps it's not okay that we failed to launch a bgworker and give
+ * up wal_level change because we already reported that the change has
+ * been accepted. Do we need to use aux process instead for that purpose?
+ */
+ if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+ ereport(WARNING,
+ (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+ errmsg("out of background worker slots"),
+ errhint("You might need to increase \"%s\".",
"max_worker_processes")));
Not sure it has to be an aux process instead as it should be busy in rare
occasions.
Maybe we could add some mechanism for ensuring that a bgworker slot is available
when needed (as suggested in [2])?
Not saying it has to be done that way. I just thought that the "wal_level
control worker"
could be a perfect use case/starting point for a more generic one but I don't
want
to over complicate that thread though.
So maybe just rename "wal_level control worker" to say "custodian worker" and
we could also think about [2]? Feel free to consider all of this as Nits if you
feel it deviates too much from the initial intend of this thread.
[1]:
https://www.postgresql.org/message-id/flat/C1EE64B0-D4DB-40F3-98C8-0CED324D34CB%40amazon.com
[2]: https://www.postgresql.org/message-id/1058306.1680467858%40sss.pgh.pa.us
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com