On Mon, Jun 27, 2022 at 12:04:57AM -0700, Noah Misch wrote:
> For me, it reproduces consistently with a sleep just before the startup
> process exits:

Nice catch.

> One can adapt the test to the server behavior by having the test wait for the
> archiver to start, as attached.  This is sufficient to make check-world pass
> with the above sleep in place.  I think we should also modify the PostgresNode
> archive_command to log a message.  That lack of logging was a obstacle
> upthread (as seen in commit 3279cef) and again here.

          ? qq{copy "%p" "$path\\\\%f"}
-         : qq{cp "%p" "$path/%f"};
+         : qq{echo >&2 "ARCHIVE_COMMAND %p"; cp "%p" "$path/%f"};

This is a bit inelegant.  Perhaps it would be better through a perl
wrapper like cp_history_files?

> An alternative would be to declare that the test is right and the server is
> wrong.  The postmaster knows how to start the checkpointer if the checkpointer
> is not running when the postmaster needs a shutdown checkpoint.  It could
> start the archiver around that same area:
> 
>                               /* Start the checkpointer if not running */
>                               if (CheckpointerPID == 0)
>                                       CheckpointerPID = StartCheckpointer();
>                               /* And tell it to shut down */
>                               if (CheckpointerPID != 0)
>                               {
>                                       signal_child(CheckpointerPID, SIGUSR2);
>                                       pmState = PM_SHUTDOWN;
>                               }
> 
> Any opinions between the change-test and change-server approaches?

The startup sequence can be sometimes tricky.  Though I don't have a
specific argument coming into mind, I would stick to a fix in the
test.
--
Michael

Attachment: signature.asc
Description: PGP signature

Reply via email to