On 2020-03-10 09:48:07 -0300, Alvaro Herrera wrote: > On 2020-Mar-10, Kyotaro Horiguchi wrote: > > > At Mon, 9 Mar 2020 20:34:20 -0700, Andres Freund <and...@anarazel.de> wrote > > in > > > On 2020-03-10 12:27:25 +0900, Kyotaro Horiguchi wrote: > > > > That's true, but I have the same concern with Tom. The archive bacame > > > > too-tightly linked with other processes than actual relation. > > > > > > What's the problem here? We have a number of helper processes > > > (checkpointer, bgwriter) that are attached to shared memory, and it's > > > not a problem. > > > > That theoretically raises the chance of server-crash by a small amount > > of probability. But, yes, it's absurd to prmise that archiver process > > crashes. > > The case I'm worried about is a misconfigured archive_command that > causes the archiver to misbehave (exit with a code other than 0); if > that already doesn't happen, or we can make it not happen, then I'm okay > with the changes to archiver.
Well, an exit(1) is also fine, afaict. No? The archive command can just trigger either a FATAL or a LOG: rc = system(xlogarchcmd); if (rc != 0) { /* * If either the shell itself, or a called command, died on a signal, * abort the archiver. We do this because system() ignores SIGINT and * SIGQUIT while waiting; so a signal is very likely something that * should have interrupted us too. Also die if the shell got a hard * "command not found" type of error. If we overreact it's no big * deal, the postmaster will just start the archiver again. */ int lev = wait_result_is_any_signal(rc, true) ? FATAL : LOG; if (WIFEXITED(rc)) { ereport(lev, (errmsg("archive command failed with exit code %d", WEXITSTATUS(rc)), errdetail("The failed archive command was: %s", xlogarchcmd))); } else if (WIFSIGNALED(rc)) { #if defined(WIN32) ereport(lev, (errmsg("archive command was terminated by exception 0x%X", WTERMSIG(rc)), errhint("See C include file \"ntstatus.h\" for a description of the hexadecimal value."), errdetail("The failed archive command was: %s", xlogarchcmd))); #else ereport(lev, (errmsg("archive command was terminated by signal %d: %s", WTERMSIG(rc), pg_strsignal(WTERMSIG(rc))), errdetail("The failed archive command was: %s", xlogarchcmd))); #endif } else { ereport(lev, (errmsg("archive command exited with unrecognized status %d", rc), errdetail("The failed archive command was: %s", xlogarchcmd))); } snprintf(activitymsg, sizeof(activitymsg), "failed on %s", xlog); set_ps_display(activitymsg, false); return false; } I.e. there's only normal ways to shut down the archiver due to a failing archvie command. Greetings, Andres Freund