On 11.04.2018 00:00, Tom Lane wrote:
So we need a mechanism that's narrowly targeted
to reopening the logfile, without SIGHUP'ing the entire database.

We can send SIGUSR1 to the syslogger process. To make its pid easier to find out, it can be published in "$PGDATA/logging_collector.pid", as suggested by Grigory. The attached patch does this. It also adds a brief description of how to use this with logrotate.

Point 2: Depending on how you've got the log filenames configured,
setting rotation_requested may result in a change in log filename

If logrotate only needs the file to be reopened, syslogger's rotation does just than when using a static log file name. I imagine logrotate can be configured to do something useful with changing file names, too. It is a matter of keeping the configuration of syslogger and logrotate consistent.

BTW, another thing that needs to be considered is the interaction with
rotation_disabled.  Right now we automatically drop that on SIGHUP, but
I'm unclear on whether it should be different for logrotate requests.

The SIGUSR1 path is supposed to be used by automated tools. In a sense, it is an automatic rotation, the difference being that it originates from an external tool and not from syslogger itself. So, it sounds plausible that the rotation request shouldn't touch the rotation_disabled flag, and should be disabled by it, just like the automatic rotation.

Still, this leads us to a scenario where we can lose logs:
1. postgres is configured to use a static file name. logrotate is configured to move the file, send SIGUSR1 to postgres syslogger, gzip the file and delete it. 2. logrotate starts the rotation. It moves the file and signals postgres to reopen it. 3. postgres fails to reopen the file because there are too many files open (ENFILE/EMFILE), which is a normal occurrence on heavily loaded systems. Or it doesn't open the new file because the rotation_disable flag is set. It continues logging to the old file. 4. logrotate has no way to detect this failure, so it gzips the file and unlinks it. 5. postgres continues writing to the now unlinked file, and we lose an arbitrary amount of logs until the next successful rotation.

With dynamic file names, logrotate can be told to skip open files, so that it doesn't touch our log file if we haven't switched to the new one. With a static file name, the log file is always open, so this method doesn't work. I'm not sure how to make this work reliably.

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 4a68ec3b40..91b96dc588 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -960,6 +960,23 @@ pg_ctl start | rotatelogs /var/log/pgsql_log 86400
   </para>
 
   <para>
+   The <application>logrotate</application> program can also be used to collect log files 
+   produced by the built-in logging collector. In this configuration, the location and
+   names of the log files are determined by logging collector, and 
+   <application>logrotate</application> periodically archives these files. There must be
+   some way for <application>logrotate</application> to inform the application that the
+   log file it is using is going to be archived, and that it must direct further output
+   to a new file. This is commonly done with a <literal>postrotate</literal> script that
+   sends a <literal>SIGHUP</literal> signal to the application, which then reopens the log
+   file. In <productname>PostgreSQL</productname>, this is achieved by sending a 
+   <literal>SIGUSR1</literal> signal to the logging collector process. The PID of this
+   process can be found in the <filename>$PGDATA/logging_collector.pid</filename> file.
+   When the logging collector receives the signal, it switches to the new log files
+   according to its configuration (see <xref linkend="runtime-config-logging-where"/>).
+   If it is configured to use a static file name, it will just reopen the file.
+  </para>
+
+  <para>
    On many systems, however, <application>syslog</application> is not very reliable,
    particularly with large log messages; it might truncate or drop messages
    just when you need them the most.  Also, on <productname>Linux</productname>,
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index 58b759f305..99201727ec 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -138,6 +138,8 @@ static void flush_pipe_input(char *logbuffer, int *bytes_in_logbuffer);
 static void open_csvlogfile(void);
 static FILE *logfile_open(const char *filename, const char *mode,
 			 bool allow_errors);
+static void syslogger_on_exit(int code, Datum arg);
+static void syslogger_publish_pid(void);
 
 #ifdef WIN32
 static unsigned int __stdcall pipeThread(void *arg);
@@ -286,6 +288,9 @@ SysLoggerMain(int argc, char *argv[])
 	set_next_rotation_time();
 	update_metainfo_datafile();
 
+	/* publish my pid to pid file */
+	syslogger_publish_pid();
+
 	/* main worker loop */
 	for (;;)
 	{
@@ -1417,6 +1422,64 @@ update_metainfo_datafile(void)
 						LOG_METAINFO_DATAFILE_TMP, LOG_METAINFO_DATAFILE)));
 }
 
+/*
+ * The pid file of logging collector.
+ */
+#define PID_FILE "logging_collector.pid"
+
+/*
+ * Clean up the pid file of logging collector.
+ */
+static void
+syslogger_on_exit(int code, Datum arg)
+{
+	unlink(PID_FILE);
+}
+
+/*
+ * Publish the PID of logging collector to a pid file.
+ */
+static void
+syslogger_publish_pid()
+{
+	int fd;
+	char buffer[32];
+
+	/*
+	 * Postmaster switches working directory to data directory, so we
+	 * can use relative file name. If the pid file already exists, just
+	 * overwrite it. Concurrency with other postgres instances should
+	 * have been handled by postmaster.
+	 */
+	fd = open(PID_FILE, O_RDWR | O_CREAT | O_TRUNC, 0600);
+
+	if (fd < 0)
+	{
+		ereport(LOG,
+				(errcode_for_file_access(),
+				 errmsg("could not create log_collector PID file \"%s\": %m",
+						PID_FILE)));
+		return;
+	}
+
+	snprintf(buffer, sizeof(buffer), "%d\n", (int) getpid());
+
+	if (write(fd, buffer, strlen(buffer)) != strlen(buffer))
+	{
+		ereport(LOG,
+				(errcode_for_file_access(),
+				 errmsg("could not write log_collector PID file \"%s\": %m",
+						PID_FILE)));
+		close(fd);
+		unlink(PID_FILE);
+		return;
+	}
+
+	close(fd);
+
+	on_proc_exit(syslogger_on_exit, 0);
+}
+
 /* --------------------------------
  *		signal handler routines
  * --------------------------------

Reply via email to