On 24/06/18 13:22, Adrien Nayrat wrote: > Attached third version of the patch with documentation.
Hi. I'm reviewing this. > exceeded = (log_min_duration_statement == 0 || > (log_min_duration_statement > 0 && > (secs > log_min_duration_statement / > 1000 || > - secs * 1000 + msecs >= > log_min_duration_statement))); > + secs * 1000 + msecs >= > log_min_duration_statement))) && > + log_sample_rate != 0 && (log_sample_rate == > 1 || > + random() <= log_sample_rate * > MAX_RANDOM_VALUE); A few notes on this part, which is the crux of the patch. 1) Is there an off-by-one error here? drandom() has the formula (double) random() / ((double) MAX_RANDOM_VALUE + 1) but yours doesn't look like that. 2) I think the whole thing should be separated into its own expression instead of tagging along on an already hard to read expression. 3) Is it intentional to only sample with log_min_duration_statement and not also with log_duration? It seems like it should affect both. In both cases, the name is too generic. Something called "log_sample_rate" should sample **everything**. Otherwise I think it's good. Attached is a revised patch that fixes 2) as well as some wordsmithing throughout. It does not deal with the other issues I raised. Marked "Waiting on Author". -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 7bfbc87109..90bbe6d423 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -5190,6 +5190,26 @@ local0.* /var/log/postgresql </listitem> </varlistentry> + <varlistentry id="guc-log-sample-rate" xreflabel="log_sample_rate"> + <term><varname>log_sample_rate</varname> (<type>real</type>) + <indexterm> + <primary><varname>log_sample_rate</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Causes logging only a fraction of the statements when <xref + linkend="log_min_duration_statement"/> is used. The default is + <literal>1</literal>, meaning log all statements longer than + <varname>log_min_duration_statement</varname>. Setting this to zero + disables logging, same as setting + <varname>log_min_duration_statement</varname> to + <literal>-1</literal>. Using <varname>log_sample_rate</varname> is + helpful when the traffic is too high to log all queries. + </para> + </listitem> + </varlistentry> + </variablelist> <para> diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index f4133953be..5590f9a9d4 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -2115,7 +2115,8 @@ check_log_statement(List *stmt_list) /* * check_log_duration - * Determine whether current command's duration should be logged + * Determine whether current command's duration should be logged. + * If log_sample_rate < 1.0, log only a sample. * * Returns: * 0 if no logging is needed @@ -2137,6 +2138,7 @@ check_log_duration(char *msec_str, bool was_logged) int usecs; int msecs; bool exceeded; + bool in_sample; TimestampDifference(GetCurrentStatementStartTimestamp(), GetCurrentTimestamp(), @@ -2153,7 +2155,15 @@ check_log_duration(char *msec_str, bool was_logged) (secs > log_min_duration_statement / 1000 || secs * 1000 + msecs >= log_min_duration_statement))); - if (exceeded || log_duration) + /* + * Do not log if log_sample_rate = 0. + * Log a sample if log_sample_rate <= 1 and avoid unecessary random() + * call if log_sample_rate = 1. + */ + in_sample = log_sample_rate != 0 && + (log_sample_rate == 1 || random() <= log_sample_rate * MAX_RANDOM_VALUE); + + if (exceeded && in_sample || log_duration) { snprintf(msec_str, 32, "%ld.%03d", secs * 1000 + msecs, usecs % 1000); diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 859ef931e7..e89944789e 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -455,6 +455,7 @@ int log_min_messages = WARNING; int client_min_messages = NOTICE; int log_min_duration_statement = -1; int log_temp_files = -1; +double log_sample_rate = 1.0; int trace_recovery_messages = LOG; int temp_file_limit = -1; @@ -3257,6 +3258,17 @@ static struct config_real ConfigureNamesReal[] = NULL, NULL, NULL }, + { + {"log_sample_rate", PGC_SUSET, LOGGING_WHEN, + gettext_noop("Fraction of statements to log."), + gettext_noop("1.0 logs all statements."), + NULL + }, + &log_sample_rate, + 1.0, 0.0, 1.0, + NULL, NULL, NULL + }, + /* End-of-list marker */ { {NULL, 0, 0, NULL, NULL}, NULL, 0.0, 0.0, 0.0, NULL, NULL, NULL diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 9e39baf466..3f9a016003 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -447,6 +447,8 @@ # statements running at least this number # of milliseconds +#log_sample_rate = 1 # Fraction of logged statements. 1 means log all + # statements. # - What to Log - diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h index f462eabe59..f50a0d1f1b 100644 --- a/src/include/utils/guc.h +++ b/src/include/utils/guc.h @@ -252,6 +252,7 @@ extern PGDLLIMPORT int log_min_messages; extern PGDLLIMPORT int client_min_messages; extern int log_min_duration_statement; extern int log_temp_files; +extern double log_sample_rate; extern int temp_file_limit;