On 06/27/2018 11:13 PM, Adrien Nayrat wrote: >> 3) Is it intentional to only sample with log_min_duration_statement and >> not also with log_duration? It seems like it should affect both. In >> both cases, the name is too generic. Something called "log_sample_rate" >> should sample **everything**. > I do not think it could be useful to sample other case such as log_duration. > > But yes, the GUC is confusing and I am not comfortable to introduce a new GUC > in > my initial patch. > > Maybe we should adapt current GUC with something like : > > log_min_duration_statement = <time>,<sample rate>> > This give : > > log_min_duration_statement = 0,0.1 > > Equivalent to : > log_min_duration_statement = 0 > log_sample_rate = 0.1 > > Thought? >
After reflection it seems a bad idea : * it breaks compatibility with external tools * it introduce a kind of "composite" GUC which may add complexity to use. For example in pg_settings view. What do you think of : log_min_duration_statement_sample ? Is it too long? I saw a few days ago this error on http://commitfest.cputube.org postgres.sgml:5202: element xref: validity error : IDREF attribute linkend references an unknown ID "log_min_duration_statement" Patch attached with fix on linkend marker Regards, -- Adrien
commit d388b1e31926c2f5f12aa5453cc4df6e7d9e5f60 Author: anayrat <adrien.nay...@anayrat.info> Date: Tue Jul 10 19:58:16 2018 +0200 Add a new GUC, log_sample_rate, to log a fraction of queries. diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index e307bb4e8e..72791ad2a6 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -5190,6 +5190,26 @@ local0.* /var/log/postgresql </listitem> </varlistentry> + <varlistentry id="guc-log-sample-rate" xreflabel="log_sample_rate"> + <term><varname>log_sample_rate</varname> (<type>real</type>) + <indexterm> + <primary><varname>log_sample_rate</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Causes logging only a fraction of the statements when <xref + linkend="guc-log-min-duration-statement"/> is used. The default is + <literal>1</literal>, meaning log all statements longer than + <varname>log_min_duration_statement</varname>. Setting this to zero + disables logging, same as setting + <varname>log_min_duration_statement</varname> to + <literal>-1</literal>. Using <varname>log_sample_rate</varname> is + helpful when the traffic is too high to log all queries. + </para> + </listitem> + </varlistentry> + </variablelist> <para> diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index f4133953be..5590f9a9d4 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -2115,7 +2115,8 @@ check_log_statement(List *stmt_list) /* * check_log_duration - * Determine whether current command's duration should be logged + * Determine whether current command's duration should be logged. + * If log_sample_rate < 1.0, log only a sample. * * Returns: * 0 if no logging is needed @@ -2137,6 +2138,7 @@ check_log_duration(char *msec_str, bool was_logged) int usecs; int msecs; bool exceeded; + bool in_sample; TimestampDifference(GetCurrentStatementStartTimestamp(), GetCurrentTimestamp(), @@ -2153,7 +2155,15 @@ check_log_duration(char *msec_str, bool was_logged) (secs > log_min_duration_statement / 1000 || secs * 1000 + msecs >= log_min_duration_statement))); - if (exceeded || log_duration) + /* + * Do not log if log_sample_rate = 0. + * Log a sample if log_sample_rate <= 1 and avoid unecessary random() + * call if log_sample_rate = 1. + */ + in_sample = log_sample_rate != 0 && + (log_sample_rate == 1 || random() <= log_sample_rate * MAX_RANDOM_VALUE); + + if (exceeded && in_sample || log_duration) { snprintf(msec_str, 32, "%ld.%03d", secs * 1000 + msecs, usecs % 1000); diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 17292e04fe..daf1ae3af1 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -455,6 +455,7 @@ int log_min_messages = WARNING; int client_min_messages = NOTICE; int log_min_duration_statement = -1; int log_temp_files = -1; +double log_sample_rate = 1.0; int trace_recovery_messages = LOG; int temp_file_limit = -1; @@ -3257,6 +3258,17 @@ static struct config_real ConfigureNamesReal[] = NULL, NULL, NULL }, + { + {"log_sample_rate", PGC_SUSET, LOGGING_WHEN, + gettext_noop("Fraction of statements to log."), + gettext_noop("1.0 logs all statements."), + NULL + }, + &log_sample_rate, + 1.0, 0.0, 1.0, + NULL, NULL, NULL + }, + /* End-of-list marker */ { {NULL, 0, 0, NULL, NULL}, NULL, 0.0, 0.0, 0.0, NULL, NULL, NULL diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 657c3f81f8..b8ebbe9f7f 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -446,6 +446,8 @@ # statements running at least this number # of milliseconds +#log_sample_rate = 1 # Fraction of logged statements. 1 means log all + # statements. # - What to Log - diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h index f462eabe59..f50a0d1f1b 100644 --- a/src/include/utils/guc.h +++ b/src/include/utils/guc.h @@ -252,6 +252,7 @@ extern PGDLLIMPORT int log_min_messages; extern PGDLLIMPORT int client_min_messages; extern int log_min_duration_statement; extern int log_temp_files; +extern double log_sample_rate; extern int temp_file_limit;
signature.asc
Description: OpenPGP digital signature