Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

Andres Freund Wed, 05 Apr 2017 12:18:45 -0700

Hi,


I'm somewhat inclined to think that this really would be better done in
an extension like pg_stat_statements.


On 2017-03-08 14:39:23 +1100, Haribabu Kommi wrote:
> On Wed, Feb 1, 2017 at 3:13 PM, Michael Paquier <[email protected]>

> +     <varlistentry id="guc-track-sql" xreflabel="track_sql">
> +      <term><varname>track_sql</varname> (<type>boolean</type>)
> +      <indexterm>
> +       <primary><varname>track_sql</> configuration parameter</primary>
> +      </indexterm>
> +      </term>
> +      <listitem>
> +       <para>
> +        Enables collection of different SQL statement statistics that are
> +        executed on the instance. This parameter is off by default. Only
> +        superusers can change this setting.
> +       </para>
> +      </listitem>
> +     </varlistentry>
> +

I don't like this name much, seems a bit too generic to
me. 'track_activities', 'track_io_timings' are at least a bit clearer.
How about track_statement_statistics + corresponding view/function renaming?


> +  <table id="pg-stat-sql-view" xreflabel="pg_stat_sql">
> +   <title><structname>pg_stat_sql</structname> View</title>
> +   <tgroup cols="3">
> +    <thead>
> +    <row>
> +      <entry>Column</entry>
> +      <entry>Type</entry>
> +      <entry>Description</entry>
> +     </row>
> +    </thead>
> +
> +   <tbody>
> +    <row>
> +     <entry><structfield>tag</></entry>
> +     <entry><type>text</></entry>
> +     <entry>Name of the SQL statement</entry>
> +    </row>

It's not really the "name" of a statement. Internally and IIRC elsewhere
in the docs we describe something like this as tag?


> +/* ----------
> + * pgstat_send_sqlstmt(void)
> + *
> + *           Send SQL statement statistics to the collector
> + * ----------
> + */
> +static void
> +pgstat_send_sqlstmt(void)
> +{
> +     PgStat_MsgSqlstmt msg;
> +     PgStat_SqlstmtEntry *entry;
> +     HASH_SEQ_STATUS hstat;
> +
> +     if (pgstat_backend_sql == NULL)
> +             return;
> +
> +     pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SQLSTMT);
> +     msg.m_nentries = 0;
> +
> +     hash_seq_init(&hstat, pgstat_backend_sql);
> +     while ((entry = (PgStat_SqlstmtEntry *) hash_seq_search(&hstat)) != 
> NULL)
> +     {
> +             PgStat_SqlstmtEntry *m_ent;
> +
> +             /* Skip it if no counts accumulated since last time */
> +             if (entry->count == 0)
> +                     continue;
> +
> +             /* need to convert format of time accumulators */
> +             m_ent = &msg.m_entry[msg.m_nentries];
> +             memcpy(m_ent, entry, sizeof(PgStat_SqlstmtEntry));
> +
> +             if (++msg.m_nentries >= PGSTAT_NUM_SQLSTMTS)
> +             {
> +                     pgstat_send(&msg, offsetof(PgStat_MsgSqlstmt, 
> m_entry[0]) +
> +                                             msg.m_nentries * 
> sizeof(PgStat_SqlstmtEntry));
> +                     msg.m_nentries = 0;
> +             }
> +
> +             /* reset the entry's count */
> +             entry->count = 0;
> +     }
> +
> +     if (msg.m_nentries > 0)
> +             pgstat_send(&msg, offsetof(PgStat_MsgSqlstmt, m_entry[0]) +
> +                                     msg.m_nentries * 
> sizeof(PgStat_SqlstmtEntry));

Hm. So pgstat_backend_sql is never deleted, which'll mean that if a
backend lives long we'll continually iterate over a lot of 0 entries.  I
think performance evaluation of this feature should take that into
account.


> +}
> +
> +/*
> + * Count SQL statement for pg_stat_sql view
> + */
> +void
> +pgstat_count_sqlstmt(const char *commandTag)
> +{
> +     PgStat_SqlstmtEntry *htabent;
> +     bool            found;
> +
> +     if (!pgstat_backend_sql)
> +     {
> +             /* First time through - initialize SQL statement stat table */
> +             HASHCTL         hash_ctl;
> +
> +             memset(&hash_ctl, 0, sizeof(hash_ctl));
> +             hash_ctl.keysize = NAMEDATALEN;
> +             hash_ctl.entrysize = sizeof(PgStat_SqlstmtEntry);
> +             pgstat_backend_sql = hash_create("SQL statement stat entries",
> +                                                                             
>  PGSTAT_SQLSTMT_HASH_SIZE,
> +                                                                             
>  &hash_ctl,
> +                                                                             
>  HASH_ELEM | HASH_BLOBS);
> +     }
> +
> +     /* Get the stats entry for this SQL statement, create if necessary */
> +     htabent = hash_search(pgstat_backend_sql, commandTag,
> +                                               HASH_ENTER, &found);
> +     if (!found)
> +             htabent->count = 1;
> +     else
> +             htabent->count++;
> +}


That's not really something in this patch, but a lot of this would be
better if we didn't internally have command tags as strings, but as an
enum.  We should really only convert to a string with needed.  That
we're doing string comparisons on Portal->commandTag is just plain bad.

If so, we could also make this whole patch a lot cheaper - have a fixed
size array that has an entry for every possible tag (possibly even in
shared memory, and then use atomics there).

> +++ b/src/backend/tcop/postgres.c
> @@ -1109,6 +1109,12 @@ exec_simple_query(const char *query_string)
>  
>               PortalDrop(portal, false);
>  
> +             /*
> +              * Count SQL statement for pg_stat_sql view
> +              */
> +             if (pgstat_track_sql)
> +                     pgstat_count_sqlstmt(commandTag);
> +

Isn't doing this at the message level quite wrong?  What about
statements executed from functions and such?  Shouldn't this integrate
at the executor level instead?


>               if (IsA(parsetree->stmt, TransactionStmt))
>               {
>                       /*
> @@ -1991,6 +1997,29 @@ exec_execute_message(const char *portal_name, long 
> max_rows)
>  
>       (*receiver->rDestroy) (receiver);
>  
> +     /*
> +      * Count SQL Statement for pgx_stat_sql
> +      */
> +     if (pgstat_track_sql)
> +     {
> +             CachedPlanSource *psrc = NULL;
> +
> +             if (portal->prepStmtName)
> +             {
> +                     PreparedStatement *pstmt;
> +
> +                     pstmt = FetchPreparedStatement(portal->prepStmtName, 
> false);
> +                     if (pstmt)
> +                             psrc = pstmt->plansource;
> +             }
> +             else
> +                     psrc = unnamed_stmt_psrc;
> +
> +             /* psrc should not be NULL here */
> +             if (psrc && psrc->commandTag && !execute_is_fetch && 
> pgstat_track_sql)
> +                     pgstat_count_sqlstmt(psrc->commandTag);

Wait, we're re-fetching the statement here?  That doesn't sound
alright.


> +Datum
> +pg_stat_sql(PG_FUNCTION_ARGS)
> +{
> +     TupleDesc       tupdesc;
> +     Datum           values[2];
> +     bool            nulls[2];
> +     ReturnSetInfo *rsi;
> +     MemoryContext old_cxt;
> +     Tuplestorestate *tuple_store;
> +     PgStat_SqlstmtEntry *sqlstmtentry;
> +     HASH_SEQ_STATUS hstat;
> +
> +     /* Function call to let the backend read the stats file */
> +     pgstat_fetch_global();
> +
> +     /* Initialize values and nulls arrays */
> +     MemSet(values, 0, sizeof(values));
> +     MemSet(nulls, 0, sizeof(nulls));

why not instead declare values[2] = {0}?  Obviously not important.


> +     tuple_store =
> +             tuplestore_begin_heap(rsi->allowedModes & 
> SFRM_Materialize_Random,
> +                                                       false, work_mem);

Huh, why do we need random?



Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

Reply via email to