pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-01 Thread Andres Freund
Hi,

Since

commit 960869da0803427d14335bba24393f414b476e2c
Author: Magnus Hagander 
Date:   2021-01-17 13:34:09 +0100

Add pg_stat_database counters for sessions and session time

pgstat_report_stat() does another timestamp computation via
pgstat_send_connstats(), despite typically having computed one just a few
lines before.

Given that timestamp computation isn't all that cheap, that's not great. Even
more, that additional timestamp computation makes things *less* accurate:

void
pgstat_report_stat(bool disconnect)
...
/*
 * Don't send a message unless it's been at least PGSTAT_STAT_INTERVAL
 * msec since we last sent one, or the backend is about to exit.
 */
now = GetCurrentTransactionStopTimestamp();
if (!disconnect &&
!TimestampDifferenceExceeds(last_report, now, 
PGSTAT_STAT_INTERVAL))
return;

/* for backends, send connection statistics */
if (MyBackendType == B_BACKEND)
pgstat_send_connstats(disconnect, last_report);

last_report = now;

and then pgstat_send_connstats() does:
/* session time since the last report */
TimestampDifference(((last_report == 0) ? MyStartTimestamp : 
last_report),
GetCurrentTimestamp(),
&secs, &usecs);
msg.m_session_time = secs * 100 + usecs;

We maintain last_report as GetCurrentTransactionStopTimestamp(), but then use
a separate timestamp in pgstat_send_connstats() to compute the difference from
last_report, which is then set to GetCurrentTransactionStopTimestamp()'s
return value.


I'm also not all that happy with sending yet another UDP packet for this. This
doubles the UDP packets to the stats collector in the common cases (unless
more than TABSTAT_QUANTUM tables have stats to report, or shared tables have
been accessed). We already send plenty of "summary" information via
PgStat_MsgTabstat, one of which is sent unconditionally, why do we need a
separate message for connection stats?


Alternatively we could just not send an update to connection stats every 500ms
for every active connection, but only do so at connection end. The database
stats would only contain stats for disconnected sessions, while the stats for
"live" connections are maintained via backend_status.c.  That'd give us *more*
information for less costs, because we then could see idle/active times for
individual connections.

That'd likely be more change than what we would want to do at this point in
the release cycle though. But I think we ought to do something about the
increased overhead...

Greetings,

Andres Freund




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-15 Thread Andrew Dunstan


On 9/8/21 12:11 AM, Laurenz Albe wrote:
> On Fri, 2021-09-03 at 17:04 -0700, Andres Freund wrote:
>> Here's how I think that would look like. While writing up this draft, I found
>> two more issues:
>>
>> - On windows / 32 bit systems, the session time would overflow if idle for
>>   longer than ~4300s. long is only 32 bit. Easy to fix obviously.
>> - Right now walsenders, including database connected walsenders, are not
>>   reported in connection stats. That doesn't seem quite right to me.
>>
>> In the patch I made the message for connecting an explicitly reported 
>> message,
>> that seems cleaner, because it then happens at a clearly defined point. I
>> didn't do the same for disconnecting, but perhaps that would be better? Then
>> we could get rid of the whole pgStatSessionEndCause variable.
> I have gone over your patch and made the following changes:
>
> - cache the last report time in a static variable pgLastSessionReportTime
> - add a comment to explain why we only track normal backends
> - pgindent
> - an attempt at a commit message
>

Hi,


this is an open item for release 14. Is someone going to commit?


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-15 Thread Andres Freund
On 2021-09-15 10:47:33 -0400, Andrew Dunstan wrote:
> this is an open item for release 14. Is someone going to commit?

Will do. Although I do wish the original committer would have chimed in at
some point...




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-15 Thread Michael Paquier
On Wed, Sep 15, 2021 at 01:48:09PM -0700, Andres Freund wrote:
> Will do. Although I do wish the original committer would have chimed in at
> some point...

Thanks, Andres.
--
Michael


signature.asc
Description: PGP signature


Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-16 Thread Andres Freund
Hi,

On 2021-09-08 06:11:56 +0200, Laurenz Albe wrote:
> I have gone over your patch and made the following changes:
> 
> - cache the last report time in a static variable pgLastSessionReportTime
> - add a comment to explain why we only track normal backends
> - pgindent
> - an attempt at a commit message

Thanks!

I pushed this. The only substantive change I made is that I moved the
MyBackendType == B_BACKEND check into a new pgstat_should_report_connstat(),
and called that from pgstat_report_connect() and
pgstat_report_disconnect(). Otherwise we'd report connect/disconnect for
walsenders, but not the session time, which seems even weirder.

Closed the open item.

I'm not convinced that ignorign walsenders is a good policy decision, but I
think that's a fair judgement call, and we can evolve that in future
releases...


One thing that'd be nice to have is a few regression tests in stats.sql. Of
course we couldn't test precise counts, but just verifying that some of the
counters increase seems like it would provide a decent increase in "actual"
coverage. One of the more painful things in working on the shared memory stats
patch is that nearly all stats features have no tests, making it very easy to
silently break things.

Greetings,

Andres Freund




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-16 Thread Laurenz Albe
On Thu, 2021-09-16 at 02:22 -0700, Andres Freund wrote:
> I pushed this. The only substantive change I made is that I moved the
> MyBackendType == B_BACKEND check into a new pgstat_should_report_connstat(),
> and called that from pgstat_report_connect() and
> pgstat_report_disconnect(). Otherwise we'd report connect/disconnect for
> walsenders, but not the session time, which seems even weirder.

Oh, right, I didn't think of that.

Thanks for all the work you put into this.

> I'm not convinced that ignorign walsenders is a good policy decision, but I
> think that's a fair judgement call, and we can evolve that in future
> releases...

The use case that is on my mind is to figure out the percentage of time
when transactions are idle to be able to estimate the connection pool size.
walsenders, for which you typically don't want to use a connection pool,
would skew the statistics.

Yours,
Laurenz Albe





Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-16 Thread Magnus Hagander
On Wed, Sep 15, 2021 at 10:48 PM Andres Freund  wrote:
>
> On 2021-09-15 10:47:33 -0400, Andrew Dunstan wrote:
> > this is an open item for release 14. Is someone going to commit?
>
> Will do. Although I do wish the original committer would have chimed in at
> some point...

Crap. My apologies for this -- I had somehow managed to press the
"mute" button in gmail on tihs thread when I cleaned things up after
returning from my vacation, so I hadn't spotted anything in it until I
saw your commit happening a short wihle ago. I have no idea how I
managed to do that, but clearly I did. Sorry!

AIUI the commit has taken care of the urgent things for 14, correct? I
see there are some discussions points left and I'll review the thread
and see if I have any input on those as well a bit later, of course,
but just so I'm not missing something else that's urgent?

Again, my apologies for completely missing this!

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-16 Thread Andres Freund
Hi,

On 2021-09-16 16:38:04 +0200, Magnus Hagander wrote:
> On Wed, Sep 15, 2021 at 10:48 PM Andres Freund  wrote:
> >
> > On 2021-09-15 10:47:33 -0400, Andrew Dunstan wrote:
> > > this is an open item for release 14. Is someone going to commit?
> >
> > Will do. Although I do wish the original committer would have chimed in at
> > some point...
> 
> Crap. My apologies for this -- I had somehow managed to press the
> "mute" button in gmail on tihs thread when I cleaned things up after
> returning from my vacation, so I hadn't spotted anything in it until I
> saw your commit happening a short wihle ago. I have no idea how I
> managed to do that, but clearly I did. Sorry!

Hah, that explains that ;)


> AIUI the commit has taken care of the urgent things for 14, correct? I
> see there are some discussions points left and I'll review the thread
> and see if I have any input on those as well a bit later, of course,
> but just so I'm not missing something else that's urgent?

I think so.

Greetings,

Andres Freund




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-17 Thread Laurenz Albe
On Sun, 2021-08-01 at 13:55 -0700, Andres Freund wrote:
> Since
> 
> commit 960869da0803427d14335bba24393f414b476e2c
> Author: Magnus Hagander 
> Date:   2021-01-17 13:34:09 +0100
> 
> Add pg_stat_database counters for sessions and session time
> 
> pgstat_report_stat() does another timestamp computation via
> pgstat_send_connstats(), despite typically having computed one just a few
> lines before.
> 
> Given that timestamp computation isn't all that cheap, that's not great. Even
> more, that additional timestamp computation makes things *less* accurate:
> 
> void
> pgstat_report_stat(bool disconnect)
> ...
>   /*
>* Don't send a message unless it's been at least PGSTAT_STAT_INTERVAL
>* msec since we last sent one, or the backend is about to exit.
>*/
>   now = GetCurrentTransactionStopTimestamp();
>   if (!disconnect &&
>   !TimestampDifferenceExceeds(last_report, now, 
> PGSTAT_STAT_INTERVAL))
>   return;
> 
>   /* for backends, send connection statistics */
>   if (MyBackendType == B_BACKEND)
>   pgstat_send_connstats(disconnect, last_report);
> 
>   last_report = now;
> 
> and then pgstat_send_connstats() does:
>   /* session time since the last report */
>   TimestampDifference(((last_report == 0) ? MyStartTimestamp : 
> last_report),
>   GetCurrentTimestamp(),
>   &secs, &usecs);
>   msg.m_session_time = secs * 100 + usecs;
> 
> We maintain last_report as GetCurrentTransactionStopTimestamp(), but then use
> a separate timestamp in pgstat_send_connstats() to compute the difference from
> last_report, which is then set to GetCurrentTransactionStopTimestamp()'s
> return value.

Makes sense to me.  How about passing "now", which was just initialized from
GetCurrentTransactionStopTimestamp(), as additional parameter to
pgstat_send_connstats() and use that value instead of taking the current time?

> I'm also not all that happy with sending yet another UDP packet for this. This
> doubles the UDP packets to the stats collector in the common cases (unless
> more than TABSTAT_QUANTUM tables have stats to report, or shared tables have
> been accessed). We already send plenty of "summary" information via
> PgStat_MsgTabstat, one of which is sent unconditionally, why do we need a
> separate message for connection stats?

Are you suggesting that connection statistics should be shoehorned into
some other statistics message?  That would reduce the number of UDP packets,
but it sounds ugly and confusing to me.

> Alternatively we could just not send an update to connection stats every 500ms
> for every active connection, but only do so at connection end. The database
> stats would only contain stats for disconnected sessions, while the stats for
> "live" connections are maintained via backend_status.c.  That'd give us *more*
> information for less costs, because we then could see idle/active times for
> individual connections.

That was my original take, but if I remember right, Magnus convinced me that
it would be more useful to have statistics for open sessions as well.
With a connection pool, connections can stay open for a very long time,
and the accuracy and usefulness of the statistics would become questionable.

> That'd likely be more change than what we would want to do at this point in
> the release cycle though. But I think we ought to do something about the
> increased overhead...

If you are talking about the extra GetCurrentTimestamp() call, and my first
suggestion is acceptable, that should be simple.

Yours,
Laurenz Albe





Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-17 Thread Andres Freund
Hi,

On 2021-08-17 10:44:51 +0200, Laurenz Albe wrote:
> On Sun, 2021-08-01 at 13:55 -0700, Andres Freund wrote:
> > We maintain last_report as GetCurrentTransactionStopTimestamp(), but then 
> > use
> > a separate timestamp in pgstat_send_connstats() to compute the difference 
> > from
> > last_report, which is then set to GetCurrentTransactionStopTimestamp()'s
> > return value.
> 
> Makes sense to me.  How about passing "now", which was just initialized from
> GetCurrentTransactionStopTimestamp(), as additional parameter to
> pgstat_send_connstats() and use that value instead of taking the current time?

Yes.


> > I'm also not all that happy with sending yet another UDP packet for this. 
> > This
> > doubles the UDP packets to the stats collector in the common cases (unless
> > more than TABSTAT_QUANTUM tables have stats to report, or shared tables have
> > been accessed). We already send plenty of "summary" information via
> > PgStat_MsgTabstat, one of which is sent unconditionally, why do we need a
> > separate message for connection stats?
> 
> Are you suggesting that connection statistics should be shoehorned into
> some other statistics message?  That would reduce the number of UDP packets,
> but it sounds ugly and confusing to me.

That ship already has sailed. Look at struct PgStat_MsgTabstat

typedef struct PgStat_MsgTabstat
{
PgStat_MsgHdr m_hdr;
Oid m_databaseid;
int m_nentries;
int m_xact_commit;
int m_xact_rollback;
PgStat_Counter m_block_read_time;   /* times in microseconds */
PgStat_Counter m_block_write_time;
PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES];
} PgStat_MsgTabstat;

Given that we transport number of commits/commits, block read/write time
adding the time the connection was active/inactive doesn't really seem like it
makes things meaningfully worse?


> > Alternatively we could just not send an update to connection stats every 
> > 500ms
> > for every active connection, but only do so at connection end. The database
> > stats would only contain stats for disconnected sessions, while the stats 
> > for
> > "live" connections are maintained via backend_status.c.  That'd give us 
> > *more*
> > information for less costs, because we then could see idle/active times for
> > individual connections.
> 
> That was my original take, but if I remember right, Magnus convinced me that
> it would be more useful to have statistics for open sessions as well.
> With a connection pool, connections can stay open for a very long time,
> and the accuracy and usefulness of the statistics would become questionable.

That's not a contradiction to what I propose. Having the data available via
backend_status.c allows to sum up the data for active connections and the data
for past connections.

I think it's also just cleaner to not constantly report changing preliminary
data as pgstat_send_connstats() does.


> > That'd likely be more change than what we would want to do at this point in
> > the release cycle though. But I think we ought to do something about the
> > increased overhead...
> 
> If you are talking about the extra GetCurrentTimestamp() call, and my first
> suggestion is acceptable, that should be simple.

Doubling the number of UDP messages in common workloads seems also problematic
enough that it should be addressed for 14. It increases the likelihood of
dropping stats messages on busy systems, which can have downstream impacts.

Greetings,

Andres Freund




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-17 Thread Laurenz Albe
On Tue, 2021-08-17 at 02:14 -0700, Andres Freund wrote:
> On 2021-08-17 10:44:51 +0200, Laurenz Albe wrote:
> > On Sun, 2021-08-01 at 13:55 -0700, Andres Freund wrote:
> > > We maintain last_report as GetCurrentTransactionStopTimestamp(), but then 
> > > use
> > > a separate timestamp in pgstat_send_connstats() to compute the difference 
> > > from
> > > last_report, which is then set to GetCurrentTransactionStopTimestamp()'s
> > > return value.
> > 
> > Makes sense to me.  How about passing "now", which was just initialized from
> > GetCurrentTransactionStopTimestamp(), as additional parameter to
> > pgstat_send_connstats() and use that value instead of taking the current 
> > time?
> 
> Yes.

Here is a patch for that.

> > > I'm also not all that happy with sending yet another UDP packet for this.
> > 
> > Are you suggesting that connection statistics should be shoehorned into
> > some other statistics message?  That would reduce the number of UDP packets,
> > but it sounds ugly and confusing to me.
> 
> That ship already has sailed. Look at struct PgStat_MsgTabstat
> 
> Given that we transport number of commits/commits, block read/write time
> adding the time the connection was active/inactive doesn't really seem like it
> makes things meaningfully worse?

Point taken.

I looked at the other statistics sent in pgstat_report_stat(), and I see
none that are sent unconditionally.  Are you thinking of this:

/*
 * Send partial messages.  Make sure that any pending xact commit/abort
 * gets counted, even if there are no table stats to send.
 */
if (regular_msg.m_nentries > 0 ||
pgStatXactCommit > 0 || pgStatXactRollback > 0)
pgstat_send_tabstat(®ular_msg);
if (shared_msg.m_nentries > 0)
pgstat_send_tabstat(&shared_msg);

I can't think of a way to hack this up that wouldn't make my stomach turn.

> > > Alternatively we could just not send an update to connection stats every 
> > > 500ms
> > > for every active connection, but only do so at connection end. The 
> > > database
> > > stats would only contain stats for disconnected sessions, while the stats 
> > > for
> > > "live" connections are maintained via backend_status.c.
> > 
> > That was my original take, but if I remember right, Magnus convinced me that
> > it would be more useful to have statistics for open sessions as well.
> > With a connection pool, connections can stay open for a very long time,
> > and the accuracy and usefulness of the statistics would become questionable.
> 
> That's not a contradiction to what I propose. Having the data available via
> backend_status.c allows to sum up the data for active connections and the data
> for past connections.
> 
> I think it's also just cleaner to not constantly report changing preliminary
> data as pgstat_send_connstats() does.

Currently, the data are kept in static variables in the backend process.
That would have to change for such an approach, right?

> Doubling the number of UDP messages in common workloads seems also problematic
> enough that it should be addressed for 14.

Ok, but I don't know how to go about it.

Yours,
Laurenz Albe
From 951820a8e312584f283ef4b5bde006c7f41e0d9d Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Wed, 18 Aug 2021 04:52:57 +0200
Subject: [PATCH] Improve performance and accuracy of session statistics

Rather than calling GetCurrentTimestamp() in pgstat_send_connstats(),
which causes an additional system call, reuse the value just
calculated in pgstat_report_stat().

Since the "now" value in pgstat_report_stat() will become the next
"last_report" value, this will also improve the accuracy of the
statistics, since the timestamp difference calculated in
pgstat_send_connstats() will become the correct one.

This patch does not address the issue that yet another UDP packet
is sent to the statistics collector for session statistics.

Discussion: https://postgr.es/m/20210801205501.nyxzxoelqoo4x2qc%40alap3.anarazel.de
---
 src/backend/postmaster/pgstat.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index cecc30..77d734a0ba 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -324,7 +324,7 @@ static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
 static void pgstat_send_funcstats(void);
 static void pgstat_send_slru(void);
 static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
-static void pgstat_send_connstats(bool disconnect, TimestampTz last_report);
+static void pgstat_send_connstats(bool disconnect, TimestampTz last_report, TimestampTz now);
 
 static PgStat_TableStatus *get_tabstat_entry(Oid rel_id, bool isshared);
 
@@ -879,7 +879,7 @@ pgstat_report_stat(bool disconnect)
 
 	/* for backends, send connection statistics */
 	if (MyBackendType == B_BACKEND)
-		pgstat_send_connstats(disconnect, last_report);
+		pgstat_send_connstats(disconnect, last_report, now);
 
 	la

Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-20 Thread Justin Pryzby
On Tue, Aug 17, 2021 at 02:14:20AM -0700, Andres Freund wrote:
> Doubling the number of UDP messages in common workloads seems also problematic
> enough that it should be addressed for 14. It increases the likelihood of
> dropping stats messages on busy systems, which can have downstream impacts.

I think by "common workloads" you mean one with many, shortlived sessions.

That does sounds like a concern, and I added this as an Opened Item.

-- 
Justin




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-23 Thread Kyotaro Horiguchi
At Wed, 18 Aug 2021 05:16:38 +0200, Laurenz Albe  
wrote in 
> On Tue, 2021-08-17 at 02:14 -0700, Andres Freund wrote:
> > On 2021-08-17 10:44:51 +0200, Laurenz Albe wrote:
> > > On Sun, 2021-08-01 at 13:55 -0700, Andres Freund wrote:
> > > > We maintain last_report as GetCurrentTransactionStopTimestamp(), but 
> > > > then use
> > > > a separate timestamp in pgstat_send_connstats() to compute the 
> > > > difference from
> > > > last_report, which is then set to GetCurrentTransactionStopTimestamp()'s
> > > > return value.
> > > 
> > > Makes sense to me.  How about passing "now", which was just initialized 
> > > from
> > > GetCurrentTransactionStopTimestamp(), as additional parameter to
> > > pgstat_send_connstats() and use that value instead of taking the current 
> > > time?
> > 
> > Yes.
> 
> Here is a patch for that.

FWIW, looks good to me.

> > > > I'm also not all that happy with sending yet another UDP packet for 
> > > > this.
> > > 
> > > Are you suggesting that connection statistics should be shoehorned into
> > > some other statistics message?  That would reduce the number of UDP 
> > > packets,
> > > but it sounds ugly and confusing to me.
> > 
> > That ship already has sailed. Look at struct PgStat_MsgTabstat
> > 
> > Given that we transport number of commits/commits, block read/write time
> > adding the time the connection was active/inactive doesn't really seem like 
> > it
> > makes things meaningfully worse?
> 
> Point taken.
> 
> I looked at the other statistics sent in pgstat_report_stat(), and I see
> none that are sent unconditionally.  Are you thinking of this:

IIUC, that means that pg_stat_report sends at least one
PgStat_MsgTabstat struct for the database stats purpose if any stats
are sent.  So the connection stats can piggy-back on the packet.

> /*
>  * Send partial messages.  Make sure that any pending xact commit/abort
>  * gets counted, even if there are no table stats to send.
>  */
> if (regular_msg.m_nentries > 0 ||
> pgStatXactCommit > 0 || pgStatXactRollback > 0)
> pgstat_send_tabstat(®ular_msg);
> if (shared_msg.m_nentries > 0)
> pgstat_send_tabstat(&shared_msg);
> 
> I can't think of a way to hack this up that wouldn't make my stomach turn.

No need to change the condition. It's sufficient that the connection
stats are sent at the same time with transaction stats are sent.

> > > > Alternatively we could just not send an update to connection stats 
> > > > every 500ms
> > > > for every active connection, but only do so at connection end. The 
> > > > database
> > > > stats would only contain stats for disconnected sessions, while the 
> > > > stats for
> > > > "live" connections are maintained via backend_status.c.
> > > 
> > > That was my original take, but if I remember right, Magnus convinced me 
> > > that
> > > it would be more useful to have statistics for open sessions as well.
> > > With a connection pool, connections can stay open for a very long time,
> > > and the accuracy and usefulness of the statistics would become 
> > > questionable.
> > 
> > That's not a contradiction to what I propose. Having the data available via
> > backend_status.c allows to sum up the data for active connections and the 
> > data
> > for past connections.
> > 
> > I think it's also just cleaner to not constantly report changing preliminary
> > data as pgstat_send_connstats() does.
> 
> Currently, the data are kept in static variables in the backend process.
> That would have to change for such an approach, right?

Total session time can be summarized from beentry any time, but I'm
not sure how we can summarize active/idle time.. Anyway it's not
needed if the attached works.

> > Doubling the number of UDP messages in common workloads seems also 
> > problematic
> > enough that it should be addressed for 14.
> 
> Ok, but I don't know how to go about it.

The attached is a heavy-WIP on:

- remove redundant gettimeofday().
- avoid sending dedicate UCP packet for connection stats.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index a3c35bdf60..f48ee80fc7 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -329,11 +329,11 @@ static bool pgstat_db_requested(Oid databaseid);
 static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
 static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
 
-static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
+static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, bool disconnect,
+TimestampTz last_report, TimestampTz now);
 static void pgstat_send_funcstats(void);
 static void pgstat_send_slru(void);
 static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
-static void pgstat_send_connstats(bool disconnect, TimestampTz last_report);
 
 static PgStat_TableStatus *get_tabstat_entr

Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-24 Thread Laurenz Albe
On Tue, 2021-08-24 at 15:12 +0900, Kyotaro Horiguchi wrote:
> At Wed, 18 Aug 2021 05:16:38 +0200, Laurenz Albe  
> wrote in 
> > On Tue, 2021-08-17 at 02:14 -0700, Andres Freund wrote:
> > > > > I'm also not all that happy with sending yet another UDP packet for 
> > > > > this.
> > > > 
> > > > Are you suggesting that connection statistics should be shoehorned into
> > > > some other statistics message?  That would reduce the number of UDP 
> > > > packets,
> > > > but it sounds ugly and confusing to me.
> > > 
> > > That ship already has sailed. Look at struct PgStat_MsgTabstat
> > > 
> > > Given that we transport number of commits/commits, block read/write time
> > > adding the time the connection was active/inactive doesn't really seem 
> > > like it
> > > makes things meaningfully worse?
> > 
> > Point taken.
> > 
> > I looked at the other statistics sent in pgstat_report_stat(), and I see
> > none that are sent unconditionally.
> 
> IIUC, that means that pg_stat_report sends at least one
> PgStat_MsgTabstat struct for the database stats purpose if any stats
> are sent.  So the connection stats can piggy-back on the packet.
> 
> No need to change the condition. It's sufficient that the connection
> stats are sent at the same time with transaction stats are sent.

> > > Doubling the number of UDP messages in common workloads seems also 
> > > problematicenough that it should be addressed for 14.
> > 
> > Ok, but I don't know how to go about it.
> 
> The attached is a heavy-WIP on:
> 
> - remove redundant gettimeofday().
> - avoid sending dedicate UCP packet for connection stats.

Thank you.

Perhaps I misread that, but doesn't that mean that the session statistics
could be sent several times?  "pgstat_send_tabstat()" could be called more than
once, right?

Yours,
Laurenz Albe





Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-24 Thread Kyotaro Horiguchi
At Tue, 24 Aug 2021 12:34:25 +0200, Laurenz Albe  
wrote in 
> On Tue, 2021-08-24 at 15:12 +0900, Kyotaro Horiguchi wrote:
> > 
> > - remove redundant gettimeofday().
> > - avoid sending dedicate UCP packet for connection stats.
> 
> Thank you.
> 
> Perhaps I misread that, but doesn't that mean that the session statistics
> could be sent several times?  "pgstat_send_tabstat()" could be called more 
> than
> once, right?

Yes, it can be called two or more times for a call to
pgstat_report_stat, but the patch causes only the first of them convey
effective connection stats numbers. This is the same way as how
transaction stats are treated.  If no table activities have taken
place at all, pgStatXactCommit/Rollback are not consumed then the
following condition:

>   /*
>* Send partial messages.  Make sure that any pending xact commit/abort
>* gets counted, even if there are no table stats to send.
>*/
>   if (regular_msg.m_nentries > 0 ||
>   pgStatXactCommit > 0 || pgStatXactRollback > 0)

leads to a call to pgstat_send_tabstat() and it sends a tabstat
message without a table stats, which is a "partial message".

In this logic the condition term (pgStatXactCommit > 0 ||
pgStatXactRollback > 0) acts as a single-shot trigger.

So we can piggy-back on the condition to send something only once.

The patch sets "0" = (DISCONNECT_NOT_YET) to m_disconnect as the
"not-effective" number, but it would be better to add
DISCONNECT_something to express that state.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-24 Thread Michael Paquier
On Wed, Aug 25, 2021 at 10:12:41AM +0900, Kyotaro Horiguchi wrote:
> Yes, it can be called two or more times for a call to
> pgstat_report_stat, but the patch causes only the first of them convey
> effective connection stats numbers. This is the same way as how
> transaction stats are treated.  If no table activities have taken
> place at all, pgStatXactCommit/Rollback are not consumed then the
> following condition:

I was looking at this WIP patch, and plugging in the connection
statistics to the table-access statistics looks like the wrong
abstraction to me.  I find much cleaner the approach of HEAD to use a
separate API to report this information, as of
pgstat_send_connstats().

As of the two problems discussed on this thread, aka the increased
number of UDP packages and the extra timestamp computations, it seems
to me that we had better combine the following ideas for HEAD and 14,
for now:
- Avoid the extra timestamp computation as proposed by Laurenz in [1]
- Throttle the frequency where the connection stat packages are sent,
as of [2].

Magnus, this open item is assigned to you as the committer of
960869d.  Could you comment on those issues?

[1]: 
https://www.postgresql.org/message-id/4095ceb328780d1bdba77ac6d9785fd7577ed047.ca...@cybertec.at
[2]: 
https://www.postgresql.org/message-id/20210801205501.nyxzxoelqoo4x...@alap3.anarazel.de
--
Michael


signature.asc
Description: PGP signature


Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-24 Thread Kyotaro Horiguchi
At Wed, 25 Aug 2021 12:51:58 +0900, Michael Paquier  wrote 
in 
> On Wed, Aug 25, 2021 at 10:12:41AM +0900, Kyotaro Horiguchi wrote:
> > Yes, it can be called two or more times for a call to
> > pgstat_report_stat, but the patch causes only the first of them convey
> > effective connection stats numbers. This is the same way as how
> > transaction stats are treated.  If no table activities have taken
> > place at all, pgStatXactCommit/Rollback are not consumed then the
> > following condition:
> 
> I was looking at this WIP patch, and plugging in the connection
> statistics to the table-access statistics looks like the wrong
> abstraction to me.  I find much cleaner the approach of HEAD to use a
> separate API to report this information, as of
> pgstat_send_connstats().
> 
> As of the two problems discussed on this thread, aka the increased
> number of UDP packages and the extra timestamp computations, it seems
> to me that we had better combine the following ideas for HEAD and 14,
> for now:
> - Avoid the extra timestamp computation as proposed by Laurenz in [1]
> - Throttle the frequency where the connection stat packages are sent,
> as of [2].
> 
> Magnus, this open item is assigned to you as the committer of
> 960869d.  Could you comment on those issues?
> 
> [1]: 
> https://www.postgresql.org/message-id/4095ceb328780d1bdba77ac6d9785fd7577ed047.ca...@cybertec.at
> [2]: 
> https://www.postgresql.org/message-id/20210801205501.nyxzxoelqoo4x...@alap3.anarazel.de

About [2], we need to maintain session active/idel times on additional
menbers in backend status.  Letting gpgstat_report_activity to
directly scribble on backend status array would work?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-24 Thread Kyotaro Horiguchi
At Wed, 25 Aug 2021 13:21:52 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Wed, 25 Aug 2021 12:51:58 +0900, Michael Paquier  
> wrote in 
> > - Throttle the frequency where the connection stat packages are sent,
> > as of [2].
> > 
> > Magnus, this open item is assigned to you as the committer of
> > 960869d.  Could you comment on those issues?
> > 
> > [1]: 
> > https://www.postgresql.org/message-id/4095ceb328780d1bdba77ac6d9785fd7577ed047.ca...@cybertec.at
> > [2]: 
> > https://www.postgresql.org/message-id/20210801205501.nyxzxoelqoo4x...@alap3.anarazel.de
> 
> About [2], we need to maintain session active/idel times on additional
> menbers in backend status.  Letting gpgstat_report_activity to
> directly scribble on backend status array would work?

So the attached is roughly that (just a PoC, of course).

- accumulate active and idle time on backend status array.
  (pgstat_report_activity)

- pgstat_get_db_session_time() and friends read pgstat file then sum
  up relevant members in backend status array. (So it scans on the
  array for every number of every database X().

- The function pgstat_send_connstats is called only at the end of a
  connection.  It reads backend status element then send the numbers
  to stats collector.  pgstat_shutdown_hook needs to be moved from
  on_shmem_exit to before_shmem_exit to read MyBEEntry.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 7fcc3f6ded..47973f1e30 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -244,8 +244,6 @@ static int  pgStatXactCommit = 0;
 static int pgStatXactRollback = 0;
 PgStat_Counter pgStatBlockReadTime = 0;
 PgStat_Counter pgStatBlockWriteTime = 0;
-PgStat_Counter pgStatActiveTime = 0;
-PgStat_Counter pgStatTransactionIdleTime = 0;
 SessionEndType pgStatSessionEndCause = DISCONNECT_NORMAL;
 
 /* Record that's written to 2PC state file when pgstat state is persisted */
@@ -323,7 +321,7 @@ static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
 static void pgstat_send_funcstats(void);
 static void pgstat_send_slru(void);
 static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
-static void pgstat_send_connstats(bool disconnect, TimestampTz last_report);
+static void pgstat_send_connstats(TimestampTz now);
 
 static PgStat_TableStatus *get_tabstat_entry(Oid rel_id, bool isshared);
 
@@ -876,10 +874,8 @@ pgstat_report_stat(bool disconnect)
return;
 
/* for backends, send connection statistics */
-   if (MyBackendType == B_BACKEND)
-   pgstat_send_connstats(disconnect, last_report);
-
-   last_report = now;
+   if (MyBackendType == B_BACKEND && disconnect)
+   pgstat_send_connstats(now);
 
/*
 * Destroy pgStatTabHash before we start invalidating PgStat_TableEntry
@@ -1368,39 +1364,41 @@ pgstat_drop_relation(Oid relid)
  * pgstat_send_connstats() -
  *
  * Tell the collector about session statistics.
- * The parameter "disconnect" will be true when the backend exits.
- * "last_report" is the last time we were called (0 if never).
  * --
  */
 static void
-pgstat_send_connstats(bool disconnect, TimestampTz last_report)
+pgstat_send_connstats(TimestampTz now)
 {
PgStat_MsgConn msg;
-   longsecs;
-   int usecs;
+   volatile PgBackendStatus *beentry = MyBEEntry;
 
if (pgStatSock == PGINVALID_SOCKET || !pgstat_track_counts)
return;
 
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_CONNECTION);
msg.m_databaseid = MyDatabaseId;
+   msg.m_disconnect = pgStatSessionEndCause;
 
-   /* session time since the last report */
-   TimestampDifference(((last_report == 0) ? MyStartTimestamp : 
last_report),
-   GetCurrentTimestamp(),
-   &secs, &usecs);
-   msg.m_session_time = secs * 100 + usecs;
+   msg.m_count = 1;
 
-   msg.m_disconnect = disconnect ? pgStatSessionEndCause : 
DISCONNECT_NOT_YET;
+   if (pgstat_track_activities && beentry)
+   {
+   msg.m_session_time =
+   (PgStat_Counter)(now - 
beentry->st_proc_start_timestamp) * 1000;
+   msg.m_active_time = beentry->st_session_active_time;
+   msg.m_idle_in_xact_time = beentry->st_session_idle_time;
 
-   msg.m_active_time = pgStatActiveTime;
-   pgStatActiveTime = 0;
-
-   msg.m_idle_in_xact_time = pgStatTransactionIdleTime;
-   pgStatTransactionIdleTime = 0;
-
-   /* report a new session only the first time */
-   msg.m_count = (last_report == 0) ? 1 : 0;
+   PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
+   beentry->st_session_active_time = 0;
+   beentry->st_session_idle_time = 0;
+   PGSTAT_END_WRITE_ACTIVITY(beentry);
+ 

Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-25 Thread Andres Freund
Hi,

On 2021-08-20 14:27:20 -0500, Justin Pryzby wrote:
> On Tue, Aug 17, 2021 at 02:14:20AM -0700, Andres Freund wrote:
> > Doubling the number of UDP messages in common workloads seems also 
> > problematic
> > enough that it should be addressed for 14. It increases the likelihood of
> > dropping stats messages on busy systems, which can have downstream impacts.
> 
> I think by "common workloads" you mean one with many, shortlived sessions.

You don't need short-lived sessions. You just need sessions that don't
process queries all the time (so that there's only one or a few queries
within each PGSTAT_STAT_INTERVAL). The connection stats aren't sent once
per session, they're sent once per PGSTAT_STAT_INTERVAL.

Greetings,

Andres Freund




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-25 Thread Andres Freund
Hi,

On 2021-08-25 12:51:58 +0900, Michael Paquier wrote:
> I was looking at this WIP patch, and plugging in the connection
> statistics to the table-access statistics looks like the wrong
> abstraction to me.  I find much cleaner the approach of HEAD to use a
> separate API to report this information, as of
> pgstat_send_connstats().

As I said before, this ship has long sailed:

typedef struct PgStat_MsgTabstat
{
PgStat_MsgHdr m_hdr;
Oid m_databaseid;
int m_nentries;
int m_xact_commit;
int m_xact_rollback;
PgStat_Counter m_block_read_time;   /* times in microseconds */
PgStat_Counter m_block_write_time;
PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES];
} PgStat_MsgTabstat;


> As of the two problems discussed on this thread, aka the increased
> number of UDP packages and the extra timestamp computations, it seems
> to me that we had better combine the following ideas for HEAD and 14,
> for now:
> - Avoid the extra timestamp computation as proposed by Laurenz in [1]
> - Throttle the frequency where the connection stat packages are sent,
> as of [2].

I think in that case we'd have to do the bigger redesign and move "live"
connection stats to backend_status.c...

Greetings,

Andres Freund




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-26 Thread Michael Paquier
On Wed, Aug 25, 2021 at 01:20:03AM -0700, Andres Freund wrote:
> On 2021-08-25 12:51:58 +0900, Michael Paquier wrote:
> As I said before, this ship has long sailed:
> 
> typedef struct PgStat_MsgTabstat
> {
>   PgStat_MsgHdr m_hdr;
>   Oid m_databaseid;
>   int m_nentries;
>   int m_xact_commit;
>   int m_xact_rollback;
>   PgStat_Counter m_block_read_time;   /* times in microseconds */
>   PgStat_Counter m_block_write_time;
>   PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES];
> } PgStat_MsgTabstat;

Well, I kind of misread what you meant upthread then.
PgStat_MsgTabstat has a name a bit misleading, especially if you
assign connection stats to it.

>> As of the two problems discussed on this thread, aka the increased
>> number of UDP packages and the extra timestamp computations, it seems
>> to me that we had better combine the following ideas for HEAD and 14,
>> for now:
>> - Avoid the extra timestamp computation as proposed by Laurenz in [1]
>> - Throttle the frequency where the connection stat packages are sent,
>> as of [2].
> 
> I think in that case we'd have to do the bigger redesign and move "live"
> connection stats to backend_status.c...

Hmm.  A redesign is not really an option for 14 at this stage.  And I
am not really comfortable with the latest proposal from upthread to
plug in that to pgstat_send_tabstat to report things once per
transaction, either.  It really looks like this needs more thoughts,
and it would mean that a revert may be the most appropriate choice
for the moment.  That's the last-resort option, surely, but we are
post-beta3 so there is no much margin left.
--
Michael


signature.asc
Description: PGP signature


Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-30 Thread Laurenz Albe
On Fri, 2021-08-27 at 13:57 +0900, Michael Paquier wrote:
> > I think in that case we'd have to do the bigger redesign and move "live"
> > connection stats to backend_status.c...
> 
> Hmm.  A redesign is not really an option for 14 at this stage.  And I
> am not really comfortable with the latest proposal from upthread to
> plug in that to pgstat_send_tabstat to report things once per
> transaction, either.  It really looks like this needs more thoughts,
> and it would mean that a revert may be the most appropriate choice
> for the moment.  That's the last-resort option, surely, but we are
> post-beta3 so there is no much margin left.

In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
milliseconds?  That would mean slightly less up-to-date statistics, but
I doubt that that will be a problem.  And it should even out the increase
in statistics messages, except in the case of lots of short-lived
sessions.  But in that scenario you cannot have session statistics
without lots of extra messages, and such a workload has enough performance
problems as it is, so I don't think we have to specifically worry about it.

Yours,
Laurenz Albe





Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-31 Thread Michael Paquier
On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:
> In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
> milliseconds?  That would mean slightly less up-to-date statistics, but
> I doubt that that will be a problem.  And it should even out the increase
> in statistics messages, except in the case of lots of short-lived
> sessions.  But in that scenario you cannot have session statistics
> without lots of extra messages, and such a workload has enough performance
> problems as it is, so I don't think we have to specifically worry about it.

Perhaps we could do that.  Now, increasing an interval for the sake of
balancing the extra load created by a feature while impacting the
whole set of stats is not really appealing.
--
Michael


signature.asc
Description: PGP signature


Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-31 Thread Andres Freund
Hi,

On August 31, 2021 6:33:15 PM PDT, Michael Paquier  wrote:
>On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:
>> In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
>> milliseconds?  That would mean slightly less up-to-date statistics, but
>> I doubt that that will be a problem.  And it should even out the increase
>> in statistics messages, except in the case of lots of short-lived
>> sessions.  But in that scenario you cannot have session statistics
>> without lots of extra messages, and such a workload has enough performance
>> problems as it is, so I don't think we have to specifically worry about it.
>
>Perhaps we could do that.  Now, increasing an interval for the sake of
>balancing the extra load created by a feature while impacting the
>whole set of stats is not really appealing.

I think it's not helpful. Still increases the number of messages substantially 
in workloads with a lot of connections doing occasional queries. Which is 
common.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-31 Thread Laurenz Albe
On Tue, 2021-08-31 at 18:55 -0700, Andres Freund wrote:
> > > On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:In the view 
> > > of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
> > 
> > > milliseconds?  That would mean slightly less up-to-date statistics, but
> > > I doubt that that will be a problem.
>
> I think it's not helpful. Still increases the number of messages 
> substantially in workloads
> with a lot of connections doing occasional queries. Which is common.

How come?  If originally you send table statistics every 500ms, and now you send
table statistics and session statistics every second, that should amount to the
same thing.  Where is my misunderstanding?

Yours,
Laurenz Albe





Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-31 Thread Laurenz Albe
On Wed, 2021-09-01 at 10:33 +0900, Michael Paquier wrote:
> On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:
> > In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
> > milliseconds?
> 
> Perhaps we could do that.  Now, increasing an interval for the sake of
> balancing the extra load created by a feature while impacting the
> whole set of stats is not really appealing.

I agree.  But if the best fix is too invasive at this point, the
alternatives are reverting the patch or choosing a less appealing
solution.

Yours,
Laurenz Albe





Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-31 Thread Andres Freund
On 2021-09-01 05:39:14 +0200, Laurenz Albe wrote:
> On Tue, 2021-08-31 at 18:55 -0700, Andres Freund wrote:
> > > > On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:In the 
> > > > view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
> > > 
> > > > milliseconds?  That would mean slightly less up-to-date statistics, but
> > > > I doubt that that will be a problem.
> >
> > I think it's not helpful. Still increases the number of messages 
> > substantially in workloads
> > with a lot of connections doing occasional queries. Which is common.
> 
> How come?  If originally you send table statistics every 500ms, and now you 
> send
> table statistics and session statistics every second, that should amount to 
> the
> same thing.  Where is my misunderstanding?

Consider the case of one query a second.




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-08-31 Thread Andres Freund
On 2021-08-27 13:57:45 +0900, Michael Paquier wrote:
> On Wed, Aug 25, 2021 at 01:20:03AM -0700, Andres Freund wrote:
> > On 2021-08-25 12:51:58 +0900, Michael Paquier wrote:
> > As I said before, this ship has long sailed:
> > 
> > typedef struct PgStat_MsgTabstat
> > {
> > PgStat_MsgHdr m_hdr;
> > Oid m_databaseid;
> > int m_nentries;
> > int m_xact_commit;
> > int m_xact_rollback;
> > PgStat_Counter m_block_read_time;   /* times in microseconds */
> > PgStat_Counter m_block_write_time;
> > PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES];
> > } PgStat_MsgTabstat;
> 
> Well, I kind of misread what you meant upthread then.
> PgStat_MsgTabstat has a name a bit misleading, especially if you
> assign connection stats to it.

ISTM we should just do this fairly obvious change. Given that we already
transport commit / rollback / IO stats, I don't see why the connection stats
change anything to a meaningful degree. I'm fairly baffled why that's not the
obvious thing to do for v14.




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-01 Thread Laurenz Albe
On Tue, 2021-08-31 at 21:16 -0700, Andres Freund wrote:
> On 2021-09-01 05:39:14 +0200, Laurenz Albe wrote:
> > On Tue, 2021-08-31 at 18:55 -0700, Andres Freund wrote:
> > > > > On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:In the
> > > > > view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
> > > > > milliseconds?  That would mean slightly less up-to-date statistics, 
> > > > > but
> > > > > I doubt that that will be a problem.
> > > 
> > > I think it's not helpful. Still increases the number of messages 
> > > substantially in workloads
> > > with a lot of connections doing occasional queries. Which is common.
> > 
> > How come?  If originally you send table statistics every 500ms, and now you 
> > send
> > table statistics and session statistics every second, that should amount to 
> > the
> > same thing.  Where is my misunderstanding?
> 
> Consider the case of one query a second.

I guess I am too stupid.  I don't see it.

Yours,
Laurenz Albe





Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-02 Thread Laurenz Albe
On Wed, 2021-09-01 at 10:56 +0200, Laurenz Albe wrote:
> On Tue, 2021-08-31 at 21:16 -0700, Andres Freund wrote:
> > On 2021-09-01 05:39:14 +0200, Laurenz Albe wrote:
> > > On Tue, 2021-08-31 at 18:55 -0700, Andres Freund wrote:
> > > > > > On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:In the
> > > > > > view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
> > > > > > milliseconds?  That would mean slightly less up-to-date statistics, 
> > > > > > but
> > > > > > I doubt that that will be a problem.
> > > > 
> > > > I think it's not helpful. Still increases the number of messages 
> > > > substantially in workloads
> > > > with a lot of connections doing occasional queries. Which is common.
> > > 
> > > How come?  If originally you send table statistics every 500ms, and now 
> > > you send
> > > table statistics and session statistics every second, that should amount 
> > > to the
> > > same thing.  Where is my misunderstanding?
> > 
> > Consider the case of one query a second.
> 
> I guess I am too stupid.  I don't see it.

Finally got it.  That would send a message every second, and with connection 
statistics,
twice as many.

Here is my next suggestion for a band-aid to mitigate this problem:
Introduce a second, much longer interval for reporting session statistics.

--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -77,6 +77,8 @@
 #define PGSTAT_STAT_INTERVAL   500 /* Minimum time between stats file
 * 
updates; in milliseconds. */
 
+#define PGSTAT_CONSTAT_INTERVAL6   /* interval to report 
connection statistics */
+
 #define PGSTAT_RETRY_DELAY 10  /* How long to wait between 
checks for a
 * new 
file; in milliseconds. */
 
@@ -889,8 +891,13 @@ pgstat_report_stat(bool disconnect)
!TimestampDifferenceExceeds(last_report, now, 
PGSTAT_STAT_INTERVAL))
return;
 
-   /* for backends, send connection statistics */
-   if (MyBackendType == B_BACKEND)
+   /*
+* For backends, send connection statistics, but only every
+* PGSTAT_CONSTAT_INTERVAL or when the backend terminates.
+*/
+   if (MyBackendType == B_BACKEND &&
+   (TimestampDifferenceExceeds(last_report, now, 
PGSTAT_CONSTAT_INTERVAL) ||
+disconnect))
pgstat_send_connstats(disconnect, last_report, now);
 
last_report = now;

That should keep the extra load moderate, except for workloads with lots of 
tiny connections
(for which this may be the least of their problems).

Yours,
Laurenz Albe





Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-03 Thread Andres Freund
Hi,

On 2021-08-31 21:56:50 -0700, Andres Freund wrote:
> On 2021-08-27 13:57:45 +0900, Michael Paquier wrote:
> > On Wed, Aug 25, 2021 at 01:20:03AM -0700, Andres Freund wrote:
> > > On 2021-08-25 12:51:58 +0900, Michael Paquier wrote:
> > > As I said before, this ship has long sailed:
> > >
> > > typedef struct PgStat_MsgTabstat
> > > {
> > >   PgStat_MsgHdr m_hdr;
> > >   Oid m_databaseid;
> > >   int m_nentries;
> > >   int m_xact_commit;
> > >   int m_xact_rollback;
> > >   PgStat_Counter m_block_read_time;   /* times in microseconds */
> > >   PgStat_Counter m_block_write_time;
> > >   PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES];
> > > } PgStat_MsgTabstat;
> >
> > Well, I kind of misread what you meant upthread then.
> > PgStat_MsgTabstat has a name a bit misleading, especially if you
> > assign connection stats to it.
>
> ISTM we should just do this fairly obvious change. Given that we already
> transport commit / rollback / IO stats, I don't see why the connection stats
> change anything to a meaningful degree. I'm fairly baffled why that's not the
> obvious thing to do for v14.

Here's how I think that would look like. While writing up this draft, I found
two more issues:

- On windows / 32 bit systems, the session time would overflow if idle for
  longer than ~4300s. long is only 32 bit. Easy to fix obviously.
- Right now walsenders, including database connected walsenders, are not
  reported in connection stats. That doesn't seem quite right to me.

In the patch I made the message for connecting an explicitly reported message,
that seems cleaner, because it then happens at a clearly defined point. I
didn't do the same for disconnecting, but perhaps that would be better? Then
we could get rid of the whole pgStatSessionEndCause variable.

Greetings,

Andres Freund
>From 2b3cf32bd02ea4b73157db023d6475826e32c64a Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Fri, 3 Sep 2021 17:02:15 -0700
Subject: [PATCH] wip: Reduce overhead of "pg_stat_database counters for
 sessions and session time"

"Fixes" 960869da0803

Author: Andres Freund 
Reviewed-By:
Discussion: https://postgr.es/m/20210901045650.fz4he2b3wx4p7...@alap3.anarazel.de
Backpatch:
---
 src/include/pgstat.h|  34 ++--
 src/backend/postmaster/pgstat.c | 171 
 src/backend/tcop/postgres.c |   2 +
 src/backend/utils/activity/backend_status.c |   4 +-
 4 files changed, 132 insertions(+), 79 deletions(-)

diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 509849c7ff4..a7b386821f6 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -81,7 +81,8 @@ typedef enum StatMsgType
 	PGSTAT_MTYPE_DEADLOCK,
 	PGSTAT_MTYPE_CHECKSUMFAILURE,
 	PGSTAT_MTYPE_REPLSLOT,
-	PGSTAT_MTYPE_CONNECTION,
+	PGSTAT_MTYPE_CONNECT,
+	PGSTAT_MTYPE_DISCONNECT,
 } StatMsgType;
 
 /* --
@@ -279,7 +280,7 @@ typedef struct PgStat_TableEntry
  * --
  */
 #define PGSTAT_NUM_TABENTRIES  \
-	((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - 3 * sizeof(int) - 2 * sizeof(PgStat_Counter))	\
+	((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - 3 * sizeof(int) - 5 * sizeof(PgStat_Counter)) \
 	 / sizeof(PgStat_TableEntry))
 
 typedef struct PgStat_MsgTabstat
@@ -291,6 +292,9 @@ typedef struct PgStat_MsgTabstat
 	int			m_xact_rollback;
 	PgStat_Counter m_block_read_time;	/* times in microseconds */
 	PgStat_Counter m_block_write_time;
+	PgStat_Counter m_session_time;
+	PgStat_Counter m_active_time;
+	PgStat_Counter m_idle_in_xact_time;
 	PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES];
 } PgStat_MsgTabstat;
 
@@ -653,20 +657,26 @@ typedef struct PgStat_MsgChecksumFailure
 } PgStat_MsgChecksumFailure;
 
 /* --
- * PgStat_MsgConn			Sent by the backend to update connection statistics.
+ * PgStat_MsgConnect			Sent by the backend upon connection
+ *establishment
  * --
  */
-typedef struct PgStat_MsgConn
+typedef struct PgStat_MsgConnect
 {
 	PgStat_MsgHdr m_hdr;
 	Oid			m_databaseid;
-	PgStat_Counter m_count;
-	PgStat_Counter m_session_time;
-	PgStat_Counter m_active_time;
-	PgStat_Counter m_idle_in_xact_time;
-	SessionEndType m_disconnect;
-} PgStat_MsgConn;
+} PgStat_MsgConnect;
 
+/* --
+ * PgStat_MsgDisconnect			Sent by the backend when disconnecting
+ * --
+ */
+typedef struct PgStat_MsgDisconnect
+{
+	PgStat_MsgHdr m_hdr;
+	Oid			m_databaseid;
+	SessionEndType m_cause;
+} PgStat_MsgDisconnect;
 
 /* --
  * PgStat_Msg	Union over all possible messages.
@@ -700,7 +710,8 @@ typedef union PgStat_Msg
 	PgStat_MsgTempFile msg_tempfile;
 	PgStat_MsgChecksumFailure msg_checksumfailure;
 	PgStat_MsgReplSlot msg_replslot;
-	PgStat_MsgConn msg_conn;
+	PgStat_MsgConnect msg_connect;
+	PgStat_MsgDisconnect msg_disconnect;
 } PgStat_Msg;
 
 
@@ -1010,6 +1021,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t
 extern void pgstat_reset_slru_counter(con

Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-06 Thread Laurenz Albe
On Fri, 2021-09-03 at 17:04 -0700, Andres Freund wrote:
> Hi,
> 
> On 2021-08-31 21:56:50 -0700, Andres Freund wrote:
> > On 2021-08-27 13:57:45 +0900, Michael Paquier wrote:
> > > On Wed, Aug 25, 2021 at 01:20:03AM -0700, Andres Freund wrote:
> > > > On 2021-08-25 12:51:58 +0900, Michael Paquier wrote:
> > > > As I said before, this ship has long sailed:
> > > > 
> > > > typedef struct PgStat_MsgTabstat
> > > > {
> > > > PgStat_MsgHdr m_hdr;
> > > > Oid m_databaseid;
> > > > int m_nentries;
> > > > int m_xact_commit;
> > > > int m_xact_rollback;
> > > > PgStat_Counter m_block_read_time;   /* times in 
> > > > microseconds */
> > > > PgStat_Counter m_block_write_time;
> > > > PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES];
> > > > } PgStat_MsgTabstat;
> > > 
> > > Well, I kind of misread what you meant upthread then.
> > > PgStat_MsgTabstat has a name a bit misleading, especially if you
> > > assign connection stats to it.
> > 
> > ISTM we should just do this fairly obvious change. Given that we already
> > transport commit / rollback / IO stats, I don't see why the connection stats
> > change anything to a meaningful degree. I'm fairly baffled why that's not 
> > the
> > obvious thing to do for v14.
> 
> Here's how I think that would look like.

Thank you!

> While writing up this draft, I found
> two more issues:
> 
> - On windows / 32 bit systems, the session time would overflow if idle for
>   longer than ~4300s. long is only 32 bit. Easy to fix obviously.

Oops, yes.  Thanks for spotting that.

> - Right now walsenders, including database connected walsenders, are not
>   reported in connection stats. That doesn't seem quite right to me.

I think that walsenders not only use a different protocol, but often have
session characteristics quite different from normal backends.
For example, they are always "active", even when they are doing nothing.
Therefore, I think it is confusing to report them together with normal
database sessions.

If at all, walsender statistics should be reported separately.

> In the patch I made the message for connecting an explicitly reported message,
> that seems cleaner, because it then happens at a clearly defined point. I
> didn't do the same for disconnecting, but perhaps that would be better? Then
> we could get rid of the whole pgStatSessionEndCause variable.

I see your point, but I am not certain if it is worth adding yet another message
for a small thing like that.  I have no strong opinion on that though.


Reading your patch, I am still confused about the following:
There are potentially several calls to "pgstat_send_tabstat" in 
"pgstat_report_stat".
It seems to me that if it were called more than once, session statistics would
be reported and counted several times, which would be wrong.

Yours,
Laurenz Albe





Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-06 Thread Andres Freund
Hi,

On 2021-09-06 09:12:58 +0200, Laurenz Albe wrote:
> Reading your patch, I am still confused about the following:
> There are potentially several calls to "pgstat_send_tabstat" in 
> "pgstat_report_stat".
> It seems to me that if it were called more than once, session statistics would
> be reported and counted several times, which would be wrong.

The accumulator variables are set to zero after being sent. So no data should
be sent multiple times as far as I can tell.

Regards,

Andres




Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-06 Thread Laurenz Albe
On Mon, 2021-09-06 at 00:22 -0700, Andres Freund wrote:
> On 2021-09-06 09:12:58 +0200, Laurenz Albe wrote:
> > Reading your patch, I am still confused about the following:
> > There are potentially several calls to "pgstat_send_tabstat" in 
> > "pgstat_report_stat".
> > It seems to me that if it were called more than once, session statistics 
> > would
> > be reported and counted several times, which would be wrong.
> 
> The accumulator variables are set to zero after being sent. So no data should
> be sent multiple times as far as I can tell.

Ah, I missed that.  Then I see no problem with that approach.

Yours,
Laurenz Albe





Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-07 Thread Laurenz Albe
On Fri, 2021-09-03 at 17:04 -0700, Andres Freund wrote:
> Here's how I think that would look like. While writing up this draft, I found
> two more issues:
> 
> - On windows / 32 bit systems, the session time would overflow if idle for
>   longer than ~4300s. long is only 32 bit. Easy to fix obviously.
> - Right now walsenders, including database connected walsenders, are not
>   reported in connection stats. That doesn't seem quite right to me.
> 
> In the patch I made the message for connecting an explicitly reported message,
> that seems cleaner, because it then happens at a clearly defined point. I
> didn't do the same for disconnecting, but perhaps that would be better? Then
> we could get rid of the whole pgStatSessionEndCause variable.

I have gone over your patch and made the following changes:

- cache the last report time in a static variable pgLastSessionReportTime
- add a comment to explain why we only track normal backends
- pgindent
- an attempt at a commit message

Yours,
Laurenz Albe
From 1fdfac528ec4e75d91b9cf3868eda66a72a41c8f Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Wed, 8 Sep 2021 06:06:46 +0200
Subject: [PATCH] Fix performance regression from session statistics

Session statistics, as introduced by 960869da08, had several
shortcomings:

- an additional GetCurrentTimestamp() call that also impaired
  the accuracy of the data collected

  This can be avoided by passing the current timestamp we already
  have in pgstat_report_stat().

- an additional statistics UDP packet sent every 500ms

  This is solved by adding the new statistics to PgStat_MsgTabstat.
  This is conceptually ugly, because session statistics are not
  table statistics.  But the struct already contains data unrelated
  to tables, so there is not much damage done.

  Connection and disconnection are reported in separate messages,
  which reduces the total overhead to two messages per session.

The time of the last session statistics report is now cached
in the static variable pgLastSessionReportTime.

Discussion: https://postgr.es/m/20210801205501.nyxzxoelqoo4x2qc%40alap3.anarazel.de
---
 src/backend/postmaster/pgstat.c | 177 +---
 src/backend/tcop/postgres.c |   2 +
 src/backend/utils/activity/backend_status.c |   4 +-
 src/include/pgstat.h|  34 ++--
 src/tools/pgindent/typedefs.list|   3 +-
 5 files changed, 141 insertions(+), 79 deletions(-)

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 3450a10129..7aaf928315 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -246,6 +246,7 @@ static int	pgStatXactCommit = 0;
 static int	pgStatXactRollback = 0;
 PgStat_Counter pgStatBlockReadTime = 0;
 PgStat_Counter pgStatBlockWriteTime = 0;
+static PgStat_Counter pgLastSessionReportTime = 0;
 PgStat_Counter pgStatActiveTime = 0;
 PgStat_Counter pgStatTransactionIdleTime = 0;
 SessionEndType pgStatSessionEndCause = DISCONNECT_NORMAL;
@@ -330,11 +331,11 @@ static bool pgstat_db_requested(Oid databaseid);
 static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it);
 static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts);
 
-static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
+static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
 static void pgstat_send_funcstats(void);
 static void pgstat_send_slru(void);
 static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
-static void pgstat_send_connstats(bool disconnect, TimestampTz last_report);
+static void pgstat_report_disconnect(Oid dboid);
 
 static PgStat_TableStatus *get_tabstat_entry(Oid rel_id, bool isshared);
 
@@ -366,7 +367,8 @@ static void pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len);
 static void pgstat_recv_recoveryconflict(PgStat_MsgRecoveryConflict *msg, int len);
 static void pgstat_recv_deadlock(PgStat_MsgDeadlock *msg, int len);
 static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len);
-static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len);
+static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len);
+static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len);
 static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len);
 static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
 
@@ -890,12 +892,11 @@ pgstat_report_stat(bool disconnect)
 		!TimestampDifferenceExceeds(last_report, now, PGSTAT_STAT_INTERVAL))
 		return;
 
-	/* for backends, send connection statistics */
-	if (MyBackendType == B_BACKEND)
-		pgstat_send_connstats(disconnect, last_report);
-
 	last_report = now;
 
+	if (disconnect)
+		pgstat_report_disconnect(MyDatabaseId);
+
 	/*
 	 * Destroy pgStatTabHash before we start invalidating PgStat_TableEntry
 	 * entries it points to.  (Should we fail partway through the loop below,
@@ -947,7 +948,7 @@ pgstat_report_stat