pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Hi, Since commit 960869da0803427d14335bba24393f414b476e2c Author: Magnus Hagander Date: 2021-01-17 13:34:09 +0100 Add pg_stat_database counters for sessions and session time pgstat_report_stat() does another timestamp computation via pgstat_send_connstats(), despite typically having computed one just a few lines before. Given that timestamp computation isn't all that cheap, that's not great. Even more, that additional timestamp computation makes things *less* accurate: void pgstat_report_stat(bool disconnect) ... /* * Don't send a message unless it's been at least PGSTAT_STAT_INTERVAL * msec since we last sent one, or the backend is about to exit. */ now = GetCurrentTransactionStopTimestamp(); if (!disconnect && !TimestampDifferenceExceeds(last_report, now, PGSTAT_STAT_INTERVAL)) return; /* for backends, send connection statistics */ if (MyBackendType == B_BACKEND) pgstat_send_connstats(disconnect, last_report); last_report = now; and then pgstat_send_connstats() does: /* session time since the last report */ TimestampDifference(((last_report == 0) ? MyStartTimestamp : last_report), GetCurrentTimestamp(), &secs, &usecs); msg.m_session_time = secs * 100 + usecs; We maintain last_report as GetCurrentTransactionStopTimestamp(), but then use a separate timestamp in pgstat_send_connstats() to compute the difference from last_report, which is then set to GetCurrentTransactionStopTimestamp()'s return value. I'm also not all that happy with sending yet another UDP packet for this. This doubles the UDP packets to the stats collector in the common cases (unless more than TABSTAT_QUANTUM tables have stats to report, or shared tables have been accessed). We already send plenty of "summary" information via PgStat_MsgTabstat, one of which is sent unconditionally, why do we need a separate message for connection stats? Alternatively we could just not send an update to connection stats every 500ms for every active connection, but only do so at connection end. The database stats would only contain stats for disconnected sessions, while the stats for "live" connections are maintained via backend_status.c. That'd give us *more* information for less costs, because we then could see idle/active times for individual connections. That'd likely be more change than what we would want to do at this point in the release cycle though. But I think we ought to do something about the increased overhead... Greetings, Andres Freund
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On 9/8/21 12:11 AM, Laurenz Albe wrote: > On Fri, 2021-09-03 at 17:04 -0700, Andres Freund wrote: >> Here's how I think that would look like. While writing up this draft, I found >> two more issues: >> >> - On windows / 32 bit systems, the session time would overflow if idle for >> longer than ~4300s. long is only 32 bit. Easy to fix obviously. >> - Right now walsenders, including database connected walsenders, are not >> reported in connection stats. That doesn't seem quite right to me. >> >> In the patch I made the message for connecting an explicitly reported >> message, >> that seems cleaner, because it then happens at a clearly defined point. I >> didn't do the same for disconnecting, but perhaps that would be better? Then >> we could get rid of the whole pgStatSessionEndCause variable. > I have gone over your patch and made the following changes: > > - cache the last report time in a static variable pgLastSessionReportTime > - add a comment to explain why we only track normal backends > - pgindent > - an attempt at a commit message > Hi, this is an open item for release 14. Is someone going to commit? cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On 2021-09-15 10:47:33 -0400, Andrew Dunstan wrote: > this is an open item for release 14. Is someone going to commit? Will do. Although I do wish the original committer would have chimed in at some point...
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Wed, Sep 15, 2021 at 01:48:09PM -0700, Andres Freund wrote: > Will do. Although I do wish the original committer would have chimed in at > some point... Thanks, Andres. -- Michael signature.asc Description: PGP signature
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Hi, On 2021-09-08 06:11:56 +0200, Laurenz Albe wrote: > I have gone over your patch and made the following changes: > > - cache the last report time in a static variable pgLastSessionReportTime > - add a comment to explain why we only track normal backends > - pgindent > - an attempt at a commit message Thanks! I pushed this. The only substantive change I made is that I moved the MyBackendType == B_BACKEND check into a new pgstat_should_report_connstat(), and called that from pgstat_report_connect() and pgstat_report_disconnect(). Otherwise we'd report connect/disconnect for walsenders, but not the session time, which seems even weirder. Closed the open item. I'm not convinced that ignorign walsenders is a good policy decision, but I think that's a fair judgement call, and we can evolve that in future releases... One thing that'd be nice to have is a few regression tests in stats.sql. Of course we couldn't test precise counts, but just verifying that some of the counters increase seems like it would provide a decent increase in "actual" coverage. One of the more painful things in working on the shared memory stats patch is that nearly all stats features have no tests, making it very easy to silently break things. Greetings, Andres Freund
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Thu, 2021-09-16 at 02:22 -0700, Andres Freund wrote: > I pushed this. The only substantive change I made is that I moved the > MyBackendType == B_BACKEND check into a new pgstat_should_report_connstat(), > and called that from pgstat_report_connect() and > pgstat_report_disconnect(). Otherwise we'd report connect/disconnect for > walsenders, but not the session time, which seems even weirder. Oh, right, I didn't think of that. Thanks for all the work you put into this. > I'm not convinced that ignorign walsenders is a good policy decision, but I > think that's a fair judgement call, and we can evolve that in future > releases... The use case that is on my mind is to figure out the percentage of time when transactions are idle to be able to estimate the connection pool size. walsenders, for which you typically don't want to use a connection pool, would skew the statistics. Yours, Laurenz Albe
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Wed, Sep 15, 2021 at 10:48 PM Andres Freund wrote: > > On 2021-09-15 10:47:33 -0400, Andrew Dunstan wrote: > > this is an open item for release 14. Is someone going to commit? > > Will do. Although I do wish the original committer would have chimed in at > some point... Crap. My apologies for this -- I had somehow managed to press the "mute" button in gmail on tihs thread when I cleaned things up after returning from my vacation, so I hadn't spotted anything in it until I saw your commit happening a short wihle ago. I have no idea how I managed to do that, but clearly I did. Sorry! AIUI the commit has taken care of the urgent things for 14, correct? I see there are some discussions points left and I'll review the thread and see if I have any input on those as well a bit later, of course, but just so I'm not missing something else that's urgent? Again, my apologies for completely missing this! -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Hi, On 2021-09-16 16:38:04 +0200, Magnus Hagander wrote: > On Wed, Sep 15, 2021 at 10:48 PM Andres Freund wrote: > > > > On 2021-09-15 10:47:33 -0400, Andrew Dunstan wrote: > > > this is an open item for release 14. Is someone going to commit? > > > > Will do. Although I do wish the original committer would have chimed in at > > some point... > > Crap. My apologies for this -- I had somehow managed to press the > "mute" button in gmail on tihs thread when I cleaned things up after > returning from my vacation, so I hadn't spotted anything in it until I > saw your commit happening a short wihle ago. I have no idea how I > managed to do that, but clearly I did. Sorry! Hah, that explains that ;) > AIUI the commit has taken care of the urgent things for 14, correct? I > see there are some discussions points left and I'll review the thread > and see if I have any input on those as well a bit later, of course, > but just so I'm not missing something else that's urgent? I think so. Greetings, Andres Freund
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Sun, 2021-08-01 at 13:55 -0700, Andres Freund wrote: > Since > > commit 960869da0803427d14335bba24393f414b476e2c > Author: Magnus Hagander > Date: 2021-01-17 13:34:09 +0100 > > Add pg_stat_database counters for sessions and session time > > pgstat_report_stat() does another timestamp computation via > pgstat_send_connstats(), despite typically having computed one just a few > lines before. > > Given that timestamp computation isn't all that cheap, that's not great. Even > more, that additional timestamp computation makes things *less* accurate: > > void > pgstat_report_stat(bool disconnect) > ... > /* >* Don't send a message unless it's been at least PGSTAT_STAT_INTERVAL >* msec since we last sent one, or the backend is about to exit. >*/ > now = GetCurrentTransactionStopTimestamp(); > if (!disconnect && > !TimestampDifferenceExceeds(last_report, now, > PGSTAT_STAT_INTERVAL)) > return; > > /* for backends, send connection statistics */ > if (MyBackendType == B_BACKEND) > pgstat_send_connstats(disconnect, last_report); > > last_report = now; > > and then pgstat_send_connstats() does: > /* session time since the last report */ > TimestampDifference(((last_report == 0) ? MyStartTimestamp : > last_report), > GetCurrentTimestamp(), > &secs, &usecs); > msg.m_session_time = secs * 100 + usecs; > > We maintain last_report as GetCurrentTransactionStopTimestamp(), but then use > a separate timestamp in pgstat_send_connstats() to compute the difference from > last_report, which is then set to GetCurrentTransactionStopTimestamp()'s > return value. Makes sense to me. How about passing "now", which was just initialized from GetCurrentTransactionStopTimestamp(), as additional parameter to pgstat_send_connstats() and use that value instead of taking the current time? > I'm also not all that happy with sending yet another UDP packet for this. This > doubles the UDP packets to the stats collector in the common cases (unless > more than TABSTAT_QUANTUM tables have stats to report, or shared tables have > been accessed). We already send plenty of "summary" information via > PgStat_MsgTabstat, one of which is sent unconditionally, why do we need a > separate message for connection stats? Are you suggesting that connection statistics should be shoehorned into some other statistics message? That would reduce the number of UDP packets, but it sounds ugly and confusing to me. > Alternatively we could just not send an update to connection stats every 500ms > for every active connection, but only do so at connection end. The database > stats would only contain stats for disconnected sessions, while the stats for > "live" connections are maintained via backend_status.c. That'd give us *more* > information for less costs, because we then could see idle/active times for > individual connections. That was my original take, but if I remember right, Magnus convinced me that it would be more useful to have statistics for open sessions as well. With a connection pool, connections can stay open for a very long time, and the accuracy and usefulness of the statistics would become questionable. > That'd likely be more change than what we would want to do at this point in > the release cycle though. But I think we ought to do something about the > increased overhead... If you are talking about the extra GetCurrentTimestamp() call, and my first suggestion is acceptable, that should be simple. Yours, Laurenz Albe
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Hi, On 2021-08-17 10:44:51 +0200, Laurenz Albe wrote: > On Sun, 2021-08-01 at 13:55 -0700, Andres Freund wrote: > > We maintain last_report as GetCurrentTransactionStopTimestamp(), but then > > use > > a separate timestamp in pgstat_send_connstats() to compute the difference > > from > > last_report, which is then set to GetCurrentTransactionStopTimestamp()'s > > return value. > > Makes sense to me. How about passing "now", which was just initialized from > GetCurrentTransactionStopTimestamp(), as additional parameter to > pgstat_send_connstats() and use that value instead of taking the current time? Yes. > > I'm also not all that happy with sending yet another UDP packet for this. > > This > > doubles the UDP packets to the stats collector in the common cases (unless > > more than TABSTAT_QUANTUM tables have stats to report, or shared tables have > > been accessed). We already send plenty of "summary" information via > > PgStat_MsgTabstat, one of which is sent unconditionally, why do we need a > > separate message for connection stats? > > Are you suggesting that connection statistics should be shoehorned into > some other statistics message? That would reduce the number of UDP packets, > but it sounds ugly and confusing to me. That ship already has sailed. Look at struct PgStat_MsgTabstat typedef struct PgStat_MsgTabstat { PgStat_MsgHdr m_hdr; Oid m_databaseid; int m_nentries; int m_xact_commit; int m_xact_rollback; PgStat_Counter m_block_read_time; /* times in microseconds */ PgStat_Counter m_block_write_time; PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES]; } PgStat_MsgTabstat; Given that we transport number of commits/commits, block read/write time adding the time the connection was active/inactive doesn't really seem like it makes things meaningfully worse? > > Alternatively we could just not send an update to connection stats every > > 500ms > > for every active connection, but only do so at connection end. The database > > stats would only contain stats for disconnected sessions, while the stats > > for > > "live" connections are maintained via backend_status.c. That'd give us > > *more* > > information for less costs, because we then could see idle/active times for > > individual connections. > > That was my original take, but if I remember right, Magnus convinced me that > it would be more useful to have statistics for open sessions as well. > With a connection pool, connections can stay open for a very long time, > and the accuracy and usefulness of the statistics would become questionable. That's not a contradiction to what I propose. Having the data available via backend_status.c allows to sum up the data for active connections and the data for past connections. I think it's also just cleaner to not constantly report changing preliminary data as pgstat_send_connstats() does. > > That'd likely be more change than what we would want to do at this point in > > the release cycle though. But I think we ought to do something about the > > increased overhead... > > If you are talking about the extra GetCurrentTimestamp() call, and my first > suggestion is acceptable, that should be simple. Doubling the number of UDP messages in common workloads seems also problematic enough that it should be addressed for 14. It increases the likelihood of dropping stats messages on busy systems, which can have downstream impacts. Greetings, Andres Freund
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Tue, 2021-08-17 at 02:14 -0700, Andres Freund wrote: > On 2021-08-17 10:44:51 +0200, Laurenz Albe wrote: > > On Sun, 2021-08-01 at 13:55 -0700, Andres Freund wrote: > > > We maintain last_report as GetCurrentTransactionStopTimestamp(), but then > > > use > > > a separate timestamp in pgstat_send_connstats() to compute the difference > > > from > > > last_report, which is then set to GetCurrentTransactionStopTimestamp()'s > > > return value. > > > > Makes sense to me. How about passing "now", which was just initialized from > > GetCurrentTransactionStopTimestamp(), as additional parameter to > > pgstat_send_connstats() and use that value instead of taking the current > > time? > > Yes. Here is a patch for that. > > > I'm also not all that happy with sending yet another UDP packet for this. > > > > Are you suggesting that connection statistics should be shoehorned into > > some other statistics message? That would reduce the number of UDP packets, > > but it sounds ugly and confusing to me. > > That ship already has sailed. Look at struct PgStat_MsgTabstat > > Given that we transport number of commits/commits, block read/write time > adding the time the connection was active/inactive doesn't really seem like it > makes things meaningfully worse? Point taken. I looked at the other statistics sent in pgstat_report_stat(), and I see none that are sent unconditionally. Are you thinking of this: /* * Send partial messages. Make sure that any pending xact commit/abort * gets counted, even if there are no table stats to send. */ if (regular_msg.m_nentries > 0 || pgStatXactCommit > 0 || pgStatXactRollback > 0) pgstat_send_tabstat(®ular_msg); if (shared_msg.m_nentries > 0) pgstat_send_tabstat(&shared_msg); I can't think of a way to hack this up that wouldn't make my stomach turn. > > > Alternatively we could just not send an update to connection stats every > > > 500ms > > > for every active connection, but only do so at connection end. The > > > database > > > stats would only contain stats for disconnected sessions, while the stats > > > for > > > "live" connections are maintained via backend_status.c. > > > > That was my original take, but if I remember right, Magnus convinced me that > > it would be more useful to have statistics for open sessions as well. > > With a connection pool, connections can stay open for a very long time, > > and the accuracy and usefulness of the statistics would become questionable. > > That's not a contradiction to what I propose. Having the data available via > backend_status.c allows to sum up the data for active connections and the data > for past connections. > > I think it's also just cleaner to not constantly report changing preliminary > data as pgstat_send_connstats() does. Currently, the data are kept in static variables in the backend process. That would have to change for such an approach, right? > Doubling the number of UDP messages in common workloads seems also problematic > enough that it should be addressed for 14. Ok, but I don't know how to go about it. Yours, Laurenz Albe From 951820a8e312584f283ef4b5bde006c7f41e0d9d Mon Sep 17 00:00:00 2001 From: Laurenz Albe Date: Wed, 18 Aug 2021 04:52:57 +0200 Subject: [PATCH] Improve performance and accuracy of session statistics Rather than calling GetCurrentTimestamp() in pgstat_send_connstats(), which causes an additional system call, reuse the value just calculated in pgstat_report_stat(). Since the "now" value in pgstat_report_stat() will become the next "last_report" value, this will also improve the accuracy of the statistics, since the timestamp difference calculated in pgstat_send_connstats() will become the correct one. This patch does not address the issue that yet another UDP packet is sent to the statistics collector for session statistics. Discussion: https://postgr.es/m/20210801205501.nyxzxoelqoo4x2qc%40alap3.anarazel.de --- src/backend/postmaster/pgstat.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index cecc30..77d734a0ba 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -324,7 +324,7 @@ static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg); static void pgstat_send_funcstats(void); static void pgstat_send_slru(void); static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid); -static void pgstat_send_connstats(bool disconnect, TimestampTz last_report); +static void pgstat_send_connstats(bool disconnect, TimestampTz last_report, TimestampTz now); static PgStat_TableStatus *get_tabstat_entry(Oid rel_id, bool isshared); @@ -879,7 +879,7 @@ pgstat_report_stat(bool disconnect) /* for backends, send connection statistics */ if (MyBackendType == B_BACKEND) - pgstat_send_connstats(disconnect, last_report); + pgstat_send_connstats(disconnect, last_report, now); la
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Tue, Aug 17, 2021 at 02:14:20AM -0700, Andres Freund wrote: > Doubling the number of UDP messages in common workloads seems also problematic > enough that it should be addressed for 14. It increases the likelihood of > dropping stats messages on busy systems, which can have downstream impacts. I think by "common workloads" you mean one with many, shortlived sessions. That does sounds like a concern, and I added this as an Opened Item. -- Justin
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
At Wed, 18 Aug 2021 05:16:38 +0200, Laurenz Albe wrote in > On Tue, 2021-08-17 at 02:14 -0700, Andres Freund wrote: > > On 2021-08-17 10:44:51 +0200, Laurenz Albe wrote: > > > On Sun, 2021-08-01 at 13:55 -0700, Andres Freund wrote: > > > > We maintain last_report as GetCurrentTransactionStopTimestamp(), but > > > > then use > > > > a separate timestamp in pgstat_send_connstats() to compute the > > > > difference from > > > > last_report, which is then set to GetCurrentTransactionStopTimestamp()'s > > > > return value. > > > > > > Makes sense to me. How about passing "now", which was just initialized > > > from > > > GetCurrentTransactionStopTimestamp(), as additional parameter to > > > pgstat_send_connstats() and use that value instead of taking the current > > > time? > > > > Yes. > > Here is a patch for that. FWIW, looks good to me. > > > > I'm also not all that happy with sending yet another UDP packet for > > > > this. > > > > > > Are you suggesting that connection statistics should be shoehorned into > > > some other statistics message? That would reduce the number of UDP > > > packets, > > > but it sounds ugly and confusing to me. > > > > That ship already has sailed. Look at struct PgStat_MsgTabstat > > > > Given that we transport number of commits/commits, block read/write time > > adding the time the connection was active/inactive doesn't really seem like > > it > > makes things meaningfully worse? > > Point taken. > > I looked at the other statistics sent in pgstat_report_stat(), and I see > none that are sent unconditionally. Are you thinking of this: IIUC, that means that pg_stat_report sends at least one PgStat_MsgTabstat struct for the database stats purpose if any stats are sent. So the connection stats can piggy-back on the packet. > /* > * Send partial messages. Make sure that any pending xact commit/abort > * gets counted, even if there are no table stats to send. > */ > if (regular_msg.m_nentries > 0 || > pgStatXactCommit > 0 || pgStatXactRollback > 0) > pgstat_send_tabstat(®ular_msg); > if (shared_msg.m_nentries > 0) > pgstat_send_tabstat(&shared_msg); > > I can't think of a way to hack this up that wouldn't make my stomach turn. No need to change the condition. It's sufficient that the connection stats are sent at the same time with transaction stats are sent. > > > > Alternatively we could just not send an update to connection stats > > > > every 500ms > > > > for every active connection, but only do so at connection end. The > > > > database > > > > stats would only contain stats for disconnected sessions, while the > > > > stats for > > > > "live" connections are maintained via backend_status.c. > > > > > > That was my original take, but if I remember right, Magnus convinced me > > > that > > > it would be more useful to have statistics for open sessions as well. > > > With a connection pool, connections can stay open for a very long time, > > > and the accuracy and usefulness of the statistics would become > > > questionable. > > > > That's not a contradiction to what I propose. Having the data available via > > backend_status.c allows to sum up the data for active connections and the > > data > > for past connections. > > > > I think it's also just cleaner to not constantly report changing preliminary > > data as pgstat_send_connstats() does. > > Currently, the data are kept in static variables in the backend process. > That would have to change for such an approach, right? Total session time can be summarized from beentry any time, but I'm not sure how we can summarize active/idle time.. Anyway it's not needed if the attached works. > > Doubling the number of UDP messages in common workloads seems also > > problematic > > enough that it should be addressed for 14. > > Ok, but I don't know how to go about it. The attached is a heavy-WIP on: - remove redundant gettimeofday(). - avoid sending dedicate UCP packet for connection stats. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index a3c35bdf60..f48ee80fc7 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -329,11 +329,11 @@ static bool pgstat_db_requested(Oid databaseid); static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it); static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts); -static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg); +static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, bool disconnect, +TimestampTz last_report, TimestampTz now); static void pgstat_send_funcstats(void); static void pgstat_send_slru(void); static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid); -static void pgstat_send_connstats(bool disconnect, TimestampTz last_report); static PgStat_TableStatus *get_tabstat_entr
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Tue, 2021-08-24 at 15:12 +0900, Kyotaro Horiguchi wrote: > At Wed, 18 Aug 2021 05:16:38 +0200, Laurenz Albe > wrote in > > On Tue, 2021-08-17 at 02:14 -0700, Andres Freund wrote: > > > > > I'm also not all that happy with sending yet another UDP packet for > > > > > this. > > > > > > > > Are you suggesting that connection statistics should be shoehorned into > > > > some other statistics message? That would reduce the number of UDP > > > > packets, > > > > but it sounds ugly and confusing to me. > > > > > > That ship already has sailed. Look at struct PgStat_MsgTabstat > > > > > > Given that we transport number of commits/commits, block read/write time > > > adding the time the connection was active/inactive doesn't really seem > > > like it > > > makes things meaningfully worse? > > > > Point taken. > > > > I looked at the other statistics sent in pgstat_report_stat(), and I see > > none that are sent unconditionally. > > IIUC, that means that pg_stat_report sends at least one > PgStat_MsgTabstat struct for the database stats purpose if any stats > are sent. So the connection stats can piggy-back on the packet. > > No need to change the condition. It's sufficient that the connection > stats are sent at the same time with transaction stats are sent. > > > Doubling the number of UDP messages in common workloads seems also > > > problematicenough that it should be addressed for 14. > > > > Ok, but I don't know how to go about it. > > The attached is a heavy-WIP on: > > - remove redundant gettimeofday(). > - avoid sending dedicate UCP packet for connection stats. Thank you. Perhaps I misread that, but doesn't that mean that the session statistics could be sent several times? "pgstat_send_tabstat()" could be called more than once, right? Yours, Laurenz Albe
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
At Tue, 24 Aug 2021 12:34:25 +0200, Laurenz Albe wrote in > On Tue, 2021-08-24 at 15:12 +0900, Kyotaro Horiguchi wrote: > > > > - remove redundant gettimeofday(). > > - avoid sending dedicate UCP packet for connection stats. > > Thank you. > > Perhaps I misread that, but doesn't that mean that the session statistics > could be sent several times? "pgstat_send_tabstat()" could be called more > than > once, right? Yes, it can be called two or more times for a call to pgstat_report_stat, but the patch causes only the first of them convey effective connection stats numbers. This is the same way as how transaction stats are treated. If no table activities have taken place at all, pgStatXactCommit/Rollback are not consumed then the following condition: > /* >* Send partial messages. Make sure that any pending xact commit/abort >* gets counted, even if there are no table stats to send. >*/ > if (regular_msg.m_nentries > 0 || > pgStatXactCommit > 0 || pgStatXactRollback > 0) leads to a call to pgstat_send_tabstat() and it sends a tabstat message without a table stats, which is a "partial message". In this logic the condition term (pgStatXactCommit > 0 || pgStatXactRollback > 0) acts as a single-shot trigger. So we can piggy-back on the condition to send something only once. The patch sets "0" = (DISCONNECT_NOT_YET) to m_disconnect as the "not-effective" number, but it would be better to add DISCONNECT_something to express that state. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Wed, Aug 25, 2021 at 10:12:41AM +0900, Kyotaro Horiguchi wrote: > Yes, it can be called two or more times for a call to > pgstat_report_stat, but the patch causes only the first of them convey > effective connection stats numbers. This is the same way as how > transaction stats are treated. If no table activities have taken > place at all, pgStatXactCommit/Rollback are not consumed then the > following condition: I was looking at this WIP patch, and plugging in the connection statistics to the table-access statistics looks like the wrong abstraction to me. I find much cleaner the approach of HEAD to use a separate API to report this information, as of pgstat_send_connstats(). As of the two problems discussed on this thread, aka the increased number of UDP packages and the extra timestamp computations, it seems to me that we had better combine the following ideas for HEAD and 14, for now: - Avoid the extra timestamp computation as proposed by Laurenz in [1] - Throttle the frequency where the connection stat packages are sent, as of [2]. Magnus, this open item is assigned to you as the committer of 960869d. Could you comment on those issues? [1]: https://www.postgresql.org/message-id/4095ceb328780d1bdba77ac6d9785fd7577ed047.ca...@cybertec.at [2]: https://www.postgresql.org/message-id/20210801205501.nyxzxoelqoo4x...@alap3.anarazel.de -- Michael signature.asc Description: PGP signature
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
At Wed, 25 Aug 2021 12:51:58 +0900, Michael Paquier wrote in > On Wed, Aug 25, 2021 at 10:12:41AM +0900, Kyotaro Horiguchi wrote: > > Yes, it can be called two or more times for a call to > > pgstat_report_stat, but the patch causes only the first of them convey > > effective connection stats numbers. This is the same way as how > > transaction stats are treated. If no table activities have taken > > place at all, pgStatXactCommit/Rollback are not consumed then the > > following condition: > > I was looking at this WIP patch, and plugging in the connection > statistics to the table-access statistics looks like the wrong > abstraction to me. I find much cleaner the approach of HEAD to use a > separate API to report this information, as of > pgstat_send_connstats(). > > As of the two problems discussed on this thread, aka the increased > number of UDP packages and the extra timestamp computations, it seems > to me that we had better combine the following ideas for HEAD and 14, > for now: > - Avoid the extra timestamp computation as proposed by Laurenz in [1] > - Throttle the frequency where the connection stat packages are sent, > as of [2]. > > Magnus, this open item is assigned to you as the committer of > 960869d. Could you comment on those issues? > > [1]: > https://www.postgresql.org/message-id/4095ceb328780d1bdba77ac6d9785fd7577ed047.ca...@cybertec.at > [2]: > https://www.postgresql.org/message-id/20210801205501.nyxzxoelqoo4x...@alap3.anarazel.de About [2], we need to maintain session active/idel times on additional menbers in backend status. Letting gpgstat_report_activity to directly scribble on backend status array would work? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
At Wed, 25 Aug 2021 13:21:52 +0900 (JST), Kyotaro Horiguchi wrote in > At Wed, 25 Aug 2021 12:51:58 +0900, Michael Paquier > wrote in > > - Throttle the frequency where the connection stat packages are sent, > > as of [2]. > > > > Magnus, this open item is assigned to you as the committer of > > 960869d. Could you comment on those issues? > > > > [1]: > > https://www.postgresql.org/message-id/4095ceb328780d1bdba77ac6d9785fd7577ed047.ca...@cybertec.at > > [2]: > > https://www.postgresql.org/message-id/20210801205501.nyxzxoelqoo4x...@alap3.anarazel.de > > About [2], we need to maintain session active/idel times on additional > menbers in backend status. Letting gpgstat_report_activity to > directly scribble on backend status array would work? So the attached is roughly that (just a PoC, of course). - accumulate active and idle time on backend status array. (pgstat_report_activity) - pgstat_get_db_session_time() and friends read pgstat file then sum up relevant members in backend status array. (So it scans on the array for every number of every database X(). - The function pgstat_send_connstats is called only at the end of a connection. It reads backend status element then send the numbers to stats collector. pgstat_shutdown_hook needs to be moved from on_shmem_exit to before_shmem_exit to read MyBEEntry. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 7fcc3f6ded..47973f1e30 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -244,8 +244,6 @@ static int pgStatXactCommit = 0; static int pgStatXactRollback = 0; PgStat_Counter pgStatBlockReadTime = 0; PgStat_Counter pgStatBlockWriteTime = 0; -PgStat_Counter pgStatActiveTime = 0; -PgStat_Counter pgStatTransactionIdleTime = 0; SessionEndType pgStatSessionEndCause = DISCONNECT_NORMAL; /* Record that's written to 2PC state file when pgstat state is persisted */ @@ -323,7 +321,7 @@ static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg); static void pgstat_send_funcstats(void); static void pgstat_send_slru(void); static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid); -static void pgstat_send_connstats(bool disconnect, TimestampTz last_report); +static void pgstat_send_connstats(TimestampTz now); static PgStat_TableStatus *get_tabstat_entry(Oid rel_id, bool isshared); @@ -876,10 +874,8 @@ pgstat_report_stat(bool disconnect) return; /* for backends, send connection statistics */ - if (MyBackendType == B_BACKEND) - pgstat_send_connstats(disconnect, last_report); - - last_report = now; + if (MyBackendType == B_BACKEND && disconnect) + pgstat_send_connstats(now); /* * Destroy pgStatTabHash before we start invalidating PgStat_TableEntry @@ -1368,39 +1364,41 @@ pgstat_drop_relation(Oid relid) * pgstat_send_connstats() - * * Tell the collector about session statistics. - * The parameter "disconnect" will be true when the backend exits. - * "last_report" is the last time we were called (0 if never). * -- */ static void -pgstat_send_connstats(bool disconnect, TimestampTz last_report) +pgstat_send_connstats(TimestampTz now) { PgStat_MsgConn msg; - longsecs; - int usecs; + volatile PgBackendStatus *beentry = MyBEEntry; if (pgStatSock == PGINVALID_SOCKET || !pgstat_track_counts) return; pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_CONNECTION); msg.m_databaseid = MyDatabaseId; + msg.m_disconnect = pgStatSessionEndCause; - /* session time since the last report */ - TimestampDifference(((last_report == 0) ? MyStartTimestamp : last_report), - GetCurrentTimestamp(), - &secs, &usecs); - msg.m_session_time = secs * 100 + usecs; + msg.m_count = 1; - msg.m_disconnect = disconnect ? pgStatSessionEndCause : DISCONNECT_NOT_YET; + if (pgstat_track_activities && beentry) + { + msg.m_session_time = + (PgStat_Counter)(now - beentry->st_proc_start_timestamp) * 1000; + msg.m_active_time = beentry->st_session_active_time; + msg.m_idle_in_xact_time = beentry->st_session_idle_time; - msg.m_active_time = pgStatActiveTime; - pgStatActiveTime = 0; - - msg.m_idle_in_xact_time = pgStatTransactionIdleTime; - pgStatTransactionIdleTime = 0; - - /* report a new session only the first time */ - msg.m_count = (last_report == 0) ? 1 : 0; + PGSTAT_BEGIN_WRITE_ACTIVITY(beentry); + beentry->st_session_active_time = 0; + beentry->st_session_idle_time = 0; + PGSTAT_END_WRITE_ACTIVITY(beentry); +
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Hi, On 2021-08-20 14:27:20 -0500, Justin Pryzby wrote: > On Tue, Aug 17, 2021 at 02:14:20AM -0700, Andres Freund wrote: > > Doubling the number of UDP messages in common workloads seems also > > problematic > > enough that it should be addressed for 14. It increases the likelihood of > > dropping stats messages on busy systems, which can have downstream impacts. > > I think by "common workloads" you mean one with many, shortlived sessions. You don't need short-lived sessions. You just need sessions that don't process queries all the time (so that there's only one or a few queries within each PGSTAT_STAT_INTERVAL). The connection stats aren't sent once per session, they're sent once per PGSTAT_STAT_INTERVAL. Greetings, Andres Freund
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Hi, On 2021-08-25 12:51:58 +0900, Michael Paquier wrote: > I was looking at this WIP patch, and plugging in the connection > statistics to the table-access statistics looks like the wrong > abstraction to me. I find much cleaner the approach of HEAD to use a > separate API to report this information, as of > pgstat_send_connstats(). As I said before, this ship has long sailed: typedef struct PgStat_MsgTabstat { PgStat_MsgHdr m_hdr; Oid m_databaseid; int m_nentries; int m_xact_commit; int m_xact_rollback; PgStat_Counter m_block_read_time; /* times in microseconds */ PgStat_Counter m_block_write_time; PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES]; } PgStat_MsgTabstat; > As of the two problems discussed on this thread, aka the increased > number of UDP packages and the extra timestamp computations, it seems > to me that we had better combine the following ideas for HEAD and 14, > for now: > - Avoid the extra timestamp computation as proposed by Laurenz in [1] > - Throttle the frequency where the connection stat packages are sent, > as of [2]. I think in that case we'd have to do the bigger redesign and move "live" connection stats to backend_status.c... Greetings, Andres Freund
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Wed, Aug 25, 2021 at 01:20:03AM -0700, Andres Freund wrote: > On 2021-08-25 12:51:58 +0900, Michael Paquier wrote: > As I said before, this ship has long sailed: > > typedef struct PgStat_MsgTabstat > { > PgStat_MsgHdr m_hdr; > Oid m_databaseid; > int m_nentries; > int m_xact_commit; > int m_xact_rollback; > PgStat_Counter m_block_read_time; /* times in microseconds */ > PgStat_Counter m_block_write_time; > PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES]; > } PgStat_MsgTabstat; Well, I kind of misread what you meant upthread then. PgStat_MsgTabstat has a name a bit misleading, especially if you assign connection stats to it. >> As of the two problems discussed on this thread, aka the increased >> number of UDP packages and the extra timestamp computations, it seems >> to me that we had better combine the following ideas for HEAD and 14, >> for now: >> - Avoid the extra timestamp computation as proposed by Laurenz in [1] >> - Throttle the frequency where the connection stat packages are sent, >> as of [2]. > > I think in that case we'd have to do the bigger redesign and move "live" > connection stats to backend_status.c... Hmm. A redesign is not really an option for 14 at this stage. And I am not really comfortable with the latest proposal from upthread to plug in that to pgstat_send_tabstat to report things once per transaction, either. It really looks like this needs more thoughts, and it would mean that a revert may be the most appropriate choice for the moment. That's the last-resort option, surely, but we are post-beta3 so there is no much margin left. -- Michael signature.asc Description: PGP signature
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Fri, 2021-08-27 at 13:57 +0900, Michael Paquier wrote: > > I think in that case we'd have to do the bigger redesign and move "live" > > connection stats to backend_status.c... > > Hmm. A redesign is not really an option for 14 at this stage. And I > am not really comfortable with the latest proposal from upthread to > plug in that to pgstat_send_tabstat to report things once per > transaction, either. It really looks like this needs more thoughts, > and it would mean that a revert may be the most appropriate choice > for the moment. That's the last-resort option, surely, but we are > post-beta3 so there is no much margin left. In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000 milliseconds? That would mean slightly less up-to-date statistics, but I doubt that that will be a problem. And it should even out the increase in statistics messages, except in the case of lots of short-lived sessions. But in that scenario you cannot have session statistics without lots of extra messages, and such a workload has enough performance problems as it is, so I don't think we have to specifically worry about it. Yours, Laurenz Albe
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote: > In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000 > milliseconds? That would mean slightly less up-to-date statistics, but > I doubt that that will be a problem. And it should even out the increase > in statistics messages, except in the case of lots of short-lived > sessions. But in that scenario you cannot have session statistics > without lots of extra messages, and such a workload has enough performance > problems as it is, so I don't think we have to specifically worry about it. Perhaps we could do that. Now, increasing an interval for the sake of balancing the extra load created by a feature while impacting the whole set of stats is not really appealing. -- Michael signature.asc Description: PGP signature
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Hi, On August 31, 2021 6:33:15 PM PDT, Michael Paquier wrote: >On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote: >> In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000 >> milliseconds? That would mean slightly less up-to-date statistics, but >> I doubt that that will be a problem. And it should even out the increase >> in statistics messages, except in the case of lots of short-lived >> sessions. But in that scenario you cannot have session statistics >> without lots of extra messages, and such a workload has enough performance >> problems as it is, so I don't think we have to specifically worry about it. > >Perhaps we could do that. Now, increasing an interval for the sake of >balancing the extra load created by a feature while impacting the >whole set of stats is not really appealing. I think it's not helpful. Still increases the number of messages substantially in workloads with a lot of connections doing occasional queries. Which is common. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Tue, 2021-08-31 at 18:55 -0700, Andres Freund wrote: > > > On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:In the view > > > of that, how about doubling PGSTAT_STAT_INTERVAL to 1000 > > > > > milliseconds? That would mean slightly less up-to-date statistics, but > > > I doubt that that will be a problem. > > I think it's not helpful. Still increases the number of messages > substantially in workloads > with a lot of connections doing occasional queries. Which is common. How come? If originally you send table statistics every 500ms, and now you send table statistics and session statistics every second, that should amount to the same thing. Where is my misunderstanding? Yours, Laurenz Albe
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Wed, 2021-09-01 at 10:33 +0900, Michael Paquier wrote: > On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote: > > In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000 > > milliseconds? > > Perhaps we could do that. Now, increasing an interval for the sake of > balancing the extra load created by a feature while impacting the > whole set of stats is not really appealing. I agree. But if the best fix is too invasive at this point, the alternatives are reverting the patch or choosing a less appealing solution. Yours, Laurenz Albe
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On 2021-09-01 05:39:14 +0200, Laurenz Albe wrote: > On Tue, 2021-08-31 at 18:55 -0700, Andres Freund wrote: > > > > On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:In the > > > > view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000 > > > > > > > milliseconds? That would mean slightly less up-to-date statistics, but > > > > I doubt that that will be a problem. > > > > I think it's not helpful. Still increases the number of messages > > substantially in workloads > > with a lot of connections doing occasional queries. Which is common. > > How come? If originally you send table statistics every 500ms, and now you > send > table statistics and session statistics every second, that should amount to > the > same thing. Where is my misunderstanding? Consider the case of one query a second.
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On 2021-08-27 13:57:45 +0900, Michael Paquier wrote: > On Wed, Aug 25, 2021 at 01:20:03AM -0700, Andres Freund wrote: > > On 2021-08-25 12:51:58 +0900, Michael Paquier wrote: > > As I said before, this ship has long sailed: > > > > typedef struct PgStat_MsgTabstat > > { > > PgStat_MsgHdr m_hdr; > > Oid m_databaseid; > > int m_nentries; > > int m_xact_commit; > > int m_xact_rollback; > > PgStat_Counter m_block_read_time; /* times in microseconds */ > > PgStat_Counter m_block_write_time; > > PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES]; > > } PgStat_MsgTabstat; > > Well, I kind of misread what you meant upthread then. > PgStat_MsgTabstat has a name a bit misleading, especially if you > assign connection stats to it. ISTM we should just do this fairly obvious change. Given that we already transport commit / rollback / IO stats, I don't see why the connection stats change anything to a meaningful degree. I'm fairly baffled why that's not the obvious thing to do for v14.
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Tue, 2021-08-31 at 21:16 -0700, Andres Freund wrote: > On 2021-09-01 05:39:14 +0200, Laurenz Albe wrote: > > On Tue, 2021-08-31 at 18:55 -0700, Andres Freund wrote: > > > > > On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:In the > > > > > view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000 > > > > > milliseconds? That would mean slightly less up-to-date statistics, > > > > > but > > > > > I doubt that that will be a problem. > > > > > > I think it's not helpful. Still increases the number of messages > > > substantially in workloads > > > with a lot of connections doing occasional queries. Which is common. > > > > How come? If originally you send table statistics every 500ms, and now you > > send > > table statistics and session statistics every second, that should amount to > > the > > same thing. Where is my misunderstanding? > > Consider the case of one query a second. I guess I am too stupid. I don't see it. Yours, Laurenz Albe
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Wed, 2021-09-01 at 10:56 +0200, Laurenz Albe wrote: > On Tue, 2021-08-31 at 21:16 -0700, Andres Freund wrote: > > On 2021-09-01 05:39:14 +0200, Laurenz Albe wrote: > > > On Tue, 2021-08-31 at 18:55 -0700, Andres Freund wrote: > > > > > > On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:In the > > > > > > view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000 > > > > > > milliseconds? That would mean slightly less up-to-date statistics, > > > > > > but > > > > > > I doubt that that will be a problem. > > > > > > > > I think it's not helpful. Still increases the number of messages > > > > substantially in workloads > > > > with a lot of connections doing occasional queries. Which is common. > > > > > > How come? If originally you send table statistics every 500ms, and now > > > you send > > > table statistics and session statistics every second, that should amount > > > to the > > > same thing. Where is my misunderstanding? > > > > Consider the case of one query a second. > > I guess I am too stupid. I don't see it. Finally got it. That would send a message every second, and with connection statistics, twice as many. Here is my next suggestion for a band-aid to mitigate this problem: Introduce a second, much longer interval for reporting session statistics. --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -77,6 +77,8 @@ #define PGSTAT_STAT_INTERVAL 500 /* Minimum time between stats file * updates; in milliseconds. */ +#define PGSTAT_CONSTAT_INTERVAL6 /* interval to report connection statistics */ + #define PGSTAT_RETRY_DELAY 10 /* How long to wait between checks for a * new file; in milliseconds. */ @@ -889,8 +891,13 @@ pgstat_report_stat(bool disconnect) !TimestampDifferenceExceeds(last_report, now, PGSTAT_STAT_INTERVAL)) return; - /* for backends, send connection statistics */ - if (MyBackendType == B_BACKEND) + /* +* For backends, send connection statistics, but only every +* PGSTAT_CONSTAT_INTERVAL or when the backend terminates. +*/ + if (MyBackendType == B_BACKEND && + (TimestampDifferenceExceeds(last_report, now, PGSTAT_CONSTAT_INTERVAL) || +disconnect)) pgstat_send_connstats(disconnect, last_report, now); last_report = now; That should keep the extra load moderate, except for workloads with lots of tiny connections (for which this may be the least of their problems). Yours, Laurenz Albe
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Hi, On 2021-08-31 21:56:50 -0700, Andres Freund wrote: > On 2021-08-27 13:57:45 +0900, Michael Paquier wrote: > > On Wed, Aug 25, 2021 at 01:20:03AM -0700, Andres Freund wrote: > > > On 2021-08-25 12:51:58 +0900, Michael Paquier wrote: > > > As I said before, this ship has long sailed: > > > > > > typedef struct PgStat_MsgTabstat > > > { > > > PgStat_MsgHdr m_hdr; > > > Oid m_databaseid; > > > int m_nentries; > > > int m_xact_commit; > > > int m_xact_rollback; > > > PgStat_Counter m_block_read_time; /* times in microseconds */ > > > PgStat_Counter m_block_write_time; > > > PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES]; > > > } PgStat_MsgTabstat; > > > > Well, I kind of misread what you meant upthread then. > > PgStat_MsgTabstat has a name a bit misleading, especially if you > > assign connection stats to it. > > ISTM we should just do this fairly obvious change. Given that we already > transport commit / rollback / IO stats, I don't see why the connection stats > change anything to a meaningful degree. I'm fairly baffled why that's not the > obvious thing to do for v14. Here's how I think that would look like. While writing up this draft, I found two more issues: - On windows / 32 bit systems, the session time would overflow if idle for longer than ~4300s. long is only 32 bit. Easy to fix obviously. - Right now walsenders, including database connected walsenders, are not reported in connection stats. That doesn't seem quite right to me. In the patch I made the message for connecting an explicitly reported message, that seems cleaner, because it then happens at a clearly defined point. I didn't do the same for disconnecting, but perhaps that would be better? Then we could get rid of the whole pgStatSessionEndCause variable. Greetings, Andres Freund >From 2b3cf32bd02ea4b73157db023d6475826e32c64a Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Fri, 3 Sep 2021 17:02:15 -0700 Subject: [PATCH] wip: Reduce overhead of "pg_stat_database counters for sessions and session time" "Fixes" 960869da0803 Author: Andres Freund Reviewed-By: Discussion: https://postgr.es/m/20210901045650.fz4he2b3wx4p7...@alap3.anarazel.de Backpatch: --- src/include/pgstat.h| 34 ++-- src/backend/postmaster/pgstat.c | 171 src/backend/tcop/postgres.c | 2 + src/backend/utils/activity/backend_status.c | 4 +- 4 files changed, 132 insertions(+), 79 deletions(-) diff --git a/src/include/pgstat.h b/src/include/pgstat.h index 509849c7ff4..a7b386821f6 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -81,7 +81,8 @@ typedef enum StatMsgType PGSTAT_MTYPE_DEADLOCK, PGSTAT_MTYPE_CHECKSUMFAILURE, PGSTAT_MTYPE_REPLSLOT, - PGSTAT_MTYPE_CONNECTION, + PGSTAT_MTYPE_CONNECT, + PGSTAT_MTYPE_DISCONNECT, } StatMsgType; /* -- @@ -279,7 +280,7 @@ typedef struct PgStat_TableEntry * -- */ #define PGSTAT_NUM_TABENTRIES \ - ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - 3 * sizeof(int) - 2 * sizeof(PgStat_Counter)) \ + ((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - 3 * sizeof(int) - 5 * sizeof(PgStat_Counter)) \ / sizeof(PgStat_TableEntry)) typedef struct PgStat_MsgTabstat @@ -291,6 +292,9 @@ typedef struct PgStat_MsgTabstat int m_xact_rollback; PgStat_Counter m_block_read_time; /* times in microseconds */ PgStat_Counter m_block_write_time; + PgStat_Counter m_session_time; + PgStat_Counter m_active_time; + PgStat_Counter m_idle_in_xact_time; PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES]; } PgStat_MsgTabstat; @@ -653,20 +657,26 @@ typedef struct PgStat_MsgChecksumFailure } PgStat_MsgChecksumFailure; /* -- - * PgStat_MsgConn Sent by the backend to update connection statistics. + * PgStat_MsgConnect Sent by the backend upon connection + *establishment * -- */ -typedef struct PgStat_MsgConn +typedef struct PgStat_MsgConnect { PgStat_MsgHdr m_hdr; Oid m_databaseid; - PgStat_Counter m_count; - PgStat_Counter m_session_time; - PgStat_Counter m_active_time; - PgStat_Counter m_idle_in_xact_time; - SessionEndType m_disconnect; -} PgStat_MsgConn; +} PgStat_MsgConnect; +/* -- + * PgStat_MsgDisconnect Sent by the backend when disconnecting + * -- + */ +typedef struct PgStat_MsgDisconnect +{ + PgStat_MsgHdr m_hdr; + Oid m_databaseid; + SessionEndType m_cause; +} PgStat_MsgDisconnect; /* -- * PgStat_Msg Union over all possible messages. @@ -700,7 +710,8 @@ typedef union PgStat_Msg PgStat_MsgTempFile msg_tempfile; PgStat_MsgChecksumFailure msg_checksumfailure; PgStat_MsgReplSlot msg_replslot; - PgStat_MsgConn msg_conn; + PgStat_MsgConnect msg_connect; + PgStat_MsgDisconnect msg_disconnect; } PgStat_Msg; @@ -1010,6 +1021,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t extern void pgstat_reset_slru_counter(con
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Fri, 2021-09-03 at 17:04 -0700, Andres Freund wrote: > Hi, > > On 2021-08-31 21:56:50 -0700, Andres Freund wrote: > > On 2021-08-27 13:57:45 +0900, Michael Paquier wrote: > > > On Wed, Aug 25, 2021 at 01:20:03AM -0700, Andres Freund wrote: > > > > On 2021-08-25 12:51:58 +0900, Michael Paquier wrote: > > > > As I said before, this ship has long sailed: > > > > > > > > typedef struct PgStat_MsgTabstat > > > > { > > > > PgStat_MsgHdr m_hdr; > > > > Oid m_databaseid; > > > > int m_nentries; > > > > int m_xact_commit; > > > > int m_xact_rollback; > > > > PgStat_Counter m_block_read_time; /* times in > > > > microseconds */ > > > > PgStat_Counter m_block_write_time; > > > > PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES]; > > > > } PgStat_MsgTabstat; > > > > > > Well, I kind of misread what you meant upthread then. > > > PgStat_MsgTabstat has a name a bit misleading, especially if you > > > assign connection stats to it. > > > > ISTM we should just do this fairly obvious change. Given that we already > > transport commit / rollback / IO stats, I don't see why the connection stats > > change anything to a meaningful degree. I'm fairly baffled why that's not > > the > > obvious thing to do for v14. > > Here's how I think that would look like. Thank you! > While writing up this draft, I found > two more issues: > > - On windows / 32 bit systems, the session time would overflow if idle for > longer than ~4300s. long is only 32 bit. Easy to fix obviously. Oops, yes. Thanks for spotting that. > - Right now walsenders, including database connected walsenders, are not > reported in connection stats. That doesn't seem quite right to me. I think that walsenders not only use a different protocol, but often have session characteristics quite different from normal backends. For example, they are always "active", even when they are doing nothing. Therefore, I think it is confusing to report them together with normal database sessions. If at all, walsender statistics should be reported separately. > In the patch I made the message for connecting an explicitly reported message, > that seems cleaner, because it then happens at a clearly defined point. I > didn't do the same for disconnecting, but perhaps that would be better? Then > we could get rid of the whole pgStatSessionEndCause variable. I see your point, but I am not certain if it is worth adding yet another message for a small thing like that. I have no strong opinion on that though. Reading your patch, I am still confused about the following: There are potentially several calls to "pgstat_send_tabstat" in "pgstat_report_stat". It seems to me that if it were called more than once, session statistics would be reported and counted several times, which would be wrong. Yours, Laurenz Albe
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Hi, On 2021-09-06 09:12:58 +0200, Laurenz Albe wrote: > Reading your patch, I am still confused about the following: > There are potentially several calls to "pgstat_send_tabstat" in > "pgstat_report_stat". > It seems to me that if it were called more than once, session statistics would > be reported and counted several times, which would be wrong. The accumulator variables are set to zero after being sent. So no data should be sent multiple times as far as I can tell. Regards, Andres
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Mon, 2021-09-06 at 00:22 -0700, Andres Freund wrote: > On 2021-09-06 09:12:58 +0200, Laurenz Albe wrote: > > Reading your patch, I am still confused about the following: > > There are potentially several calls to "pgstat_send_tabstat" in > > "pgstat_report_stat". > > It seems to me that if it were called more than once, session statistics > > would > > be reported and counted several times, which would be wrong. > > The accumulator variables are set to zero after being sent. So no data should > be sent multiple times as far as I can tell. Ah, I missed that. Then I see no problem with that approach. Yours, Laurenz Albe
Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
On Fri, 2021-09-03 at 17:04 -0700, Andres Freund wrote: > Here's how I think that would look like. While writing up this draft, I found > two more issues: > > - On windows / 32 bit systems, the session time would overflow if idle for > longer than ~4300s. long is only 32 bit. Easy to fix obviously. > - Right now walsenders, including database connected walsenders, are not > reported in connection stats. That doesn't seem quite right to me. > > In the patch I made the message for connecting an explicitly reported message, > that seems cleaner, because it then happens at a clearly defined point. I > didn't do the same for disconnecting, but perhaps that would be better? Then > we could get rid of the whole pgStatSessionEndCause variable. I have gone over your patch and made the following changes: - cache the last report time in a static variable pgLastSessionReportTime - add a comment to explain why we only track normal backends - pgindent - an attempt at a commit message Yours, Laurenz Albe From 1fdfac528ec4e75d91b9cf3868eda66a72a41c8f Mon Sep 17 00:00:00 2001 From: Laurenz Albe Date: Wed, 8 Sep 2021 06:06:46 +0200 Subject: [PATCH] Fix performance regression from session statistics Session statistics, as introduced by 960869da08, had several shortcomings: - an additional GetCurrentTimestamp() call that also impaired the accuracy of the data collected This can be avoided by passing the current timestamp we already have in pgstat_report_stat(). - an additional statistics UDP packet sent every 500ms This is solved by adding the new statistics to PgStat_MsgTabstat. This is conceptually ugly, because session statistics are not table statistics. But the struct already contains data unrelated to tables, so there is not much damage done. Connection and disconnection are reported in separate messages, which reduces the total overhead to two messages per session. The time of the last session statistics report is now cached in the static variable pgLastSessionReportTime. Discussion: https://postgr.es/m/20210801205501.nyxzxoelqoo4x2qc%40alap3.anarazel.de --- src/backend/postmaster/pgstat.c | 177 +--- src/backend/tcop/postgres.c | 2 + src/backend/utils/activity/backend_status.c | 4 +- src/include/pgstat.h| 34 ++-- src/tools/pgindent/typedefs.list| 3 +- 5 files changed, 141 insertions(+), 79 deletions(-) diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 3450a10129..7aaf928315 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -246,6 +246,7 @@ static int pgStatXactCommit = 0; static int pgStatXactRollback = 0; PgStat_Counter pgStatBlockReadTime = 0; PgStat_Counter pgStatBlockWriteTime = 0; +static PgStat_Counter pgLastSessionReportTime = 0; PgStat_Counter pgStatActiveTime = 0; PgStat_Counter pgStatTransactionIdleTime = 0; SessionEndType pgStatSessionEndCause = DISCONNECT_NORMAL; @@ -330,11 +331,11 @@ static bool pgstat_db_requested(Oid databaseid); static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData name, bool create_it); static void pgstat_reset_replslot(PgStat_StatReplSlotEntry *slotstats, TimestampTz ts); -static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg); +static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now); static void pgstat_send_funcstats(void); static void pgstat_send_slru(void); static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid); -static void pgstat_send_connstats(bool disconnect, TimestampTz last_report); +static void pgstat_report_disconnect(Oid dboid); static PgStat_TableStatus *get_tabstat_entry(Oid rel_id, bool isshared); @@ -366,7 +367,8 @@ static void pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len); static void pgstat_recv_recoveryconflict(PgStat_MsgRecoveryConflict *msg, int len); static void pgstat_recv_deadlock(PgStat_MsgDeadlock *msg, int len); static void pgstat_recv_checksum_failure(PgStat_MsgChecksumFailure *msg, int len); -static void pgstat_recv_connstat(PgStat_MsgConn *msg, int len); +static void pgstat_recv_connect(PgStat_MsgConnect *msg, int len); +static void pgstat_recv_disconnect(PgStat_MsgDisconnect *msg, int len); static void pgstat_recv_replslot(PgStat_MsgReplSlot *msg, int len); static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len); @@ -890,12 +892,11 @@ pgstat_report_stat(bool disconnect) !TimestampDifferenceExceeds(last_report, now, PGSTAT_STAT_INTERVAL)) return; - /* for backends, send connection statistics */ - if (MyBackendType == B_BACKEND) - pgstat_send_connstats(disconnect, last_report); - last_report = now; + if (disconnect) + pgstat_report_disconnect(MyDatabaseId); + /* * Destroy pgStatTabHash before we start invalidating PgStat_TableEntry * entries it points to. (Should we fail partway through the loop below, @@ -947,7 +948,7 @@ pgstat_report_stat