Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
> > Well, if it's possible to deduce what is the meaning in given line, > > I can add the logic to do the deduction to parser. > > Also, I want to say that I appreciate being looped in the discussion. > I lost sight of this thread, so my apologies for the delay. The patch > to fix the description of the EXPLAIN field has now been applied to > v15 and v16. Thanks. Will do my best to update the parser soon. Best regards, depesz
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
On Tue, Oct 31, 2023 at 03:11:03PM +0100, hubert depesz lubaczewski wrote: > On Tue, Oct 31, 2023 at 08:17:52AM +0900, Michael Paquier wrote: >> Thanks for the input. I was looking yesterday if this code was >> available somewhere, but couldn't find it.. Until this morning: >> https://gitlab.com/depesz/explain.depesz.com.git > > Well, the parser itself is https://gitlab.com/depesz/Pg--Explain/ :) That was close enough ;) > Well, if it's possible to deduce what is the meaning in given line, > I can add the logic to do the deduction to parser. > > Also, I want to say that I appreciate being looped in the discussion. I lost sight of this thread, so my apologies for the delay. The patch to fix the description of the EXPLAIN field has now been applied to v15 and v16. -- Michael signature.asc Description: PGP signature
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
On Tue, Oct 31, 2023 at 08:17:52AM +0900, Michael Paquier wrote: > Thanks for the input. I was looking yesterday if this code was > available somewhere, but couldn't find it.. Until this morning: > https://gitlab.com/depesz/explain.depesz.com.git Well, the parser itself is https://gitlab.com/depesz/Pg--Explain/ :) > And.. It looks like things would become better if we change > "shared/local" to "shared", because the parsing code seems to have an > issue once you add a '/'. All the fields in I/O Timings are > considered as part of a Node, and they're just included in the output. > Now, pasting a plan that includes "shared/local" drops entirely the > string from the output result, so some information is lost. In short, > imagine that we have the following string in a node: > I/O Timings: shared/local write=23.77 > > This would show up like that, meaning that the context where the > write/read timings happened is lost: > I/O Timings: write=23.77 > > If we switch back to "shared", the context would be kept around. Of > course, this does not count for all the parsers that may be out > there, but at least that's something. Well, if it's possible to deduce what is the meaning in given line, I can add the logic to do the deduction to parser. Also, I want to say that I appreciate being looped in the discussion. Best regards, depesz
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
On Mon, Oct 30, 2023 at 03:14:16PM +0100, hubert depesz lubaczewski wrote: > some things will definitely break, but that's 100% OK. The change seems > needed, and I can update my parser to deal with it :) Thanks for the input. I was looking yesterday if this code was available somewhere, but couldn't find it.. Until this morning: https://gitlab.com/depesz/explain.depesz.com.git And.. It looks like things would become better if we change "shared/local" to "shared", because the parsing code seems to have an issue once you add a '/'. All the fields in I/O Timings are considered as part of a Node, and they're just included in the output. Now, pasting a plan that includes "shared/local" drops entirely the string from the output result, so some information is lost. In short, imagine that we have the following string in a node: I/O Timings: shared/local write=23.77 This would show up like that, meaning that the context where the write/read timings happened is lost: I/O Timings: write=23.77 If we switch back to "shared", the context would be kept around. Of course, this does not count for all the parsers that may be out there, but at least that's something. -- Michael signature.asc Description: PGP signature
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
On Mon, Oct 30, 2023 at 10:45:05AM +0900, Michael Paquier wrote: > On Fri, Oct 27, 2023 at 04:58:20PM +0300, Nazir Bilal Yavuz wrote: > > I think switching it to 'shared' makes sense. That shouldn't confuse > > existing monitoring queries much as the numbers won't change, right? > > Also, if we keep 'shared/local' there could be similar complaints to > > this thread in the future; so, at least adding comments can be > > helpful. > > The problem is that it may impact existing tools that do explain > output deparsing. One of them is https://explain.depesz.com/ that > Hubert Depesz Lubaczewski has implemented, and it would be sad to > break anything related to it. > > I am adding Hubert in CC for comments about changing this > "shared/local" to "shared" on a released branch. Knowing that > "shared" and "local" will need to be handled as separate terms in 17~ > anyway, perhaps that's not a big deal, but let's be sure. Hi, some things will definitely break, but that's 100% OK. The change seems needed, and I can update my parser to deal with it :) Best regards, depesz
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
On Fri, Oct 27, 2023 at 04:58:20PM +0300, Nazir Bilal Yavuz wrote: > I think switching it to 'shared' makes sense. That shouldn't confuse > existing monitoring queries much as the numbers won't change, right? > Also, if we keep 'shared/local' there could be similar complaints to > this thread in the future; so, at least adding comments can be > helpful. The problem is that it may impact existing tools that do explain output deparsing. One of them is https://explain.depesz.com/ that Hubert Depesz Lubaczewski has implemented, and it would be sad to break anything related to it. I am adding Hubert in CC for comments about changing this "shared/local" to "shared" on a released branch. Knowing that "shared" and "local" will need to be handled as separate terms in 17~ anyway, perhaps that's not a big deal, but let's be sure. -- Michael signature.asc Description: PGP signature
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
Hi, On Thu, 19 Oct 2023 at 08:26, Michael Paquier wrote: > > On Wed, Oct 18, 2023 at 02:56:42PM +0900, Michael Paquier wrote: > > Thanks for the new versions. I have applied 0001 and backpatched it > > for now. 0002 and 0003 look in much cleaner shape than previously. > > 0002 and 0003 have now been applied. I have split 0003 into two parts > at the end, mainly on clarity grounds: one for the counters with > EXPLAIN and a second for pg_stat_statements. > > There were a few things in the patch set. Per my notes: > - Some incorrect indentation. > - The additions of show_buffer_usage() did not handle correctly the > addition of a comma before/after the local timing block. The code > area for has_local_timing needs to check for has_temp_timing, while > the area of has_shared_timing needs to check for (has_local_timing || > has_temp_timing). > - explain.sgml was missing an update for the information related to > the read/write timings of the local blocks. Thanks for the changes, push and feedback! > > Remains what we should do about the "shared/local" string in > show_buffer_usage() for v16 and v15, as "local" is unrelated to that. > Perhaps we should just switch to "shared" in this case or just remove > the string entirely? Still that implies changing the output of > EXPLAIN on a stable branch in this case, so there could be an argument > for leaving this stuff alone. I think switching it to 'shared' makes sense. That shouldn't confuse existing monitoring queries much as the numbers won't change, right? Also, if we keep 'shared/local' there could be similar complaints to this thread in the future; so, at least adding comments can be helpful. Regards, Nazir Bilal Yavuz Microsoft
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
On Wed, Oct 18, 2023 at 02:56:42PM +0900, Michael Paquier wrote: > Thanks for the new versions. I have applied 0001 and backpatched it > for now. 0002 and 0003 look in much cleaner shape than previously. 0002 and 0003 have now been applied. I have split 0003 into two parts at the end, mainly on clarity grounds: one for the counters with EXPLAIN and a second for pg_stat_statements. There were a few things in the patch set. Per my notes: - Some incorrect indentation. - The additions of show_buffer_usage() did not handle correctly the addition of a comma before/after the local timing block. The code area for has_local_timing needs to check for has_temp_timing, while the area of has_shared_timing needs to check for (has_local_timing || has_temp_timing). - explain.sgml was missing an update for the information related to the read/write timings of the local blocks. Remains what we should do about the "shared/local" string in show_buffer_usage() for v16 and v15, as "local" is unrelated to that. Perhaps we should just switch to "shared" in this case or just remove the string entirely? Still that implies changing the output of EXPLAIN on a stable branch in this case, so there could be an argument for leaving this stuff alone. -- Michael signature.asc Description: PGP signature
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
On Tue, Oct 03, 2023 at 12:44:36PM -0400, Robert Haas wrote: > My first thought was to wonder whether this was even a bug. I > remembered that EXPLAIN treats shared, local, and temp buffers as > three separate categories of things. But it seems that someone decided > to conflate two of them for I/O timing purposes: > > if (has_timing) > { > appendStringInfoString(es->str, " > shared/local"); > > Notice this bit in particular. I was reviewing the whole, and this is an oversight specific to efb0ef909f60, because we've never incremented the write/read counters for local buffers, even with this commit applied, for both the EXPLAIN reports and anything stored in pg_stat_statement. It seems to me that the origin of the confusion comes down to pg_stat_database where blk_{read|write}_time increments on both local and shared blocks, but on EXPLAIN this stuff only reflects data about shared buffers. So the "shared" part of the string is right, but the "local" part is not in v15 and v16. -- Michael signature.asc Description: PGP signature
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
On Tue, Oct 17, 2023 at 04:44:25PM +0300, Nazir Bilal Yavuz wrote: > I updated patches based on your comments. v4 is attached. Thanks for the new versions. I have applied 0001 and backpatched it for now. 0002 and 0003 look in much cleaner shape than previously. -- Michael signature.asc Description: PGP signature
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
Hi, Thanks for the review! On Tue, 17 Oct 2023 at 11:40, Michael Paquier wrote: > > On Mon, Oct 16, 2023 at 01:07:07PM +0300, Nazir Bilal Yavuz wrote: > > Yes, that could be a better solution. Also, having more detailed stats > > for shared and local buffers is helpful. I updated patches in line > > with that: > > > > 0001: Counts extends same way as a write. > > It can change existing query results on an already-released branch, > but we already count the number of blocks when doing a relation > extension, so counting the write time is something I'd rather fix in > v16. If you have any objections, let me know. I agree. > > > 0002: Rename blk_{read|write}_time as shared_blk_{read|write}_time. > > Note that `git diff --check` complains here. > > --- a/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql > +++ b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql > @@ -30,8 +30,8 @@ CREATE FUNCTION pg_stat_statements(IN showtext boolean, > OUT local_blks_written int8, > OUT temp_blks_read int8, > OUT temp_blks_written int8, > -OUT blk_read_time float8, > -OUT blk_write_time float8 > +OUT shared_blk_read_time float8, > +OUT shared_blk_write_time float8 > > Doing that in an extension upgrade script is incorrect. These should > not be touched. > > - Total time the statement spent reading data file blocks, in milliseconds > + Total time the statement spent reading shared data file blocks, in > milliseconds > > Or just shared blocks? That's what we use elsewhere for > pg_stat_statements. "shared data file blocks" sounds a bit confusing > for relation file blocks read/written from/to shared buffers. > > > 0003: Add new local_blk_{read|write}_time variables. > > DATA = pg_stat_statements--1.4.sql \ > + pg_stat_statements--1.11--1.12.sql \ > pg_stat_statements--1.10--1.11.sql \ > > There is no need to bump again pg_stat_statements, as it has already > been bumped to 1.11 on HEAD per the recent commit 5a3423ad8ee1 from > Daniel. So the new changes can just be added to 1.11. I updated patches based on your comments. v4 is attached. Regards, Nazir Bilal Yavuz Microsoft From 4b744c0b4e2cfdee4e95779ee19ac214b85aa150 Mon Sep 17 00:00:00 2001 From: Nazir Bilal Yavuz Date: Fri, 13 Oct 2023 17:35:00 +0300 Subject: [PATCH v4 1/3] Include IOOp IOOP_EXTENDs while calculating block write times Extends are counted as writes in block context, so include IOOP_EXTENDs while calculating block write times. --- src/backend/utils/activity/pgstat_io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c index eb7d35d422..8ec8670199 100644 --- a/src/backend/utils/activity/pgstat_io.c +++ b/src/backend/utils/activity/pgstat_io.c @@ -119,7 +119,7 @@ pgstat_count_io_op_time(IOObject io_object, IOContext io_context, IOOp io_op, INSTR_TIME_SET_CURRENT(io_time); INSTR_TIME_SUBTRACT(io_time, start_time); - if (io_op == IOOP_WRITE) + if (io_op == IOOP_WRITE || io_op == IOOP_EXTEND) { pgstat_count_buffer_write_time(INSTR_TIME_GET_MICROSEC(io_time)); if (io_object == IOOBJECT_RELATION) -- 2.42.0 From 5f299ee59d443a53bf6f3a8e18467270f5ee8318 Mon Sep 17 00:00:00 2001 From: Nazir Bilal Yavuz Date: Fri, 13 Oct 2023 16:28:03 +0300 Subject: [PATCH v4 2/3] Rename I/O timing statistics columns to shared_blk_{read|write}_time Since only shared blocks' timings are counted in blk_{read|write}_time, rename them as shared_blk_{read|write}_time --- .../expected/oldextversions.out | 4 +- .../pg_stat_statements--1.10--1.11.sql| 4 +- .../pg_stat_statements/pg_stat_statements.c | 14 ++- doc/src/sgml/pgstatstatements.sgml| 8 +- src/backend/commands/explain.c| 29 ++--- src/backend/executor/instrument.c | 12 +- src/backend/utils/activity/pgstat_io.c| 4 +- src/include/executor/instrument.h | 4 +- src/test/regress/expected/explain.out | 108 +- 9 files changed, 95 insertions(+), 92 deletions(-) diff --git a/contrib/pg_stat_statements/expected/oldextversions.out b/contrib/pg_stat_statements/expected/oldextversions.out index 64982aad60..c2e0140a47 100644 --- a/contrib/pg_stat_statements/expected/oldextversions.out +++ b/contrib/pg_stat_statements/expected/oldextversions.out @@ -284,8 +284,8 @@ AlTER EXTENSION pg_stat_statements UPDATE TO '1.11'; local_blks_written | bigint | | | temp_blks_read | bigint | | | temp_blks_written | bigint | | | - blk_read_time | double precision | | | - blk_write_time | double precision | | | + shared_blk_read_time | double precision | | | + shared_blk_write_time | double precision | | | temp_blk_read
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
On Mon, Oct 16, 2023 at 01:07:07PM +0300, Nazir Bilal Yavuz wrote: > Yes, that could be a better solution. Also, having more detailed stats > for shared and local buffers is helpful. I updated patches in line > with that: > > 0001: Counts extends same way as a write. It can change existing query results on an already-released branch, but we already count the number of blocks when doing a relation extension, so counting the write time is something I'd rather fix in v16. If you have any objections, let me know. > 0002: Rename blk_{read|write}_time as shared_blk_{read|write}_time. Note that `git diff --check` complains here. --- a/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql +++ b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql @@ -30,8 +30,8 @@ CREATE FUNCTION pg_stat_statements(IN showtext boolean, OUT local_blks_written int8, OUT temp_blks_read int8, OUT temp_blks_written int8, -OUT blk_read_time float8, -OUT blk_write_time float8 +OUT shared_blk_read_time float8, +OUT shared_blk_write_time float8 Doing that in an extension upgrade script is incorrect. These should not be touched. - Total time the statement spent reading data file blocks, in milliseconds + Total time the statement spent reading shared data file blocks, in milliseconds Or just shared blocks? That's what we use elsewhere for pg_stat_statements. "shared data file blocks" sounds a bit confusing for relation file blocks read/written from/to shared buffers. > 0003: Add new local_blk_{read|write}_time variables. DATA = pg_stat_statements--1.4.sql \ + pg_stat_statements--1.11--1.12.sql \ pg_stat_statements--1.10--1.11.sql \ There is no need to bump again pg_stat_statements, as it has already been bumped to 1.11 on HEAD per the recent commit 5a3423ad8ee1 from Daniel. So the new changes can just be added to 1.11. -- Michael signature.asc Description: PGP signature
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
Hi, On Tue, 10 Oct 2023 at 03:54, Michael Paquier wrote: > > In ~14, as far as I can see blk_write_time is only incremented for > shared buffers. FWIW, I agree that we should improve these stats for > local buffers but I am not on board with a solution where we'd use the > same counter for local and shared buffers while we've historically > only counted the former, because that could confuse existing > monitoring queries. It seems to me that the right solution is to do > the same separation as temp blocks with two separate counters, without > a backpatch. I'd like to go as far as renaming blk_read_time and > blk_write_time to respectively shared_blk_read_time and > shared_blk_write_time to know exactly what the type of block dealt > with is when querying this data, particularly for pg_stat_statements's > sake. Yes, that could be a better solution. Also, having more detailed stats for shared and local buffers is helpful. I updated patches in line with that: 0001: Counts extends same way as a write. 0002: Rename blk_{read|write}_time as shared_blk_{read|write}_time. 0003: Add new local_blk_{read|write}_time variables. Regards, Nazir Bilal Yavuz Microsoft From acf5a2781760f284aef770da0f64acf8685b6445 Mon Sep 17 00:00:00 2001 From: Nazir Bilal Yavuz Date: Fri, 13 Oct 2023 17:35:00 +0300 Subject: [PATCH v3 1/3] Include IOOp IOOP_EXTENDs while calculating block write times Extends are counted as writes in block context, so include IOOP_EXTENDs while calculating block write times. --- src/backend/utils/activity/pgstat_io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c index eb7d35d422..8ec8670199 100644 --- a/src/backend/utils/activity/pgstat_io.c +++ b/src/backend/utils/activity/pgstat_io.c @@ -119,7 +119,7 @@ pgstat_count_io_op_time(IOObject io_object, IOContext io_context, IOOp io_op, INSTR_TIME_SET_CURRENT(io_time); INSTR_TIME_SUBTRACT(io_time, start_time); - if (io_op == IOOP_WRITE) + if (io_op == IOOP_WRITE || io_op == IOOP_EXTEND) { pgstat_count_buffer_write_time(INSTR_TIME_GET_MICROSEC(io_time)); if (io_object == IOOBJECT_RELATION) -- 2.42.0 From 1f0933c7057e20c074f6b9355f6d91a9277a8c09 Mon Sep 17 00:00:00 2001 From: Nazir Bilal Yavuz Date: Thu, 12 Oct 2023 14:40:23 +0300 Subject: [PATCH v3 3/3] Add local_blk_{read|write}_time I/O timing statistics There was no I/O timing statistics for counting local blocks' timings. So, add local_blk_{read|write}_time variables for counting these. These counters are exposed in EXPLAIN and pg_stat_statements. pg_stat_statements is bumped to 1.12. --- contrib/pg_stat_statements/Makefile | 1 + .../expected/oldextversions.out | 60 contrib/pg_stat_statements/meson.build| 1 + .../pg_stat_statements--1.11--1.12.sql| 71 +++ .../pg_stat_statements/pg_stat_statements.c | 33 - .../pg_stat_statements.control| 2 +- .../pg_stat_statements/sql/oldextversions.sql | 5 ++ doc/src/sgml/pgstatstatements.sgml| 20 ++ src/backend/commands/explain.c| 23 +- src/backend/executor/instrument.c | 6 ++ src/backend/utils/activity/pgstat_io.c| 4 ++ src/include/executor/instrument.h | 2 + src/test/regress/expected/explain.out | 4 ++ 13 files changed, 228 insertions(+), 4 deletions(-) create mode 100644 contrib/pg_stat_statements/pg_stat_statements--1.11--1.12.sql diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile index eba4a95d91..b7d12bc7fb 100644 --- a/contrib/pg_stat_statements/Makefile +++ b/contrib/pg_stat_statements/Makefile @@ -7,6 +7,7 @@ OBJS = \ EXTENSION = pg_stat_statements DATA = pg_stat_statements--1.4.sql \ + pg_stat_statements--1.11--1.12.sql \ pg_stat_statements--1.10--1.11.sql \ pg_stat_statements--1.9--1.10.sql pg_stat_statements--1.8--1.9.sql \ pg_stat_statements--1.7--1.8.sql pg_stat_statements--1.6--1.7.sql \ diff --git a/contrib/pg_stat_statements/expected/oldextversions.out b/contrib/pg_stat_statements/expected/oldextversions.out index 9e18fe2e47..b84e0c8484 100644 --- a/contrib/pg_stat_statements/expected/oldextversions.out +++ b/contrib/pg_stat_statements/expected/oldextversions.out @@ -308,4 +308,64 @@ SELECT count(*) > 0 AS has_data FROM pg_stat_statements; t (1 row) +-- New views for pg_stat_statements in 1.12 +AlTER EXTENSION pg_stat_statements UPDATE TO '1.12'; +\d pg_stat_statements + View "public.pg_stat_statements" + Column | Type | Collation | Nullable | Default ++--+---+--+- + userid | oid | | | + dbid | oid | | | + toplevel | boolean | |
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
On Thu, Oct 05, 2023 at 08:51:40AM -0400, Robert Haas wrote: > On Thu, Oct 5, 2023 at 6:25 AM Nazir Bilal Yavuz wrote: >> What do you think about the second patch, counting extend calls' >> timings in blk_write_time? In my opinion if something increments >> {shared|local}_blks_written, then it needs to be counted in >> blk_write_time too. I am not sure why it is decided like that. > > I agree that an extend should be counted the same way as a write. But > I'm suspicious that here too we have confusion about whether > blk_write_time is supposed to be covering shared buffers and local > buffers or just shared buffers. Agreed. In ~14, as far as I can see blk_write_time is only incremented for shared buffers. FWIW, I agree that we should improve these stats for local buffers but I am not on board with a solution where we'd use the same counter for local and shared buffers while we've historically only counted the former, because that could confuse existing monitoring queries. It seems to me that the right solution is to do the same separation as temp blocks with two separate counters, without a backpatch. I'd like to go as far as renaming blk_read_time and blk_write_time to respectively shared_blk_read_time and shared_blk_write_time to know exactly what the type of block dealt with is when querying this data, particularly for pg_stat_statements's sake. -- Michael signature.asc Description: PGP signature
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
On Thu, Oct 5, 2023 at 6:25 AM Nazir Bilal Yavuz wrote: > What do you think about the second patch, counting extend calls' > timings in blk_write_time? In my opinion if something increments > {shared|local}_blks_written, then it needs to be counted in > blk_write_time too. I am not sure why it is decided like that. I agree that an extend should be counted the same way as a write. But I'm suspicious that here too we have confusion about whether blk_write_time is supposed to be covering shared buffers and local buffers or just shared buffers. -- Robert Haas EDB: http://www.enterprisedb.com
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
Hi, On Tue, 3 Oct 2023 at 19:44, Robert Haas wrote: > > Given that, I'm inclined to agree that this is a bug. But we might > need to go through and make sure all of the code that deals with these > counters is on the same page about what the values represent. Maybe > there is code lurking somewhere that thinks these counters are only > supposed to include "shared" rather than, as the fragment above > suggests, "shared/local". Thank you for the guidance. What do you think about the second patch, counting extend calls' timings in blk_write_time? In my opinion if something increments {shared|local}_blks_written, then it needs to be counted in blk_write_time too. I am not sure why it is decided like that. Regards, Nazir Bilal Yavuz Microsoft
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
On Fri, Sep 15, 2023 at 12:34 PM Melanie Plageman wrote: > On Fri, Sep 15, 2023 at 9:24 AM Nazir Bilal Yavuz wrote: > > I found that pgBufferUsage.blk_{read|write}_time are zero although there > > are pgBufferUsage.local_blks_{read|written} > > Yes, good catch. This is a bug. I will note that at least in 15 and > likely before, pgBufferUsage.local_blks_written is incremented for > local buffers but pgBufferUsage.blk_write_time is only added to for > shared buffers (in FlushBuffer()). I think it makes sense to propose a > bug fix to stable branches counting blk_write_time for local buffers > as well. My first thought was to wonder whether this was even a bug. I remembered that EXPLAIN treats shared, local, and temp buffers as three separate categories of things. But it seems that someone decided to conflate two of them for I/O timing purposes: if (has_timing) { appendStringInfoString(es->str, " shared/local"); Notice this bit in particular. if (!INSTR_TIME_IS_ZERO(usage->blk_read_time)) appendStringInfo(es->str, " read=%0.3f", INSTR_TIME_GET_MILLISEC(usage->blk_read_time)); if (!INSTR_TIME_IS_ZERO(usage->blk_write_time)) appendStringInfo(es->str, " write=%0.3f", INSTR_TIME_GET_MILLISEC(usage->blk_write_time)); if (has_temp_timing) appendStringInfoChar(es->str, ','); } if (has_temp_timing) { appendStringInfoString(es->str, " temp"); if (!INSTR_TIME_IS_ZERO(usage->temp_blk_read_time)) appendStringInfo(es->str, " read=%0.3f", INSTR_TIME_GET_MILLISEC(usage->temp_blk_read_time)); if (!INSTR_TIME_IS_ZERO(usage->temp_blk_write_time)) appendStringInfo(es->str, " write=%0.3f", INSTR_TIME_GET_MILLISEC(usage->temp_blk_write_time)); } Given that, I'm inclined to agree that this is a bug. But we might need to go through and make sure all of the code that deals with these counters is on the same page about what the values represent. Maybe there is code lurking somewhere that thinks these counters are only supposed to include "shared" rather than, as the fragment above suggests, "shared/local". -- Robert Haas EDB: http://www.enterprisedb.com
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
Hi, On Fri, 15 Sept 2023 at 16:30, Melanie Plageman wrote: > > Yes, good catch. This is a bug. I will note that at least in 15 and > likely before, pgBufferUsage.local_blks_written is incremented for > local buffers but pgBufferUsage.blk_write_time is only added to for > shared buffers (in FlushBuffer()). I think it makes sense to propose a > bug fix to stable branches counting blk_write_time for local buffers > as well. I attached the PG16+ (after pg_stat_io) and PG15- (before pg_stat_io) versions of the same patch. Regards, Nazir Bilal Yavuz Microsoft From b4ea504497708b1017e8cfda5f1ac366a9e34386 Mon Sep 17 00:00:00 2001 From: Nazir Bilal Yavuz Date: Fri, 15 Sep 2023 11:55:43 +0300 Subject: [PATCH v2 2/2] [PG16+] Add pgBufferUsage.blk_write_time to for IOOp IOOP_EXTENDs --- src/backend/utils/activity/pgstat_io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c index c8058b57962..56051fc6072 100644 --- a/src/backend/utils/activity/pgstat_io.c +++ b/src/backend/utils/activity/pgstat_io.c @@ -119,7 +119,7 @@ pgstat_count_io_op_time(IOObject io_object, IOContext io_context, IOOp io_op, INSTR_TIME_SET_CURRENT(io_time); INSTR_TIME_SUBTRACT(io_time, start_time); - if (io_op == IOOP_WRITE) + if (io_op == IOOP_WRITE || io_op == IOOP_EXTEND) { pgstat_count_buffer_write_time(INSTR_TIME_GET_MICROSEC(io_time)); if (io_object == IOOBJECT_RELATION -- 2.42.0 From 46aaa1f76f02f7697f520e6509a3aeb970a862ae Mon Sep 17 00:00:00 2001 From: Nazir Bilal Yavuz Date: Fri, 15 Sep 2023 12:35:01 +0300 Subject: [PATCH v2 1/2] [PG16+] Increase pgBufferUsage.blk_{read|write}_time when IOObject is IOOBJECT_TEMP_RELATION --- src/backend/utils/activity/pgstat_io.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c index eb7d35d4225..c8058b57962 100644 --- a/src/backend/utils/activity/pgstat_io.c +++ b/src/backend/utils/activity/pgstat_io.c @@ -122,13 +122,15 @@ pgstat_count_io_op_time(IOObject io_object, IOContext io_context, IOOp io_op, if (io_op == IOOP_WRITE) { pgstat_count_buffer_write_time(INSTR_TIME_GET_MICROSEC(io_time)); - if (io_object == IOOBJECT_RELATION) + if (io_object == IOOBJECT_RELATION + || io_object == IOOBJECT_TEMP_RELATION) INSTR_TIME_ADD(pgBufferUsage.blk_write_time, io_time); } else if (io_op == IOOP_READ) { pgstat_count_buffer_read_time(INSTR_TIME_GET_MICROSEC(io_time)); - if (io_object == IOOBJECT_RELATION) + if (io_object == IOOBJECT_RELATION + || io_object == IOOBJECT_TEMP_RELATION) INSTR_TIME_ADD(pgBufferUsage.blk_read_time, io_time); } -- 2.42.0 From 1e3af6e9466924713488b374b5d7bfb2c3bd5983 Mon Sep 17 00:00:00 2001 From: Nazir Bilal Yavuz Date: Mon, 2 Oct 2023 16:31:46 +0300 Subject: [PATCH v2 2/2] [PG15-] Add pgBufferUsage.blk_write_time to for extends --- src/backend/storage/buffer/bufmgr.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index 9fcb3d6e194..a4e9fd3317b 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -825,6 +825,8 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum, bool found; bool isExtend; bool isLocalBuf = SmgrIsTemp(smgr); + instr_time io_start, +io_time; *hit = false; @@ -992,9 +994,21 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum, { /* new buffers are zero-filled */ MemSet((char *) bufBlock, 0, BLCKSZ); + + if (track_io_timing) + INSTR_TIME_SET_CURRENT(io_start); + /* don't set checksum for all-zero page */ smgrextend(smgr, forkNum, blockNum, (char *) bufBlock, false); + if (track_io_timing) + { + INSTR_TIME_SET_CURRENT(io_time); + INSTR_TIME_SUBTRACT(io_time, io_start); + pgstat_count_buffer_read_time(INSTR_TIME_GET_MICROSEC(io_time)); + INSTR_TIME_ADD(pgBufferUsage.blk_write_time, io_time); + } + /* * NB: we're *not* doing a ScheduleBufferTagForWriteback here; * although we're essentially performing a write. At least on linux @@ -1012,9 +1026,6 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum, MemSet((char *) bufBlock, 0, BLCKSZ); else { - instr_time io_start, - io_time; - if (track_io_timing) INSTR_TIME_SET_CURRENT(io_start); -- 2.42.0 From c7fdf1fa2a1f878a96a71b04e9fc65b1f2a0a402 Mon Sep 17 00:00:00 2001 From: Nazir Bilal Yavuz Date: Mon, 2 Oct 2023 16:29:16 +0300 Subject: [PATCH v2 1/2] [PG15-] Add pgBufferUsage.blk_write_time to for local buffers --- src/backend/storage/buffer/localbuf.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c index e71f95ac1f
Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}
On Fri, Sep 15, 2023 at 9:24 AM Nazir Bilal Yavuz wrote: > I found that pgBufferUsage.blk_{read|write}_time are zero although there are > pgBufferUsage.local_blks_{read|written} Yes, good catch. This is a bug. I will note that at least in 15 and likely before, pgBufferUsage.local_blks_written is incremented for local buffers but pgBufferUsage.blk_write_time is only added to for shared buffers (in FlushBuffer()). I think it makes sense to propose a bug fix to stable branches counting blk_write_time for local buffers as well. - Melanie