Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-12-15 Thread hubert depesz lubaczewski
> > Well, if it's possible to deduce what is the meaning in given line,
> > I can add the logic to do the deduction to parser.
> > Also, I want to say that I appreciate being looped in the discussion.
> I lost sight of this thread, so my apologies for the delay.  The patch
> to fix the description of the EXPLAIN field has now been applied to
> v15 and v16.

Thanks. Will do my best to update the parser soon.

Best regards,

depesz





Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-12-14 Thread Michael Paquier
On Tue, Oct 31, 2023 at 03:11:03PM +0100, hubert depesz lubaczewski wrote:
> On Tue, Oct 31, 2023 at 08:17:52AM +0900, Michael Paquier wrote:
>> Thanks for the input.  I was looking yesterday if this code was
>> available somewhere, but couldn't find it..  Until this morning:
>> https://gitlab.com/depesz/explain.depesz.com.git
> 
> Well, the parser itself is https://gitlab.com/depesz/Pg--Explain/ :)

That was close enough ;)

> Well, if it's possible to deduce what is the meaning in given line,
> I can add the logic to do the deduction to parser.
> 
> Also, I want to say that I appreciate being looped in the discussion.

I lost sight of this thread, so my apologies for the delay.  The patch
to fix the description of the EXPLAIN field has now been applied to
v15 and v16.
--
Michael


signature.asc
Description: PGP signature


Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-31 Thread hubert depesz lubaczewski
On Tue, Oct 31, 2023 at 08:17:52AM +0900, Michael Paquier wrote:
> Thanks for the input.  I was looking yesterday if this code was
> available somewhere, but couldn't find it..  Until this morning:
> https://gitlab.com/depesz/explain.depesz.com.git

Well, the parser itself is https://gitlab.com/depesz/Pg--Explain/ :)

> And..  It looks like things would become better if we change
> "shared/local" to "shared", because the parsing code seems to have an
> issue once you add a '/'.  All the fields in I/O Timings are
> considered as part of a Node, and they're just included in the output.
> Now, pasting a plan that includes "shared/local" drops entirely the
> string from the output result, so some information is lost.  In short,
> imagine that we have the following string in a node:
> I/O Timings: shared/local write=23.77
> 
> This would show up like that, meaning that the context where the
> write/read timings happened is lost:
> I/O Timings: write=23.77
> 
> If we switch back to "shared", the context would be kept around.  Of
> course, this does not count for all the parsers that may be out
> there, but at least that's something.

Well, if it's possible to deduce what is the meaning in given line,
I can add the logic to do the deduction to parser.

Also, I want to say that I appreciate being looped in the discussion.

Best regards,

depesz





Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-30 Thread Michael Paquier
On Mon, Oct 30, 2023 at 03:14:16PM +0100, hubert depesz lubaczewski wrote:
> some things will definitely break, but that's 100% OK. The change seems
> needed, and I can update my parser to deal with it :)

Thanks for the input.  I was looking yesterday if this code was
available somewhere, but couldn't find it..  Until this morning:
https://gitlab.com/depesz/explain.depesz.com.git

And..  It looks like things would become better if we change
"shared/local" to "shared", because the parsing code seems to have an
issue once you add a '/'.  All the fields in I/O Timings are
considered as part of a Node, and they're just included in the output.
Now, pasting a plan that includes "shared/local" drops entirely the
string from the output result, so some information is lost.  In short,
imagine that we have the following string in a node:
I/O Timings: shared/local write=23.77

This would show up like that, meaning that the context where the
write/read timings happened is lost:
I/O Timings: write=23.77

If we switch back to "shared", the context would be kept around.  Of
course, this does not count for all the parsers that may be out
there, but at least that's something.
--
Michael


signature.asc
Description: PGP signature


Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-30 Thread hubert depesz lubaczewski
On Mon, Oct 30, 2023 at 10:45:05AM +0900, Michael Paquier wrote:
> On Fri, Oct 27, 2023 at 04:58:20PM +0300, Nazir Bilal Yavuz wrote:
> > I think switching it to 'shared' makes sense. That shouldn't confuse
> > existing monitoring queries much as the numbers won't change, right?
> > Also, if we keep 'shared/local' there could be similar complaints to
> > this thread in the future; so, at least adding comments can be
> > helpful.
> 
> The problem is that it may impact existing tools that do explain
> output deparsing.  One of them is https://explain.depesz.com/ that
> Hubert Depesz Lubaczewski has implemented, and it would be sad to
> break anything related to it.
> 
> I am adding Hubert in CC for comments about changing this
> "shared/local" to "shared" on a released branch.  Knowing that
> "shared" and "local" will need to be handled as separate terms in 17~
> anyway, perhaps that's not a big deal, but let's be sure.

Hi,
some things will definitely break, but that's 100% OK. The change seems
needed, and I can update my parser to deal with it :)

Best regards,

depesz





Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-29 Thread Michael Paquier
On Fri, Oct 27, 2023 at 04:58:20PM +0300, Nazir Bilal Yavuz wrote:
> I think switching it to 'shared' makes sense. That shouldn't confuse
> existing monitoring queries much as the numbers won't change, right?
> Also, if we keep 'shared/local' there could be similar complaints to
> this thread in the future; so, at least adding comments can be
> helpful.

The problem is that it may impact existing tools that do explain
output deparsing.  One of them is https://explain.depesz.com/ that
Hubert Depesz Lubaczewski has implemented, and it would be sad to
break anything related to it.

I am adding Hubert in CC for comments about changing this
"shared/local" to "shared" on a released branch.  Knowing that
"shared" and "local" will need to be handled as separate terms in 17~
anyway, perhaps that's not a big deal, but let's be sure.
--
Michael


signature.asc
Description: PGP signature


Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-27 Thread Nazir Bilal Yavuz
Hi,

On Thu, 19 Oct 2023 at 08:26, Michael Paquier  wrote:
>
> On Wed, Oct 18, 2023 at 02:56:42PM +0900, Michael Paquier wrote:
> > Thanks for the new versions.  I have applied 0001 and backpatched it
> > for now.  0002 and 0003 look in much cleaner shape than previously.
>
> 0002 and 0003 have now been applied.  I have split 0003 into two parts
> at the end, mainly on clarity grounds: one for the counters with
> EXPLAIN and a second for pg_stat_statements.
>
> There were a few things in the patch set.  Per my notes:
> - Some incorrect indentation.
> - The additions of show_buffer_usage() did not handle correctly the
> addition of a comma before/after the local timing block.  The code
> area for has_local_timing needs to check for has_temp_timing, while
> the area of has_shared_timing needs to check for (has_local_timing ||
> has_temp_timing).
> - explain.sgml was missing an update for the information related to
> the read/write timings of the local blocks.

Thanks for the changes, push and feedback!

>
> Remains what we should do about the "shared/local" string in
> show_buffer_usage() for v16 and v15, as "local" is unrelated to that.
> Perhaps we should just switch to "shared" in this case or just remove
> the string entirely?  Still that implies changing the output of
> EXPLAIN on a stable branch in this case, so there could be an argument
> for leaving this stuff alone.

I think switching it to 'shared' makes sense. That shouldn't confuse
existing monitoring queries much as the numbers won't change, right?
Also, if we keep 'shared/local' there could be similar complaints to
this thread in the future; so, at least adding comments can be
helpful.

Regards,
Nazir Bilal Yavuz
Microsoft




Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-18 Thread Michael Paquier
On Wed, Oct 18, 2023 at 02:56:42PM +0900, Michael Paquier wrote:
> Thanks for the new versions.  I have applied 0001 and backpatched it
> for now.  0002 and 0003 look in much cleaner shape than previously.

0002 and 0003 have now been applied.  I have split 0003 into two parts
at the end, mainly on clarity grounds: one for the counters with
EXPLAIN and a second for pg_stat_statements.

There were a few things in the patch set.  Per my notes:
- Some incorrect indentation.
- The additions of show_buffer_usage() did not handle correctly the
addition of a comma before/after the local timing block.  The code
area for has_local_timing needs to check for has_temp_timing, while
the area of has_shared_timing needs to check for (has_local_timing ||
has_temp_timing).
- explain.sgml was missing an update for the information related to
the read/write timings of the local blocks.

Remains what we should do about the "shared/local" string in
show_buffer_usage() for v16 and v15, as "local" is unrelated to that.
Perhaps we should just switch to "shared" in this case or just remove
the string entirely?  Still that implies changing the output of
EXPLAIN on a stable branch in this case, so there could be an argument
for leaving this stuff alone.
--
Michael


signature.asc
Description: PGP signature


Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-18 Thread Michael Paquier
On Tue, Oct 03, 2023 at 12:44:36PM -0400, Robert Haas wrote:
> My first thought was to wonder whether this was even a bug. I
> remembered that EXPLAIN treats shared, local, and temp buffers as
> three separate categories of things. But it seems that someone decided
> to conflate two of them for I/O timing purposes:
> 
> if (has_timing)
> {
> appendStringInfoString(es->str, "
> shared/local");
> 
>  Notice this bit in particular.

I was reviewing the whole, and this is an oversight specific to
efb0ef909f60, because we've never incremented the write/read counters
for local buffers, even with this commit applied, for both the EXPLAIN
reports and anything stored in pg_stat_statement.  It seems to me that
the origin of the confusion comes down to pg_stat_database where
blk_{read|write}_time increments on both local and shared blocks, but
on EXPLAIN this stuff only reflects data about shared buffers.  So the
"shared" part of the string is right, but the "local" part is not in
v15 and v16.
--
Michael


signature.asc
Description: PGP signature


Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-17 Thread Michael Paquier
On Tue, Oct 17, 2023 at 04:44:25PM +0300, Nazir Bilal Yavuz wrote:
> I updated patches based on your comments. v4 is attached.

Thanks for the new versions.  I have applied 0001 and backpatched it
for now.  0002 and 0003 look in much cleaner shape than previously.
--
Michael


signature.asc
Description: PGP signature


Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-17 Thread Nazir Bilal Yavuz
Hi,

Thanks for the review!

On Tue, 17 Oct 2023 at 11:40, Michael Paquier  wrote:
>
> On Mon, Oct 16, 2023 at 01:07:07PM +0300, Nazir Bilal Yavuz wrote:
> > Yes, that could be a better solution. Also, having more detailed stats
> > for shared and local buffers is helpful. I updated patches in line
> > with that:
> >
> > 0001: Counts extends same way as a write.
>
> It can change existing query results on an already-released branch,
> but we already count the number of blocks when doing a relation
> extension, so counting the write time is something I'd rather fix in
> v16.  If you have any objections, let me know.

I agree.

>
> > 0002: Rename blk_{read|write}_time as shared_blk_{read|write}_time.
>
> Note that `git diff --check` complains here.
>
> --- a/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql
> +++ b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql
> @@ -30,8 +30,8 @@ CREATE FUNCTION pg_stat_statements(IN showtext boolean,
>  OUT local_blks_written int8,
>  OUT temp_blks_read int8,
>  OUT temp_blks_written int8,
> -OUT blk_read_time float8,
> -OUT blk_write_time float8
> +OUT shared_blk_read_time float8,
> +OUT shared_blk_write_time float8
>
> Doing that in an extension upgrade script is incorrect.  These should
> not be touched.
>
> - Total time the statement spent reading data file blocks, in milliseconds
> + Total time the statement spent reading shared data file blocks, in 
> milliseconds
>
> Or just shared blocks?  That's what we use elsewhere for
> pg_stat_statements.  "shared data file blocks" sounds a bit confusing
> for relation file blocks read/written from/to shared buffers.
>
> > 0003: Add new local_blk_{read|write}_time variables.
>
>  DATA = pg_stat_statements--1.4.sql \
> +   pg_stat_statements--1.11--1.12.sql \
> pg_stat_statements--1.10--1.11.sql \
>
> There is no need to bump again pg_stat_statements, as it has already
> been bumped to 1.11 on HEAD per the recent commit 5a3423ad8ee1 from
> Daniel.  So the new changes can just be added to 1.11.

I updated patches based on your comments. v4 is attached.

Regards,
Nazir Bilal Yavuz
Microsoft
From 4b744c0b4e2cfdee4e95779ee19ac214b85aa150 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Fri, 13 Oct 2023 17:35:00 +0300
Subject: [PATCH v4 1/3] Include IOOp IOOP_EXTENDs while calculating block
 write times

Extends are counted as writes in block context, so include IOOP_EXTENDs
while calculating block write times.
---
 src/backend/utils/activity/pgstat_io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index eb7d35d422..8ec8670199 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -119,7 +119,7 @@ pgstat_count_io_op_time(IOObject io_object, IOContext io_context, IOOp io_op,
 		INSTR_TIME_SET_CURRENT(io_time);
 		INSTR_TIME_SUBTRACT(io_time, start_time);
 
-		if (io_op == IOOP_WRITE)
+		if (io_op == IOOP_WRITE || io_op == IOOP_EXTEND)
 		{
 			pgstat_count_buffer_write_time(INSTR_TIME_GET_MICROSEC(io_time));
 			if (io_object == IOOBJECT_RELATION)
-- 
2.42.0

From 5f299ee59d443a53bf6f3a8e18467270f5ee8318 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Fri, 13 Oct 2023 16:28:03 +0300
Subject: [PATCH v4 2/3] Rename I/O timing statistics columns to
 shared_blk_{read|write}_time

Since only shared blocks' timings are counted in blk_{read|write}_time,
rename them as shared_blk_{read|write}_time
---
 .../expected/oldextversions.out   |   4 +-
 .../pg_stat_statements--1.10--1.11.sql|   4 +-
 .../pg_stat_statements/pg_stat_statements.c   |  14 ++-
 doc/src/sgml/pgstatstatements.sgml|   8 +-
 src/backend/commands/explain.c|  29 ++---
 src/backend/executor/instrument.c |  12 +-
 src/backend/utils/activity/pgstat_io.c|   4 +-
 src/include/executor/instrument.h |   4 +-
 src/test/regress/expected/explain.out | 108 +-
 9 files changed, 95 insertions(+), 92 deletions(-)

diff --git a/contrib/pg_stat_statements/expected/oldextversions.out b/contrib/pg_stat_statements/expected/oldextversions.out
index 64982aad60..c2e0140a47 100644
--- a/contrib/pg_stat_statements/expected/oldextversions.out
+++ b/contrib/pg_stat_statements/expected/oldextversions.out
@@ -284,8 +284,8 @@ AlTER EXTENSION pg_stat_statements UPDATE TO '1.11';
  local_blks_written | bigint   |   |  | 
  temp_blks_read | bigint   |   |  | 
  temp_blks_written  | bigint   |   |  | 
- blk_read_time  | double precision |   |  | 
- blk_write_time | double precision |   |  | 
+ shared_blk_read_time   | double precision |   |  | 
+ shared_blk_write_time  | double precision |   |  | 
  

Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-17 Thread Michael Paquier
On Mon, Oct 16, 2023 at 01:07:07PM +0300, Nazir Bilal Yavuz wrote:
> Yes, that could be a better solution. Also, having more detailed stats
> for shared and local buffers is helpful. I updated patches in line
> with that:
> 
> 0001: Counts extends same way as a write.

It can change existing query results on an already-released branch,
but we already count the number of blocks when doing a relation
extension, so counting the write time is something I'd rather fix in
v16.  If you have any objections, let me know.

> 0002: Rename blk_{read|write}_time as shared_blk_{read|write}_time.

Note that `git diff --check` complains here.

--- a/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql
+++ b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql
@@ -30,8 +30,8 @@ CREATE FUNCTION pg_stat_statements(IN showtext boolean,
 OUT local_blks_written int8,
 OUT temp_blks_read int8,
 OUT temp_blks_written int8,
-OUT blk_read_time float8,
-OUT blk_write_time float8
+OUT shared_blk_read_time float8,
+OUT shared_blk_write_time float8

Doing that in an extension upgrade script is incorrect.  These should
not be touched. 

- Total time the statement spent reading data file blocks, in milliseconds
+ Total time the statement spent reading shared data file blocks, in 
milliseconds

Or just shared blocks?  That's what we use elsewhere for
pg_stat_statements.  "shared data file blocks" sounds a bit confusing
for relation file blocks read/written from/to shared buffers.

> 0003: Add new local_blk_{read|write}_time variables.

 DATA = pg_stat_statements--1.4.sql \
+   pg_stat_statements--1.11--1.12.sql \
pg_stat_statements--1.10--1.11.sql \

There is no need to bump again pg_stat_statements, as it has already
been bumped to 1.11 on HEAD per the recent commit 5a3423ad8ee1 from
Daniel.  So the new changes can just be added to 1.11.
--
Michael


signature.asc
Description: PGP signature


Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-16 Thread Nazir Bilal Yavuz
Hi,

On Tue, 10 Oct 2023 at 03:54, Michael Paquier  wrote:
>
> In ~14, as far as I can see blk_write_time is only incremented for
> shared buffers.  FWIW, I agree that we should improve these stats for
> local buffers but I am not on board with a solution where we'd use the
> same counter for local and shared buffers while we've historically
> only counted the former, because that could confuse existing
> monitoring queries.  It seems to me that the right solution is to do
> the same separation as temp blocks with two separate counters, without
> a backpatch.  I'd like to go as far as renaming blk_read_time and
> blk_write_time to respectively shared_blk_read_time and
> shared_blk_write_time to know exactly what the type of block dealt
> with is when querying this data, particularly for pg_stat_statements's
> sake.

Yes, that could be a better solution. Also, having more detailed stats
for shared and local buffers is helpful. I updated patches in line
with that:

0001: Counts extends same way as a write.
0002: Rename blk_{read|write}_time as shared_blk_{read|write}_time.
0003: Add new local_blk_{read|write}_time variables.

Regards,
Nazir Bilal Yavuz
Microsoft
From acf5a2781760f284aef770da0f64acf8685b6445 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Fri, 13 Oct 2023 17:35:00 +0300
Subject: [PATCH v3 1/3] Include IOOp IOOP_EXTENDs while calculating block
 write times

Extends are counted as writes in block context, so include IOOP_EXTENDs
while calculating block write times.
---
 src/backend/utils/activity/pgstat_io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index eb7d35d422..8ec8670199 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -119,7 +119,7 @@ pgstat_count_io_op_time(IOObject io_object, IOContext io_context, IOOp io_op,
 		INSTR_TIME_SET_CURRENT(io_time);
 		INSTR_TIME_SUBTRACT(io_time, start_time);
 
-		if (io_op == IOOP_WRITE)
+		if (io_op == IOOP_WRITE || io_op == IOOP_EXTEND)
 		{
 			pgstat_count_buffer_write_time(INSTR_TIME_GET_MICROSEC(io_time));
 			if (io_object == IOOBJECT_RELATION)
-- 
2.42.0

From 1f0933c7057e20c074f6b9355f6d91a9277a8c09 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Thu, 12 Oct 2023 14:40:23 +0300
Subject: [PATCH v3 3/3] Add local_blk_{read|write}_time I/O timing statistics

There was no I/O timing statistics for counting local blocks' timings.
So, add local_blk_{read|write}_time variables for counting these. These
counters are exposed in EXPLAIN and pg_stat_statements.
pg_stat_statements is bumped to 1.12.
---
 contrib/pg_stat_statements/Makefile   |  1 +
 .../expected/oldextversions.out   | 60 
 contrib/pg_stat_statements/meson.build|  1 +
 .../pg_stat_statements--1.11--1.12.sql| 71 +++
 .../pg_stat_statements/pg_stat_statements.c   | 33 -
 .../pg_stat_statements.control|  2 +-
 .../pg_stat_statements/sql/oldextversions.sql |  5 ++
 doc/src/sgml/pgstatstatements.sgml| 20 ++
 src/backend/commands/explain.c| 23 +-
 src/backend/executor/instrument.c |  6 ++
 src/backend/utils/activity/pgstat_io.c|  4 ++
 src/include/executor/instrument.h |  2 +
 src/test/regress/expected/explain.out |  4 ++
 13 files changed, 228 insertions(+), 4 deletions(-)
 create mode 100644 contrib/pg_stat_statements/pg_stat_statements--1.11--1.12.sql

diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index eba4a95d91..b7d12bc7fb 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -7,6 +7,7 @@ OBJS = \
 
 EXTENSION = pg_stat_statements
 DATA = pg_stat_statements--1.4.sql \
+	pg_stat_statements--1.11--1.12.sql \
 	pg_stat_statements--1.10--1.11.sql \
 	pg_stat_statements--1.9--1.10.sql pg_stat_statements--1.8--1.9.sql \
 	pg_stat_statements--1.7--1.8.sql pg_stat_statements--1.6--1.7.sql \
diff --git a/contrib/pg_stat_statements/expected/oldextversions.out b/contrib/pg_stat_statements/expected/oldextversions.out
index 9e18fe2e47..b84e0c8484 100644
--- a/contrib/pg_stat_statements/expected/oldextversions.out
+++ b/contrib/pg_stat_statements/expected/oldextversions.out
@@ -308,4 +308,64 @@ SELECT count(*) > 0 AS has_data FROM pg_stat_statements;
  t
 (1 row)
 
+-- New views for pg_stat_statements in 1.12
+AlTER EXTENSION pg_stat_statements UPDATE TO '1.12';
+\d pg_stat_statements
+  View "public.pg_stat_statements"
+ Column |   Type   | Collation | Nullable | Default 
++--+---+--+-
+ userid | oid  |   |  | 
+ dbid   | oid  |   |  | 
+ toplevel   | boolean  |   | 

Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-09 Thread Michael Paquier
On Thu, Oct 05, 2023 at 08:51:40AM -0400, Robert Haas wrote:
> On Thu, Oct 5, 2023 at 6:25 AM Nazir Bilal Yavuz  wrote:
>> What do you think about the second patch, counting extend calls'
>> timings in blk_write_time? In my opinion if something increments
>> {shared|local}_blks_written, then it needs to be counted in
>> blk_write_time too. I am not sure why it is decided like that.
> 
> I agree that an extend should be counted the same way as a write. But
> I'm suspicious that here too we have confusion about whether
> blk_write_time is supposed to be covering shared buffers and local
> buffers or just shared buffers.

Agreed.

In ~14, as far as I can see blk_write_time is only incremented for
shared buffers.  FWIW, I agree that we should improve these stats for
local buffers but I am not on board with a solution where we'd use the
same counter for local and shared buffers while we've historically
only counted the former, because that could confuse existing
monitoring queries.  It seems to me that the right solution is to do
the same separation as temp blocks with two separate counters, without
a backpatch.  I'd like to go as far as renaming blk_read_time and
blk_write_time to respectively shared_blk_read_time and
shared_blk_write_time to know exactly what the type of block dealt
with is when querying this data, particularly for pg_stat_statements's
sake.
--
Michael


signature.asc
Description: PGP signature


Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-05 Thread Robert Haas
On Thu, Oct 5, 2023 at 6:25 AM Nazir Bilal Yavuz  wrote:
> What do you think about the second patch, counting extend calls'
> timings in blk_write_time? In my opinion if something increments
> {shared|local}_blks_written, then it needs to be counted in
> blk_write_time too. I am not sure why it is decided like that.

I agree that an extend should be counted the same way as a write. But
I'm suspicious that here too we have confusion about whether
blk_write_time is supposed to be covering shared buffers and local
buffers or just shared buffers.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-05 Thread Nazir Bilal Yavuz
Hi,

On Tue, 3 Oct 2023 at 19:44, Robert Haas  wrote:
>
> Given that, I'm inclined to agree that this is a bug. But we might
> need to go through and make sure all of the code that deals with these
> counters is on the same page about what the values represent. Maybe
> there is code lurking somewhere that thinks these counters are only
> supposed to include "shared" rather than, as the fragment above
> suggests, "shared/local".

Thank you for the guidance.

What do you think about the second patch, counting extend calls'
timings in blk_write_time? In my opinion if something increments
{shared|local}_blks_written, then it needs to be counted in
blk_write_time too. I am not sure why it is decided like that.

Regards,
Nazir Bilal Yavuz
Microsoft




Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-03 Thread Robert Haas
On Fri, Sep 15, 2023 at 12:34 PM Melanie Plageman
 wrote:
> On Fri, Sep 15, 2023 at 9:24 AM Nazir Bilal Yavuz  wrote:
> > I found that pgBufferUsage.blk_{read|write}_time are zero although there 
> > are pgBufferUsage.local_blks_{read|written}
>
> Yes, good catch. This is a bug. I will note that at least in 15 and
> likely before, pgBufferUsage.local_blks_written is incremented for
> local buffers but pgBufferUsage.blk_write_time is only added to for
> shared buffers (in FlushBuffer()). I think it makes sense to propose a
> bug fix to stable branches counting blk_write_time for local buffers
> as well.

My first thought was to wonder whether this was even a bug. I
remembered that EXPLAIN treats shared, local, and temp buffers as
three separate categories of things. But it seems that someone decided
to conflate two of them for I/O timing purposes:

if (has_timing)
{
appendStringInfoString(es->str, "
shared/local");

 Notice this bit in particular.

if (!INSTR_TIME_IS_ZERO(usage->blk_read_time))
appendStringInfo(es->str, " read=%0.3f",

  INSTR_TIME_GET_MILLISEC(usage->blk_read_time));
if (!INSTR_TIME_IS_ZERO(usage->blk_write_time))
appendStringInfo(es->str, "
write=%0.3f",

  INSTR_TIME_GET_MILLISEC(usage->blk_write_time));
if (has_temp_timing)
appendStringInfoChar(es->str, ',');
}
if (has_temp_timing)
{
appendStringInfoString(es->str, " temp");
if
(!INSTR_TIME_IS_ZERO(usage->temp_blk_read_time))
appendStringInfo(es->str, " read=%0.3f",

  INSTR_TIME_GET_MILLISEC(usage->temp_blk_read_time));
if
(!INSTR_TIME_IS_ZERO(usage->temp_blk_write_time))
appendStringInfo(es->str, "
write=%0.3f",

  INSTR_TIME_GET_MILLISEC(usage->temp_blk_write_time));
}

Given that, I'm inclined to agree that this is a bug. But we might
need to go through and make sure all of the code that deals with these
counters is on the same page about what the values represent. Maybe
there is code lurking somewhere that thinks these counters are only
supposed to include "shared" rather than, as the fragment above
suggests, "shared/local".

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-03 Thread Nazir Bilal Yavuz
Hi,

On Fri, 15 Sept 2023 at 16:30, Melanie Plageman
 wrote:
>
> Yes, good catch. This is a bug. I will note that at least in 15 and
> likely before, pgBufferUsage.local_blks_written is incremented for
> local buffers but pgBufferUsage.blk_write_time is only added to for
> shared buffers (in FlushBuffer()). I think it makes sense to propose a
> bug fix to stable branches counting blk_write_time for local buffers
> as well.

I attached the PG16+ (after pg_stat_io) and PG15- (before pg_stat_io)
versions of the same patch.

Regards,
Nazir Bilal Yavuz
Microsoft
From b4ea504497708b1017e8cfda5f1ac366a9e34386 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Fri, 15 Sep 2023 11:55:43 +0300
Subject: [PATCH v2 2/2] [PG16+] Add pgBufferUsage.blk_write_time to for IOOp
 IOOP_EXTENDs

---
 src/backend/utils/activity/pgstat_io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index c8058b57962..56051fc6072 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -119,7 +119,7 @@ pgstat_count_io_op_time(IOObject io_object, IOContext io_context, IOOp io_op,
 		INSTR_TIME_SET_CURRENT(io_time);
 		INSTR_TIME_SUBTRACT(io_time, start_time);
 
-		if (io_op == IOOP_WRITE)
+		if (io_op == IOOP_WRITE || io_op == IOOP_EXTEND)
 		{
 			pgstat_count_buffer_write_time(INSTR_TIME_GET_MICROSEC(io_time));
 			if (io_object == IOOBJECT_RELATION
-- 
2.42.0

From 46aaa1f76f02f7697f520e6509a3aeb970a862ae Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Fri, 15 Sep 2023 12:35:01 +0300
Subject: [PATCH v2 1/2] [PG16+] Increase pgBufferUsage.blk_{read|write}_time
 when IOObject is IOOBJECT_TEMP_RELATION

---
 src/backend/utils/activity/pgstat_io.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index eb7d35d4225..c8058b57962 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -122,13 +122,15 @@ pgstat_count_io_op_time(IOObject io_object, IOContext io_context, IOOp io_op,
 		if (io_op == IOOP_WRITE)
 		{
 			pgstat_count_buffer_write_time(INSTR_TIME_GET_MICROSEC(io_time));
-			if (io_object == IOOBJECT_RELATION)
+			if (io_object == IOOBJECT_RELATION
+			|| io_object == IOOBJECT_TEMP_RELATION)
 INSTR_TIME_ADD(pgBufferUsage.blk_write_time, io_time);
 		}
 		else if (io_op == IOOP_READ)
 		{
 			pgstat_count_buffer_read_time(INSTR_TIME_GET_MICROSEC(io_time));
-			if (io_object == IOOBJECT_RELATION)
+			if (io_object == IOOBJECT_RELATION
+			|| io_object == IOOBJECT_TEMP_RELATION)
 INSTR_TIME_ADD(pgBufferUsage.blk_read_time, io_time);
 		}
 
-- 
2.42.0

From 1e3af6e9466924713488b374b5d7bfb2c3bd5983 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Mon, 2 Oct 2023 16:31:46 +0300
Subject: [PATCH v2 2/2] [PG15-] Add pgBufferUsage.blk_write_time to for
 extends

---
 src/backend/storage/buffer/bufmgr.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 9fcb3d6e194..a4e9fd3317b 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -825,6 +825,8 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	bool		found;
 	bool		isExtend;
 	bool		isLocalBuf = SmgrIsTemp(smgr);
+	instr_time	io_start,
+io_time;
 
 	*hit = false;
 
@@ -992,9 +994,21 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	{
 		/* new buffers are zero-filled */
 		MemSet((char *) bufBlock, 0, BLCKSZ);
+
+		if (track_io_timing)
+			INSTR_TIME_SET_CURRENT(io_start);
+
 		/* don't set checksum for all-zero page */
 		smgrextend(smgr, forkNum, blockNum, (char *) bufBlock, false);
 
+		if (track_io_timing)
+		{
+			INSTR_TIME_SET_CURRENT(io_time);
+			INSTR_TIME_SUBTRACT(io_time, io_start);
+			pgstat_count_buffer_read_time(INSTR_TIME_GET_MICROSEC(io_time));
+			INSTR_TIME_ADD(pgBufferUsage.blk_write_time, io_time);
+		}
+
 		/*
 		 * NB: we're *not* doing a ScheduleBufferTagForWriteback here;
 		 * although we're essentially performing a write. At least on linux
@@ -1012,9 +1026,6 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 			MemSet((char *) bufBlock, 0, BLCKSZ);
 		else
 		{
-			instr_time	io_start,
-		io_time;
-
 			if (track_io_timing)
 INSTR_TIME_SET_CURRENT(io_start);
 
-- 
2.42.0

From c7fdf1fa2a1f878a96a71b04e9fc65b1f2a0a402 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Mon, 2 Oct 2023 16:29:16 +0300
Subject: [PATCH v2 1/2] [PG15-] Add pgBufferUsage.blk_write_time to for local
 buffers

---
 src/backend/storage/buffer/localbuf.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 

Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-09-15 Thread Melanie Plageman
On Fri, Sep 15, 2023 at 9:24 AM Nazir Bilal Yavuz  wrote:
> I found that pgBufferUsage.blk_{read|write}_time are zero although there are 
> pgBufferUsage.local_blks_{read|written}

Yes, good catch. This is a bug. I will note that at least in 15 and
likely before, pgBufferUsage.local_blks_written is incremented for
local buffers but pgBufferUsage.blk_write_time is only added to for
shared buffers (in FlushBuffer()). I think it makes sense to propose a
bug fix to stable branches counting blk_write_time for local buffers
as well.

- Melanie




pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-09-15 Thread Nazir Bilal Yavuz
Hi,

I found that pgBufferUsage.blk_{read|write}_time are zero although there
are pgBufferUsage.local_blks_{read|written}. For example, when you run
(track_io_timing should be on):

CREATE EXTENSION pg_stat_statements;
CREATE TEMP TABLE example_table (id serial PRIMARY KEY, data text);
INSERT INTO example_table (data) SELECT 'Some data'
FROM generate_series(1, 10);
UPDATE example_table SET data = 'Updated data';
SELECT query, local_blks_read, local_blks_written,
blk_read_time, blk_write_time FROM pg_stat_statements
WHERE query like '%UPDATE%';

on master:

query  | UPDATE example_table SET data = $1
local_blks_read| 467
local_blks_written | 2087
blk_read_time  | 0
blk_write_time | 0

There are two reasons of that:

1- When local_blks_{read|written} are incremented,
pgstat_count_io_op_time() is called with IOOBJECT_TEMP_RELATION. But in
pgstat_count_io_op_time(), pgBufferUsage.blk_{read|write}_time increments
are called only when io_object is IOOBJECT_RELATION. The first patch
attached fixes that.

2- in ExtendBufferedRelLocal() and in ExtendBufferedRelShared(), extend
calls increment local_blks_written and shared_blks_written respectively.
But these extends are not counted while calculating the blk_write_time. If
there is no specific reason to not do that, I think these extends needs to
be counted in blk_write_time. The second patch attached does that.

Results after applying first patch:

query  | UPDATE example_table SET data = $1
local_blks_read| 467
local_blks_written | 2087
blk_read_time  | 0.30085
blk_write_time | 1.475123

Results after applying both patches:

query  | UPDATE example_table SET data = $1
local_blks_read| 467
local_blks_written | 2087
blk_read_time  | 0.329597
blk_write_time | 4.050305

Any kind of feedback would be appreciated.

Regards,
Nazir Bilal Yavuz
Microsoft
From 96f7b82e2ec5e8f68b509ea58131fba42deac7c0 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Fri, 15 Sep 2023 11:55:43 +0300
Subject: [PATCH v1 2/2] Include IOOp IOOP_EXTENDs while calculating block
 write times

---
 src/backend/utils/activity/pgstat_io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index c8058b57962..56051fc6072 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -119,7 +119,7 @@ pgstat_count_io_op_time(IOObject io_object, IOContext io_context, IOOp io_op,
 		INSTR_TIME_SET_CURRENT(io_time);
 		INSTR_TIME_SUBTRACT(io_time, start_time);
 
-		if (io_op == IOOP_WRITE)
+		if (io_op == IOOP_WRITE || io_op == IOOP_EXTEND)
 		{
 			pgstat_count_buffer_write_time(INSTR_TIME_GET_MICROSEC(io_time));
 			if (io_object == IOOBJECT_RELATION
-- 
2.40.1

From de0f429280b24c42c951d86a468118ed09183a6e Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Fri, 15 Sep 2023 12:35:01 +0300
Subject: [PATCH v1 1/2] Increase pgBufferUsage.blk_{read|write}_time when
 IOObject is IOOBJECT_TEMP_RELATION

---
 src/backend/utils/activity/pgstat_io.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index eb7d35d4225..c8058b57962 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -122,13 +122,15 @@ pgstat_count_io_op_time(IOObject io_object, IOContext io_context, IOOp io_op,
 		if (io_op == IOOP_WRITE)
 		{
 			pgstat_count_buffer_write_time(INSTR_TIME_GET_MICROSEC(io_time));
-			if (io_object == IOOBJECT_RELATION)
+			if (io_object == IOOBJECT_RELATION
+			|| io_object == IOOBJECT_TEMP_RELATION)
 INSTR_TIME_ADD(pgBufferUsage.blk_write_time, io_time);
 		}
 		else if (io_op == IOOP_READ)
 		{
 			pgstat_count_buffer_read_time(INSTR_TIME_GET_MICROSEC(io_time));
-			if (io_object == IOOBJECT_RELATION)
+			if (io_object == IOOBJECT_RELATION
+			|| io_object == IOOBJECT_TEMP_RELATION)
 INSTR_TIME_ADD(pgBufferUsage.blk_read_time, io_time);
 		}
 
-- 
2.40.1