subject:"\[HACKERS\] EXPLAIN BUFFERS"

Re: [HACKERS] Explain buffers display units.

2010-02-16 Thread Greg Stark

On Tue, Feb 16, 2010 at 3:54 PM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Greg Stark escribió:
>>> Oops. Well, I would like to know if I'm in the minority and have to
>>> roll this back before I fix that.
>
>> My personal opinion is that displaying number of blocks in all EXPLAIN
>> formats is more consistent.
>
> FWIW, I vote for number of blocks too.  I tend to see those numbers as
> more indicative of number of I/O requests than amount of memory used.

Ok, that's 3:1 against.

I suspect we'll revisit this once you see all the other
instrumentation I plan for 9.1. It will be much easier to make sense
of all the numbers in consistent units. But we'll see then.

I won't be able to do the rollback until about 11pm EST again today.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explain buffers display units.

2010-02-16 Thread Tom Lane

Alvaro Herrera  writes:
> Greg Stark escribió:
>> Oops. Well, I would like to know if I'm in the minority and have to
>> roll this back before I fix that.

> My personal opinion is that displaying number of blocks in all EXPLAIN
> formats is more consistent.

FWIW, I vote for number of blocks too.  I tend to see those numbers as
more indicative of number of I/O requests than amount of memory used.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explain buffers display units.

2010-02-16 Thread Alvaro Herrera

Greg Stark escribió:
> On Tue, Feb 16, 2010 at 2:48 AM, Robert Haas  wrote:

> > Upon further review, I also notice that this patch seems to have
> > falsified the EXPLAIN documentation - both the description of the
> > BUFFERS option and the description of the FORMAT option are no longer
> > accurate
> 
> Oops. Well, I would like to know if I'm in the minority and have to
> roll this back before I fix that.

My personal opinion is that displaying number of blocks in all EXPLAIN
formats is more consistent.  What are you going to do with YAML output
anyway, which is machine readable yet some people prefer over our legacy
text format?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explain buffers display units.

2010-02-16 Thread Greg Stark

On Tue, Feb 16, 2010 at 2:48 AM, Robert Haas  wrote:
> Multiplying by the block size makes it sound as if all the
> memory was read or used, which is simply not the case - especially for
> things like buffer hits, which don't actually read or allocate any
> memory at all.

In which case it represents how much data would have had to have been
read if it wasn't in the buffer cache which is a perfectly reasonable
measurement. It's exactly what a cache profiler should be measuring.
These are figures that users have to compare with their buffer cache
size and with the output of iostat or other tools. Presenting them in
arbitrary internal units makes that difficult.

> We certainly do that for GUCs, and in that context it seems to me to
> make sense.  If you set your shared buffers to a gigabyte, PG will use
> an additional GB of memory.  But if you hit a "gigabyte" of shared
> buffers, you may be examining anywhere from one 8K block over and over
> again all the way up to a full GB of memory.  Block hits and reads
> just don't add in the same way that actual memory allocations do.

Accessing the same 8kB of memory 100,1000 times is 1GB of memory
bandwidth. The output of explain doesn't give you enough information
to distinguish that from accessing 1GB of different data which is too
bad but there's a limit to how much information we can fit in a
reasonable amount of space. But 1GB of memory bandwidth is still an
interesting figure even if it's the same 8kB a hundred thousand times.
I think it's a lot more meaningful for a human reader than "131072".

> And at any rate, what we DON'T do for GUCs is produce differing output
> format for the same parameter based on the magnitude of the output
> value, as you've done here.

No, that's *exactly* what we do:

postgres=# set work_mem = 64;
SET
postgres=# show work_mem;
 work_mem
--
 64kB
(1 row)

postgres=# set work_mem = 1024;
SET
postgres=# show work_mem;
 work_mem
--
 1MB
(1 row)

postgres=# set work_mem = 1048576;
SET
postgres=# show work_mem;
 work_mem
--
 1GB
(1 row)

> We accept input in several different
> formats, but there is only one canonical output formal for any
> particular GUC, which is furthermore always chosen in such a way that
> the exact value of the setting is preserved (again, unlike what you've
> done here).

I don't think the use case for GUCs is the same as for empirical
measurements. Empirical results are never going to come out as a round
number of megabytes so only using larger units in that case would be
useless. In the case of GUCs I assume the argument was that someone
should be able to copy the output into another postgresql.conf and get
the same value, something which is irrelevant for empirical
measurements.

In any case the machine-readable form of GUC settings is not this one
canonical format you describe for SHOW:

postgres=# select name,setting,unit,min_val,max_val,boot_val,reset_val
from pg_settings where name = 'work_mem';
   name   | setting | unit | min_val | max_val | boot_val | reset_val
--+-+--+-+-+--+---
 work_mem | 1048576 | kB   | 64  | 2097151 | 1024 | 1024
(1 row)

This is similar to how I think the XML output should work. It should
have the raw internal values with enough meta data in it that a tool
can figure out how to display it or work with it.

> So, you're saying we shouldn't look at the way that the pg_stat
> functions format the output because somebody might write a view over
> it that formats it in some different way that may or may not match
> what you've done for the EXPLAIN output?  What makes you think that
> people don't just look at the raw numbers?  I certainly have, and
> there's no suggestion in the documentation that users should do
> anything else.

I'm not sure users need suggestions that they should format the data
in whatever way they want. We still have to document the programmatic
interface they use to get the raw data.

> pg_stat_statements doesn't do what you're suggesting either; it, too,
> presents raw numbers, and lets the user make of it what they will.
> They might, for example, want to compute a hit ratio, as in the
> example provided in the docs.  In the case of EXPLAIN of an index
> scan, they might want to estimate the number of seeks, on the theory
> that an inner-indexscan is going to be all random IO.

You can compute the hit ratio just fine from measurements with units.
And if you're doing it in an automated way you'll want to use
machine-readable output, rather than parsing the formatted text.

> This doesn't seem to be a very carefully thought out proposal, because
> you haven't explained how it would work for JSON or YAML output.  A
> format-neutral solution which we've already used for sort and hash
> information (and for GUCs) is to include the unit designator in the
> output..  But I generally think that trying to make the EXPLAIN output
> self-documenting to the point where programs don'

Re: [HACKERS] Explain buffers display units.

2010-02-15 Thread Robert Haas

On Mon, Feb 15, 2010 at 6:44 PM, Greg Stark  wrote:
> I did respond to it. The whole point is that the text output is for a
> human to read. It should be printed in human-readable units. Not some
> arbitrary internal unit of accounting that they then have to do
> arithmetic on to make sense of.

Well, I disagree with your statement the previous output was not
printed in human-readable units: it was printed in blocks, which I
find to be a perfectly good unit.  It's true that the basic unit of
blocks can be converted into kilobytes, but so what?  We aren't really
measuring kilobytes; we're measuring blocks.  We could equally well
convert the sort and hash output from kilobytes into blocks, but it
would be equally wrong: the sort and hash statistics are measuring
memory usage by adding up actual memory allocations.  The buffer
statistics are simply counting the number of blocks that are read or
written.  Multiplying by the block size makes it sound as if all the
memory was read or used, which is simply not the case - especially for
things like buffer hits, which don't actually read or allocate any
memory at all.

> We do *not* display raw block numbers anywhere else. Generally I think
> we should have a policy of outputing human-readable standard units of
> memory whenever displaying a memory quantity. Actually I thought we
> already had that policy, hence things like:
>
> postgres=# show shared_buffers;
>  shared_buffers
> 
>  28MB
> (1 row)
>
> postgres=# show checkpoint_timeout;
>  checkpoint_timeout
> 
>  5min
> (1 row)

We certainly do that for GUCs, and in that context it seems to me to
make sense.  If you set your shared buffers to a gigabyte, PG will use
an additional GB of memory.  But if you hit a "gigabyte" of shared
buffers, you may be examining anywhere from one 8K block over and over
again all the way up to a full GB of memory.  Block hits and reads
just don't add in the same way that actual memory allocations do.

And at any rate, what we DON'T do for GUCs is produce differing output
format for the same parameter based on the magnitude of the output
value, as you've done here.  We accept input in several different
formats, but there is only one canonical output formal for any
particular GUC, which is furthermore always chosen in such a way that
the exact value of the setting is preserved (again, unlike what you've
done here).

> The other examples you name are all internal or machine-readable
> fomats which have to be formatted somehow using sql queries or tools
> if you want to inspect the values directly. The user is free to format
> the output of the pg_stat* functions using pg_size_pretty() though
> it's annoying that it's not in the same base unit that
> pg_relation_size() outputs  but these are the only interface to these
> internal counters so there's no way to know if they're being used for
> human-readable output or for gathering raw data for statistics or
> other purposes.

So, you're saying we shouldn't look at the way that the pg_stat
functions format the output because somebody might write a view over
it that formats it in some different way that may or may not match
what you've done for the EXPLAIN output?  What makes you think that
people don't just look at the raw numbers?  I certainly have, and
there's no suggestion in the documentation that users should do
anything else.

pg_stat_statements doesn't do what you're suggesting either; it, too,
presents raw numbers, and lets the user make of it what they will.
They might, for example, want to compute a hit ratio, as in the
example provided in the docs.  In the case of EXPLAIN of an index
scan, they might want to estimate the number of seeks, on the theory
that an inner-indexscan is going to be all random IO.

>> I think this is a really terrible idea.  You've got a lot of very
>> specific formatting code in explain.c which anyone who wants to use
>> the JSON and XML output will very possibly need to reimplement.  I
>> have worked really hard to keep the text format in sync with all the
>> others, and up until now they have been.
>
> You're assuming the JSON and XML program is planning to display the
> measurements? They might not be. They might be gathering them for
> charting or for alerts or all kinds of other things.  Even if they do
> plan to output them they'll want to format it in way that makes sense
> for the context it's used in which might include more or fewer digits
> or plug into some widget which requires raw values and does the
> formatting automatically.

Yes, they might want to write their own formatting code, but they also
might not.  They might want to calculate hit ratios, or they might
want to alter the number of decimal places, or they might just want to
output the exact same information as the text format, but in a GUI
format rather than using ASCII art.

> Whereas the human-readable format should display values in a form
> humans can parse, the machine-readable output sho

Re: [HACKERS] Explain buffers display units.

2010-02-15 Thread Greg Smith


Greg Stark wrote:

We do *not* display raw block numbers anywhere else. Generally I think
we should have a policy of outputing human-readable standard units of
memory whenever displaying a memory quantity. Actually I thought we
already had that policy, hence things like...
  


The first counter example I thought of is log_checkpoints which looks 
like this:


LOG: checkpoint complete: wrote 133795 buffers (25.5%); 0 transaction 
log file(s) added, 0 removed, 98 recycled; write=112.281 s, sync=108.809 
s, total=221.166 s



 Probably the XML schema should include the units as an attribute for
each tag so tools don't have to hard-code knowledge about what unit
each tag is in.
  


I don't know if it's practical at this point, but it might be helpful 
for the truly machine-targeted output formats to include specifically 
BLCKSZ somewhere in their header--just so there's a universal way to 
interpret the output even if the user tuned that.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us

Re: [HACKERS] Explain buffers display units.

2010-02-15 Thread Greg Stark

On Mon, Feb 15, 2010 at 7:58 PM, Robert Haas  wrote:
>>>  To me, buffers seem like discrete (and unitless)
>>> entities, and we handle them that way elsewhere in the system (see,
>>> e.g. pg_stat_database, pg_statio_all_tables).  I don't know that it's
>>> a good idea to display that same information here in a different
>>> format.
>
> This seems like an important point that you need to respond to.  Why
> should we print out this information in kB here when we display it as
> raw numbers elsewhere?  I can't see any reason at all.

I did respond to it. The whole point is that the text output is for a
human to read. It should be printed in human-readable units. Not some
arbitrary internal unit of accounting that they then have to do
arithmetic on to make sense of.

We do *not* display raw block numbers anywhere else. Generally I think
we should have a policy of outputing human-readable standard units of
memory whenever displaying a memory quantity. Actually I thought we
already had that policy, hence things like:

postgres=# show shared_buffers;
 shared_buffers

 28MB
(1 row)

postgres=# show checkpoint_timeout;
 checkpoint_timeout

 5min
(1 row)

The other examples you name are all internal or machine-readable
fomats which have to be formatted somehow using sql queries or tools
if you want to inspect the values directly. The user is free to format
the output of the pg_stat* functions using pg_size_pretty() though
it's annoying that it's not in the same base unit that
pg_relation_size() outputs. but these are the only interface to these
internal counters so there's no way to know if they're being used for
human-readable output or for gathering raw data for statistics or
other purposes.

>>> I definitely do not want to do anything that loses accuracy.  This is
>>> probably accurate enough for most uses, but it's still not as accurate
>>> as just printing the raw numbers.
>>
>> I left the XML/JSON output in terms of blocks on the theory that tools
>> reading this data can look up the block size and convert all it wants.
>
> I think this is a really terrible idea.  You've got a lot of very
> specific formatting code in explain.c which anyone who wants to use
> the JSON and XML output will very possibly need to reimplement.  I
> have worked really hard to keep the text format in sync with all the
> others, and up until now they have been.

You're assuming the JSON and XML program is planning to display the
measurements? They might not be. They might be gathering them for
charting or for alerts or all kinds of other things. Even if they do
plan to output them they'll want to format it in way that makes sense
for the context it's used in which might include more or fewer digits
or plug into some widget which requires raw values and does the
formatting automatically.

Whereas the human-readable format should display values in a form
humans can parse, the machine-readable output should include the raw
measurements with enough information for the tool to make sense of it.
 Probably the XML schema should include the units as an attribute for
each tag so tools don't have to hard-code knowledge about what unit
each tag is in.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explain buffers display units.

2010-02-15 Thread Robert Haas

On Mon, Feb 15, 2010 at 1:29 PM, Greg Stark  wrote:
> On Mon, Feb 15, 2010 at 6:05 PM, Robert Haas  wrote:
>>> Well there was a 30+ message thread almost a week ago where there
>>> seemed to be some contention over the issue of whether the numbers
>>> should be averages or totals. But were there was no dispute over the
>>> idea of printing in memory units instead of blocks.
>>
>> Hmm yeah, I guess it wasn't discussed.  I'm still not sure it's an
>> improvement.  If a query hit one buffer, is that really the same as
>> saying it hit 8kB?
>
> Well you can always convert between them. The only time it would make
> a difference is if you're sure it's random i/o and you're concerned
> with the number of iops. However it's impossible to tell from this
> output how many of these buffers are read sequentially and how many
> randomly. Even if it's sequential you don't know how much it read
> between interruptions to handle the inner side of a join or whether
> the cached blocks were interspersed throughout the file or were all at
> the beginning or end.

All true, although "you can always converted between them" assumes you
know the block size.  I don't imagine many people change that, but...

> I think we should provide better tools to measure these things
> directly rather than force users to make deductions from buffer
> counts. I'm still excited about using dtrace to get real counts of
> iops, seeks, etc.

Sure.

>>  To me, buffers seem like discrete (and unitless)
>> entities, and we handle them that way elsewhere in the system (see,
>> e.g. pg_stat_database, pg_statio_all_tables).  I don't know that it's
>> a good idea to display that same information here in a different
>> format.

This seems like an important point that you need to respond to.  Why
should we print out this information in kB here when we display it as
raw numbers elsewhere?  I can't see any reason at all.

>> I definitely do not want to do anything that loses accuracy.  This is
>> probably accurate enough for most uses, but it's still not as accurate
>> as just printing the raw numbers.
>
> I left the XML/JSON output in terms of blocks on the theory that tools
> reading this data can look up the block size and convert all it wants.

I think this is a really terrible idea.  You've got a lot of very
specific formatting code in explain.c which anyone who wants to use
the JSON and XML output will very possibly need to reimplement.  I
have worked really hard to keep the text format in sync with all the
others, and up until now they have been.

> Incidentally looking at the pg_size_pretty() functions reminds me that
> these counters are all 32-bit. That means they'll do funny things if
> you have a query which accesses over 16TB of data... I suspect this
> should probably be changed though I'm feeling lazy about it unless
> someone else wants to push me to do it now.

Well that will require fixing a whole lot of bits in the stats
infrastructure that are only minimally related to this patch.  That is
certainly 9.1 material.

Basically, I think this whole change is a bad idea and should be
reverted.  You've made the text format EXPLAIN inconsistent with both
the non-text formats and with the rest of the buffer statistics stuff
for absolutely no benefit that I can see.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explain buffers display units.

2010-02-15 Thread Greg Stark

On Mon, Feb 15, 2010 at 6:05 PM, Robert Haas  wrote:
>> Well there was a 30+ message thread almost a week ago where there
>> seemed to be some contention over the issue of whether the numbers
>> should be averages or totals. But were there was no dispute over the
>> idea of printing in memory units instead of blocks.
>
> Hmm yeah, I guess it wasn't discussed.  I'm still not sure it's an
> improvement.  If a query hit one buffer, is that really the same as
> saying it hit 8kB?

Well you can always convert between them. The only time it would make
a difference is if you're sure it's random i/o and you're concerned
with the number of iops. However it's impossible to tell from this
output how many of these buffers are read sequentially and how many
randomly. Even if it's sequential you don't know how much it read
between interruptions to handle the inner side of a join or whether
the cached blocks were interspersed throughout the file or were all at
the beginning or end.

I think we should provide better tools to measure these things
directly rather than force users to make deductions from buffer
counts. I'm still excited about using dtrace to get real counts of
iops, seeks, etc.

>  To me, buffers seem like discrete (and unitless)
> entities, and we handle them that way elsewhere in the system (see,
> e.g. pg_stat_database, pg_statio_all_tables).  I don't know that it's
> a good idea to display that same information here in a different
> format.
>...
> I definitely do not want to do anything that loses accuracy.  This is
> probably accurate enough for most uses, but it's still not as accurate
> as just printing the raw numbers.

I left the XML/JSON output in terms of blocks on the theory that tools
reading this data can look up the block size and convert all it wants.
Likewise the pg_stat* functions are for extracting raw data. Any tool
or query that extracts this data can present it in any friendly form
it wants.

Incidentally looking at the pg_size_pretty() functions reminds me that
these counters are all 32-bit. That means they'll do funny things if
you have a query which accesses over 16TB of data... I suspect this
should probably be changed though I'm feeling lazy about it unless
someone else wants to push me to do it now.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explain buffers display units.

2010-02-15 Thread Robert Haas

On Mon, Feb 15, 2010 at 9:55 AM, Greg Stark  wrote:
> On Mon, Feb 15, 2010 at 2:22 PM, Robert Haas  wrote:
>>> a) Changed the line description to "Total Buffer Usage" which at least
>>> hints that it's something more akin to the "Total runtime" listed at
>>> the bottom than the "actual time".
>>>
>>> b) Used units of memory -- I formatted them with 3 significant digits
>>> (unless the unit is bytes or kB where that would be silly). It's just
>>> what looked best to my eye.
>>
>> I wasn't aware we had consensus on making this change, which I see you
>> committed less than an hour after posting this.
>
> Well there was a 30+ message thread almost a week ago where there
> seemed to be some contention over the issue of whether the numbers
> should be averages or totals. But were there was no dispute over the
> idea of printing in memory units instead of blocks.

Hmm yeah, I guess it wasn't discussed.  I'm still not sure it's an
improvement.  If a query hit one buffer, is that really the same as
saying it hit 8kB?  To me, buffers seem like discrete (and unitless)
entities, and we handle them that way elsewhere in the system (see,
e.g. pg_stat_database, pg_statio_all_tables).  I don't know that it's
a good idea to display that same information here in a different
format.

> We can always continue tweak the details of the format such as adding
> spaces before the units to make it similar to the pg_size_pretty().
> I'm not sure I like the idea of making it exactly equivalent because
> pg_size_pretty() doesn't print any decimals so it's pretty imprecise
> for smaller values.

I definitely do not want to do anything that loses accuracy.  This is
probably accurate enough for most uses, but it's still not as accurate
as just printing the raw numbers.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explain buffers display units.

2010-02-15 Thread Greg Smith


Greg Stark wrote:

We can always continue tweak the details of the format such as adding
spaces before the units to make it similar to the pg_size_pretty().
I'm not sure I like the idea of making it exactly equivalent because
pg_size_pretty() doesn't print any decimals so it's pretty imprecise
for smaller values.
  


That's a reasonable position; I'd be fine with upgrading the 
requirements for a text scraping app to handle either "8 kB" or "1.356 
kB" if it wanted to share some code to consume either type of info, if 
all you did was throw a space in there.  I'd suggest either removing the 
PB units support from your implementation, or adding it to 
pg_size_pretty, just to keep those two routines more like one another in 
terms of what they might produce as output given the same scale of input.


Also, a quick comment in the new code explaining what you just said 
above might be helpful, just to preempt a similar "how is this different 
from pg_size_pretty?" question from popping up again one day.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explain buffers display units.

2010-02-15 Thread Greg Stark

On Mon, Feb 15, 2010 at 2:22 PM, Robert Haas  wrote:
>> a) Changed the line description to "Total Buffer Usage" which at least
>> hints that it's something more akin to the "Total runtime" listed at
>> the bottom than the "actual time".
>>
>> b) Used units of memory -- I formatted them with 3 significant digits
>> (unless the unit is bytes or kB where that would be silly). It's just
>> what looked best to my eye.
>
> I wasn't aware we had consensus on making this change, which I see you
> committed less than an hour after posting this.

Well there was a 30+ message thread almost a week ago where there
seemed to be some contention over the issue of whether the numbers
should be averages or totals. But were there was no dispute over the
idea of printing in memory units instead of blocks.

Given the controversy over whether to display averages or totals and
given the issues raised towards the end of the thread that there are
no comparable estimated values printed so there's no particular need
to average them I opted for the minimal change of just labelling it
"Total Buffer Usage". It didn't seem there was consensus to change it
to averages per loop or to change the whole plan output to display
totals. And I didn't see anyone argue that saying calling out that it
was a total was a bad idea.

We can always continue tweak the details of the format such as adding
spaces before the units to make it similar to the pg_size_pretty().
I'm not sure I like the idea of making it exactly equivalent because
pg_size_pretty() doesn't print any decimals so it's pretty imprecise
for smaller values.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explain buffers display units.

2010-02-15 Thread Robert Haas

On Sun, Feb 14, 2010 at 8:25 PM, Greg Stark  wrote:
> So this is what I did about my two complaints earlier about the
> explain buffer patch.
>
> a) Changed the line description to "Total Buffer Usage" which at least
> hints that it's something more akin to the "Total runtime" listed at
> the bottom than the "actual time".
>
> b) Used units of memory -- I formatted them with 3 significant digits
> (unless the unit is bytes or kB where that would be silly). It's just
> what looked best to my eye.

I wasn't aware we had consensus on making this change, which I see you
committed less than an hour after posting this.

> I'm finding "hit" and "read" kind of confusing myself but don't really
> have any better idea. It's not entirely clear whether read is the
> total accesses out of which some are cache hits or if they're two
> disjoint sets.

Keep in mind these terms are taken from other parts of the system
where they existed prior to this patch.  We probably want to stick
with them at this point for consistency, but in any case it's
certainly a separate discussion.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Explain buffers display units.

2010-02-14 Thread Greg Smith


Greg Stark wrote:

b) Used units of memory -- I formatted them with 3 significant digits
(unless the unit is bytes or kB where that would be silly). It's just
what looked best to my eye.
  


How does this compare with what comes out of pg_size_pretty 
(src/backend/utils/adt/dbsize.c)? I already have code floating around 
that parses the output from pg_size_pretty when I'm slurping in things 
from PostgreSQL, and it's not immediately obvious to me what having a 
format that's similar to but not quite the same as that one is buying here.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Explain buffers display units.

2010-02-14 Thread Greg Stark

So this is what I did about my two complaints earlier about the
explain buffer patch.

a) Changed the line description to "Total Buffer Usage" which at least
hints that it's something more akin to the "Total runtime" listed at
the bottom than the "actual time".

b) Used units of memory -- I formatted them with 3 significant digits
(unless the unit is bytes or kB where that would be silly). It's just
what looked best to my eye.

I'm finding "hit" and "read" kind of confusing myself but don't really
have any better idea. It's not entirely clear whether read is the
total accesses out of which some are cache hits or if they're two
disjoint sets.

postgres=# explain (analyze,buffers) select * from x limit 1;
   QUERY PLAN
-
 Limit  (cost=0.00..266.68 rows=1 width=105) (actual
time=0.023..53.964 rows=1 loops=1)
   Total Buffer Usage: shared hit=8kB read=1.30MB
   ->  Seq Scan on x  (cost=0.00..10667.00 rows=40 width=105)
(actual time=0.019..20.311 rows=1 loops=1)
 Total Buffer Usage: shared hit=8kB read=1.30MB
 Total runtime: 71.074 ms
(5 rows)

-- 
greg
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***
*** 98,104  static void ExplainJSONLineEnding(ExplainState *es);
  static void ExplainYAMLLineStarting(ExplainState *es);
  static void escape_json(StringInfo buf, const char *str);
  static void escape_yaml(StringInfo buf, const char *str);
! 
  
  /*
   * ExplainQuery -
--- 98,104 
  static void ExplainYAMLLineStarting(ExplainState *es);
  static void escape_json(StringInfo buf, const char *str);
  static void escape_yaml(StringInfo buf, const char *str);
! static double normalize_memory(double amount, char **unit, int *precision);
  
  /*
   * ExplainQuery -
***
*** 1081,1127  ExplainNode(Plan *plan, PlanState *planstate,
  			if (has_shared || has_local || has_temp)
  			{
  appendStringInfoSpaces(es->str, es->indent * 2);
! appendStringInfoString(es->str, "Buffers:");
  
  if (has_shared)
  {
  	appendStringInfoString(es->str, " shared");
! 	if (usage->shared_blks_hit > 0)
! 		appendStringInfo(es->str, " hit=%ld",
! 			usage->shared_blks_hit);
  	if (usage->shared_blks_read > 0)
! 		appendStringInfo(es->str, " read=%ld",
! 			usage->shared_blks_read);
  	if (usage->shared_blks_written > 0)
! 		appendStringInfo(es->str, " written=%ld",
! 			usage->shared_blks_written);
  	if (has_local || has_temp)
  		appendStringInfoChar(es->str, ',');
  }
  if (has_local)
  {
! 	appendStringInfoString(es->str, " local");
! 	if (usage->local_blks_hit > 0)
! 		appendStringInfo(es->str, " hit=%ld",
! 			usage->local_blks_hit);
! 	if (usage->local_blks_read > 0)
! 		appendStringInfo(es->str, " read=%ld",
! 			usage->local_blks_read);
! 	if (usage->local_blks_written > 0)
! 		appendStringInfo(es->str, " written=%ld",
! 			usage->local_blks_written);
  	if (has_temp)
  		appendStringInfoChar(es->str, ',');
  }
  if (has_temp)
  {
  	appendStringInfoString(es->str, " temp");
  	if (usage->temp_blks_read > 0)
! 		appendStringInfo(es->str, " read=%ld",
! 			usage->temp_blks_read);
! 	if (usage->temp_blks_written > 0)
! 		appendStringInfo(es->str, " written=%ld",
! 			usage->temp_blks_written);
  }
  appendStringInfoChar(es->str, '\n');
  			}
--- 1081,1143 
  			if (has_shared || has_local || has_temp)
  			{
  appendStringInfoSpaces(es->str, es->indent * 2);
! appendStringInfoString(es->str, "Total Buffer Usage:");
  
  if (has_shared)
  {
+ 	char *hit_unit, *read_unit, *written_unit;
+ 	int   hit_prec,  read_prec,  written_prec;
+ 	double hit_mem  = normalize_memory((double)usage->shared_blks_hit  * BLCKSZ, &hit_unit,  &hit_prec);
+ 	double read_mem = normalize_memory((double)usage->shared_blks_read * BLCKSZ, &read_unit, &read_prec);
+ 	double written_mem  = normalize_memory((double)usage->shared_blks_written  * BLCKSZ, &written_unit,  &written_prec);
+ 
  	appendStringInfoString(es->str, " shared");
! 		appendStringInfo(es->str, " hit=%.*f%s", 
! 		 hit_prec, hit_mem, hit_unit);
  	if (usage->shared_blks_read > 0)
! 		appendStringInfo(es->str, " read=%.*f%s",
! 		 read_prec, read_mem, read_unit);
  	if (usage->shared_blks_written > 0)
! 		appendStringInfo(es->str, " written=%.*f%s",
! 		 written_prec, written_mem, written_unit);
  	if (has_local || has_temp)
  		appendStringInfoChar(es->str, ',');
  }
  if (has_local)
  {
! 	char *hit_unit, *read_unit, *written_unit;
! 	int   hit_prec,  read_prec,  written_prec;
! 	double hit_mem  = normalize_memory((double)usage->loc

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-14 Thread Robert Haas

On Sun, Dec 13, 2009 at 11:49 PM, Takahiro Itagaki
 wrote:
> The attached patch [...]

Committed.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-14 Thread Tom Lane

Takahiro Itagaki  writes:
> Tom Lane  wrote:
>> Pushing extra arguments around would create overhead of its own ...
>> overhead that would be paid even when not using EXPLAIN at all.

> I cannot understand what you mean... The additional argument should
> not be a performance overhead because the code path is run only once
> per execution.

Hmm, maybe, but still: once you have two flags you're likely to need
more.  I concur with turning doInstrument into a bitmask as per Robert's
suggestion downthread.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-13 Thread Robert Haas

On Sun, Dec 13, 2009 at 11:49 PM, Takahiro Itagaki
 wrote:
>
> Robert Haas  wrote:
>
>> Well, I think we need to do something.  I don't really want to tack
>> another 5-6% overhead onto EXPLAIN ANALYZE.  Maybe we could recast the
>> doInstrument argument as a set of OR'd flags?
>
> I'm thinking the same thing (OR'd flags) right now.
>
> The attached patch adds INSTRUMENT_TIMER and INSTRUMENT_BUFFERS flags.
> The types of QueryDesc.doInstrument (renamed to instrument_options) and
> EState.es_instrument are changed from bool to int, and they store
> OR of InstrumentOption flags. INSTRUMENT_TIMER is always enabled when
> instrumetations are initialized, but INSTRUMENT_BUFFERS is enabled only if
> we use EXPLAIN BUFFERS. I think the flag options are not so bad idea because
> of extensibility. For example, we could support EXPLAIN CPU_USAGE someday.
>
> One issue is in the top-level instrumentation (queryDesc->totaltime).
> Since the field might be used by multiple plugins, the first initializer
> need to initialize the counter with all options. I used INSTRUMENT_ALL
> for it in the patch.
>
> =# EXPLAIN (ANALYZE) SELECT * FROM pgbench_accounts;
>                                                           QUERY PLAN
> 
>  Seq Scan on pgbench_accounts  (cost=0.00..263935.00 rows=1000 width=97) 
> (actual time=0.003..572.126 rows=1000 loops=1)
>  Total runtime: 897.729 ms
>
> =# EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM pgbench_accounts;
>                                                           QUERY PLAN
> 
>  Seq Scan on pgbench_accounts  (cost=0.00..263935.00 rows=1000 width=97) 
> (actual time=0.002..580.642 rows=1000 loops=1)
>   Buffers: shared hit=163935
>  Total runtime: 955.744 ms

That seems very promising, but it's almost midnight here so I have to
turn in for now.  I'll take another look at this tomorrow.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-13 Thread Takahiro Itagaki

Robert Haas  wrote:

> Well, I think we need to do something.  I don't really want to tack
> another 5-6% overhead onto EXPLAIN ANALYZE.  Maybe we could recast the
> doInstrument argument as a set of OR'd flags?

I'm thinking the same thing (OR'd flags) right now.

The attached patch adds INSTRUMENT_TIMER and INSTRUMENT_BUFFERS flags.
The types of QueryDesc.doInstrument (renamed to instrument_options) and
EState.es_instrument are changed from bool to int, and they store
OR of InstrumentOption flags. INSTRUMENT_TIMER is always enabled when
instrumetations are initialized, but INSTRUMENT_BUFFERS is enabled only if
we use EXPLAIN BUFFERS. I think the flag options are not so bad idea because
of extensibility. For example, we could support EXPLAIN CPU_USAGE someday.

One issue is in the top-level instrumentation (queryDesc->totaltime).
Since the field might be used by multiple plugins, the first initializer
need to initialize the counter with all options. I used INSTRUMENT_ALL
for it in the patch.

=# EXPLAIN (ANALYZE) SELECT * FROM pgbench_accounts;
   QUERY PLAN

 Seq Scan on pgbench_accounts  (cost=0.00..263935.00 rows=1000 width=97) 
(actual time=0.003..572.126 rows=1000 loops=1)
 Total runtime: 897.729 ms

=# EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM pgbench_accounts;
   QUERY PLAN

 Seq Scan on pgbench_accounts  (cost=0.00..263935.00 rows=1000 width=97) 
(actual time=0.002..580.642 rows=1000 loops=1)
   Buffers: shared hit=163935
 Total runtime: 955.744 ms

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

explain_buffers_20091214.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-13 Thread Robert Haas

On Sun, Dec 13, 2009 at 10:00 PM, Takahiro Itagaki
 wrote:
>> Two other thoughts:
>>
>> 1. It doesn't appear that there is any provision to ever zero
>> pgBufferUsage.  Shouldn't we do this, say, once per explain, just to
>> avoid the possibility of overflowing the counters?
>
> I think the overflowing will not be a problem because we only use
> the differences of values. The delta is always corrent unless we use
> 2^32 buffer accesses during one execution of a node.

Hmm... you might be right.  I'm not savvy enough to know whether there
are any portability concerns here.

Anyone else know?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-13 Thread Robert Haas

On Sun, Dec 13, 2009 at 10:15 PM, Tom Lane  wrote:
> Takahiro Itagaki  writes:
>> Should I add countBufferUsage boolean arguments to all places
>> doInstrument booleans are currently used? This requires several
>> minor modifications of codes in many places.
>
> Pushing extra arguments around would create overhead of its own ...
> overhead that would be paid even when not using EXPLAIN at all.

Well, I think we need to do something.  I don't really want to tack
another 5-6% overhead onto EXPLAIN ANALYZE.  Maybe we could recast the
doInstrument argument as a set of OR'd flags?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-13 Thread Takahiro Itagaki

Tom Lane  wrote:

> Takahiro Itagaki  writes:
> > Should I add countBufferUsage boolean arguments to all places
> > doInstrument booleans are currently used? This requires several
> > minor modifications of codes in many places.
> 
> Pushing extra arguments around would create overhead of its own ...
> overhead that would be paid even when not using EXPLAIN at all.

I cannot understand what you mean... The additional argument should
not be a performance overhead because the code path is run only once
per execution. Instrumentation structures are still not allocated
in normal or EXPLAIN queries; allocated only in "EXPLAIN ANALYZE".

Or, are you suggesting to separate buffer counters with Instrumentation
structure? It still requires extra arguments, but it could minimize the
overhead when we use EXPLAIN ANALYZE without BUFFERS. However, we need
additional codes around InstrStartNode/InstrStopNode calls.

Or, are you complaining about non-performance overheads,
something like overheads of code maintenance?

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-13 Thread Tom Lane

Takahiro Itagaki  writes:
> Should I add countBufferUsage boolean arguments to all places
> doInstrument booleans are currently used? This requires several
> minor modifications of codes in many places.

Pushing extra arguments around would create overhead of its own ...
overhead that would be paid even when not using EXPLAIN at all.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-13 Thread Takahiro Itagaki


Robert Haas  wrote:

> I have a question about the comment in InstrStopNode(), which reads:
> "Adds delta of buffer usage to node's count and resets counter to
> start so that the counters are not double counted by parent nodes."
> It then calls BufferUsageAccumDiff(), but that function doesn't
> actually "reset" anything, so it seems like the comment is wrong.

Oops, it's wrong. It just does "Adds delta of buffer usage to node's count."

> Two other thoughts:
> 
> 1. It doesn't appear that there is any provision to ever zero
> pgBufferUsage.  Shouldn't we do this, say, once per explain, just to
> avoid the possibility of overflowing the counters?

I think the overflowing will not be a problem because we only use
the differences of values. The delta is always corrent unless we use
2^32 buffer accesses during one execution of a node.

> 2. We seem to do all the work associated with pgBufferUsage even when
> the "buffers" option is not passed to explain.  The overhead of
> incrementing the counters is probably negligible (and we were paying
> the equivalent overhead before anyway) but I'm not sure whether saving
> the starting counters and accumulating the deltas might be enough to
> slow down EXPLAIN ANALYZE.  That's sorta slow already so I'd hate to
> whack it any more - have you benchmarked this at all?

There are 5% of overheads in the worst cases. The difference will be
little if we have more complex operations or some disk I/Os.

Adding Instrumentation->count_bufusage flag could reduce the overheads.
if (instr->count_bufusage)
BufferUsageAccumDiff(...)

Should I add countBufferUsage boolean arguments to all places
doInstrument booleans are currently used? This requires several
minor modifications of codes in many places.

[without patch]
=# EXPLAIN (ANALYZE) SELECT * FROM pgbench_accounts;
   QUERY PLAN

 Seq Scan on pgbench_accounts  (cost=0.00..263935.00 rows=1000 width=97) 
(actual time=0.003..571.794 rows=1000 loops=1)
 Total runtime: 899.427 ms

[with patch]
=# EXPLAIN (ANALYZE) SELECT * FROM pgbench_accounts;
   QUERY PLAN

 Seq Scan on pgbench_accounts  (cost=0.00..263935.00 rows=1000 width=97) 
(actual time=0.003..585.885 rows=1000 loops=1)
 Total runtime: 955.280 ms

- shared_buffers = 1500MB
- pgbench -i -s100
- Read all pages in the pgbench_accounts into shared buffers before runs.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-13 Thread Robert Haas

On Sun, Dec 13, 2009 at 7:55 PM, Takahiro Itagaki
 wrote:
>
> Robert Haas  wrote:
>
>> OK, done, see attached.  I also noticed when looking through this that
>> the documentation says that auto_explain.log_buffers is ignored unless
>> auto_explain.log_analyze is set.  That is true and seems right to me,
>> but for some reason explain_ExecutorEnd() had been changed to set
>> es.analyze if either log_analyze or log_buffers was set.
>
> Thanks. It was my bug.
>
> Could you apply the patch? Or, may I do by myself?

Sorry, I've been meaning to look at this a little more and have gotten
distracted.

I have a question about the comment in InstrStopNode(), which reads:
"Adds delta of buffer usage to node's count and resets counter to
start so that the counters are not double counted by parent nodes."
It then calls BufferUsageAccumDiff(), but that function doesn't
actually "reset" anything, so it seems like the comment is wrong.

Two other thoughts:

1. It doesn't appear that there is any provision to ever zero
pgBufferUsage.  Shouldn't we do this, say, once per explain, just to
avoid the possibility of overflowing the counters?

2. We seem to do all the work associated with pgBufferUsage even when
the "buffers" option is not passed to explain.  The overhead of
incrementing the counters is probably negligible (and we were paying
the equivalent overhead before anyway) but I'm not sure whether saving
the starting counters and accumulating the deltas might be enough to
slow down EXPLAIN ANALYZE.  That's sorta slow already so I'd hate to
whack it any more - have you benchmarked this at all?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-13 Thread Takahiro Itagaki


Robert Haas  wrote:

> OK, done, see attached.  I also noticed when looking through this that
> the documentation says that auto_explain.log_buffers is ignored unless
> auto_explain.log_analyze is set.  That is true and seems right to me,
> but for some reason explain_ExecutorEnd() had been changed to set
> es.analyze if either log_analyze or log_buffers was set.

Thanks. It was my bug.

Could you apply the patch? Or, may I do by myself?

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-11 Thread Robert Haas

On Fri, Dec 11, 2009 at 11:36 AM, Euler Taveira de Oliveira
 wrote:
> Robert Haas escreveu:
>> On Thu, Dec 10, 2009 at 9:35 PM, Takahiro Itagaki
>>  wrote:
>>> Anyway, a revised patch according to the comments is attached.
>>> The new text format is:
>>>  Buffers: shared hit=675 read=968, temp read=1443 written=1443
>>>    * Zero values are omitted. (Non-text formats could have zero values.)
>>>    * Rename "Blocks:" to "Buffers:".
>>>    * Remove parentheses and add a comma between shared, local and temp.
>>
>> I did a bit of copy-editing of your doc changes to make the English a
>> bit more correct and idiomatic.  Slightly revised patch attached for
>> your consideration.  The output format looks really nice (thanks for
>> bearing with me), and the functionality is great.
>>
> Please, document that zero values are omitted in the text format. It seems
> intuitive but could be surprise because zero values are in non-text formats.

OK, done, see attached.  I also noticed when looking through this that
the documentation says that auto_explain.log_buffers is ignored unless
auto_explain.log_analyze is set.  That is true and seems right to me,
but for some reason explain_ExecutorEnd() had been changed to set
es.analyze if either log_analyze or log_buffers was set.  It actually
didn't have any effect unless log_analyze was set, but only because
explain_ExecutorStart doesn't set queryDesc->doInstrument in that
case.  So I've reverted that here for clarity.

...Robert
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index f0d907d..491f479 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -22,6 +22,7 @@ PG_MODULE_MAGIC;
 static int	auto_explain_log_min_duration = -1; /* msec or -1 */
 static bool auto_explain_log_analyze = false;
 static bool auto_explain_log_verbose = false;
+static bool auto_explain_log_buffers = false;
 static int	auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
 static bool auto_explain_log_nested_statements = false;
 
@@ -93,6 +94,16 @@ _PG_init(void)
 			 NULL,
 			 NULL);
 
+	DefineCustomBoolVariable("auto_explain.log_buffers",
+			 "Log buffers usage.",
+			 NULL,
+			 &auto_explain_log_buffers,
+			 false,
+			 PGC_SUSET,
+			 0,
+			 NULL,
+			 NULL);
+
 	DefineCustomEnumVariable("auto_explain.log_format",
 			 "EXPLAIN format to be used for plan logging.",
 			 NULL,
@@ -221,6 +232,7 @@ explain_ExecutorEnd(QueryDesc *queryDesc)
 			ExplainInitState(&es);
 			es.analyze = (queryDesc->doInstrument && auto_explain_log_analyze);
 			es.verbose = auto_explain_log_verbose;
+			es.buffers = (es.analyze && auto_explain_log_buffers);
 			es.format = auto_explain_log_format;
 
 			ExplainBeginOutput(&es);
diff --git a/doc/src/sgml/auto-explain.sgml b/doc/src/sgml/auto-explain.sgml
index dd3f3fd..1b9d4d9 100644
--- a/doc/src/sgml/auto-explain.sgml
+++ b/doc/src/sgml/auto-explain.sgml
@@ -104,6 +104,25 @@ LOAD 'auto_explain';
 

 
+ auto_explain.log_buffers (boolean)
+
+
+ auto_explain.log_buffers configuration parameter
+
+
+ 
+  auto_explain.log_buffers causes EXPLAIN
+  (ANALYZE, BUFFERS) output, rather than just EXPLAIN 
+  output, to be printed when an execution plan is logged. This parameter is 
+  off by default. Only superusers can change this setting. This
+  parameter has no effect unless auto_explain.log_analyze
+  parameter is set.
+ 
+
+   
+
+   
+
  auto_explain.log_format (enum)
 
 
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 0d03469..6c68afd 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -31,7 +31,7 @@ PostgreSQL documentation
 
  
 
-EXPLAIN [ ( { ANALYZE boolean | VERBOSE boolean | COSTS boolean | FORMAT { TEXT | XML | JSON | YAML } } [, ...] ) ] statement
+EXPLAIN [ ( { ANALYZE boolean | VERBOSE boolean | COSTS boolean | BUFFERS boolean | FORMAT { TEXT | XML | JSON | YAML } } [, ...] ) ] statement
 EXPLAIN [ ANALYZE ] [ VERBOSE ] statement
 
  
@@ -140,6 +140,24 @@ ROLLBACK;

 

+BUFFERS
+
+ 
+  Include information on buffer usage. Specifically, include the number of
+  shared blocks hits, reads, and writes, the number of local blocks hits,
+  reads, and writes, and the number of temp blocks reads and writes.
+  Shared blocks, local blocks, and temp blocks contain tables and indexes,
+  temporary tables and temporary indexes, and disk blocks used in sort and
+  materialized plans, respectively. The number of blocks shown for an
+  upper-level node includes those used by all its child nodes.  In text
+  format, only non-zero values are printed.  This parameter may only be
+  used with ANALYZE parameter.  It defaults to
+  FALSE.
+ 
+
+   
+
+   
 FORMAT
 
  
diff --git a/src/backend/commands/explain.c b/src/backend/commands

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-11 Thread Euler Taveira de Oliveira

Robert Haas escreveu:
> On Thu, Dec 10, 2009 at 9:35 PM, Takahiro Itagaki
>  wrote:
>> Anyway, a revised patch according to the comments is attached.
>> The new text format is:
>>  Buffers: shared hit=675 read=968, temp read=1443 written=1443
>>* Zero values are omitted. (Non-text formats could have zero values.)
>>* Rename "Blocks:" to "Buffers:".
>>* Remove parentheses and add a comma between shared, local and temp.
> 
> I did a bit of copy-editing of your doc changes to make the English a
> bit more correct and idiomatic.  Slightly revised patch attached for
> your consideration.  The output format looks really nice (thanks for
> bearing with me), and the functionality is great.
> 
Please, document that zero values are omitted in the text format. It seems
intuitive but could be surprise because zero values are in non-text formats.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-10 Thread Robert Haas

On Thu, Dec 10, 2009 at 9:35 PM, Takahiro Itagaki
 wrote:
> Anyway, a revised patch according to the comments is attached.
> The new text format is:
>  Buffers: shared hit=675 read=968, temp read=1443 written=1443
>    * Zero values are omitted. (Non-text formats could have zero values.)
>    * Rename "Blocks:" to "Buffers:".
>    * Remove parentheses and add a comma between shared, local and temp.

I did a bit of copy-editing of your doc changes to make the English a
bit more correct and idiomatic.  Slightly revised patch attached for
your consideration.  The output format looks really nice (thanks for
bearing with me), and the functionality is great.

...Robert
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 75ac9ca..88c33c0 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -22,6 +22,7 @@ PG_MODULE_MAGIC;
 static int	auto_explain_log_min_duration = -1; /* msec or -1 */
 static bool auto_explain_log_analyze = false;
 static bool auto_explain_log_verbose = false;
+static bool auto_explain_log_buffers = false;
 static int	auto_explain_log_format = EXPLAIN_FORMAT_TEXT;
 static bool auto_explain_log_nested_statements = false;
 
@@ -93,6 +94,16 @@ _PG_init(void)
 			 NULL,
 			 NULL);
 
+	DefineCustomBoolVariable("auto_explain.log_buffers",
+			 "Log buffers usage.",
+			 NULL,
+			 &auto_explain_log_buffers,
+			 false,
+			 PGC_SUSET,
+			 0,
+			 NULL,
+			 NULL);
+
 	DefineCustomEnumVariable("auto_explain.log_format",
 			 "EXPLAIN format to be used for plan logging.",
 			 NULL,
@@ -219,8 +230,10 @@ explain_ExecutorEnd(QueryDesc *queryDesc)
 			ExplainState	es;
 
 			ExplainInitState(&es);
-			es.analyze = (queryDesc->doInstrument && auto_explain_log_analyze);
+			es.analyze = (queryDesc->doInstrument &&
+(auto_explain_log_analyze || auto_explain_log_buffers));
 			es.verbose = auto_explain_log_verbose;
+			es.buffers = (es.analyze && auto_explain_log_buffers);
 			es.format = auto_explain_log_format;
 
 			ExplainPrintPlan(&es, queryDesc);
diff --git a/doc/src/sgml/auto-explain.sgml b/doc/src/sgml/auto-explain.sgml
index dd3f3fd..1b9d4d9 100644
--- a/doc/src/sgml/auto-explain.sgml
+++ b/doc/src/sgml/auto-explain.sgml
@@ -104,6 +104,25 @@ LOAD 'auto_explain';
 

 
+ auto_explain.log_buffers (boolean)
+
+
+ auto_explain.log_buffers configuration parameter
+
+
+ 
+  auto_explain.log_buffers causes EXPLAIN
+  (ANALYZE, BUFFERS) output, rather than just EXPLAIN 
+  output, to be printed when an execution plan is logged. This parameter is 
+  off by default. Only superusers can change this setting. This
+  parameter has no effect unless auto_explain.log_analyze
+  parameter is set.
+ 
+
+   
+
+   
+
  auto_explain.log_format (enum)
 
 
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 0d03469..c90a028 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -31,7 +31,7 @@ PostgreSQL documentation
 
  
 
-EXPLAIN [ ( { ANALYZE boolean | VERBOSE boolean | COSTS boolean | FORMAT { TEXT | XML | JSON | YAML } } [, ...] ) ] statement
+EXPLAIN [ ( { ANALYZE boolean | VERBOSE boolean | COSTS boolean | BUFFERS boolean | FORMAT { TEXT | XML | JSON | YAML } } [, ...] ) ] statement
 EXPLAIN [ ANALYZE ] [ VERBOSE ] statement
 
  
@@ -140,6 +140,23 @@ ROLLBACK;

 

+BUFFERS
+
+ 
+  Include information on buffer usage. Specifically, include the number of
+  shared blocks hits, reads, and writes, the number of local blocks hits,
+  reads, and writes, and the number of temp blocks reads and writes.
+  Shared blocks, local blocks, and temp blocks contain tables and indexes,
+  temporary tables and temporary indexes, and disk blocks used in sort and
+  materialized plans, respectively. The number of blocks shown for an
+  upper-level node includes those used by all its child nodes. This
+  parameter may only be used with ANALYZE parameter.
+  It defaults to FALSE.
+ 
+
+   
+
+   
 FORMAT
 
  
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 0437ffa..0aba2a7 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -127,6 +127,8 @@ ExplainQuery(ExplainStmt *stmt, const char *queryString,
 			es.verbose = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "costs") == 0)
 			es.costs = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "buffers") == 0)
+			es.buffers = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "format") == 0)
 		{
 			char   *p = defGetString(opt);
@@ -152,6 +154,11 @@ ExplainQuery(ExplainStmt *stmt, const char *queryString,
 			opt->defname)));
 	}
 
+	if (es.buffers && !es.analyze)
+		ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("EXPLAIN option BUFFERS re

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-10 Thread Takahiro Itagaki


Robert Haas  wrote:

> On Thu, Dec 10, 2009 at 10:52 AM, Greg Smith  wrote:
> >> I don't think IO is a terrible name for an option but I like BUFFERS
> >> better. ?I don't think the BUFFERS/BLOCKS confusion is too bad, but
> >> perhaps we could use BUFFERS in both places.
> >
> > I don't know how "blocks" got into here in the first place--this concept is
> > "buffers" just about everywhere else already, right?
> 
> I think we have some places already in the system where we bounce back
> and forth between those terms.  I expect that's the reason.

The "blocks" comes from pg_statio_all_tables.heap_blks_{read|hit},
but "buffers" might be easy to understand. One matter for concern
is that "buffer read" might not be clear whether it is a memory access
or a disk read.

Anyway, a revised patch according to the comments is attached.
The new text format is:
  Buffers: shared hit=675 read=968, temp read=1443 written=1443
* Zero values are omitted. (Non-text formats could have zero values.)
* Rename "Blocks:" to "Buffers:".
* Remove parentheses and add a comma between shared, local and temp.

=# EXPLAIN (BUFFERS, ANALYZE) SELECT *
 FROM pgbench_accounts a, pgbench_branches b
WHERE a.bid = b.bid AND abalance >= 0 ORDER BY abalance;
  QUERY PLAN
---
 Sort  (cost=54151.83..54401.83 rows=10 width=461) (actual 
time=92.551..109.646 rows=10 loops=1)
   Sort Key: a.abalance
   Sort Method:  external sort  Disk: 11544kB
   Buffers: shared hit=675 read=968, temp read=1443 written=1443
   ->  Nested Loop  (cost=0.00..4141.01 rows=10 width=461) (actual 
time=0.048..42.190 rows=10 loops=1)
 Join Filter: (a.bid = b.bid)
 Buffers: shared hit=673 read=968
 ->  Seq Scan on pgbench_branches b  (cost=0.00..1.01 rows=1 width=364) 
(actual time=0.003..0.004 rows=1 loops=1)
   Buffers: shared hit=1
 ->  Seq Scan on pgbench_accounts a  (cost=0.00..2890.00 rows=10 
width=97) (actual time=0.038..22.912 rows=10 loops=1)
   Filter: (a.abalance >= 0)
   Buffers: shared hit=672 read=968
 Total runtime: 116.058 ms
(13 rows)

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



explain_buffers_20091211.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-10 Thread Euler Taveira de Oliveira

Robert Haas escreveu:
> The only reason anyone is even thinking that they need parentheses
> here is because they're trying to put three separate groups of
> buffer-related statistics - a total of 8 values - on the same output
> line.  If this were split up over three output lines, no one would
> even be suggesting parentheses.
> 
That's the point. I'm afraid 3 new lines per node is too verbose.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-10 Thread Robert Haas

On Thu, Dec 10, 2009 at 10:50 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Dec 10, 2009 at 9:03 AM, Euler Taveira de Oliveira
>>  wrote:
>>> Why? If you want this information for all of your queries, you can always 
>>> set
>>> auto_explain.log_min_duration to 0. But if you're suggesting that we should
>>> maintain log_statement_stats (that was not I understand from Tom's email 
>>> [1]),
>>> it's not that difficult to a change ShowBufferUsage().
>
>> Mmm, OK, if Tom thinks we should rip it out, I'm not going to second-guess 
>> him.
>
> Feel free to question that.  But it's ancient code and I'm not convinced
> it still has a reason to live.  If you want to investigate the I/O
> behavior of a particular query, you'll use EXPLAIN.  If you want to get
> an idea of the system-wide behavior, you'll use the stats collector.
> What use case is left for the backend-local counters?

Beats me.  Tracing just your session without having to EXPLAIN each
query (and therefore not get the output rows)?  OK, I'm reaching.  I
tend to be very conservative about ripping things out that someone
might want unless they're actually getting in the way of doing some
new thing that we want to do - but so are you, and you know the
history of this code better than I do.  I'm happy to save my
questioning for a more important issue.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-10 Thread Robert Haas

On Thu, Dec 10, 2009 at 10:52 AM, Greg Smith  wrote:
> Robert Haas wrote:
>>
>> I don't think IO is a terrible name for an option but I like BUFFERS
>> better.  I don't think the BUFFERS/BLOCKS confusion is too bad, but
>> perhaps we could use BUFFERS in both places.
>
> I don't know how "blocks" got into here in the first place--this concept is
> "buffers" just about everywhere else already, right?

I think we have some places already in the system where we bounce back
and forth between those terms.  I expect that's the reason.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-10 Thread Greg Smith


Robert Haas wrote:

I don't think IO is a terrible name for an option but I like BUFFERS
better.  I don't think the BUFFERS/BLOCKS confusion is too bad, but
perhaps we could use BUFFERS in both places.
  
I don't know how "blocks" got into here in the first place--this concept 
is "buffers" just about everywhere else already, right?


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-10 Thread Tom Lane

Robert Haas  writes:
> On Thu, Dec 10, 2009 at 9:03 AM, Euler Taveira de Oliveira
>  wrote:
>> Why? If you want this information for all of your queries, you can always set
>> auto_explain.log_min_duration to 0. But if you're suggesting that we should
>> maintain log_statement_stats (that was not I understand from Tom's email 
>> [1]),
>> it's not that difficult to a change ShowBufferUsage().

> Mmm, OK, if Tom thinks we should rip it out, I'm not going to second-guess 
> him.

Feel free to question that.  But it's ancient code and I'm not convinced
it still has a reason to live.  If you want to investigate the I/O
behavior of a particular query, you'll use EXPLAIN.  If you want to get
an idea of the system-wide behavior, you'll use the stats collector.
What use case is left for the backend-local counters?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-10 Thread Robert Haas

On Thu, Dec 10, 2009 at 10:44 AM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Takahiro Itagaki escribió:
>>> Blocks: (shared hit=96 read=1544 written=0) (local hit=0 read=0 written=0) 
>>> (temp read=0 written=0)
>
>> Maybe I missed part of this discussion, but it seems a bit weird to have
>> an option named "buffers" turn on a line that specifies numbers of
>> "blocks".
>
> Agreed, and I have to agree also with the people who have been
> criticizing the output format.  If we were trying to put the block
> counts onto the same line as everything else then maybe parentheses
> would be helpful, but here they're just clutter.
>
> Perhaps
>
>        I/O: shared hit=96 read=1544 written=0 local hit=0 read=0 written=0 
> temp read=0 written=0
>
> (although this would suggest making the option name "io" which is
> probably a poor choice)
>
> I also suggest that dropping out zeroes might help --- a large fraction
> of EXPLAIN work is done with SELECTs that aren't ever going to write
> anything.  Then the above becomes
>
>        I/O: shared hit=96 read=1544
>
> which is vastly more readable.  You wouldn't want that to happen in
> machine-readable formats of course, but I think we no longer care about
> whether the text format is easy for programs to parse.

Oooh, that's a nice idea, though I think you should throw in some
commas if there is, say, both shared and local stuff:

shared hit=96 read=1544, local read=19

I don't think IO is a terrible name for an option but I like BUFFERS
better.  I don't think the BUFFERS/BLOCKS confusion is too bad, but
perhaps we could use BUFFERS in both places.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-10 Thread Tom Lane

Alvaro Herrera  writes:
> Takahiro Itagaki escribió:
>> Blocks: (shared hit=96 read=1544 written=0) (local hit=0 read=0 written=0) 
>> (temp read=0 written=0)

> Maybe I missed part of this discussion, but it seems a bit weird to have
> an option named "buffers" turn on a line that specifies numbers of
> "blocks".

Agreed, and I have to agree also with the people who have been
criticizing the output format.  If we were trying to put the block
counts onto the same line as everything else then maybe parentheses
would be helpful, but here they're just clutter.

Perhaps

I/O: shared hit=96 read=1544 written=0 local hit=0 read=0 written=0 
temp read=0 written=0

(although this would suggest making the option name "io" which is
probably a poor choice)

I also suggest that dropping out zeroes might help --- a large fraction
of EXPLAIN work is done with SELECTs that aren't ever going to write
anything.  Then the above becomes

I/O: shared hit=96 read=1544

which is vastly more readable.  You wouldn't want that to happen in
machine-readable formats of course, but I think we no longer care about
whether the text format is easy for programs to parse.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-10 Thread Alvaro Herrera

Takahiro Itagaki escribió:

> =# EXPLAIN (BUFFERS, ANALYZE) SELECT *
>   FROM pgbench_accounts a, pgbench_branches b
>  WHERE a.bid = b.bid AND abalance > 0 ORDER BY abalance;
>   QUERY PLAN
> --
>  Sort  (cost=2891.03..2891.04 rows=1 width=461) (actual time=22.494..22.494 
> rows=0 loops=1)
>Sort Key: a.abalance
>Sort Method:  quicksort  Memory: 25kB
>Blocks: (shared hit=96 read=1544 written=0) (local hit=0 read=0 written=0) 
> (temp read=0 written=0)

Maybe I missed part of this discussion, but it seems a bit weird to have
an option named "buffers" turn on a line that specifies numbers of
"blocks".  I kept looking for where you were specifying the BLOCKS
option to EXPLAIN in the command ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-10 Thread Robert Haas

On Thu, Dec 10, 2009 at 9:03 AM, Euler Taveira de Oliveira
 wrote:
> Robert Haas escreveu:
>> I'm not sure whether this is a good idea or not.  Let me read the
>> patch.  I'm not sure an EXPLAIN option is really an adequate
>> substitute for log_statement_stats - the latter will let you get stats
>> for all of your queries automatically, I believe, and might still be
>> useful as a quick and dirty tool.
>>
> Why? If you want this information for all of your queries, you can always set
> auto_explain.log_min_duration to 0. But if you're suggesting that we should
> maintain log_statement_stats (that was not I understand from Tom's email [1]),
> it's not that difficult to a change ShowBufferUsage().

Mmm, OK, if Tom thinks we should rip it out, I'm not going to second-guess him.

>> I still think this is a bad format.  Instead of putting "(" and ")"
>> around each phrase, can't we just separate them with a "," or ";"?
>>
> We already use ( and ) to group things. I don't remember us using , or ; in
> any output node. The suggested output is intuitive and similar to other nodes
> patterns.

It isn't.  In the other cases where we output multiple distinct values
on the same output row - like the sort instrumentation when ANALYZE is
turned on - they are separated with copious amounts of whitespace.
Costs are an exception, but those aren't done the same way as this
either.

The only reason anyone is even thinking that they need parentheses
here is because they're trying to put three separate groups of
buffer-related statistics - a total of 8 values - on the same output
line.  If this were split up over three output lines, no one would
even be suggesting parentheses.  Maybe that's a saner way to go.  If
not, fine, but I don't believe for a minute that the suggested format
is either correct or parallel to what has been done elsewhere.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-10 Thread Euler Taveira de Oliveira

Robert Haas escreveu:
> I'm not sure whether this is a good idea or not.  Let me read the
> patch.  I'm not sure an EXPLAIN option is really an adequate
> substitute for log_statement_stats - the latter will let you get stats
> for all of your queries automatically, I believe, and might still be
> useful as a quick and dirty tool.
> 
Why? If you want this information for all of your queries, you can always set
auto_explain.log_min_duration to 0. But if you're suggesting that we should
maintain log_statement_stats (that was not I understand from Tom's email [1]),
it's not that difficult to a change ShowBufferUsage().

> We certainly should NOT count on dtrace as a substitute for anything.
> It's not available on Windows, or all other platforms either.
> 
But we can always count on EXPLAIN BUFFERS. Remember that some monitoring
tasks are _only_ available via DTrace.

> I still think this is a bad format.  Instead of putting "(" and ")"
> around each phrase, can't we just separate them with a "," or ";"?
> 
We already use ( and ) to group things. I don't remember us using , or ; in
any output node. The suggested output is intuitive and similar to other nodes
patterns.


[1] http://archives.postgresql.org/pgsql-hackers/2009-10/msg00718.php


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-09 Thread Robert Haas

On Wed, Dec 9, 2009 at 12:36 AM, Takahiro Itagaki
 wrote:
> Note that the patch also removes buffer counters from log_statement_stats,
> but we only have brief descriptions about the options. Our documentation
> say nothing about buffer counters, so I didn't modify those lines in sgml.
> http://developer.postgresql.org/pgdocs/postgres/runtime-config-statistics.html#RUNTIME-CONFIG-STATISTICS-MONITOR

I'm not sure whether this is a good idea or not.  Let me read the
patch.  I'm not sure an EXPLAIN option is really an adequate
substitute for log_statement_stats - the latter will let you get stats
for all of your queries automatically, I believe, and might still be
useful as a quick and dirty tool.

> IMHO, we could remove those options completely because we can use
> EXPLAIN BUFFERS and DTrace probes instead of them.

We certainly should NOT count on dtrace as a substitute for anything.
It's not available on Windows, or all other platforms either.

> =# EXPLAIN (BUFFERS, ANALYZE) SELECT *
>      FROM pgbench_accounts a, pgbench_branches b
>     WHERE a.bid = b.bid AND abalance > 0 ORDER BY abalance;
>                                                          QUERY PLAN
> --
>  Sort  (cost=2891.03..2891.04 rows=1 width=461) (actual time=22.494..22.494 
> rows=0 loops=1)
>   Sort Key: a.abalance
>   Sort Method:  quicksort  Memory: 25kB
>   Blocks: (shared hit=96 read=1544 written=0) (local hit=0 read=0 written=0) 
> (temp read=0 written=0)
>   ->  Nested Loop  (cost=0.00..2891.02 rows=1 width=461) (actual 
> time=22.488..22.488 rows=0 loops=1)
>         Join Filter: (a.bid = b.bid)
>         Blocks: (shared hit=96 read=1544 written=0) (local hit=0 read=0 
> written=0) (temp read=0 written=0)
>         ->  Seq Scan on pgbench_accounts a  (cost=0.00..2890.00 rows=1 
> width=97) (actual time=22.486..22.486 rows=0 loops=1)
>               Filter: (abalance > 0)
>               Blocks: (shared hit=96 read=1544 written=0) (local hit=0 read=0 
> written=0) (temp read=0 written=0)
>         ->  Seq Scan on pgbench_branches b  (cost=0.00..1.01 rows=1 
> width=364) (never executed)
>               Blocks: (shared hit=0 read=0 written=0) (local hit=0 read=0 
> written=0) (temp read=0 written=0)
>  Total runtime: 22.546 ms
> (13 rows)

I still think this is a bad format.  Instead of putting "(" and ")"
around each phrase, can't we just separate them with a "," or ";"?
The filter uses parentheses in a mathematical way, for grouping
related items.  Not all filters are surrounded by parentheses
(consider a filter like "WHERE x", x being a boolean column) and some
will have multiple sets, if there are ANDs and ORs in there.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-09 Thread Euler Taveira de Oliveira

Takahiro Itagaki escreveu:
> Sure, I should have merge all of the comments. Patch attached.
> 
Thanks for your effort. Looks sane to me.

> - Updated the output format as follows. I think this format is the most
>   similar to existing lines. ("actual" by ANALYZE and "Filter:").
> 
If people object to it, we can always change it later.

> IMHO, we could remove those options completely because we can use
> EXPLAIN BUFFERS and DTrace probes instead of them.
> 
+1. But we need to propose some replacement options.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-08 Thread Takahiro Itagaki


Greg Smith  wrote:

> I just executed that.  Note that there are two bits of subjective 
> tweaking possible to do with this one when it's committed:  slimming 
> down the width of the display, and Euler's suggestion's for rewording.  
> I linked to both of those messages in the CF app, labeled as notes the 
> committer might want to consider, but that the patch hasn't been updated 
> to include yet.

Sure, I should have merge all of the comments. Patch attached.

- Updated documentation as Euler's suggestion, but I replaced
  the "It" of the second last sentence to "This parameter".
- Updated the output format as follows. I think this format is the most
  similar to existing lines. ("actual" by ANALYZE and "Filter:").

Note that the patch also removes buffer counters from log_statement_stats,
but we only have brief descriptions about the options. Our documentation
say nothing about buffer counters, so I didn't modify those lines in sgml.
http://developer.postgresql.org/pgdocs/postgres/runtime-config-statistics.html#RUNTIME-CONFIG-STATISTICS-MONITOR
IMHO, we could remove those options completely because we can use
EXPLAIN BUFFERS and DTrace probes instead of them.


=# EXPLAIN (BUFFERS, ANALYZE) SELECT *
  FROM pgbench_accounts a, pgbench_branches b
 WHERE a.bid = b.bid AND abalance > 0 ORDER BY abalance;
  QUERY PLAN
--
 Sort  (cost=2891.03..2891.04 rows=1 width=461) (actual time=22.494..22.494 
rows=0 loops=1)
   Sort Key: a.abalance
   Sort Method:  quicksort  Memory: 25kB
   Blocks: (shared hit=96 read=1544 written=0) (local hit=0 read=0 written=0) 
(temp read=0 written=0)
   ->  Nested Loop  (cost=0.00..2891.02 rows=1 width=461) (actual 
time=22.488..22.488 rows=0 loops=1)
 Join Filter: (a.bid = b.bid)
 Blocks: (shared hit=96 read=1544 written=0) (local hit=0 read=0 
written=0) (temp read=0 written=0)
 ->  Seq Scan on pgbench_accounts a  (cost=0.00..2890.00 rows=1 
width=97) (actual time=22.486..22.486 rows=0 loops=1)
   Filter: (abalance > 0)
   Blocks: (shared hit=96 read=1544 written=0) (local hit=0 read=0 
written=0) (temp read=0 written=0)
 ->  Seq Scan on pgbench_branches b  (cost=0.00..1.01 rows=1 width=364) 
(never executed)
   Blocks: (shared hit=0 read=0 written=0) (local hit=0 read=0 
written=0) (temp read=0 written=0)
 Total runtime: 22.546 ms
(13 rows)

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



explain_buffers_20091209.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-08 Thread Greg Smith


Euler Taveira de Oliveira wrote:

If there is no more objections, I'll flag the patch 'ready for committer'
  
I just executed that.  Note that there are two bits of subjective 
tweaking possible to do with this one when it's committed:  slimming 
down the width of the display, and Euler's suggestion's for rewording.  
I linked to both of those messages in the CF app, labeled as notes the 
committer might want to consider, but that the patch hasn't been updated 
to include yet.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-08 Thread Robert Haas


On Dec 8, 2009, at 12:05 AM, Greg Smith  wrote:


Robert Haas wrote:

I could live with the equals signs, but the use of parentheses seems
weird and inconsistent with normal english usage (which permits
parentheses as a means of making parenthetical comments).

But it is consistent with people seeing:

Seq Scan on foo (cost=0.00..155.00 rows=1 width=4)

Which seems to be what was being emulated here.  I though that was  
pretty reasonable given this is a related feature.


It's not the same at all. The essence of a parenthetical phrase is  
that it can be omitted without turning what's left into nonsense - and  
in fact we have COSTS OFF, which does just that. Omitting only the  
parenthesized portions of the proposed output would not be sensible.


...Robert 


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-07 Thread Euler Taveira de Oliveira

Itagaki Takahiro escreveu:
> Here is an updated patch per discussion.
> 
>   * Counters are accumulative. They contain I/Os by child nodes.
>   * Text format shows all counters.
>   * Add "shared_" prefix to variables representing shared buffers/blocks.
> 
Nice. Despite of the other opinions, I'm satisfied with your text format
sentence. It is: (i) clear (I don't have to think or check the docs to know
what information is that.) and (ii) not so verbose (There are nodes that are
even longer than that.).

The only thing that needs some fix is:

+ Include information on the buffers. Specifically, include the number of
+ hits/reads/writes of shared blocks and local blocks, and number of reads
+ and writes of temp blocks. Shared blocks contain global tables and
+ indexes, local blocks contain temporary tables and indexes, and temp
+ blocks contain disk blocks used in sort and materialized plans.
+ The value of a parent node contains values of its children.
+ This parameter should be used with ANALYZE parameter.
+ This parameter defaults to FALSE.

That could be:

Include information on the buffers. Specifically, include the number of shared
blocks hits, reads, and writes, the number of local blocks hits, reads, and
writes, and the number of temp blocks reads and writes. Shared blocks, local
blocks, and temp blocks contain tables and indexes, temporary tables and
temporary indexes, and disk blocks used in sort and materialized plans,
respectively. The number of blocks of an upper-level node includes the blocks
of all its child nodes. It should be used with ANALYZE
parameter. This parameter defaults to FALSE.

If there is no more objections, I'll flag the patch 'ready for committer' (you 
:).


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-07 Thread Greg Smith


Robert Haas wrote:

I could live with the equals signs, but the use of parentheses seems
weird and inconsistent with normal english usage (which permits
parentheses as a means of making parenthetical comments).

But it is consistent with people seeing:

Seq Scan on foo (cost=0.00..155.00 rows=1 width=4)

Which seems to be what was being emulated here.  I though that was 
pretty reasonable given this is a related feature.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-07 Thread Takahiro Itagaki


Robert Haas  wrote:

> On Mon, Dec 7, 2009 at 11:09 PM, Greg Smith  wrote:
> > (1) Blocks Shared: (hit=2 read=1641 written=0) Local: (hit=0 read=0
> > written=0) Temp: (read=1443 written=1443)

> I could live with the equals signs, but the use of parentheses seems
> weird and inconsistent with normal english usage (which permits
> parentheses as a means of making parenthetical comments).  (You can
> also put an entire sentence in parentheses.)  But you can't: (do
> this).

+1 for (1) personally, if we could think it is not in English but just
a symbol. I have another idea to make it alike with ANALYZE output.

(4) Blocks (shared hit=2 read=1641 ...) (local hit=0 ...) (temp read=0 ...)

Which is the best? I think it's a matter of preference.
  (0) 109 characters - Shared Blocks: hit 2 read 1641 wrote 0, ...
  (1) 105 characters - Blocks Shared: (hit=2 ...
  (2)  96 characters - Blocks Shared:hit=2 ...
  (3)  82 characters - Blocks Shared:(hit=2 ...
  (4)  82 characters - Blocks (shared hit=2 ...


BTW, I found text is a *bad* format because it requires much discussion ;)
We have a little choice in XML, JSON and YAML formats.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-07 Thread Robert Haas

On Mon, Dec 7, 2009 at 11:09 PM, Greg Smith  wrote:
> Robert Haas wrote:
>
> On Mon, Dec 7, 2009 at 9:58 PM, Itagaki Takahiro
>  wrote:
>
>
> Obviously I should not hide any information only in the text format.
> The new output will be: (in one line)
>  Shared Blocks: (hit=2 read=1641 written=0) Local Blocks: (hit=0 read=0
> written=0) Temp Blocks: (read=1443 written=1443)
>
>
> Hmm, that's a little awkward.  I think we could drop some of the
> punctuation.
>
> Shared Blocks: hit 2 read 1641 wrote 0, Local Blocks: hit 0 read 0
> wrote 0, Temp Blocks: read 1443 wrote 1443
>
>
> Having written more things to parse log files that should have been saved in
> a better format after the fact than I'd like, I'd say that replacing the "="
> signs used as a delimiter with a space is a step backwards.  That doesn't
> save any space anyway, and drifts too far from what one gets out of regular
> EXPLAIN I think.
>
> I was perfectly happy with proposed text format as being a reasonable
> trade-off between *possible* to parse if all you have is the text format,
> while still being readable.  If you want to compress horizontally, get rid
> of "Blocks" after the first usage is the first thing to do:
>
> (1) Blocks Shared: (hit=2 read=1641 written=0) Local: (hit=0 read=0
> written=0) Temp: (read=1443 written=1443)
>
> That's already at about the same length as what you suggested at 105
> characters, without losing any useful formatting.
>
> If further compression is required, you could just remove all the
> parentheses:
>
> (2) Blocks Shared:hit=2 read=1641 written=0 Local:hit=0 read=0 written=0
> Temp:read=1443 written=1443
>
> I don't really like this though.  Instead you could abbreviate the rest of
> the repeated text and reduce the number of spaces:
>
> (3) Blocks Shared:(hit=2 read=1641 written=0) Local:(h=0 r=0 w=0)
> Temp:(r=1443 w=1443)
>
> And now we're at the smallest result yet without any real loss in
> readability--I'd argue it's faster to read in fact.  This has a good balance
> of fitting on a reasonably wide console (the above is down to 82
> characters), still being readable to anyone, and being possible to machine
> parse in a pinch if all you have are text logs around (split on : then =).
> It might be too compressed down for some tastes though.

I could live with the equals signs, but the use of parentheses seems
weird and inconsistent with normal english usage (which permits
parentheses as a means of making parenthetical comments).  (You can
also put an entire sentence in parentheses.)  But you can't: (do
this).

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-07 Thread Greg Smith


Robert Haas wrote:

On Mon, Dec 7, 2009 at 9:58 PM, Itagaki Takahiro
 wrote:
  

Obviously I should not hide any information only in the text format.
The new output will be: (in one line)
 Shared Blocks: (hit=2 read=1641 written=0) Local Blocks: (hit=0 read=0 
written=0) Temp Blocks: (read=1443 written=1443)



Hmm, that's a little awkward.  I think we could drop some of the punctuation.

Shared Blocks: hit 2 read 1641 wrote 0, Local Blocks: hit 0 read 0
wrote 0, Temp Blocks: read 1443 wrote 1443
  
Having written more things to parse log files that should have been 
saved in a better format after the fact than I'd like, I'd say that 
replacing the "=" signs used as a delimiter with a space is a step 
backwards.  That doesn't save any space anyway, and drifts too far from 
what one gets out of regular EXPLAIN I think.


I was perfectly happy with proposed text format as being a reasonable 
trade-off between *possible* to parse if all you have is the text 
format, while still being readable.  If you want to compress 
horizontally, get rid of "Blocks" after the first usage is the first 
thing to do:


(1) Blocks Shared: (hit=2 read=1641 written=0) Local: (hit=0 read=0 
written=0) Temp: (read=1443 written=1443)


That's already at about the same length as what you suggested at 105 
characters, without losing any useful formatting. 

If further compression is required, you could just remove all the 
parentheses:


(2) Blocks Shared:hit=2 read=1641 written=0 Local:hit=0 read=0 written=0 
Temp:read=1443 written=1443


I don't really like this though.  Instead you could abbreviate the rest 
of the repeated text and reduce the number of spaces:


(3) Blocks Shared:(hit=2 read=1641 written=0) Local:(h=0 r=0 w=0) 
Temp:(r=1443 w=1443)


And now we're at the smallest result yet without any real loss in 
readability--I'd argue it's faster to read in fact.  This has a good 
balance of fitting on a reasonably wide console (the above is down to 82 
characters), still being readable to anyone, and being possible to 
machine parse in a pinch if all you have are text logs around (split on 
: then =).  It might be too compressed down for some tastes though.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-07 Thread Robert Haas

On Mon, Dec 7, 2009 at 9:58 PM, Itagaki Takahiro
 wrote:
> Here is an updated patch per discussion.
>
>  * Counters are accumulative. They contain I/Os by child nodes.
>  * Text format shows all counters.
>  * Add "shared_" prefix to variables representing shared buffers/blocks.
>
> Euler Taveira de Oliveira  wrote:
>
>> Itagaki Takahiro escreveu:
>> > I think the current output is enough and useful in normal use.
>> > We can use XML or JSON format for more details.
>> >
>> I don't think it is a good idea to have different information in different
>> formats. I'm with Robert; *don't* do that.
>
> I'm afraid of the human-unreadability of the text format, that is discussed
> in the YAML format thread. ...but I found we say the following in the docs.
>
>  XML or JSON output contains the same information as the text output format
>  http://developer.postgresql.org/pgdocs/postgres/sql-explain.html
>
> Obviously I should not hide any information only in the text format.
> The new output will be: (in one line)
>  Shared Blocks: (hit=2 read=1641 written=0) Local Blocks: (hit=0 read=0 
> written=0) Temp Blocks: (read=1443 written=1443)

Hmm, that's a little awkward.  I think we could drop some of the punctuation.

Shared Blocks: hit 2 read 1641 wrote 0, Local Blocks: hit 0 read 0
wrote 0, Temp Blocks: read 1443 wrote 1443

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-07 Thread Itagaki Takahiro

Here is an updated patch per discussion.

  * Counters are accumulative. They contain I/Os by child nodes.
  * Text format shows all counters.
  * Add "shared_" prefix to variables representing shared buffers/blocks.

Euler Taveira de Oliveira  wrote:

> Itagaki Takahiro escreveu:
> > I think the current output is enough and useful in normal use.
> > We can use XML or JSON format for more details.
> > 
> I don't think it is a good idea to have different information in different
> formats. I'm with Robert; *don't* do that.

I'm afraid of the human-unreadability of the text format, that is discussed
in the YAML format thread. ...but I found we say the following in the docs.

  XML or JSON output contains the same information as the text output format
  http://developer.postgresql.org/pgdocs/postgres/sql-explain.html

Obviously I should not hide any information only in the text format.
The new output will be: (in one line)
  Shared Blocks: (hit=2 read=1641 written=0) Local Blocks: (hit=0 read=0 
written=0) Temp Blocks: (read=1443 written=1443)

> There are nodes that don't read or write blocks.

This will be impossible now because we re-defined the meaning of counters
as "accumulated number of I/O". Even if the node never read or write blocks,
it might contain some child nodes with I/O.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

explain_buffers_20091208.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-07 Thread Euler Taveira de Oliveira

Itagaki Takahiro escreveu:
> I think the current output is enough and useful in normal use.
> We can use XML or JSON format for more details.
> 
I don't think it is a good idea to have different information in different
formats. I'm with Robert; *don't* do that. If you want to suppress the other
ones in text format, do it in the others too. One idea is to add them only if
we prefer the VERBOSE output.

> I think
> Blocks Hit: 1641  Read: 0  Temp Read: 1443
> means
> Blocks (Hit: 1641  Read: 0  Temp Read: 1443)
> i.e., Blocks of hit, blocks of reads, and Blocks of temp reads.
> 
But the latter is more clear than the former.

> I cannot understand what you mean -- should I suppress the lines when they
> have all-zero values?
> 
There are nodes that don't read or write blocks. If we go this way, we need to
document this behavior.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-07 Thread Robert Haas

On Mon, Dec 7, 2009 at 1:28 AM, Itagaki Takahiro
 wrote:
>> (ii) format: why does text output format have less counters than the other 
>> ones?
>
> That's because lines will be too long for text format. I think the
> three values in it are the most important and useful ones.

I disagree.  I objected to showing only part of the information when I
looked at this patch last CommitFest, and I object again now.  I do
*NOT* want to have to use JSON or XML to get at the counters you've
arbitrarily decided are not interesting to me.  I think with a little
creativity we can certainly get these into the text format, and I'm
willing to help with that.  In fact, if you want, I'll pick up this
patch and make it my first commit, though since you're now a committer
as well perhaps you'd prefer to do it yourself.

>> (v) accumulative: i don't remember if we discussed it but is there a reason
>> the number of buffers isn't accumulative? We already have cost and time that
>> are both accumulative. I saw BufferUsageAccumDiff() function but didn't try 
>> to
>> figure out why it isn't accumulating or passing the counters to parent nodes.
>
> It's reasonable. I'll change so if no objections.

+1 to change it.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-06 Thread Itagaki Takahiro

Euler Taveira de Oliveira  wrote:

> I'm looking at your patch now... It is almost there but has some issues.
> 
> (i) documentation: you have more than three counters and they could be
> mentioned in docs too.

I'll add documentation for all variables.

> (ii) format: why does text output format have less counters than the other 
> ones?

That's because lines will be too long for text format. I think the
three values in it are the most important and useful ones.

> (iii) string: i don't like the string in text format
> (1) it is not concise (only the first item has the word 'Blocks'),
> (2) what block is it about? Shared, Local, or Temp?

The format was suggested here and no objections then.
http://archives.postgresql.org/pgsql-hackers/2009-10/msg00268.php
I think the current output is enough and useful in normal use.
We can use XML or JSON format for more details.

I think
Blocks Hit: 1641  Read: 0  Temp Read: 1443
means
Blocks (Hit: 1641  Read: 0  Temp Read: 1443)
i.e., Blocks of hit, blocks of reads, and Blocks of temp reads.

> (3) why don't you include the other ones?, and
> (4) why don't you include the written counters?
> (iv) text format: i don't have a good suggestion but here are some ideas. The
> former is too long and the latter is too verbose.

Their reasons are the same as (ii).

> (v) accumulative: i don't remember if we discussed it but is there a reason
> the number of buffers isn't accumulative? We already have cost and time that
> are both accumulative. I saw BufferUsageAccumDiff() function but didn't try to
> figure out why it isn't accumulating or passing the counters to parent nodes.

It's reasonable. I'll change so if no objections.

> (vi) comment: the 'at start' is superfluous. Please, remove it.

Ooops, I'll remove them.

> (vii) all nodes: I'm thinking if we need this information in all nodes (even
> in those nodes that don't read or write). It would be less verbose but it
> could complicate some parser's life. Of course, if we suppress this
> information, we need to include it on the top node even if we don't read or
> write in it.

I cannot understand what you mean -- should I suppress the lines when they
have all-zero values?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-12-06 Thread Euler Taveira de Oliveira

Itagaki Takahiro escreveu:
> The attached patch is rebased to current CVS.
> 
I'm looking at your patch now... It is almost there but has some issues.

(i) documentation: you have more than three counters and they could be
mentioned in docs too.

+Include information on the buffers. Specifically, include the number of
+buffer hits, number of disk blocks read, and number of local buffer read.


(ii) format: why does text output format have less counters than the other ones?

+   if (es->format == EXPLAIN_FORMAT_TEXT)
+   {
+   appendStringInfoSpaces(es->str, es->indent * 2);
+   appendStringInfo(es->str, "Blocks Hit: %ld  Read: %ld  Temp Read:
%ld\n",
+   usage->blks_hit, usage->blks_read, usage->temp_blks_read);
+   }
+   else
+   {
+   ExplainPropertyLong("Hit Blocks", usage->blks_hit, es);
+   ExplainPropertyLong("Read Blocks", usage->blks_read, es);
+   ExplainPropertyLong("Written Blocks", usage->blks_written, es);
+   ExplainPropertyLong("Local Hit Blocks", usage->local_blks_hit, es);
+   ExplainPropertyLong("Local Read Blocks", usage->local_blks_read, 
es);
+   ExplainPropertyLong("Local Written Blocks",
usage->local_blks_written, es);
+   ExplainPropertyLong("Temp Read Blocks", usage->temp_blks_read, es);
+   ExplainPropertyLong("Temp Written Blocks",
usage->temp_blks_written, es);
+   }

(iii) string: i don't like the string in text format because (1) it is not
concise (only the first item has the word 'Blocks'), (2) what block is it
about? Shared, Local, or Temp?, (3) why don't you include the other ones?, and
(4) why don't you include the written counters?

 ->  Seq Scan on pg_namespace nc  (cost=0.00..1.07 rows=4 width=68) (actual
time=0.015..0.165 rows=4 loops=1)
   Filter: (NOT pg_is_other_temp_schema(oid))
   Blocks Hit: 11  Read: 0  Temp Read: 0

(iv) text format: i don't have a good suggestion but here are some ideas. The
former is too long and the latter is too verbose. :( Another option is to
suppress words hit, read, and written; and just document it.

Shared Blocks (11 hit, 5 read, 0 written); Local Blocks (5 hit, 0 read, 0
written); Temp Blocks (0 read, 0 written)

or

Shared Blocks: 11 hit, 5 read, 0 written
Local Blocks: 5 hit, 0 read, 0 written
Temp Blocks: 0 read, 0 written

(v) accumulative: i don't remember if we discussed it but is there a reason
the number of buffers isn't accumulative? We already have cost and time that
are both accumulative. I saw BufferUsageAccumDiff() function but didn't try to
figure out why it isn't accumulating or passing the counters to parent nodes.

euler=# explain (analyze true, buffers true) select * from pgbench_branches
inner join pgbench_history using (bid) where bid > 100;
   QUERY PLAN


 Hash Join  (cost=1.02..18.62 rows=3 width=476) (actual time=0.136..0.136
rows=0 loops=1)
   Hash Cond: (pgbench_history.bid = pgbench_branches.bid)
   Blocks Hit: 2  Read: 0  Temp Read: 0
   ->  Seq Scan on pgbench_history  (cost=0.00..15.50 rows=550 width=116)
(actual time=0.034..0.034 rows=1 loops=1)
 Blocks Hit: 1  Read: 0  Temp Read: 0
   ->  Hash  (cost=1.01..1.01 rows=1 width=364) (actual time=0.022..0.022
rows=0 loops=1)
 Blocks Hit: 0  Read: 0  Temp Read: 0
 ->  Seq Scan on pgbench_branches  (cost=0.00..1.01 rows=1 width=364)
(actual time=0.019..0.019 rows=0 loops=1)
   Filter: (bid > 100)
   Blocks Hit: 1  Read: 0  Temp Read: 0
 Total runtime: 0.531 ms
(11 rows)

(vi) comment: the 'at start' is superfluous. Please, remove it.

+   longblks_hit;   /* # of buffer hits at start */
+   longblks_read;  /* # of disk blocks read at start */

(vii) all nodes: I'm thinking if we need this information in all nodes (even
in those nodes that don't read or write). It would be less verbose but it
could complicate some parser's life. Of course, if we suppress this
information, we need to include it on the top node even if we don't read or
write in it.

I didn't have time to adjust your patch per comments above but if you can
address all of those issues I certainly could check your patch again.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-11-23 Thread Itagaki Takahiro


Jeff Janes  wrote:

> Just a quick note:  this patch does not apply cleanly to HEAD due to
> the subsequent removal from explain.c of the near-by lines:

Thanks for reporting.
The attached patch is rebased to current CVS.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



explain_buffers_20091124.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-11-23 Thread Jeff Janes

On Thu, Oct 15, 2009 at 3:29 AM, Itagaki Takahiro
 wrote:
>
> Robert Haas  wrote:
>
>> In this case, I think that the auto_explain changes out to be part of
>> the same patch as the core EXPLAIN changes
>
> Here is a rewritten patch to add EXPLAIN (ANALYZE, BUFFERS) and
> support for it by contrib/auto_explain. I removed pg_stat_statements
> support from the patch for now.

Just a quick note:  this patch does not apply cleanly to HEAD due to
the subsequent removal from explain.c of the near-by lines:

/* Convert parameter type data to the form parser wants */
getParamListTypes(params, ¶m_types, &num_params);

I think it is merely a text conflict and not a functional one.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-10-15 Thread Robert Haas

On Thu, Oct 15, 2009 at 11:06 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Oct 15, 2009 at 7:29 AM, Itagaki Takahiro
>>  wrote:
>>> EXPLAIN BUFFERS only shows 'hit', 'read' and 'temp read' in text format
>>> to fit in display, but xml or json format contains all of them.
>
>> I was very careful when I submitted the machine-readable explain patch
>> to make sure that the choice of which information was displayed was
>> independent of the format, and I think that we should stick with that.
>
> I thought one of the main purposes of adding the machine-readable
> formats was to allow inclusion of information that we thought too
> verbose for the human-readable format.  Whether this info falls into
> that category remains to be seen, but I don't agree with the premise
> that the information content must always be exactly the same.

Hmm.  I thought that the purpose of having a generalized options
syntax was that people could have the information they wanted,
independently of the format they chose.  Even with a lot of extra
information, the human readable format is still far shorter and more
easily readable than either of the others.  If we had gone with the
idea of just dumping everything in the world into the XML format,
you'd be right: but for various reasons we decided against that, which
I'm very happy about.

> FWIW, the patch's output as it stood a few days ago (one extra line per
> node, conditional on a BUFFERS option) did seem perfectly reasonable to
> me, and I don't see the reason to change that format now.

Yep, agreed.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-10-15 Thread Tom Lane

Robert Haas  writes:
> On Thu, Oct 15, 2009 at 7:29 AM, Itagaki Takahiro
>  wrote:
>> EXPLAIN BUFFERS only shows 'hit', 'read' and 'temp read' in text format
>> to fit in display, but xml or json format contains all of them.

> I was very careful when I submitted the machine-readable explain patch
> to make sure that the choice of which information was displayed was
> independent of the format, and I think that we should stick with that.

I thought one of the main purposes of adding the machine-readable
formats was to allow inclusion of information that we thought too
verbose for the human-readable format.  Whether this info falls into
that category remains to be seen, but I don't agree with the premise
that the information content must always be exactly the same.

FWIW, the patch's output as it stood a few days ago (one extra line per
node, conditional on a BUFFERS option) did seem perfectly reasonable to
me, and I don't see the reason to change that format now.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN BUFFERS

2009-10-15 Thread Robert Haas

On Thu, Oct 15, 2009 at 7:29 AM, Itagaki Takahiro
 wrote:
> EXPLAIN BUFFERS only shows 'hit', 'read' and 'temp read' in text format
> to fit in display, but xml or json format contains all of them.

I was very careful when I submitted the machine-readable explain patch
to make sure that the choice of which information was displayed was
independent of the format, and I think that we should stick with that.
 If you want we could have 'buffers terse' and 'buffers detail' but I
don't think we should force JSON/XML on people who want to see that
information.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] EXPLAIN BUFFERS

2009-10-15 Thread Itagaki Takahiro

Robert Haas  wrote:

> In this case, I think that the auto_explain changes out to be part of
> the same patch as the core EXPLAIN changes

Here is a rewritten patch to add EXPLAIN (ANALYZE, BUFFERS) and
support for it by contrib/auto_explain. I removed pg_stat_statements
support from the patch for now.

I modifed heavily in buffer statistics conters; These counters are
put all together into struct BufferUsage. The struct is also used in
struct Instrumentation. The global buffer usage counters are saved
into 'bufusage_start' field at the InstrStartNode(), and accumulated
into 'bufusage' field and global counters are reset at InstrStopNode().

EXPLAIN BUFFERS only shows 'hit', 'read' and 'temp read' in text format
to fit in display, but xml or json format contains all of them.

I removed ShowBufferUsage() because we can retrieve the same information
from xml or json explain output, but the patch does not drop
log_statement_stats variable families nor ShowUsage() functions.
We could also remove all of them if no one use them at all.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

explain_buffers_20091015.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

62 matches

Mail list logo