Re: Planning counters in pg_stat_statements (using pgss_store)

2021-07-25 Thread Fujii Masao




On 2021/07/26 12:32, Julien Rouhaud wrote:

On Sun, Jul 25, 2021 at 11:26:02PM -0400, Tom Lane wrote:

Julien Rouhaud  writes:

I attach a patch that splits the test and add a comment explaining the
boundaries for the new query.
Checked with and without forced invalidations.


Pushed with a little cosmetic fooling-about.


Thanks!


Thanks a lot!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Planning counters in pg_stat_statements (using pgss_store)

2021-07-25 Thread Julien Rouhaud
On Sun, Jul 25, 2021 at 11:26:02PM -0400, Tom Lane wrote:
> Julien Rouhaud  writes:
> > I attach a patch that splits the test and add a comment explaining the
> > boundaries for the new query.
> > Checked with and without forced invalidations.
> 
> Pushed with a little cosmetic fooling-about.

Thanks!




Re: Planning counters in pg_stat_statements (using pgss_store)

2021-07-25 Thread Tom Lane
Julien Rouhaud  writes:
> I attach a patch that splits the test and add a comment explaining the
> boundaries for the new query.
> Checked with and without forced invalidations.

Pushed with a little cosmetic fooling-about.

regards, tom lane




Re: Planning counters in pg_stat_statements (using pgss_store)

2021-07-25 Thread Julien Rouhaud
On Mon, Jul 26, 2021 at 01:08:08AM +0800, Julien Rouhaud wrote:
> Le lun. 26 juil. 2021 à 00:59, Tom Lane  a écrit :
> 
> > Julien Rouhaud  writes:
> > > On Sun, Jul 25, 2021 at 12:03:25PM -0400, Tom Lane wrote:
> >
> 
> > > Would it be worth to split the query for the prepared statement row vs
> > the rest
> > > to keep the full "plans" coverage when possible?
> >
> > +1, the same thought occurred to me later.  Also, if we're making
> > it specific to the one PREPARE example, we could get away with
> > checking "plans >= 2 AND plans <= calls", with a comment like
> > "we expect at least one replan event, but there could be more".
> 
> 
> > Do you want to prepare a patch?
> >
> 
> Sure, I will work on that tomorrow!

I attach a patch that splits the test and add a comment explaining the
boundaries for the new query.

Checked with and without forced invalidations.
>From 225632c54ff734f7f2b5a2b0d45127e425327973 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Mon, 26 Jul 2021 09:23:57 +0800
Subject: [PATCH v1] Make pg_stat_statements tests immune to prepared
 statements invalidation.

The tests for the number of planifications have been unstable since their
introduction, but got unnoticed until now as buildfarm animals don't run them
for now.

Discussion: https://postgr.es/m/42557.1627229...@sss.pgh.pa.us
---
 .../expected/pg_stat_statements.out   | 27 ---
 .../sql/pg_stat_statements.sql|  5 +++-
 2 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 40b5109b55..ea2f8cf77f 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -850,16 +850,23 @@ SELECT 42;
42
 (1 row)
 
-SELECT query, plans, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
-query| plans | calls | rows 
--+---+---+--
- ALTER TABLE test ADD COLUMN x int   | 0 | 1 |0
- CREATE TABLE test ()| 0 | 1 |0
- PREPARE prep1 AS SELECT COUNT(*) FROM test  | 2 | 4 |4
- SELECT $1   | 3 | 3 |3
- SELECT pg_stat_statements_reset()   | 0 | 1 |1
- SELECT query, plans, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" | 1 | 0 |0
-(6 rows)
+SELECT query, plans, calls, rows FROM pg_stat_statements WHERE query NOT LIKE 'PREPARE%' ORDER BY query COLLATE "C";
+query| plans | calls | rows 
+-+---+---+--
+ ALTER TABLE test ADD COLUMN x int   | 0 | 1 |0
+ CREATE TABLE test ()| 0 | 1 |0
+ SELECT $1   | 3 | 3 |3
+ SELECT pg_stat_statements_reset()   | 0 | 1 |1
+ SELECT query, plans, calls, rows FROM pg_stat_statements WHERE query NOT LIKE $1 ORDER BY query COLLATE "C" | 1 | 0 |0
+(5 rows)
+
+-- for the prepared statemennt we expect at least one replan, but cache
+-- invalidations could force more
+SELECT query, plans >=2 AND plans<=calls AS plans_ok, rows FROM pg_stat_statements WHERE query LIKE 'PREPARE%' ORDER BY query COLLATE "C";
+   query| plans_ok | rows 
++--+--
+ PREPARE prep1 AS SELECT COUNT(*) FROM test | t|4
+(1 row)
 
 --
 -- access to pg_stat_statements_info view
diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
index bc3b6493e6..3606a5f581 100644
--- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
+++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
@@ -356,7 +356,10 @@ EXECUTE prep1;
 SELECT 42;
 SELECT 42;
 SELECT 42;
-SELECT query, plans, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+SELECT query, plans, calls, rows FROM pg_stat_statements WHERE query NOT LIKE 'PREPARE%' ORDER BY query COLLATE "C";
+-- for the prepared statemennt we expect at 

Re: Planning counters in pg_stat_statements (using pgss_store)

2021-07-25 Thread Tom Lane
Andrew Dunstan  writes:
> On 7/25/21 12:03 PM, Tom Lane wrote:
>> So AFAICS this test is inherently unstable and there is no code bug
>> to be fixed.  We could drop the "plans" column from this query, or
>> print something approximate like "plans > 0 AND plans <= calls".
>> Thoughts?

> Is that likely to tell us anything very useful?

The variant suggested downthread ("plans >= 2 AND plans <= calls" for the
PREPARE entry only) seems like it's still reasonably useful.  At least it
can verify that a replan has occurred and been counted.

regards, tom lane




Re: Planning counters in pg_stat_statements (using pgss_store)

2021-07-25 Thread Andrew Dunstan


On 7/25/21 12:03 PM, Tom Lane wrote:
>
> So AFAICS this test is inherently unstable and there is no code bug
> to be fixed.  We could drop the "plans" column from this query, or
> print something approximate like "plans > 0 AND plans <= calls".
> Thoughts?
>

Is that likely to tell us anything very useful? I suppose it's really
just a check against insane values. Since the test is unstable it's hard
to do more than that.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Planning counters in pg_stat_statements (using pgss_store)

2021-07-25 Thread Julien Rouhaud
Le lun. 26 juil. 2021 à 00:59, Tom Lane  a écrit :

> Julien Rouhaud  writes:
> > On Sun, Jul 25, 2021 at 12:03:25PM -0400, Tom Lane wrote:
>

> > Would it be worth to split the query for the prepared statement row vs
> the rest
> > to keep the full "plans" coverage when possible?
>
> +1, the same thought occurred to me later.  Also, if we're making
> it specific to the one PREPARE example, we could get away with
> checking "plans >= 2 AND plans <= calls", with a comment like
> "we expect at least one replan event, but there could be more".


> Do you want to prepare a patch?
>

Sure, I will work on that tomorrow!

>


Re: Planning counters in pg_stat_statements (using pgss_store)

2021-07-25 Thread Tom Lane
Julien Rouhaud  writes:
> On Sun, Jul 25, 2021 at 12:03:25PM -0400, Tom Lane wrote:
>> So AFAICS this test is inherently unstable and there is no code bug
>> to be fixed.  We could drop the "plans" column from this query, or
>> print something approximate like "plans > 0 AND plans <= calls".
>> Thoughts?

> I think we should go with the latter.  Checking for a legit value, even if 
> it's
> a bit imprecise is still better than nothing.

> Would it be worth to split the query for the prepared statement row vs the 
> rest
> to keep the full "plans" coverage when possible?

+1, the same thought occurred to me later.  Also, if we're making
it specific to the one PREPARE example, we could get away with
checking "plans >= 2 AND plans <= calls", with a comment like
"we expect at least one replan event, but there could be more".

Do you want to prepare a patch?

regards, tom lane




Re: Planning counters in pg_stat_statements (using pgss_store)

2021-07-25 Thread Julien Rouhaud
On Sun, Jul 25, 2021 at 12:03:25PM -0400, Tom Lane wrote:

> The cause of the failure of course is that cache clobbering includes
> plan cache clobbering, so that the prepared statement's plan is
> remade each time it's used, not only twice as the test expects.
> However, remembering that cache flushes can happen for other reasons,
> it's my guess that this test case would prove unstable in the buildfarm
> even without considering the CLOBBER_CACHE_ALWAYS members.  For example,
> a background autovacuum hitting the "test" table at just the right time
> would result in extra planning.  We haven't seen that because the
> buildfarm's not running this test, but that's about to change.

Indeed.

> So AFAICS this test is inherently unstable and there is no code bug
> to be fixed.  We could drop the "plans" column from this query, or
> print something approximate like "plans > 0 AND plans <= calls".
> Thoughts?

I think we should go with the latter.  Checking for a legit value, even if it's
a bit imprecise is still better than nothing.

Would it be worth to split the query for the prepared statement row vs the rest
to keep the full "plans" coverage when possible?




Re: Planning counters in pg_stat_statements (using pgss_store)

2021-07-25 Thread Tom Lane
[ blast from the past department ]

Fujii Masao  writes:
> Finally I pushed the patch!
> Many thanks for all involved in this patch!

It turns out that the regression test outputs from this patch are
unstable under debug_discard_caches (nee CLOBBER_CACHE_ALWAYS).
You can easily check this in HEAD or v14, with something along
the lines of

$ cd ~/pgsql/contrib/pg_stat_statements
$ echo "debug_discard_caches = 1" >/tmp/temp_config
$ TEMP_CONFIG=/tmp/temp_config make check

and what you will get is a diff like this:

 SELECT query, plans, calls, rows FROM pg_stat_statements ORDER BY query 
COLLATE "C";
query | plans | calls | rows 
...
- PREPARE prep1 AS SELECT COUNT(*) FROM test  | 2 | 4 |4
+ PREPARE prep1 AS SELECT COUNT(*) FROM test  | 4 | 4 |4

The reason we didn't detect this long since is that the buildfarm
client script fails to run "make check" for contrib modules that
are marked NO_INSTALLCHECK, so that pg_stat_statements (among
others) has received precisely zero buildfarm testing.  Buildfarm
member sifaka is running an unreleased version of the script that
fixes that oversight, and when I experimented with turning on
debug_discard_caches, I got this failure, as shown at [1].

The cause of the failure of course is that cache clobbering includes
plan cache clobbering, so that the prepared statement's plan is
remade each time it's used, not only twice as the test expects.
However, remembering that cache flushes can happen for other reasons,
it's my guess that this test case would prove unstable in the buildfarm
even without considering the CLOBBER_CACHE_ALWAYS members.  For example,
a background autovacuum hitting the "test" table at just the right time
would result in extra planning.  We haven't seen that because the
buildfarm's not running this test, but that's about to change.

So AFAICS this test is inherently unstable and there is no code bug
to be fixed.  We could drop the "plans" column from this query, or
print something approximate like "plans > 0 AND plans <= calls".
Thoughts?

regards, tom lane

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka=2021-07-24%2023%3A53%3A52




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-05-26 Thread Andy Fan
On Fri, May 22, 2020 at 9:51 PM Fujii Masao 
wrote:

>
>
> On 2020/05/22 15:10, Andy Fan wrote:
> >
> >
> > On Thu, May 21, 2020 at 3:49 PM Julien Rouhaud  > wrote:
> >
> > Le jeu. 21 mai 2020 à 09:17, Michael Paquier  > a écrit :
> >
> > On Thu, May 21, 2020 at 08:49:53AM +0200, Julien Rouhaud wrote:
> >  > On Tue, May 19, 2020 at 4:29 AM Andy Fan <
> zhihui.fan1...@gmail.com > wrote:
> >  >> Thanks for the excellent extension. I want to add 5 more
> fields to satisfy the
> >  >> following requirements.
> >  >>
> >  >> int   subplan; /* No. of subplan in this query */
> >  >> int   subquery; /* No. of subquery */
> >  >> int   joincnt; /* How many relations are joined */
> >  >> bool hasagg; /* if we have agg function in this query */
> >  >> bool hasgroup; /* has group clause */
> >  >
> >  > Most of those fields can be computed using the raw sql
> satements.  Why
> >  > not adding functions like query_has_agg(querytext) to get the
> >  > information from pgss stored query text instead?
> >
> > Yeah I personally find concepts related only to the query string
> > itself not something that needs to be tied to pg_stat_statements.
> > ...
> >
> >
> > Indeed cte will bring additional concerns about the fields
> semantics. That's another good reason to go with external functions so you
> can add extra parameters for that if needed.
> >
> >
> > There are something more we can't get from query string easily. like:
> > 1. view involved.   2. subquery are pulled up so there is not subquery
> > indeed.  3. sublink are pull-up or become as an InitPlan rather than
> subPlan.
> > 4. joins are removed by remove_useless_joins.
>
> If we can store the plan for each statement, e.g., like pg_store_plans
> extension [1] does, rather than such partial information, which would
> be enough for your cases?
>
> That would be helpful if I can search the interested data from it. Oracle
has
v$sql_plan,  where every node in the plan has its own record, so it is easy
to search.

-- 
Best Regards
Andy Fan


Re: Planning counters in pg_stat_statements (using pgss_store)

2020-05-23 Thread Julien Rouhaud
On Fri, May 22, 2020 at 9:27 PM legrand legrand
 wrote:
>
> >> If we can store the plan for each statement, e.g., like pg_store_plans
> >> extension [1] does, rather than such partial information, which would
> >> be enough for your cases?
>
> > That'd definitely address way more use cases.  Do you know if some
> > benchmark were done to see how much overhead such an extension adds?
>
> Hi Julien,
> Did you asked about how overhead Auto Explain adds ?

Well, yes but on the other hand auto_explain is by design definitely
not something intended to trace all queries in an OLTP environment,
but rather configured to catch only some long running queries, so in
such cases the overhead is quite negligible.

> The only extension that was proposing to store plans with a decent planid
> calculation was pg_stat_plans that is not compatible any more with recent
> pg versions for years.

Ah I see.  AFAICT it's mainly missing the new node changes, but the
approach should otherwise still work smoothly.

Did you do some benchmark to compare this extension with the other
alternatives? Assuming that there's postgres version compatible with
all the extensions of course.

> We all know here that pg_store_plans, pg_show_plans, (my) pg_stat_sql_plans
> use ExplainPrintPlan through Executor Hook, and that Explain is slow ...
>
> Explain is slow because it was not designed for performances:
> 1/ colname_is_unique
> see
> https://www.postgresql-archive.org/Re-Explain-is-slow-with-tables-having-many-columns-td6047284.html
>
> 2/ hash_create from set_rtable_names
> Look with perf top about
>do $$ declare i int; begin for i in 1..100 loop execute 'explain
> select 1'; end loop end; $$;
>
> I may propose a "minimal" explain that only display explain's backbone and
> is much faster
> see
> https://github.com/legrandlegrand/pg_stat_sql_plans/blob/perf-explain/pgssp_explain.c
>
> 3/ All those extensions rebuild the explain output even with cached plan
> queries ...
>  a way to optimize this would be to build a planid during planning (using
> associated hook)
>
> 4/ All thoses extensions try to rebuild the explain plan even for trivial
> queries/plans
> like "select 1" or " insert into t values (,,,)" and that's not great for
> high transactional
> applications ...
>
> So yes, pg_store_plans is one of the short term answers to Andy Fan needs,
> the answer for the long term would be to help extensions to build planid and
> store plans,
> by **adding a planid field in plannedstmt memory structure ** and/or
> optimizing explain command;o)

I'd be in favor of adding a planid and using the same approach as
pg_store_plans.




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-05-22 Thread legrand legrand
>> If we can store the plan for each statement, e.g., like pg_store_plans
>> extension [1] does, rather than such partial information, which would
>> be enough for your cases?

> That'd definitely address way more use cases.  Do you know if some
> benchmark were done to see how much overhead such an extension adds?

Hi Julien,
Did you asked about how overhead Auto Explain adds ?

The only extension that was proposing to store plans with a decent planid 
calculation was pg_stat_plans that is not compatible any more with recent 
pg versions for years.

We all know here that pg_store_plans, pg_show_plans, (my) pg_stat_sql_plans
use ExplainPrintPlan through Executor Hook, and that Explain is slow ...

Explain is slow because it was not designed for performances:
1/ colname_is_unique
see
https://www.postgresql-archive.org/Re-Explain-is-slow-with-tables-having-many-columns-td6047284.html

2/ hash_create from set_rtable_names
Look with perf top about
   do $$ declare i int; begin for i in 1..100 loop execute 'explain
select 1'; end loop end; $$;

I may propose a "minimal" explain that only display explain's backbone and
is much faster
see
https://github.com/legrandlegrand/pg_stat_sql_plans/blob/perf-explain/pgssp_explain.c
 
3/ All those extensions rebuild the explain output even with cached plan
queries ...
 a way to optimize this would be to build a planid during planning (using
associated hook)

4/ All thoses extensions try to rebuild the explain plan even for trivial
queries/plans 
like "select 1" or " insert into t values (,,,)" and that's not great for
high transactional 
applications ...

So yes, pg_store_plans is one of the short term answers to Andy Fan needs, 
the answer for the long term would be to help extensions to build planid and
store plans, 
by **adding a planid field in plannedstmt memory structure ** and/or 
optimizing explain command;o)

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-05-22 Thread Julien Rouhaud
On Fri, May 22, 2020 at 3:51 PM Fujii Masao  wrote:
>
> On 2020/05/22 15:10, Andy Fan wrote:
> >
> >
> > On Thu, May 21, 2020 at 3:49 PM Julien Rouhaud  > > wrote:
> >
> > Le jeu. 21 mai 2020 à 09:17, Michael Paquier  > > a écrit :
> >
> > On Thu, May 21, 2020 at 08:49:53AM +0200, Julien Rouhaud wrote:
> >  > On Tue, May 19, 2020 at 4:29 AM Andy Fan 
> > mailto:zhihui.fan1...@gmail.com>> wrote:
> >  >> Thanks for the excellent extension. I want to add 5 more fields 
> > to satisfy the
> >  >> following requirements.
> >  >>
> >  >> int   subplan; /* No. of subplan in this query */
> >  >> int   subquery; /* No. of subquery */
> >  >> int   joincnt; /* How many relations are joined */
> >  >> bool hasagg; /* if we have agg function in this query */
> >  >> bool hasgroup; /* has group clause */
> >  >
> >  > Most of those fields can be computed using the raw sql 
> > satements.  Why
> >  > not adding functions like query_has_agg(querytext) to get the
> >  > information from pgss stored query text instead?
> >
> > Yeah I personally find concepts related only to the query string
> > itself not something that needs to be tied to pg_stat_statements.
> > ...
> >
> >
> > Indeed cte will bring additional concerns about the fields semantics. 
> > That's another good reason to go with external functions so you can add 
> > extra parameters for that if needed.
> >
> >
> > There are something more we can't get from query string easily. like:
> > 1. view involved.   2. subquery are pulled up so there is not subquery
> > indeed.  3. sublink are pull-up or become as an InitPlan rather than 
> > subPlan.
> > 4. joins are removed by remove_useless_joins.
>
> If we can store the plan for each statement, e.g., like pg_store_plans
> extension [1] does, rather than such partial information, which would
> be enough for your cases?

That'd definitely address way more use cases.  Do you know if some
benchmark were done to see how much overhead such an extension adds?




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-05-22 Thread Fujii Masao




On 2020/05/22 15:10, Andy Fan wrote:



On Thu, May 21, 2020 at 3:49 PM Julien Rouhaud mailto:rjuju...@gmail.com>> wrote:

Le jeu. 21 mai 2020 à 09:17, Michael Paquier mailto:mich...@paquier.xyz>> a écrit :

On Thu, May 21, 2020 at 08:49:53AM +0200, Julien Rouhaud wrote:
 > On Tue, May 19, 2020 at 4:29 AM Andy Fan mailto:zhihui.fan1...@gmail.com>> wrote:
 >> Thanks for the excellent extension. I want to add 5 more fields to 
satisfy the
 >> following requirements.
 >>
 >> int   subplan; /* No. of subplan in this query */
 >> int   subquery; /* No. of subquery */
 >> int   joincnt; /* How many relations are joined */
 >> bool hasagg; /* if we have agg function in this query */
 >> bool hasgroup; /* has group clause */
 >
 > Most of those fields can be computed using the raw sql satements.  
Why
 > not adding functions like query_has_agg(querytext) to get the
 > information from pgss stored query text instead?

Yeah I personally find concepts related only to the query string
itself not something that needs to be tied to pg_stat_statements.
...


Indeed cte will bring additional concerns about the fields semantics. 
That's another good reason to go with external functions so you can add extra 
parameters for that if needed.


There are something more we can't get from query string easily. like:
1. view involved.   2. subquery are pulled up so there is not subquery
indeed.  3. sublink are pull-up or become as an InitPlan rather than subPlan.
4. joins are removed by remove_useless_joins.


If we can store the plan for each statement, e.g., like pg_store_plans
extension [1] does, rather than such partial information, which would
be enough for your cases?

Regards,

[1]
http://pgstoreplans.osdn.jp/pg_store_plans.html
https://github.com/ossc-db/pg_store_plans

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-05-22 Thread Andy Fan
On Thu, May 21, 2020 at 3:49 PM Julien Rouhaud  wrote:

> Le jeu. 21 mai 2020 à 09:17, Michael Paquier  a
> écrit :
>
>> On Thu, May 21, 2020 at 08:49:53AM +0200, Julien Rouhaud wrote:
>> > On Tue, May 19, 2020 at 4:29 AM Andy Fan 
>> wrote:
>> >> Thanks for the excellent extension. I want to add 5 more fields to
>> satisfy the
>> >> following requirements.
>> >>
>> >> int   subplan; /* No. of subplan in this query */
>> >> int   subquery; /* No. of subquery */
>> >> int   joincnt; /* How many relations are joined */
>> >> bool hasagg; /* if we have agg function in this query */
>> >> bool hasgroup; /* has group clause */
>> >
>> > Most of those fields can be computed using the raw sql satements.  Why
>> > not adding functions like query_has_agg(querytext) to get the
>> > information from pgss stored query text instead?
>>
>> Yeah I personally find concepts related only to the query string
>> itself not something that needs to be tied to pg_stat_statements.
>> ...
>>
>
> Indeed cte will bring additional concerns about the fields semantics.
> That's another good reason to go with external functions so you can add
> extra parameters for that if needed.
>
>>
There are something more we can't get from query string easily. like:
1. view involved.   2. subquery are pulled up so there is not subquery
indeed.  3. sublink are pull-up or become as an InitPlan rather than
subPlan.
4. joins are removed by remove_useless_joins.

-- 
Best Regards
Andy Fan


Re: Planning counters in pg_stat_statements (using pgss_store)

2020-05-22 Thread Andy Fan
On Thu, May 21, 2020 at 3:17 PM Michael Paquier  wrote:

> On Thu, May 21, 2020 at 08:49:53AM +0200, Julien Rouhaud wrote:
> > On Tue, May 19, 2020 at 4:29 AM Andy Fan 
> wrote:
> >> Thanks for the excellent extension. I want to add 5 more fields to
> satisfy the
> >> following requirements.
> >>
> >> int   subplan; /* No. of subplan in this query */
> >> int   subquery; /* No. of subquery */
> >> int   joincnt; /* How many relations are joined */
> >> bool hasagg; /* if we have agg function in this query */
> >> bool hasgroup; /* has group clause */
> >
> > Most of those fields can be computed using the raw sql satements.  Why
> > not adding functions like query_has_agg(querytext) to get the
> > information from pgss stored query text instead?
>
> Yeah I personally find concepts related only to the query string
> itself not something that needs to be tied to pg_stat_statements.
> While reading about those five new fields, I am also wondering how
> this stuff would work with CTEs.  Particularly, should the hasagg or
> hasgroup flags be set only if the most outer query satisfies a
> condition? What if an inner query satisfies a condition but not an

outer query?  Should joincnt just be the sum of all the joins done in
> all queries, including subqueries?
>


The semantics is for overall query not for most outer query. see codes
like this for example:

query_characters.hasagg |= parse->hasAggs;
query_characters.hasgroup |= parse->groupClause != NIL;


> Most of those fields can be computed using the raw sql satements.  Why
> not adding functions like query_has_agg(querytext) to get the
> information from pgss stored query text instead?

That mainly because I don't want to reparse the query again.

-- 
Best Regards
Andy Fan


Re: Planning counters in pg_stat_statements (using pgss_store)

2020-05-21 Thread Julien Rouhaud
Le jeu. 21 mai 2020 à 09:17, Michael Paquier  a écrit :

> On Thu, May 21, 2020 at 08:49:53AM +0200, Julien Rouhaud wrote:
> > On Tue, May 19, 2020 at 4:29 AM Andy Fan 
> wrote:
> >> Thanks for the excellent extension. I want to add 5 more fields to
> satisfy the
> >> following requirements.
> >>
> >> int   subplan; /* No. of subplan in this query */
> >> int   subquery; /* No. of subquery */
> >> int   joincnt; /* How many relations are joined */
> >> bool hasagg; /* if we have agg function in this query */
> >> bool hasgroup; /* has group clause */
> >
> > Most of those fields can be computed using the raw sql satements.  Why
> > not adding functions like query_has_agg(querytext) to get the
> > information from pgss stored query text instead?
>
> Yeah I personally find concepts related only to the query string
> itself not something that needs to be tied to pg_stat_statements.
> While reading about those five new fields, I am also wondering how
> this stuff would work with CTEs.  Particularly, should the hasagg or
> hasgroup flags be set only if the most outer query satisfies a
> condition?  What if an inner query satisfies a condition but not an
> outer query?  Should joincnt just be the sum of all the joins done in
> all queries, including subqueries?
>

Indeed cte will bring additional concerns about the fields semantics.
That's another good reason to go with external functions so you can add
extra parameters for that if needed.

>


Re: Planning counters in pg_stat_statements (using pgss_store)

2020-05-21 Thread Michael Paquier
On Thu, May 21, 2020 at 08:49:53AM +0200, Julien Rouhaud wrote:
> On Tue, May 19, 2020 at 4:29 AM Andy Fan  wrote:
>> Thanks for the excellent extension. I want to add 5 more fields to satisfy 
>> the
>> following requirements.
>>
>> int   subplan; /* No. of subplan in this query */
>> int   subquery; /* No. of subquery */
>> int   joincnt; /* How many relations are joined */
>> bool hasagg; /* if we have agg function in this query */
>> bool hasgroup; /* has group clause */
>
> Most of those fields can be computed using the raw sql satements.  Why
> not adding functions like query_has_agg(querytext) to get the
> information from pgss stored query text instead?

Yeah I personally find concepts related only to the query string
itself not something that needs to be tied to pg_stat_statements.
While reading about those five new fields, I am also wondering how
this stuff would work with CTEs.  Particularly, should the hasagg or
hasgroup flags be set only if the most outer query satisfies a
condition?  What if an inner query satisfies a condition but not an
outer query?  Should joincnt just be the sum of all the joins done in
all queries, including subqueries?
--
Michael


signature.asc
Description: PGP signature


Re: Planning counters in pg_stat_statements (using pgss_store)

2020-05-21 Thread Julien Rouhaud
On Tue, May 19, 2020 at 4:29 AM Andy Fan  wrote:
>
> Thanks for the excellent extension. I want to add 5 more fields to satisfy the
> following requirements.
>
> int   subplan; /* No. of subplan in this query */
> int   subquery; /* No. of subquery */
> int   joincnt; /* How many relations are joined */
> bool hasagg; /* if we have agg function in this query */
> bool hasgroup; /* has group clause */
>
>
> 1. Usually I want to check total_exec_time / rows to see if the query is 
> missing
>index, however aggregation/groupby case makes this rule doesn't work. so
>hasagg/hasgroup should be a good rule to filter out these queries.
>
> 2. subplan is also a important clue to find out the query to turning. when we
>check the slow queries with pg_stat_statements, such information maybe
>helpful as well.
>
> 3. As for subquery / joincnt,  actually it is just helpful for optimizer
>developer to understand the query character is running most, it doesn't 
> help
>much for user.
>
>
> The attached is a PoC, that is far from perfect since 1). It maintain a
> per-backend global variable query_character which is only used in
> pg_stat_statements extension.  2). The 5 fields is impossible to change no
> matter how many times it runs, so it can't be treat as Counter in nature.
> However I don't think the above 2 will cause big issues.
>
> I added the columns to V1_8 rather than adding a new version. this can be
> changed at final patch.
>
> Any suggestions?

Most of those fields can be computed using the raw sql satements.  Why
not adding functions like query_has_agg(querytext) to get the
information from pgss stored query text instead?




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-05-18 Thread Andy Fan
Thanks for the excellent extension. I want to add 5 more fields to satisfy
the
following requirements.

int   subplan; /* No. of subplan in this query */
int   subquery; /* No. of subquery */
int   joincnt; /* How many relations are joined */
bool hasagg; /* if we have agg function in this query */
bool hasgroup; /* has group clause */


1. Usually I want to check total_exec_time / rows to see if the query is
missing
   index, however aggregation/groupby case makes this rule doesn't work. so
   hasagg/hasgroup should be a good rule to filter out these queries.

2. subplan is also a important clue to find out the query to turning. when
we
   check the slow queries with pg_stat_statements, such information maybe
   helpful as well.

3. As for subquery / joincnt,  actually it is just helpful for optimizer
   developer to understand the query character is running most, it doesn't
help
   much for user.


The attached is a PoC, that is far from perfect since 1). It maintain a
per-backend global variable query_character which is only used in
pg_stat_statements extension.  2). The 5 fields is impossible to change no
matter how many times it runs, so it can't be treat as Counter in nature.
However I don't think the above 2 will cause big issues.

I added the columns to V1_8 rather than adding a new version. this can be
changed at final patch.

Any suggestions?


Best Regards
Andy Fan


v1-0001-Add-query-characters-information-to-pg_stat_state.patch
Description: Binary data


Re: Planning counters in pg_stat_statements (using pgss_store)

2020-04-09 Thread Fujii Masao




On 2020/04/09 22:31, Julien Rouhaud wrote:

On Thu, Apr 9, 2020 at 5:59 AM Fujii Masao  wrote:




On 2020/04/08 18:31, Julien Rouhaud wrote:

On Wed, Apr 08, 2020 at 05:37:27PM +0900, Fujii Masao wrote:



On 2020/04/03 16:26, Julien Rouhaud wrote:

On Thu, Apr 02, 2020 at 01:04:28PM -0700, legrand legrand wrote:

Fujii Masao-4 wrote

On 2020/04/01 18:19, Fujii Masao wrote:

Finally I pushed the patch!
Many thanks for all involved in this patch!

As a remaining TODO item, I'm thinking that the document would need to
be improved. For example, previously the query was not stored in pgss
when it failed. But, in v13, if pgss_planning is enabled, such a query is
stored because the planning succeeds. Without the explanation about
that behavior in the document, I'm afraid that users will get confused.
Thought?


Thank you all for this work and especially to Julian for its major
contribution !



Thanks a lot to everyone!  This was quite a long journey.



Regarding the TODO point: Yes I agree that it can be improved.
My proposal:

"Note that planning and execution statistics are updated only at their
respective end phase, and only for successfull operations.
For exemple executions counters of a long running SELECT query,
will be updated at the execution end, without showing any progress
report in the interval.
Other exemple, if the statement is successfully planned but fails in
the execution phase, only its planning statistics are stored.
This may give uncorrelated plans vs calls informations."


Thanks for the proposal!


There are numerous reasons for lack of correlation between number of planning
and number of execution, so I'm afraid that this will give users the false
impression that only failed execution can lead to that.

Here's some enhancement on your proposal:

"Note that planning and execution statistics are updated only at their
respective end phase, and only for successful operations.
For example the execution counters of a long running query
will only be updated at the execution end, without showing any progress
report before that.


Probably since this is not the example for explaining the relationship of
planning and execution stats, it's better to explain this separately or just
drop it?

What about the attached patch based on your proposals?



Thanks Fuji-san, it looks perfect to me!


Thanks for the check! Pushed!


Thanks a lot Fuji-san!

For the record, the commit is available, but I didn't receive the
usual mail, and it's also not present in the archives apparently:
https://www.postgresql.org/list/pgsql-committers/since/20200409/
(although Amit's latest commit was delivered as expected).


Yes.


Given your previous discussion with Magnus, I'm assuming that your
address is now allowed to post for a year. I'm not sure what went
wrong here, so I'm adding Magnus in Cc.


Thanks! I also reported the issue in pgsql-www.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-04-09 Thread Julien Rouhaud
On Thu, Apr 9, 2020 at 5:59 AM Fujii Masao  wrote:
>
>
>
> On 2020/04/08 18:31, Julien Rouhaud wrote:
> > On Wed, Apr 08, 2020 at 05:37:27PM +0900, Fujii Masao wrote:
> >>
> >>
> >> On 2020/04/03 16:26, Julien Rouhaud wrote:
> >>> On Thu, Apr 02, 2020 at 01:04:28PM -0700, legrand legrand wrote:
>  Fujii Masao-4 wrote
> > On 2020/04/01 18:19, Fujii Masao wrote:
> >
> > Finally I pushed the patch!
> > Many thanks for all involved in this patch!
> >
> > As a remaining TODO item, I'm thinking that the document would need to
> > be improved. For example, previously the query was not stored in pgss
> > when it failed. But, in v13, if pgss_planning is enabled, such a query 
> > is
> > stored because the planning succeeds. Without the explanation about
> > that behavior in the document, I'm afraid that users will get confused.
> > Thought?
> 
>  Thank you all for this work and especially to Julian for its major
>  contribution !
> >>>
> >>>
> >>> Thanks a lot to everyone!  This was quite a long journey.
> >>>
> >>>
>  Regarding the TODO point: Yes I agree that it can be improved.
>  My proposal:
> 
>  "Note that planning and execution statistics are updated only at their
>  respective end phase, and only for successfull operations.
>  For exemple executions counters of a long running SELECT query,
>  will be updated at the execution end, without showing any progress
>  report in the interval.
>  Other exemple, if the statement is successfully planned but fails in
>  the execution phase, only its planning statistics are stored.
>  This may give uncorrelated plans vs calls informations."
> >>
> >> Thanks for the proposal!
> >>
> >>> There are numerous reasons for lack of correlation between number of 
> >>> planning
> >>> and number of execution, so I'm afraid that this will give users the false
> >>> impression that only failed execution can lead to that.
> >>>
> >>> Here's some enhancement on your proposal:
> >>>
> >>> "Note that planning and execution statistics are updated only at their
> >>> respective end phase, and only for successful operations.
> >>> For example the execution counters of a long running query
> >>> will only be updated at the execution end, without showing any progress
> >>> report before that.
> >>
> >> Probably since this is not the example for explaining the relationship of
> >> planning and execution stats, it's better to explain this separately or 
> >> just
> >> drop it?
> >>
> >> What about the attached patch based on your proposals?
> >>
> >
> > Thanks Fuji-san, it looks perfect to me!
>
> Thanks for the check! Pushed!

Thanks a lot Fuji-san!

For the record, the commit is available, but I didn't receive the
usual mail, and it's also not present in the archives apparently:
https://www.postgresql.org/list/pgsql-committers/since/20200409/
(although Amit's latest commit was delivered as expected).

Given your previous discussion with Magnus, I'm assuming that your
address is now allowed to post for a year. I'm not sure what went
wrong here, so I'm adding Magnus in Cc.




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-04-08 Thread Fujii Masao




On 2020/04/08 21:32, legrand legrand wrote:

Fujii Masao-4 wrote

On 2020/04/03 16:26
[...]


"Note that planning and execution statistics are updated only at their
respective end phase, and only for successful operations.
For example the execution counters of a long running query
will only be updated at the execution end, without showing any progress
report before that.


Probably since this is not the example for explaining the relationship of
planning and execution stats, it's better to explain this separately or
just
drop it?

What about the attached patch based on your proposals?


+1
Your patch is perfect ;^>


Thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-04-08 Thread Fujii Masao




On 2020/04/08 18:31, Julien Rouhaud wrote:

On Wed, Apr 08, 2020 at 05:37:27PM +0900, Fujii Masao wrote:



On 2020/04/03 16:26, Julien Rouhaud wrote:

On Thu, Apr 02, 2020 at 01:04:28PM -0700, legrand legrand wrote:

Fujii Masao-4 wrote

On 2020/04/01 18:19, Fujii Masao wrote:

Finally I pushed the patch!
Many thanks for all involved in this patch!

As a remaining TODO item, I'm thinking that the document would need to
be improved. For example, previously the query was not stored in pgss
when it failed. But, in v13, if pgss_planning is enabled, such a query is
stored because the planning succeeds. Without the explanation about
that behavior in the document, I'm afraid that users will get confused.
Thought?


Thank you all for this work and especially to Julian for its major
contribution !



Thanks a lot to everyone!  This was quite a long journey.



Regarding the TODO point: Yes I agree that it can be improved.
My proposal:

"Note that planning and execution statistics are updated only at their
respective end phase, and only for successfull operations.
For exemple executions counters of a long running SELECT query,
will be updated at the execution end, without showing any progress
report in the interval.
Other exemple, if the statement is successfully planned but fails in
the execution phase, only its planning statistics are stored.
This may give uncorrelated plans vs calls informations."


Thanks for the proposal!


There are numerous reasons for lack of correlation between number of planning
and number of execution, so I'm afraid that this will give users the false
impression that only failed execution can lead to that.

Here's some enhancement on your proposal:

"Note that planning and execution statistics are updated only at their
respective end phase, and only for successful operations.
For example the execution counters of a long running query
will only be updated at the execution end, without showing any progress
report before that.


Probably since this is not the example for explaining the relationship of
planning and execution stats, it's better to explain this separately or just
drop it?

What about the attached patch based on your proposals?



Thanks Fuji-san, it looks perfect to me!


Thanks for the check! Pushed!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-04-08 Thread legrand legrand
Fujii Masao-4 wrote
> On 2020/04/03 16:26
> [...]
>> 
>> "Note that planning and execution statistics are updated only at their
>> respective end phase, and only for successful operations.
>> For example the execution counters of a long running query
>> will only be updated at the execution end, without showing any progress
>> report before that.
> 
> Probably since this is not the example for explaining the relationship of
> planning and execution stats, it's better to explain this separately or
> just
> drop it?
> 
> What about the attached patch based on your proposals?

+1
Your patch is perfect ;^>

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-04-08 Thread Julien Rouhaud
On Wed, Apr 08, 2020 at 05:37:27PM +0900, Fujii Masao wrote:
> 
> 
> On 2020/04/03 16:26, Julien Rouhaud wrote:
> > On Thu, Apr 02, 2020 at 01:04:28PM -0700, legrand legrand wrote:
> > > Fujii Masao-4 wrote
> > > > On 2020/04/01 18:19, Fujii Masao wrote:
> > > > 
> > > > Finally I pushed the patch!
> > > > Many thanks for all involved in this patch!
> > > > 
> > > > As a remaining TODO item, I'm thinking that the document would need to
> > > > be improved. For example, previously the query was not stored in pgss
> > > > when it failed. But, in v13, if pgss_planning is enabled, such a query 
> > > > is
> > > > stored because the planning succeeds. Without the explanation about
> > > > that behavior in the document, I'm afraid that users will get confused.
> > > > Thought?
> > > 
> > > Thank you all for this work and especially to Julian for its major
> > > contribution !
> > 
> > 
> > Thanks a lot to everyone!  This was quite a long journey.
> > 
> > 
> > > Regarding the TODO point: Yes I agree that it can be improved.
> > > My proposal:
> > > 
> > > "Note that planning and execution statistics are updated only at their
> > > respective end phase, and only for successfull operations.
> > > For exemple executions counters of a long running SELECT query,
> > > will be updated at the execution end, without showing any progress
> > > report in the interval.
> > > Other exemple, if the statement is successfully planned but fails in
> > > the execution phase, only its planning statistics are stored.
> > > This may give uncorrelated plans vs calls informations."
> 
> Thanks for the proposal!
> 
> > There are numerous reasons for lack of correlation between number of 
> > planning
> > and number of execution, so I'm afraid that this will give users the false
> > impression that only failed execution can lead to that.
> > 
> > Here's some enhancement on your proposal:
> > 
> > "Note that planning and execution statistics are updated only at their
> > respective end phase, and only for successful operations.
> > For example the execution counters of a long running query
> > will only be updated at the execution end, without showing any progress
> > report before that.
> 
> Probably since this is not the example for explaining the relationship of
> planning and execution stats, it's better to explain this separately or just
> drop it?
> 
> What about the attached patch based on your proposals?
> 

Thanks Fuji-san, it looks perfect to me!




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-04-08 Thread Fujii Masao



On 2020/04/03 16:26, Julien Rouhaud wrote:

On Thu, Apr 02, 2020 at 01:04:28PM -0700, legrand legrand wrote:

Fujii Masao-4 wrote

On 2020/04/01 18:19, Fujii Masao wrote:

Finally I pushed the patch!
Many thanks for all involved in this patch!

As a remaining TODO item, I'm thinking that the document would need to
be improved. For example, previously the query was not stored in pgss
when it failed. But, in v13, if pgss_planning is enabled, such a query is
stored because the planning succeeds. Without the explanation about
that behavior in the document, I'm afraid that users will get confused.
Thought?


Thank you all for this work and especially to Julian for its major
contribution !



Thanks a lot to everyone!  This was quite a long journey.



Regarding the TODO point: Yes I agree that it can be improved.
My proposal:

"Note that planning and execution statistics are updated only at their
respective end phase, and only for successfull operations.
For exemple executions counters of a long running SELECT query,
will be updated at the execution end, without showing any progress
report in the interval.
Other exemple, if the statement is successfully planned but fails in
the execution phase, only its planning statistics are stored.
This may give uncorrelated plans vs calls informations."


Thanks for the proposal!


There are numerous reasons for lack of correlation between number of planning
and number of execution, so I'm afraid that this will give users the false
impression that only failed execution can lead to that.

Here's some enhancement on your proposal:

"Note that planning and execution statistics are updated only at their
respective end phase, and only for successful operations.
For example the execution counters of a long running query
will only be updated at the execution end, without showing any progress
report before that.


Probably since this is not the example for explaining the relationship of
planning and execution stats, it's better to explain this separately or just
drop it?

What about the attached patch based on your proposals?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/doc/src/sgml/pgstatstatements.sgml 
b/doc/src/sgml/pgstatstatements.sgml
index 3d26108649..5a962feb39 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -129,7 +129,7 @@
   calls
   bigint
   
-  Number of times executed
+  Number of times the statement was executed
  
 
  
@@ -398,6 +398,16 @@
reducing pg_stat_statements.max to prevent
recurrences.
   
+
+  
+   plans and calls aren't
+   always expected to match because planning and execution statistics are
+   updated at their respective end phase, and only for successful operations.
+   For example, if a statement is successfully planned but fails during
+   the execution phase, only its planning statistics will be updated.
+   If planning is skipped because a cached plan is used, only its execution
+   statistics will be updated.
+  
  
 
  


Re: Planning counters in pg_stat_statements (using pgss_store)

2020-04-03 Thread Julien Rouhaud
On Thu, Apr 02, 2020 at 01:04:28PM -0700, legrand legrand wrote:
> Fujii Masao-4 wrote
> > On 2020/04/01 18:19, Fujii Masao wrote:
> > 
> > Finally I pushed the patch!
> > Many thanks for all involved in this patch!
> > 
> > As a remaining TODO item, I'm thinking that the document would need to
> > be improved. For example, previously the query was not stored in pgss
> > when it failed. But, in v13, if pgss_planning is enabled, such a query is
> > stored because the planning succeeds. Without the explanation about
> > that behavior in the document, I'm afraid that users will get confused.
> > Thought?
> 
> Thank you all for this work and especially to Julian for its major
> contribution !


Thanks a lot to everyone!  This was quite a long journey.


> Regarding the TODO point: Yes I agree that it can be improved.
> My proposal:
> 
> "Note that planning and execution statistics are updated only at their 
> respective end phase, and only for successfull operations.
> For exemple executions counters of a long running SELECT query, 
> will be updated at the execution end, without showing any progress 
> report in the interval.
> Other exemple, if the statement is successfully planned but fails in 
> the execution phase, only its planning statistics are stored.
> This may give uncorrelated plans vs calls informations."


There are numerous reasons for lack of correlation between number of planning
and number of execution, so I'm afraid that this will give users the false
impression that only failed execution can lead to that.

Here's some enhancement on your proposal:

"Note that planning and execution statistics are updated only at their
respective end phase, and only for successful operations.
For example the execution counters of a long running query
will only be updated at the execution end, without showing any progress
report before that.
Similarly, if a statement is successfully planned but fails during
the execution phase, only its planning statistics will be displayed.
Please also note that the number of planning and number of execution aren't
expected to match, as the planification of a query won't always be followed by
its execution and reciprocally."




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-04-02 Thread legrand legrand
Fujii Masao-4 wrote
> On 2020/04/01 18:19, Fujii Masao wrote:
> 
> Finally I pushed the patch!
> Many thanks for all involved in this patch!
> 
> As a remaining TODO item, I'm thinking that the document would need to
> be improved. For example, previously the query was not stored in pgss
> when it failed. But, in v13, if pgss_planning is enabled, such a query is
> stored because the planning succeeds. Without the explanation about
> that behavior in the document, I'm afraid that users will get confused.
> Thought?
> 
> Regards,
> 
> -- 
> Fujii Masao
> Advanced Computing Technology Center
> Research and Development Headquarters
> NTT DATA CORPORATION

Thank you all for this work and especially to Julian for its major
contribution !

Regarding the TODO point: Yes I agree that it can be improved.
My proposal:

"Note that planning and execution statistics are updated only at their 
respective end phase, and only for successfull operations.
For exemple executions counters of a long running SELECT query, 
will be updated at the execution end, without showing any progress 
report in the interval.
Other exemple, if the statement is successfully planned but fails in 
the execution phase, only its planning statistics are stored.
This may give uncorrelated plans vs calls informations."

Regards
PAscal




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-04-01 Thread Fujii Masao




On 2020/04/01 18:19, Fujii Masao wrote:



On 2020/04/01 3:42, Julien Rouhaud wrote:

On Wed, Apr 01, 2020 at 02:43:10AM +0900, Fujii Masao wrote:



On 2020/03/31 16:33, Julien Rouhaud wrote:


v12 attached!


Thanks for updating the patch! The patch looks good to me.

I applied minor and cosmetic changes into the patch. Attached is
the updated version of the patch. Barring any objection, I'd like to
commit this version.

BTW, the minor and cosmetic changes that I applied are, for example,

- Rename pgss_planner_hook to pgss_planner for the sake of consistency.
    Other function using hook in pgss doesn't use "_hook" in their names, too.
- Make pgss_planner use PG_FINALLY() instead of PG_CATCH().
- Make PGSS_NUMKIND as the last value in enum pgssStoreKind.



+1, and the PGSS_INVALID is also way better.



- Update the sample output in the document.
etc



Thanks a lot.  It all looks good to me!


Finally I pushed the patch!
Many thanks for all involved in this patch!

As a remaining TODO item, I'm thinking that the document would need to
be improved. For example, previously the query was not stored in pgss
when it failed. But, in v13, if pgss_planning is enabled, such a query is
stored because the planning succeeds. Without the explanation about
that behavior in the document, I'm afraid that users will get confused.
Thought?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-04-01 Thread Fujii Masao




On 2020/04/01 3:42, Julien Rouhaud wrote:

On Wed, Apr 01, 2020 at 02:43:10AM +0900, Fujii Masao wrote:



On 2020/03/31 16:33, Julien Rouhaud wrote:


v12 attached!


Thanks for updating the patch! The patch looks good to me.

I applied minor and cosmetic changes into the patch. Attached is
the updated version of the patch. Barring any objection, I'd like to
commit this version.

BTW, the minor and cosmetic changes that I applied are, for example,

- Rename pgss_planner_hook to pgss_planner for the sake of consistency.
Other function using hook in pgss doesn't use "_hook" in their names, too.
- Make pgss_planner use PG_FINALLY() instead of PG_CATCH().
- Make PGSS_NUMKIND as the last value in enum pgssStoreKind.



+1, and the PGSS_INVALID is also way better.



- Update the sample output in the document.
etc



Thanks a lot.  It all looks good to me!


Thanks for the check!

I tried to pick up the names of authors and reviewers of this patch,
from the past discussions. Then I'm thinking to write the followings
in the commit log. Are there any other developers that should be
credited as author or reviewer?

Author: Julien Rouhaud, Pascal Legrand, Thomas Munro, Fujii Masao
Reviewed-by: Sergei Kornilov, Tomas Vondra, Yoshikazu Imai, Haribabu Kommi, Tom 
Lane

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-31 Thread Julien Rouhaud
On Wed, Apr 01, 2020 at 02:43:10AM +0900, Fujii Masao wrote:
>
>
> On 2020/03/31 16:33, Julien Rouhaud wrote:
> >
> > v12 attached!
>
> Thanks for updating the patch! The patch looks good to me.
>
> I applied minor and cosmetic changes into the patch. Attached is
> the updated version of the patch. Barring any objection, I'd like to
> commit this version.
>
> BTW, the minor and cosmetic changes that I applied are, for example,
>
> - Rename pgss_planner_hook to pgss_planner for the sake of consistency.
>Other function using hook in pgss doesn't use "_hook" in their names, too.
> - Make pgss_planner use PG_FINALLY() instead of PG_CATCH().
> - Make PGSS_NUMKIND as the last value in enum pgssStoreKind.


+1, and the PGSS_INVALID is also way better.


> - Update the sample output in the document.
> etc


Thanks a lot.  It all looks good to me!




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-31 Thread Fujii Masao



On 2020/03/31 16:33, Julien Rouhaud wrote:

On Tue, Mar 31, 2020 at 04:10:47PM +0900, Fujii Masao wrote:



On 2020/03/31 15:03, Julien Rouhaud wrote:

On Tue, Mar 31, 2020 at 12:21:43PM +0900, Fujii Masao wrote:


On 2020/03/31 3:16, Julien Rouhaud wrote:

On Mon, Mar 30, 2020 at 6:36 PM Fujii Masao  wrote:


While testing the patched pgss, I found that the patched version
may track the statements that the original version doesn't.
Please imagine the case where the following queries are executed,
with pgss.track = top.

PREPARE hoge AS SELECT * FROM t;
EXPLAIN EXECUTE hoge;

The pgss view returned "PREPARE hoge AS SELECT * FROM t"
in the patched version, but not in the orignal version.

Is this problematic?


Oh indeed. That's a side effect of having different the executed query
and the planned query being different.

I guess the question is to chose if the top level executed query of a
utilty statement containing an optimisable query, should the top level
planner call of that optimisable statement be considered at top level
or not.  I tend to think that's the correct behavior here, as this is
also what would happen if a regular DML was provided.  What do you
think?


TBH, not sure if that's ok yet...

I'm now just wondering if both plan_nested_level and
exec_nested_level should be incremented in pgss_ProcessUtility().
This is just a guess, so I need more investigation about this.


Yeah, after a second thought I realize that my comparison was wrong.  Allowing
*any* top-level planner call when pgss.track = top would mean that we should
also consider all planner calls from queries executed for FK checks and such,
which is definitely not the intended behavior.


Yes. So, basically any planner activity that happens during
the execution phase of the statement is not tracked.


FTR with this patch such calls still don't get tracked, but only because those
query don't get a queryid assigned, not because the nesting level says so.

How about simply passing (plan_nested_level + exec_nested_level) for
pgss_enabled call in pgss_planner_hook?


Looks good to me! The comment about why this treatment is necessary only in
pgss_planner() should be added.



Yes of course.  It also requires some changes in the macro to make it safe when
called with an expression.

v12 attached!


Thanks for updating the patch! The patch looks good to me.

I applied minor and cosmetic changes into the patch. Attached is
the updated version of the patch. Barring any objection, I'd like to
commit this version.

BTW, the minor and cosmetic changes that I applied are, for example,

- Rename pgss_planner_hook to pgss_planner for the sake of consistency.
   Other function using hook in pgss doesn't use "_hook" in their names, too.
- Make pgss_planner use PG_FINALLY() instead of PG_CATCH().
- Make PGSS_NUMKIND as the last value in enum pgssStoreKind.
- Update the sample output in the document.
etc

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
diff --git a/contrib/pg_stat_statements/Makefile 
b/contrib/pg_stat_statements/Makefile
index 80042a0905..081f997d70 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -6,7 +6,8 @@ OBJS = \
pg_stat_statements.o
 
 EXTENSION = pg_stat_statements
-DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.6--1.7.sql \
+DATA = pg_stat_statements--1.4.sql \
+   pg_stat_statements--1.7--1.8.sql pg_stat_statements--1.6--1.7.sql \
pg_stat_statements--1.5--1.6.sql pg_stat_statements--1.4--1.5.sql \
pg_stat_statements--1.3--1.4.sql pg_stat_statements--1.2--1.3.sql \
pg_stat_statements--1.1--1.2.sql pg_stat_statements--1.0--1.1.sql
diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out 
b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 6787ec1efd..45dbe9e677 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -96,25 +96,26 @@ EXECUTE pgss_test(1);
 
 DEALLOCATE pgss_test;
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
-   query   | calls | rows 
+---+--
- PREPARE pgss_test (int) AS SELECT $1, $2 LIMIT $3 | 1 |1
- SELECT $1+| 4 |4
-  +|   | 
-   AS "text"   |   | 
- SELECT $1 + $2| 2 |2
- SELECT $1 + $2 + $3 AS "add"  | 3 |3
- SELECT $1 AS "float"  | 1 |1
- SELECT $1 AS "int"| 2 |2
- SELECT $1 AS i UNION SELECT $2 ORDER BY i | 1 |2
- SELECT $1 || $2   | 1 |

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-31 Thread Julien Rouhaud
On Tue, Mar 31, 2020 at 04:10:47PM +0900, Fujii Masao wrote:
> 
> 
> On 2020/03/31 15:03, Julien Rouhaud wrote:
> > On Tue, Mar 31, 2020 at 12:21:43PM +0900, Fujii Masao wrote:
> > > 
> > > On 2020/03/31 3:16, Julien Rouhaud wrote:
> > > > On Mon, Mar 30, 2020 at 6:36 PM Fujii Masao 
> > > >  wrote:
> > > > > 
> > > > > While testing the patched pgss, I found that the patched version
> > > > > may track the statements that the original version doesn't.
> > > > > Please imagine the case where the following queries are executed,
> > > > > with pgss.track = top.
> > > > > 
> > > > >PREPARE hoge AS SELECT * FROM t;
> > > > >EXPLAIN EXECUTE hoge;
> > > > > 
> > > > > The pgss view returned "PREPARE hoge AS SELECT * FROM t"
> > > > > in the patched version, but not in the orignal version.
> > > > > 
> > > > > Is this problematic?
> > > > 
> > > > Oh indeed. That's a side effect of having different the executed query
> > > > and the planned query being different.
> > > > 
> > > > I guess the question is to chose if the top level executed query of a
> > > > utilty statement containing an optimisable query, should the top level
> > > > planner call of that optimisable statement be considered at top level
> > > > or not.  I tend to think that's the correct behavior here, as this is
> > > > also what would happen if a regular DML was provided.  What do you
> > > > think?
> > > 
> > > TBH, not sure if that's ok yet...
> > > 
> > > I'm now just wondering if both plan_nested_level and
> > > exec_nested_level should be incremented in pgss_ProcessUtility().
> > > This is just a guess, so I need more investigation about this.
> > 
> > Yeah, after a second thought I realize that my comparison was wrong.  
> > Allowing
> > *any* top-level planner call when pgss.track = top would mean that we should
> > also consider all planner calls from queries executed for FK checks and 
> > such,
> > which is definitely not the intended behavior.
> 
> Yes. So, basically any planner activity that happens during
> the execution phase of the statement is not tracked.
> 
> > FTR with this patch such calls still don't get tracked, but only because 
> > those
> > query don't get a queryid assigned, not because the nesting level says so.
> > 
> > How about simply passing (plan_nested_level + exec_nested_level) for
> > pgss_enabled call in pgss_planner_hook?
> 
> Looks good to me! The comment about why this treatment is necessary only in
> pgss_planner() should be added.


Yes of course.  It also requires some changes in the macro to make it safe when
called with an expression.

v12 attached!
>From e3c470b0adf6cfb911f6d609ef9b79a6c311484c Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Wed, 27 Mar 2019 11:24:35 +0100
Subject: [PATCH v12] Add planning counters to pg_stat_statements

---
 contrib/pg_stat_statements/Makefile   |   3 +-
 .../expected/pg_stat_statements.out   | 231 +-
 .../pg_stat_statements--1.7--1.8.sql  |  55 
 .../pg_stat_statements/pg_stat_statements.c   | 295 ++
 .../pg_stat_statements.control|   2 +-
 .../sql/pg_stat_statements.sql|  16 +
 doc/src/sgml/pgstatstatements.sgml|  80 -
 7 files changed, 528 insertions(+), 154 deletions(-)
 create mode 100644 contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql

diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index 80042a0905..0f45e85b5b 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -6,7 +6,8 @@ OBJS = \
 	pg_stat_statements.o
 
 EXTENSION = pg_stat_statements
-DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.6--1.7.sql \
+DATA = pg_stat_statements--1.4.sql \
+pg_stat_statements--1.7--1.8.sql pg_stat_statements--1.6--1.7.sql \
 	pg_stat_statements--1.5--1.6.sql pg_stat_statements--1.4--1.5.sql \
 	pg_stat_statements--1.3--1.4.sql pg_stat_statements--1.2--1.3.sql \
 	pg_stat_statements--1.1--1.2.sql pg_stat_statements--1.0--1.1.sql
diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 6787ec1efd..7612bed6d5 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -96,25 +96,26 @@ EXECUTE pgss_test(1);
 
 DEALLOCATE pgss_test;
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
-   query   | calls | rows 
+---+--
- PREPARE pgss_test (int) AS SELECT $1, $2 LIMIT $3 | 1 |1
- SELECT $1+| 4 |4
-  +|   | 
-   AS "text"   |   | 
- SELECT $1 + $2| 2 |2
- SELECT $1 + $2 + $3 AS "add" 

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-31 Thread Fujii Masao




On 2020/03/31 15:03, Julien Rouhaud wrote:

On Tue, Mar 31, 2020 at 12:21:43PM +0900, Fujii Masao wrote:


On 2020/03/31 3:16, Julien Rouhaud wrote:

On Mon, Mar 30, 2020 at 6:36 PM Fujii Masao  wrote:


While testing the patched pgss, I found that the patched version
may track the statements that the original version doesn't.
Please imagine the case where the following queries are executed,
with pgss.track = top.

   PREPARE hoge AS SELECT * FROM t;
   EXPLAIN EXECUTE hoge;

The pgss view returned "PREPARE hoge AS SELECT * FROM t"
in the patched version, but not in the orignal version.

Is this problematic?


Oh indeed. That's a side effect of having different the executed query
and the planned query being different.

I guess the question is to chose if the top level executed query of a
utilty statement containing an optimisable query, should the top level
planner call of that optimisable statement be considered at top level
or not.  I tend to think that's the correct behavior here, as this is
also what would happen if a regular DML was provided.  What do you
think?


TBH, not sure if that's ok yet...

I'm now just wondering if both plan_nested_level and
exec_nested_level should be incremented in pgss_ProcessUtility().
This is just a guess, so I need more investigation about this.


Yeah, after a second thought I realize that my comparison was wrong.  Allowing
*any* top-level planner call when pgss.track = top would mean that we should
also consider all planner calls from queries executed for FK checks and such,
which is definitely not the intended behavior.


Yes. So, basically any planner activity that happens during
the execution phase of the statement is not tracked.


FTR with this patch such calls still don't get tracked, but only because those
query don't get a queryid assigned, not because the nesting level says so.

How about simply passing (plan_nested_level + exec_nested_level) for
pgss_enabled call in pgss_planner_hook?


Looks good to me! The comment about why this treatment is necessary only in
pgss_planner() should be added.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-31 Thread Julien Rouhaud
On Tue, Mar 31, 2020 at 12:21:43PM +0900, Fujii Masao wrote:
> 
> On 2020/03/31 3:16, Julien Rouhaud wrote:
> > On Mon, Mar 30, 2020 at 6:36 PM Fujii Masao  
> > wrote:
> > > 
> > > While testing the patched pgss, I found that the patched version
> > > may track the statements that the original version doesn't.
> > > Please imagine the case where the following queries are executed,
> > > with pgss.track = top.
> > > 
> > >   PREPARE hoge AS SELECT * FROM t;
> > >   EXPLAIN EXECUTE hoge;
> > > 
> > > The pgss view returned "PREPARE hoge AS SELECT * FROM t"
> > > in the patched version, but not in the orignal version.
> > > 
> > > Is this problematic?
> > 
> > Oh indeed. That's a side effect of having different the executed query
> > and the planned query being different.
> > 
> > I guess the question is to chose if the top level executed query of a
> > utilty statement containing an optimisable query, should the top level
> > planner call of that optimisable statement be considered at top level
> > or not.  I tend to think that's the correct behavior here, as this is
> > also what would happen if a regular DML was provided.  What do you
> > think?
> 
> TBH, not sure if that's ok yet...
> 
> I'm now just wondering if both plan_nested_level and
> exec_nested_level should be incremented in pgss_ProcessUtility().
> This is just a guess, so I need more investigation about this.

Yeah, after a second thought I realize that my comparison was wrong.  Allowing
*any* top-level planner call when pgss.track = top would mean that we should
also consider all planner calls from queries executed for FK checks and such,
which is definitely not the intended behavior.

FTR with this patch such calls still don't get tracked, but only because those
query don't get a queryid assigned, not because the nesting level says so.

How about simply passing (plan_nested_level + exec_nested_level) for
pgss_enabled call in pgss_planner_hook?




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-30 Thread Fujii Masao




On 2020/03/31 3:16, Julien Rouhaud wrote:

On Mon, Mar 30, 2020 at 6:36 PM Fujii Masao  wrote:


On 2020/03/30 17:03, Julien Rouhaud wrote:

On Mon, Mar 30, 2020 at 01:56:43PM +0900, Fujii Masao wrote:



On 2020/03/29 15:15, Julien Rouhaud wrote:

On Fri, Mar 27, 2020 at 03:42:50PM +0100, Julien Rouhaud wrote:

On Fri, Mar 27, 2020 at 2:01 PM Fujii Masao  wrote:





So what I'd like to say is that the information that users are interested
in would vary on each situation and case. At least for me it seems
enough for pgss to report only the basic information. Then users
can calculate to get the numbers (like total_time) they're interested in,
from those basic information.

But of course, I'd like to hear more opinions about this...


+1

Unless someone chime in by tomorrow, I'll just drop the sum as it
seems less controversial and not a blocker in userland if users are
interested.


Done in attached v11, with also the s/querytext/query_text/ discrepancy noted
previously.


Thanks for updating the patch! But I still think query_string is better
name because it's used in other several places, for the sake of consistency.


You're absolutely right.  That's what I actually wanted to do given your
previous comment, but somehow managed to miss it, sorry about that and thanks
for fixing.


So I changed the argument name that way and commit the 0001 patch.
If you think query_text is better, let's keep discussing this topic!

Anyway many thanks for your great job!


Thanks a lot!




I also exported BufferUsageAccumDiff as mentioned previously, as it seems
clearner and will avoid future useless code churn, and run pgindent.


Many thanks!! I'm thinking to commit this part separately.
So I made that patch based on your patch. Attached.


Thanks! It looks good to me.


I also kept that part in a distinct commit for convenience.


I also pushed 0002 patch. Thanks!

I will review 0003 patch again.


And thanks for that too :)


While testing the patched pgss, I found that the patched version
may track the statements that the original version doesn't.
Please imagine the case where the following queries are executed,
with pgss.track = top.

  PREPARE hoge AS SELECT * FROM t;
  EXPLAIN EXECUTE hoge;

The pgss view returned "PREPARE hoge AS SELECT * FROM t"
in the patched version, but not in the orignal version.

Is this problematic?


Oh indeed. That's a side effect of having different the executed query
and the planned query being different.

I guess the question is to chose if the top level executed query of a
utilty statement containing an optimisable query, should the top level
planner call of that optimisable statement be considered at top level
or not.  I tend to think that's the correct behavior here, as this is
also what would happen if a regular DML was provided.  What do you
think?


TBH, not sure if that's ok yet...

I'm now just wondering if both plan_nested_level and
exec_nested_level should be incremented in pgss_ProcessUtility().
This is just a guess, so I need more investigation about this.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-30 Thread Julien Rouhaud
On Mon, Mar 30, 2020 at 6:36 PM Fujii Masao  wrote:
>
> On 2020/03/30 17:03, Julien Rouhaud wrote:
> > On Mon, Mar 30, 2020 at 01:56:43PM +0900, Fujii Masao wrote:
> >>
> >>
> >> On 2020/03/29 15:15, Julien Rouhaud wrote:
> >>> On Fri, Mar 27, 2020 at 03:42:50PM +0100, Julien Rouhaud wrote:
>  On Fri, Mar 27, 2020 at 2:01 PM Fujii Masao 
>   wrote:
> >
> 
> > So what I'd like to say is that the information that users are 
> > interested
> > in would vary on each situation and case. At least for me it seems
> > enough for pgss to report only the basic information. Then users
> > can calculate to get the numbers (like total_time) they're interested 
> > in,
> > from those basic information.
> >
> > But of course, I'd like to hear more opinions about this...
> 
>  +1
> 
>  Unless someone chime in by tomorrow, I'll just drop the sum as it
>  seems less controversial and not a blocker in userland if users are
>  interested.
> >>>
> >>> Done in attached v11, with also the s/querytext/query_text/ discrepancy 
> >>> noted
> >>> previously.
> >>
> >> Thanks for updating the patch! But I still think query_string is better
> >> name because it's used in other several places, for the sake of 
> >> consistency.
> >
> > You're absolutely right.  That's what I actually wanted to do given your
> > previous comment, but somehow managed to miss it, sorry about that and 
> > thanks
> > for fixing.
> >
> >> So I changed the argument name that way and commit the 0001 patch.
> >> If you think query_text is better, let's keep discussing this topic!
> >>
> >> Anyway many thanks for your great job!
> >
> > Thanks a lot!
> >
> >>
> >> I also exported BufferUsageAccumDiff as mentioned previously, as it 
> >> seems
> >> clearner and will avoid future useless code churn, and run pgindent.
> >
> > Many thanks!! I'm thinking to commit this part separately.
> > So I made that patch based on your patch. Attached.
> 
>  Thanks! It looks good to me.
> >>>
> >>> I also kept that part in a distinct commit for convenience.
> >>
> >> I also pushed 0002 patch. Thanks!
> >>
> >> I will review 0003 patch again.
> >
> > And thanks for that too :)
>
> While testing the patched pgss, I found that the patched version
> may track the statements that the original version doesn't.
> Please imagine the case where the following queries are executed,
> with pgss.track = top.
>
>  PREPARE hoge AS SELECT * FROM t;
>  EXPLAIN EXECUTE hoge;
>
> The pgss view returned "PREPARE hoge AS SELECT * FROM t"
> in the patched version, but not in the orignal version.
>
> Is this problematic?

Oh indeed. That's a side effect of having different the executed query
and the planned query being different.

I guess the question is to chose if the top level executed query of a
utilty statement containing an optimisable query, should the top level
planner call of that optimisable statement be considered at top level
or not.  I tend to think that's the correct behavior here, as this is
also what would happen if a regular DML was provided.  What do you
think?




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-30 Thread Fujii Masao




On 2020/03/30 17:03, Julien Rouhaud wrote:

On Mon, Mar 30, 2020 at 01:56:43PM +0900, Fujii Masao wrote:



On 2020/03/29 15:15, Julien Rouhaud wrote:

On Fri, Mar 27, 2020 at 03:42:50PM +0100, Julien Rouhaud wrote:

On Fri, Mar 27, 2020 at 2:01 PM Fujii Masao  wrote:





So what I'd like to say is that the information that users are interested
in would vary on each situation and case. At least for me it seems
enough for pgss to report only the basic information. Then users
can calculate to get the numbers (like total_time) they're interested in,
from those basic information.

But of course, I'd like to hear more opinions about this...


+1

Unless someone chime in by tomorrow, I'll just drop the sum as it
seems less controversial and not a blocker in userland if users are
interested.


Done in attached v11, with also the s/querytext/query_text/ discrepancy noted
previously.


Thanks for updating the patch! But I still think query_string is better
name because it's used in other several places, for the sake of consistency.


You're absolutely right.  That's what I actually wanted to do given your
previous comment, but somehow managed to miss it, sorry about that and thanks
for fixing.


So I changed the argument name that way and commit the 0001 patch.
If you think query_text is better, let's keep discussing this topic!

Anyway many thanks for your great job!


Thanks a lot!




I also exported BufferUsageAccumDiff as mentioned previously, as it seems
clearner and will avoid future useless code churn, and run pgindent.


Many thanks!! I'm thinking to commit this part separately.
So I made that patch based on your patch. Attached.


Thanks! It looks good to me.


I also kept that part in a distinct commit for convenience.


I also pushed 0002 patch. Thanks!

I will review 0003 patch again.


And thanks for that too :)


While testing the patched pgss, I found that the patched version
may track the statements that the original version doesn't.
Please imagine the case where the following queries are executed,
with pgss.track = top.

PREPARE hoge AS SELECT * FROM t;
EXPLAIN EXECUTE hoge;

The pgss view returned "PREPARE hoge AS SELECT * FROM t"
in the patched version, but not in the orignal version.

Is this problematic?

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-30 Thread Julien Rouhaud
On Mon, Mar 30, 2020 at 01:56:43PM +0900, Fujii Masao wrote:
> 
> 
> On 2020/03/29 15:15, Julien Rouhaud wrote:
> > On Fri, Mar 27, 2020 at 03:42:50PM +0100, Julien Rouhaud wrote:
> > > On Fri, Mar 27, 2020 at 2:01 PM Fujii Masao  
> > > wrote:
> > > > 
> > > 
> > > > So what I'd like to say is that the information that users are 
> > > > interested
> > > > in would vary on each situation and case. At least for me it seems
> > > > enough for pgss to report only the basic information. Then users
> > > > can calculate to get the numbers (like total_time) they're interested 
> > > > in,
> > > > from those basic information.
> > > > 
> > > > But of course, I'd like to hear more opinions about this...
> > > 
> > > +1
> > > 
> > > Unless someone chime in by tomorrow, I'll just drop the sum as it
> > > seems less controversial and not a blocker in userland if users are
> > > interested.
> > 
> > Done in attached v11, with also the s/querytext/query_text/ discrepancy 
> > noted
> > previously.
> 
> Thanks for updating the patch! But I still think query_string is better
> name because it's used in other several places, for the sake of consistency.

You're absolutely right.  That's what I actually wanted to do given your
previous comment, but somehow managed to miss it, sorry about that and thanks
for fixing.

> So I changed the argument name that way and commit the 0001 patch.
> If you think query_text is better, let's keep discussing this topic!
> 
> Anyway many thanks for your great job!

Thanks a lot!

> 
> > > > > I also exported BufferUsageAccumDiff as mentioned previously, as it 
> > > > > seems
> > > > > clearner and will avoid future useless code churn, and run pgindent.
> > > > 
> > > > Many thanks!! I'm thinking to commit this part separately.
> > > > So I made that patch based on your patch. Attached.
> > > 
> > > Thanks! It looks good to me.
> > 
> > I also kept that part in a distinct commit for convenience.
> 
> I also pushed 0002 patch. Thanks!
> 
> I will review 0003 patch again.

And thanks for that too :)




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-29 Thread Fujii Masao




On 2020/03/29 15:15, Julien Rouhaud wrote:

On Fri, Mar 27, 2020 at 03:42:50PM +0100, Julien Rouhaud wrote:

On Fri, Mar 27, 2020 at 2:01 PM Fujii Masao  wrote:





So what I'd like to say is that the information that users are interested
in would vary on each situation and case. At least for me it seems
enough for pgss to report only the basic information. Then users
can calculate to get the numbers (like total_time) they're interested in,
from those basic information.

But of course, I'd like to hear more opinions about this...


+1

Unless someone chime in by tomorrow, I'll just drop the sum as it
seems less controversial and not a blocker in userland if users are
interested.


Done in attached v11, with also the s/querytext/query_text/ discrepancy noted
previously.


Thanks for updating the patch! But I still think query_string is better
name because it's used in other several places, for the sake of consistency.
So I changed the argument name that way and commit the 0001 patch.
If you think query_text is better, let's keep discussing this topic!

Anyway many thanks for your great job!


I also exported BufferUsageAccumDiff as mentioned previously, as it seems
clearner and will avoid future useless code churn, and run pgindent.


Many thanks!! I'm thinking to commit this part separately.
So I made that patch based on your patch. Attached.


Thanks! It looks good to me.


I also kept that part in a distinct commit for convenience.


I also pushed 0002 patch. Thanks!

I will review 0003 patch again.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-29 Thread Julien Rouhaud
On Fri, Mar 27, 2020 at 03:42:50PM +0100, Julien Rouhaud wrote:
> On Fri, Mar 27, 2020 at 2:01 PM Fujii Masao  
> wrote:
> >
> 
> > So what I'd like to say is that the information that users are interested
> > in would vary on each situation and case. At least for me it seems
> > enough for pgss to report only the basic information. Then users
> > can calculate to get the numbers (like total_time) they're interested in,
> > from those basic information.
> >
> > But of course, I'd like to hear more opinions about this...
> 
> +1
> 
> Unless someone chime in by tomorrow, I'll just drop the sum as it
> seems less controversial and not a blocker in userland if users are
> interested.

Done in attached v11, with also the s/querytext/query_text/ discrepancy noted
previously.

> > >
> > > I also exported BufferUsageAccumDiff as mentioned previously, as it seems
> > > clearner and will avoid future useless code churn, and run pgindent.
> >
> > Many thanks!! I'm thinking to commit this part separately.
> > So I made that patch based on your patch. Attached.
> 
> Thanks! It looks good to me.

I also kept that part in a distinct commit for convenience.
>From 0d9bf628dac15ce9e790d3a040ef1123c7c6e7d9 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Thu, 28 Mar 2019 13:33:23 +0100
Subject: [PATCH v11 1/3] Pass query string to the planner

---
 src/backend/commands/copy.c  |  3 ++-
 src/backend/commands/createas.c  |  3 ++-
 src/backend/commands/explain.c   |  2 +-
 src/backend/commands/extension.c |  2 +-
 src/backend/commands/matview.c   |  2 +-
 src/backend/commands/portalcmds.c|  2 +-
 src/backend/executor/functions.c |  1 +
 src/backend/optimizer/plan/planner.c | 10 ++
 src/backend/tcop/postgres.c  | 13 -
 src/backend/utils/cache/plancache.c  |  3 ++-
 src/include/optimizer/optimizer.h|  3 ++-
 src/include/optimizer/planner.h  |  4 +++-
 src/include/tcop/tcopprot.h  |  6 --
 13 files changed, 34 insertions(+), 20 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index fbde9f88e7..efb1e0d03e 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1580,7 +1580,8 @@ BeginCopy(ParseState *pstate,
 		}
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, NULL);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, NULL);
 
 		/*
 		 * With row level security and a user using "COPY relation TO", we
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 3a5676fb39..9febdc5165 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -329,7 +329,8 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		Assert(query->commandType == CMD_SELECT);
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, params);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, params);
 
 		/*
 		 * Use a snapshot with an updated command ID to ensure this query sees
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index ff2f45cfb2..ee0e638f33 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -376,7 +376,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		INSTR_TIME_SET_CURRENT(planstart);
 
 		/* plan the query */
-		plan = pg_plan_query(query, cursorOptions, params);
+		plan = pg_plan_query(query, queryString, cursorOptions, params);
 
 		INSTR_TIME_SET_CURRENT(planduration);
 		INSTR_TIME_SUBTRACT(planduration, planstart);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 00cf4ef268..38cbea385a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -751,7 +751,7 @@ execute_sql_string(const char *sql)
 		   NULL,
 		   0,
 		   NULL);
-		stmt_list = pg_plan_queries(stmt_list, CURSOR_OPT_PARALLEL_OK, NULL);
+		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
 
 		foreach(lc2, stmt_list)
 		{
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index c3954f3e24..e5a5eef102 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -391,7 +391,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	CHECK_FOR_INTERRUPTS();
 
 	/* Plan the query which will generate data for the refresh. */
-	plan = pg_plan_query(query, 0, NULL);
+	plan = pg_plan_query(query, queryString, 0, NULL);
 
 	/*
 	 * Use a snapshot with an updated command ID to ensure this query sees
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 40be5069fe..6a2c233615 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -90,7 +90,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 		elog(ERROR, "non-SELECT 

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-27 Thread Julien Rouhaud
On Fri, Mar 27, 2020 at 2:01 PM Fujii Masao  wrote:
>
> On 2020/03/27 19:00, Julien Rouhaud wrote:
> > On Thu, Mar 26, 2020 at 02:22:47PM +0100, Julien Rouhaud wrote:
> >> On Thu, Mar 26, 2020 at 08:08:35PM +0900, Fujii Masao wrote:
> >>>
> >>> Here are other comments.
> >>>
> >>> -   if (jstate)
> >>> +   if (kind == PGSS_JUMBLE)
> >>>
> >>> Why is PGSS_JUMBLE necessary? ISTM that we can still use jstate here, 
> >>> instead.
> >>>
> >>> If it's ok to remove PGSS_JUMBLE, we can define PGSS_NUM_KIND(=2) instead
> >>> and replace 2 in, e.g., total_time[2] with PGSS_NUM_KIND. Thought?
> >>
> >> Yes, we could be using jstate here.  I originally used that to avoid 
> >> passing
> >> PGSS_EXEC (or the other one) as a way to say "ignore this information as
> >> there's the jstate which says it's yet another meaning".  If that's not an
> >> issue, I can change that as PGSS_NUM_KIND will clearly improve the 
> >> explicit "2"
> >> all over the place.
> >
> > Done, passing PGSS_PLAN when jumble is intended, with a comment saying that 
> > the
> > pgss_kind is ignored in that case.
> >
> >>> +  total_time
> >>> +  double precision
> >>> +  
> >>> +  
> >>> +Total time spend planning and executing the statement, in 
> >>> milliseconds
> >>> +  
> >>> + 
> >>>
> >>> pg_stat_statements view has this column but the function not.
> >>> We should make both have the column or not at all, for consistency?
> >>> I'm not sure if it's good thing to expose the sum of total_plan_time
> >>> and total_exec_time as total_time. If some users want that, they can
> >>> easily calculate it from total_plan_time and total_exec_time by using
> >>> their own logic.
> >>
> >> I think we originally added it as a way to avoid too much compatibility 
> >> break,
> >> and also because it seems like a field most users will be interested in 
> >> anyway.
> >> Now that I'm thinking about it again, I indeed think it was a mistake to 
> >> have
> >> that in view part only.  Not mainly for consistency, but for users who 
> >> would be
> >> interested in the total_time field while not wanting to pay the overhead of
> >> retrieving the query text if they don't need it.  So I'll change that!
> >
> > Done
> >
>
> Thanks for updating the patch! But I'm still wondering if it's really
> good thing to expose total_time. For example, when the query fails
> with an error many times and "calls" becomes very different from
> "plans", "total_plan_time" + "total_exec_time" is really what the users
> are interested in?

That's also the case when running explain without analyze, or prepared
statements that fallback to generic plans.  As a user, knowing how
long postgres actually spent processing a query is interesting as a
way to find likely low hanging fruits, even if there's no
planning/execution strict correlation.  The planning/execution detail
is also useful but that's probably not what I'd be starting from (at
least in OLTP workload).

The error scenario is unfortunate, but that's yet another topic.

> Some users may be interested in the sum of mean
> times, but it's not exposed...

Yes, we had a discussion about summing the other fields, but it seems
to me that doing a sum of computed fields doesn't really make sense.
Mean without variance is already not that useful.

> So what I'd like to say is that the information that users are interested
> in would vary on each situation and case. At least for me it seems
> enough for pgss to report only the basic information. Then users
> can calculate to get the numbers (like total_time) they're interested in,
> from those basic information.
>
> But of course, I'd like to hear more opinions about this...

+1

Unless someone chime in by tomorrow, I'll just drop the sum as it
seems less controversial and not a blocker in userland if users are
interested.

>
> +   if (api_version >= PGSS_V1_8)
> +   values[i++] = Int64GetDatumFast(tmp.total_time[0] +
> + 
>   tmp.total_time[1]);
>
> BTW, Int64GetDatumFast() should be Float8GetDatumFast()?

Oh indeed, embarrassing copy/pasto.

>
> >>> +   nested_level++;
> >>> +   PG_TRY();
> >>>
> >>> In old thread [1], Tom Lane commented the usage of nested_level
> >>> in the planner hook. There seems no reply to that so far. What's
> >>> your opinion about that comment?
> >>>
> >>> [1] https://www.postgresql.org/message-id/28980.1515803...@sss.pgh.pa.us
> >>
> >> Oh thanks, I didn't noticed this part of the discussion.  I agree with 
> >> Tom's
> >> concern, and I think that having a specific nesting level variable for the
> >> planner is the best workaround, so I'll implement that.
> >
> > Done.
> >
> > I also exported BufferUsageAccumDiff as mentioned previously, as it seems
> > clearner and will avoid future useless code churn, and run pgindent.
>
> Many thanks!! I'm thinking to commit this part 

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-27 Thread Julien Rouhaud
On Fri, Mar 27, 2020 at 12:02 PM Fujii Masao
 wrote:
>
> On 2020/03/27 19:00, Julien Rouhaud wrote:
> > On Thu, Mar 26, 2020 at 02:22:47PM +0100, Julien Rouhaud wrote:
> >> On Thu, Mar 26, 2020 at 08:08:35PM +0900, Fujii Masao wrote:
> >>>
> >>> Here are other comments.
> >>>
> >>> -   if (jstate)
> >>> +   if (kind == PGSS_JUMBLE)
> >>>
> >>> Why is PGSS_JUMBLE necessary? ISTM that we can still use jstate here, 
> >>> instead.
> >>>
> >>> If it's ok to remove PGSS_JUMBLE, we can define PGSS_NUM_KIND(=2) instead
> >>> and replace 2 in, e.g., total_time[2] with PGSS_NUM_KIND. Thought?
> >>
> >> Yes, we could be using jstate here.  I originally used that to avoid 
> >> passing
> >> PGSS_EXEC (or the other one) as a way to say "ignore this information as
> >> there's the jstate which says it's yet another meaning".  If that's not an
> >> issue, I can change that as PGSS_NUM_KIND will clearly improve the 
> >> explicit "2"
> >> all over the place.
> >
> > Done, passing PGSS_PLAN when jumble is intended, with a comment saying that 
> > the
> > pgss_kind is ignored in that case.
> >
> >>> +  total_time
> >>> +  double precision
> >>> +  
> >>> +  
> >>> +Total time spend planning and executing the statement, in 
> >>> milliseconds
> >>> +  
> >>> + 
> >>>
> >>> pg_stat_statements view has this column but the function not.
> >>> We should make both have the column or not at all, for consistency?
> >>> I'm not sure if it's good thing to expose the sum of total_plan_time
> >>> and total_exec_time as total_time. If some users want that, they can
> >>> easily calculate it from total_plan_time and total_exec_time by using
> >>> their own logic.
> >>
> >> I think we originally added it as a way to avoid too much compatibility 
> >> break,
> >> and also because it seems like a field most users will be interested in 
> >> anyway.
> >> Now that I'm thinking about it again, I indeed think it was a mistake to 
> >> have
> >> that in view part only.  Not mainly for consistency, but for users who 
> >> would be
> >> interested in the total_time field while not wanting to pay the overhead of
> >> retrieving the query text if they don't need it.  So I'll change that!
> >
> > Done
> >
> >>> +   nested_level++;
> >>> +   PG_TRY();
> >>>
> >>> In old thread [1], Tom Lane commented the usage of nested_level
> >>> in the planner hook. There seems no reply to that so far. What's
> >>> your opinion about that comment?
> >>>
> >>> [1] https://www.postgresql.org/message-id/28980.1515803...@sss.pgh.pa.us
> >>
> >> Oh thanks, I didn't noticed this part of the discussion.  I agree with 
> >> Tom's
> >> concern, and I think that having a specific nesting level variable for the
> >> planner is the best workaround, so I'll implement that.
> >
> > Done.
> >
> > I also exported BufferUsageAccumDiff as mentioned previously, as it seems
> > clearner and will avoid future useless code churn, and run pgindent.
> >
> > v10 attached.
>
> Thanks for updating the patches!
>
> Regarding 0001 patch, I have one nitpicking comment;
>
> -   result = standard_planner(parse, cursorOptions, boundParams);
> +   result = standard_planner(parse, query_text, cursorOptions, 
> boundParams);
>
> -standard_planner(Query *parse, int cursorOptions, ParamListInfo boundParams)
> +standard_planner(Query *parse, const char *querytext, int cursorOptions,
> +ParamListInfo boundParams)
>
> -pg_plan_query(Query *querytree, int cursorOptions, ParamListInfo boundParams)
> +pg_plan_query(Query *querytree, const char *query_text, int cursorOptions,
> + ParamListInfo boundParams)
>
> The patch uses "query_text" and "querytext" as the name of newly-added
> argument. They should be unified? IMO "query_string" looks better name
> because it's used in other functions like pg_analyze_and_rewrite(),
> pg_parse_query() for the sake of consistency. Thought?

Indeed, and +1 for query_text.




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-27 Thread Fujii Masao



On 2020/03/27 19:00, Julien Rouhaud wrote:

On Thu, Mar 26, 2020 at 02:22:47PM +0100, Julien Rouhaud wrote:

On Thu, Mar 26, 2020 at 08:08:35PM +0900, Fujii Masao wrote:


Here are other comments.

-   if (jstate)
+   if (kind == PGSS_JUMBLE)

Why is PGSS_JUMBLE necessary? ISTM that we can still use jstate here, instead.

If it's ok to remove PGSS_JUMBLE, we can define PGSS_NUM_KIND(=2) instead
and replace 2 in, e.g., total_time[2] with PGSS_NUM_KIND. Thought?


Yes, we could be using jstate here.  I originally used that to avoid passing
PGSS_EXEC (or the other one) as a way to say "ignore this information as
there's the jstate which says it's yet another meaning".  If that's not an
issue, I can change that as PGSS_NUM_KIND will clearly improve the explicit "2"
all over the place.


Done, passing PGSS_PLAN when jumble is intended, with a comment saying that the
pgss_kind is ignored in that case.


+  total_time
+  double precision
+  
+  
+Total time spend planning and executing the statement, in milliseconds
+  
+ 

pg_stat_statements view has this column but the function not.
We should make both have the column or not at all, for consistency?
I'm not sure if it's good thing to expose the sum of total_plan_time
and total_exec_time as total_time. If some users want that, they can
easily calculate it from total_plan_time and total_exec_time by using
their own logic.


I think we originally added it as a way to avoid too much compatibility break,
and also because it seems like a field most users will be interested in anyway.
Now that I'm thinking about it again, I indeed think it was a mistake to have
that in view part only.  Not mainly for consistency, but for users who would be
interested in the total_time field while not wanting to pay the overhead of
retrieving the query text if they don't need it.  So I'll change that!


Done



Thanks for updating the patch! But I'm still wondering if it's really
good thing to expose total_time. For example, when the query fails
with an error many times and "calls" becomes very different from
"plans", "total_plan_time" + "total_exec_time" is really what the users
are interested in? Some users may be interested in the sum of mean
times, but it's not exposed...

So what I'd like to say is that the information that users are interested
in would vary on each situation and case. At least for me it seems
enough for pgss to report only the basic information. Then users
can calculate to get the numbers (like total_time) they're interested in,
from those basic information.

But of course, I'd like to hear more opinions about this...

+   if (api_version >= PGSS_V1_8)
+   values[i++] = Int64GetDatumFast(tmp.total_time[0] +
+   
tmp.total_time[1]);

BTW, Int64GetDatumFast() should be Float8GetDatumFast()?


+   nested_level++;
+   PG_TRY();

In old thread [1], Tom Lane commented the usage of nested_level
in the planner hook. There seems no reply to that so far. What's
your opinion about that comment?

[1] https://www.postgresql.org/message-id/28980.1515803...@sss.pgh.pa.us


Oh thanks, I didn't noticed this part of the discussion.  I agree with Tom's
concern, and I think that having a specific nesting level variable for the
planner is the best workaround, so I'll implement that.


Done.

I also exported BufferUsageAccumDiff as mentioned previously, as it seems
clearner and will avoid future useless code churn, and run pgindent.


Many thanks!! I'm thinking to commit this part separately.
So I made that patch based on your patch. Attached.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c 
b/contrib/pg_stat_statements/pg_stat_statements.c
index 20dc8c605b..50c345378d 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1016,30 +1016,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char 
*queryString,
rows = (qc && qc->commandTag == CMDTAG_COPY) ? qc->nprocessed : 
0;
 
/* calc differences of buffer counters. */
-   bufusage.shared_blks_hit =
-   pgBufferUsage.shared_blks_hit - 
bufusage_start.shared_blks_hit;
-   bufusage.shared_blks_read =
-   pgBufferUsage.shared_blks_read - 
bufusage_start.shared_blks_read;
-   bufusage.shared_blks_dirtied =
-   pgBufferUsage.shared_blks_dirtied - 
bufusage_start.shared_blks_dirtied;
-   bufusage.shared_blks_written =
-   pgBufferUsage.shared_blks_written - 
bufusage_start.shared_blks_written;
-   bufusage.local_blks_hit =
-   pgBufferUsage.local_blks_hit - 

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-27 Thread Fujii Masao




On 2020/03/27 19:00, Julien Rouhaud wrote:

On Thu, Mar 26, 2020 at 02:22:47PM +0100, Julien Rouhaud wrote:

On Thu, Mar 26, 2020 at 08:08:35PM +0900, Fujii Masao wrote:


Here are other comments.

-   if (jstate)
+   if (kind == PGSS_JUMBLE)

Why is PGSS_JUMBLE necessary? ISTM that we can still use jstate here, instead.

If it's ok to remove PGSS_JUMBLE, we can define PGSS_NUM_KIND(=2) instead
and replace 2 in, e.g., total_time[2] with PGSS_NUM_KIND. Thought?


Yes, we could be using jstate here.  I originally used that to avoid passing
PGSS_EXEC (or the other one) as a way to say "ignore this information as
there's the jstate which says it's yet another meaning".  If that's not an
issue, I can change that as PGSS_NUM_KIND will clearly improve the explicit "2"
all over the place.


Done, passing PGSS_PLAN when jumble is intended, with a comment saying that the
pgss_kind is ignored in that case.


+  total_time
+  double precision
+  
+  
+Total time spend planning and executing the statement, in milliseconds
+  
+ 

pg_stat_statements view has this column but the function not.
We should make both have the column or not at all, for consistency?
I'm not sure if it's good thing to expose the sum of total_plan_time
and total_exec_time as total_time. If some users want that, they can
easily calculate it from total_plan_time and total_exec_time by using
their own logic.


I think we originally added it as a way to avoid too much compatibility break,
and also because it seems like a field most users will be interested in anyway.
Now that I'm thinking about it again, I indeed think it was a mistake to have
that in view part only.  Not mainly for consistency, but for users who would be
interested in the total_time field while not wanting to pay the overhead of
retrieving the query text if they don't need it.  So I'll change that!


Done


+   nested_level++;
+   PG_TRY();

In old thread [1], Tom Lane commented the usage of nested_level
in the planner hook. There seems no reply to that so far. What's
your opinion about that comment?

[1] https://www.postgresql.org/message-id/28980.1515803...@sss.pgh.pa.us


Oh thanks, I didn't noticed this part of the discussion.  I agree with Tom's
concern, and I think that having a specific nesting level variable for the
planner is the best workaround, so I'll implement that.


Done.

I also exported BufferUsageAccumDiff as mentioned previously, as it seems
clearner and will avoid future useless code churn, and run pgindent.

v10 attached.


Thanks for updating the patches!

Regarding 0001 patch, I have one nitpicking comment;

-   result = standard_planner(parse, cursorOptions, boundParams);
+   result = standard_planner(parse, query_text, cursorOptions, 
boundParams);

-standard_planner(Query *parse, int cursorOptions, ParamListInfo boundParams)
+standard_planner(Query *parse, const char *querytext, int cursorOptions,
+ParamListInfo boundParams)

-pg_plan_query(Query *querytree, int cursorOptions, ParamListInfo boundParams)
+pg_plan_query(Query *querytree, const char *query_text, int cursorOptions,
+ ParamListInfo boundParams)

The patch uses "query_text" and "querytext" as the name of newly-added
argument. They should be unified? IMO "query_string" looks better name
because it's used in other functions like pg_analyze_and_rewrite(),
pg_parse_query() for the sake of consistency. Thought?

Regards,


--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-27 Thread Julien Rouhaud
On Thu, Mar 26, 2020 at 02:22:47PM +0100, Julien Rouhaud wrote:
> On Thu, Mar 26, 2020 at 08:08:35PM +0900, Fujii Masao wrote:
> > 
> > Here are other comments.
> > 
> > -   if (jstate)
> > +   if (kind == PGSS_JUMBLE)
> > 
> > Why is PGSS_JUMBLE necessary? ISTM that we can still use jstate here, 
> > instead.
> > 
> > If it's ok to remove PGSS_JUMBLE, we can define PGSS_NUM_KIND(=2) instead
> > and replace 2 in, e.g., total_time[2] with PGSS_NUM_KIND. Thought?
> 
> Yes, we could be using jstate here.  I originally used that to avoid passing
> PGSS_EXEC (or the other one) as a way to say "ignore this information as
> there's the jstate which says it's yet another meaning".  If that's not an
> issue, I can change that as PGSS_NUM_KIND will clearly improve the explicit 
> "2"
> all over the place.

Done, passing PGSS_PLAN when jumble is intended, with a comment saying that the
pgss_kind is ignored in that case.

> > +  total_time
> > +  double precision
> > +  
> > +  
> > +Total time spend planning and executing the statement, in 
> > milliseconds
> > +  
> > + 
> > 
> > pg_stat_statements view has this column but the function not.
> > We should make both have the column or not at all, for consistency?
> > I'm not sure if it's good thing to expose the sum of total_plan_time
> > and total_exec_time as total_time. If some users want that, they can
> > easily calculate it from total_plan_time and total_exec_time by using
> > their own logic.
> 
> I think we originally added it as a way to avoid too much compatibility break,
> and also because it seems like a field most users will be interested in 
> anyway.
> Now that I'm thinking about it again, I indeed think it was a mistake to have
> that in view part only.  Not mainly for consistency, but for users who would 
> be
> interested in the total_time field while not wanting to pay the overhead of
> retrieving the query text if they don't need it.  So I'll change that!

Done

> > +   nested_level++;
> > +   PG_TRY();
> > 
> > In old thread [1], Tom Lane commented the usage of nested_level
> > in the planner hook. There seems no reply to that so far. What's
> > your opinion about that comment?
> > 
> > [1] https://www.postgresql.org/message-id/28980.1515803...@sss.pgh.pa.us
> 
> Oh thanks, I didn't noticed this part of the discussion.  I agree with Tom's
> concern, and I think that having a specific nesting level variable for the
> planner is the best workaround, so I'll implement that.

Done.

I also exported BufferUsageAccumDiff as mentioned previously, as it seems
clearner and will avoid future useless code churn, and run pgindent.

v10 attached.
>From b4db0211ccc9113b14a1e8a668551791add3f6dc Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Thu, 28 Mar 2019 13:33:23 +0100
Subject: [PATCH v10 1/2] Pass query string to the planner

---
 src/backend/commands/copy.c  |  3 ++-
 src/backend/commands/createas.c  |  3 ++-
 src/backend/commands/explain.c   |  2 +-
 src/backend/commands/extension.c |  2 +-
 src/backend/commands/matview.c   |  2 +-
 src/backend/commands/portalcmds.c|  2 +-
 src/backend/executor/functions.c |  1 +
 src/backend/optimizer/plan/planner.c | 10 ++
 src/backend/tcop/postgres.c  | 13 -
 src/backend/utils/cache/plancache.c  |  3 ++-
 src/include/optimizer/optimizer.h|  3 ++-
 src/include/optimizer/planner.h  |  4 +++-
 src/include/tcop/tcopprot.h  |  6 --
 13 files changed, 34 insertions(+), 20 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index fbde9f88e7..efb1e0d03e 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1580,7 +1580,8 @@ BeginCopy(ParseState *pstate,
 		}
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, NULL);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, NULL);
 
 		/*
 		 * With row level security and a user using "COPY relation TO", we
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 3a5676fb39..9febdc5165 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -329,7 +329,8 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		Assert(query->commandType == CMD_SELECT);
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, params);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, params);
 
 		/*
 		 * Use a snapshot with an updated command ID to ensure this query sees
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index ff2f45cfb2..ee0e638f33 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -376,7 +376,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		INSTR_TIME_SET_CURRENT(planstart);
 
 		/* plan the query */
-		plan = 

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-26 Thread Julien Rouhaud
On Thu, Mar 26, 2020 at 08:08:35PM +0900, Fujii Masao wrote:
> 
> On 2020/03/25 22:45, Julien Rouhaud wrote:
> > On Wed, Mar 25, 2020 at 10:09:37PM +0900, Fujii Masao wrote:
> > > + /*
> > > +  * We can't process the query if no query_text is provided, as 
> > > pgss_store
> > > +  * needs it.  We also ignore query without queryid, as it would be 
> > > treated
> > > +  * as a utility statement, which may not be the case.
> > > +  */
> > > 
> > > Could you tell me why the planning stats are not tracked when executing
> > > utility statements? In some utility statements like REFRESH MATERIALIZED 
> > > VIEW,
> > > the planner would work.
> > 
> > I explained that in [1].  The problem is that the underlying statement 
> > doesn't
> > get the proper stmt_location and stmt_len, so you eventually end up with two
> > different entries.
> 
> It's not problematic to have two different entries in that case. Right?

I will unnecessarily bloat the entries, and makes users life harder too.  This
example is quite easy to deal with, but if the application is sending
multi-query statements, you'll just end up with a mess impossible to properly
handle.

> The actual problem is that the statements reported in those entries are
> very similar? For example, when "create table test as select 1;" is executed,
> it's strange to get the following two entries, as you explained.
> 
> create table test as select 1;
> create table test as select 1
> 
> But it seems valid to get the following two entries in that case?
> 
> select 1
> create table test as select 1
> 
> The former is the nested statement and the latter is the top statement.

I think that there should only be 1 entry, the utility command.  It seems easy
to correlate the planning time to the underlying query, but I'm not entirely
sure that the execution counters won't be impacted by the fact is being run in
a utilty statements.  Also, for now current pgss behavior is to always merge
underlying optimisable statements in the utility command, and it seems a bit
late in this release cycle to revisit that.

I'd be happy to work on improving that for the next release, among other
things.  For instance the total lack of normalization for utility commands [2]
is also something that has been bothering me for a long time.  In some
workloads, you can end up with the entries almost entirely filled with
1-time-execution commands, just because it's using random identifiers, so you
have no other choice than to disable track_utility, although it would have been
useful for other commands.

> Here are other comments.
> 
> - if (jstate)
> + if (kind == PGSS_JUMBLE)
> 
> Why is PGSS_JUMBLE necessary? ISTM that we can still use jstate here, instead.
> 
> If it's ok to remove PGSS_JUMBLE, we can define PGSS_NUM_KIND(=2) instead
> and replace 2 in, e.g., total_time[2] with PGSS_NUM_KIND. Thought?

Yes, we could be using jstate here.  I originally used that to avoid passing
PGSS_EXEC (or the other one) as a way to say "ignore this information as
there's the jstate which says it's yet another meaning".  If that's not an
issue, I can change that as PGSS_NUM_KIND will clearly improve the explicit "2"
all over the place.

> +  total_time
> +  double precision
> +  
> +  
> +Total time spend planning and executing the statement, in 
> milliseconds
> +  
> + 
> 
> pg_stat_statements view has this column but the function not.
> We should make both have the column or not at all, for consistency?
> I'm not sure if it's good thing to expose the sum of total_plan_time
> and total_exec_time as total_time. If some users want that, they can
> easily calculate it from total_plan_time and total_exec_time by using
> their own logic.

I think we originally added it as a way to avoid too much compatibility break,
and also because it seems like a field most users will be interested in anyway.
Now that I'm thinking about it again, I indeed think it was a mistake to have
that in view part only.  Not mainly for consistency, but for users who would be
interested in the total_time field while not wanting to pay the overhead of
retrieving the query text if they don't need it.  So I'll change that!

> + nested_level++;
> + PG_TRY();
> 
> In old thread [1], Tom Lane commented the usage of nested_level
> in the planner hook. There seems no reply to that so far. What's
> your opinion about that comment?
> 
> [1] https://www.postgresql.org/message-id/28980.1515803...@sss.pgh.pa.us

Oh thanks, I didn't noticed this part of the discussion.  I agree with Tom's
concern, and I think that having a specific nesting level variable for the
planner is the best workaround, so I'll implement that.

[2] https://twitter.com/fujii_masao/status/1242978261572837377




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-26 Thread Fujii Masao




On 2020/03/25 22:45, Julien Rouhaud wrote:

On Wed, Mar 25, 2020 at 10:09:37PM +0900, Fujii Masao wrote:


On 2020/03/21 4:30, Julien Rouhaud wrote:

On Fri, Mar 20, 2020 at 05:09:05PM +0300, Sergei Kornilov wrote:

Hello

Yet another is missed in docs: total_time


Oh good catch!  I rechecked many time the field, and totally missed that the
documentation is referring to the view, which has an additional column, and not
the function.  Attached v9 fixes that.


Thanks for the patch! Here are the review comments from me.

-   PGSS_V1_3
+   PGSS_V1_3,
+   PGSS_V1_8

WAL usage patch [1] increments this version to 1_4 instead of 1_8.
I *guess* that's because previously this version was maintained
independently from pg_stat_statements' version. For example,
pg_stat_statements 1.4 seems to have used PGSS_V1_3.


Oh right.  It seems that I changed that many versions ago, I'm not sure why.
I'm personally fine with any, but I think this was previously raised and
consensus was to keep distinct counters.  Unless you prefer to keep it this
way, I'll send an updated version (with other possible modifications depending
on the rest of the mail) using PGSS_V1_4.


+   /*
+* We can't process the query if no query_text is provided, as 
pgss_store
+* needs it.  We also ignore query without queryid, as it would be 
treated
+* as a utility statement, which may not be the case.
+*/

Could you tell me why the planning stats are not tracked when executing
utility statements? In some utility statements like REFRESH MATERIALIZED VIEW,
the planner would work.


I explained that in [1].  The problem is that the underlying statement doesn't
get the proper stmt_location and stmt_len, so you eventually end up with two
different entries.


It's not problematic to have two different entries in that case. Right?
The actual problem is that the statements reported in those entries are
very similar? For example, when "create table test as select 1;" is executed,
it's strange to get the following two entries, as you explained.

create table test as select 1;
create table test as select 1

But it seems valid to get the following two entries in that case?

select 1
create table test as select 1

The former is the nested statement and the latter is the top statement.


I suggested fixing transformTopLevelStmt() to handle the
various DDL that can contain optimisable statements, but everyone preferred to
postpone that for a future enhencement.


Understood. Thanks for explanation!


+static BufferUsage
+compute_buffer_counters(BufferUsage start, BufferUsage stop)
+{
+   BufferUsage result;

BufferUsageAccumDiff() has very similar logic. Isn't it better to expose
and use that function rather than creating new similar function?


Oh, I thought this wouldn't be acceptable.  That's indeed better so I'll do
that instead.


Thanks! But of course this is trivial thing, so it's ok to do that later.


values[i++] = Int64GetDatumFast(tmp.rows);
values[i++] = Int64GetDatumFast(tmp.shared_blks_hit);
values[i++] = Int64GetDatumFast(tmp.shared_blks_read);

Previously (without the patch) pg_stat_statements_1_3() reported
the buffer usage counters updated only in execution phase. But,
in the patched version, pg_stat_statements_1_3() reports the total
of buffer usage counters updated in both planning and execution
phases. Is this OK? I'm not sure how seriously we should ensure
the backward compatibility for pg_stat_statements


That's indeed a behavior change, although the new behavior is probably better
as user want to know how much resource a query is consuming overall.  We could
distinguish all buffers with a plan/exec version, but it seems quite overkill.


Ok.




+/* contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql */

ISTM it's good timing to have also pg_stat_statements--1.8.sql since
the definition of pg_stat_statements() is changed. Thought?


I thought that since CreateExtension() was modified to be able to find it's way
automatically, we shouldn't provide base version anymore, to minimize
maintenance burden and also avoid possible bug/discrepancy.  The only drawback
is that it'll do multiple CREATE or DROP/CREATE of the function usually once
per database, which doesn't seem like a big problem.


Ok.

Here are other comments.

-   if (jstate)
+   if (kind == PGSS_JUMBLE)

Why is PGSS_JUMBLE necessary? ISTM that we can still use jstate here, instead.

If it's ok to remove PGSS_JUMBLE, we can define PGSS_NUM_KIND(=2) instead
and replace 2 in, e.g., total_time[2] with PGSS_NUM_KIND. Thought?

+  total_time
+  double precision
+  
+  
+Total time spend planning and executing the statement, in milliseconds
+  
+ 

pg_stat_statements view has this column but the function not.
We should make both have the column or not at all, for consistency?
I'm not sure if it's good 

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-25 Thread Fujii Masao




On 2020/03/26 2:17, Sergei Kornilov wrote:

Hello


WAL usage patch [1] increments this version to 1_4 instead of 1_8.
I *guess* that's because previously this version was maintained
independently from pg_stat_statements' version. For example,
pg_stat_statements 1.4 seems to have used PGSS_V1_3.


As far as I remember, this was my proposed change in review a year ago.
I think that having a clear analogy between the extension version and the 
function name would be more clear than sequential numbering of PGSS_V with 
different extension versions.
For pgss 1.4 it was fine to use PGSS_V1_3, because there were no changes in 
pg_stat_statements_internal.
pg_stat_statements 1.3 will call pg_stat_statements_1_3
pg_stat_statements 1.4 - 1.7 will still call pg_stat_statements_1_3. In my 
opinion, this is the correct naming, since we did not need a new function.
but pg_stat_statements 1.8 will call pg_stat_statements_1_4. It's not confusing?


Yeah, I withdraw my comment and agree that 1_8 looks less confusing.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-25 Thread Sergei Kornilov
Hello

> WAL usage patch [1] increments this version to 1_4 instead of 1_8.
> I *guess* that's because previously this version was maintained
> independently from pg_stat_statements' version. For example,
> pg_stat_statements 1.4 seems to have used PGSS_V1_3.

As far as I remember, this was my proposed change in review a year ago.
I think that having a clear analogy between the extension version and the 
function name would be more clear than sequential numbering of PGSS_V with 
different extension versions.
For pgss 1.4 it was fine to use PGSS_V1_3, because there were no changes in 
pg_stat_statements_internal.
pg_stat_statements 1.3 will call pg_stat_statements_1_3
pg_stat_statements 1.4 - 1.7 will still call pg_stat_statements_1_3. In my 
opinion, this is the correct naming, since we did not need a new function.
but pg_stat_statements 1.8 will call pg_stat_statements_1_4. It's not confusing?

Well, no strong opinion.

regards, Sergei




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-25 Thread Julien Rouhaud
On Wed, Mar 25, 2020 at 10:09:37PM +0900, Fujii Masao wrote:
> 
> On 2020/03/21 4:30, Julien Rouhaud wrote:
> > On Fri, Mar 20, 2020 at 05:09:05PM +0300, Sergei Kornilov wrote:
> > > Hello
> > > 
> > > Yet another is missed in docs: total_time
> > 
> > Oh good catch!  I rechecked many time the field, and totally missed that the
> > documentation is referring to the view, which has an additional column, and 
> > not
> > the function.  Attached v9 fixes that.
> 
> Thanks for the patch! Here are the review comments from me.
> 
> - PGSS_V1_3
> + PGSS_V1_3,
> + PGSS_V1_8
> 
> WAL usage patch [1] increments this version to 1_4 instead of 1_8.
> I *guess* that's because previously this version was maintained
> independently from pg_stat_statements' version. For example,
> pg_stat_statements 1.4 seems to have used PGSS_V1_3.

Oh right.  It seems that I changed that many versions ago, I'm not sure why.
I'm personally fine with any, but I think this was previously raised and
consensus was to keep distinct counters.  Unless you prefer to keep it this
way, I'll send an updated version (with other possible modifications depending
on the rest of the mail) using PGSS_V1_4.

> + /*
> +  * We can't process the query if no query_text is provided, as 
> pgss_store
> +  * needs it.  We also ignore query without queryid, as it would be 
> treated
> +  * as a utility statement, which may not be the case.
> +  */
> 
> Could you tell me why the planning stats are not tracked when executing
> utility statements? In some utility statements like REFRESH MATERIALIZED VIEW,
> the planner would work.

I explained that in [1].  The problem is that the underlying statement doesn't
get the proper stmt_location and stmt_len, so you eventually end up with two
different entries.  I suggested fixing transformTopLevelStmt() to handle the
various DDL that can contain optimisable statements, but everyone preferred to
postpone that for a future enhencement.

> +static BufferUsage
> +compute_buffer_counters(BufferUsage start, BufferUsage stop)
> +{
> + BufferUsage result;
> 
> BufferUsageAccumDiff() has very similar logic. Isn't it better to expose
> and use that function rather than creating new similar function?

Oh, I thought this wouldn't be acceptable.  That's indeed better so I'll do
that instead.

>   values[i++] = Int64GetDatumFast(tmp.rows);
>   values[i++] = Int64GetDatumFast(tmp.shared_blks_hit);
>   values[i++] = Int64GetDatumFast(tmp.shared_blks_read);
> 
> Previously (without the patch) pg_stat_statements_1_3() reported
> the buffer usage counters updated only in execution phase. But,
> in the patched version, pg_stat_statements_1_3() reports the total
> of buffer usage counters updated in both planning and execution
> phases. Is this OK? I'm not sure how seriously we should ensure
> the backward compatibility for pg_stat_statements

That's indeed a behavior change, although the new behavior is probably better
as user want to know how much resource a query is consuming overall.  We could
distinguish all buffers with a plan/exec version, but it seems quite overkill.

> +/* contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql */
> 
> ISTM it's good timing to have also pg_stat_statements--1.8.sql since
> the definition of pg_stat_statements() is changed. Thought?

I thought that since CreateExtension() was modified to be able to find it's way
automatically, we shouldn't provide base version anymore, to minimize
maintenance burden and also avoid possible bug/discrepancy.  The only drawback
is that it'll do multiple CREATE or DROP/CREATE of the function usually once
per database, which doesn't seem like a big problem.

[1] 
https://www.postgresql.org/message-id/caobau_y-y+vohtzgdoudk6-9v72-zxdwccxo_kx0p4ddbee...@mail.gmail.com




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-25 Thread Fujii Masao




On 2020/03/21 4:30, Julien Rouhaud wrote:

On Fri, Mar 20, 2020 at 05:09:05PM +0300, Sergei Kornilov wrote:

Hello

Yet another is missed in docs: total_time


Oh good catch!  I rechecked many time the field, and totally missed that the
documentation is referring to the view, which has an additional column, and not
the function.  Attached v9 fixes that.


Thanks for the patch! Here are the review comments from me.

-   PGSS_V1_3
+   PGSS_V1_3,
+   PGSS_V1_8

WAL usage patch [1] increments this version to 1_4 instead of 1_8.
I *guess* that's because previously this version was maintained
independently from pg_stat_statements' version. For example,
pg_stat_statements 1.4 seems to have used PGSS_V1_3.

+   /*
+* We can't process the query if no query_text is provided, as 
pgss_store
+* needs it.  We also ignore query without queryid, as it would be 
treated
+* as a utility statement, which may not be the case.
+*/

Could you tell me why the planning stats are not tracked when executing
utility statements? In some utility statements like REFRESH MATERIALIZED VIEW,
the planner would work.

+static BufferUsage
+compute_buffer_counters(BufferUsage start, BufferUsage stop)
+{
+   BufferUsage result;

BufferUsageAccumDiff() has very similar logic. Isn't it better to expose
and use that function rather than creating new similar function?

values[i++] = Int64GetDatumFast(tmp.rows);
values[i++] = Int64GetDatumFast(tmp.shared_blks_hit);
values[i++] = Int64GetDatumFast(tmp.shared_blks_read);

Previously (without the patch) pg_stat_statements_1_3() reported
the buffer usage counters updated only in execution phase. But,
in the patched version, pg_stat_statements_1_3() reports the total
of buffer usage counters updated in both planning and execution
phases. Is this OK? I'm not sure how seriously we should ensure
the backward compatibility for pg_stat_statements

+/* contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql */

ISTM it's good timing to have also pg_stat_statements--1.8.sql since
the definition of pg_stat_statements() is changed. Thought?

[1]
https://postgr.es/m/CAB-hujrP8ZfUkvL5OYETipQwA=e3n7oqHFU=4zlxws_cza3...@mail.gmail.com

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-20 Thread Julien Rouhaud
On Fri, Mar 20, 2020 at 05:09:05PM +0300, Sergei Kornilov wrote:
> Hello
> 
> Yet another is missed in docs: total_time

Oh good catch!  I rechecked many time the field, and totally missed that the
documentation is referring to the view, which has an additional column, and not
the function.  Attached v9 fixes that.

> I specifically verified that the new loaded library works with the old 
> version of the extension in the database. I have not noticed issues here.

Thanks for those extra checks.

> > 2.2% is a bit large but I think it is still acceptable because people using 
> > this feature
> > might take account that some overhead will happen for additional calling of 
> > a
> > gettime function.
> 
> I will be happy even with 10% overhead due to enabled track_planning... (but 
> in this case disabled by default) log_min_duration_statement = 0 with log 
> parsing is much more expensive.
> I think 1-2% is acceptable and we can set track_planning = on by default as 
> patch does.
> 
> > * Rows for executor time are changed from {total/min/max/mean/stddev}_time 
> > to {total/min/max/mean/stddev}_exec_time.
> 
> Maybe release it as 2.0 version instead of minor update 1.18?

I don't have an opinion on that, I'd be fine with any version.  I kept 1.18 in
the patch for now.
>From 16c888f7861300d08d21f3ebb97af4fad3ec1fa5 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Thu, 28 Mar 2019 13:33:23 +0100
Subject: [PATCH v9 1/2] Pass query string to the planner

---
 src/backend/commands/copy.c  |  3 ++-
 src/backend/commands/createas.c  |  3 ++-
 src/backend/commands/explain.c   |  2 +-
 src/backend/commands/extension.c |  2 +-
 src/backend/commands/matview.c   |  2 +-
 src/backend/commands/portalcmds.c|  2 +-
 src/backend/executor/functions.c |  1 +
 src/backend/optimizer/plan/planner.c | 10 ++
 src/backend/tcop/postgres.c  | 13 -
 src/backend/utils/cache/plancache.c  |  3 ++-
 src/include/optimizer/optimizer.h|  3 ++-
 src/include/optimizer/planner.h  |  4 +++-
 src/include/tcop/tcopprot.h  |  6 --
 13 files changed, 34 insertions(+), 20 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index fbde9f88e7..efb1e0d03e 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1580,7 +1580,8 @@ BeginCopy(ParseState *pstate,
 		}
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, NULL);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, NULL);
 
 		/*
 		 * With row level security and a user using "COPY relation TO", we
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 3a5676fb39..9febdc5165 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -329,7 +329,8 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		Assert(query->commandType == CMD_SELECT);
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, params);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, params);
 
 		/*
 		 * Use a snapshot with an updated command ID to ensure this query sees
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index d901dc4a50..fb772aa5cd 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -375,7 +375,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		INSTR_TIME_SET_CURRENT(planstart);
 
 		/* plan the query */
-		plan = pg_plan_query(query, cursorOptions, params);
+		plan = pg_plan_query(query, queryString, cursorOptions, params);
 
 		INSTR_TIME_SET_CURRENT(planduration);
 		INSTR_TIME_SUBTRACT(planduration, planstart);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 00cf4ef268..38cbea385a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -751,7 +751,7 @@ execute_sql_string(const char *sql)
 		   NULL,
 		   0,
 		   NULL);
-		stmt_list = pg_plan_queries(stmt_list, CURSOR_OPT_PARALLEL_OK, NULL);
+		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
 
 		foreach(lc2, stmt_list)
 		{
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index c3954f3e24..e5a5eef102 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -391,7 +391,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	CHECK_FOR_INTERRUPTS();
 
 	/* Plan the query which will generate data for the refresh. */
-	plan = pg_plan_query(query, 0, NULL);
+	plan = pg_plan_query(query, queryString, 0, NULL);
 
 	/*
 	 * Use a snapshot with an updated command ID to ensure this query sees
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 40be5069fe..6a2c233615 100644
--- a/src/backend/commands/portalcmds.c
+++ 

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-20 Thread Sergei Kornilov
Hello

I was inactive for a while... Sorry.

>>  BTW, I recheck the patchset.
>>  I think codes are ready for committer but should we modify the 
>> documentation?
>>  {min,max,mean,stddev}_time is now obsoleted so it is better to modify it to
>>  {min,max,mean,stddev}_exec_time and add {min,max,mean,stddev}_plan_time.
>
> Oh indeed, I totally forgot about this. I'm attaching v8 with updated
> documentation that should match what was implemented since some versions.

Yet another is missed in docs: total_time

I specifically verified that the new loaded library works with the old version 
of the extension in the database. I have not noticed issues here.

> 2.2% is a bit large but I think it is still acceptable because people using 
> this feature
> might take account that some overhead will happen for additional calling of a
> gettime function.

I will be happy even with 10% overhead due to enabled track_planning... (but in 
this case disabled by default) log_min_duration_statement = 0 with log parsing 
is much more expensive.
I think 1-2% is acceptable and we can set track_planning = on by default as 
patch does.

> * Rows for executor time are changed from {total/min/max/mean/stddev}_time to 
> {total/min/max/mean/stddev}_exec_time.

Maybe release it as 2.0 version instead of minor update 1.18?

regards, Sergei




RE: Planning counters in pg_stat_statements (using pgss_store)

2020-03-16 Thread imai.yoshik...@fujitsu.com
On Mon, Mar 16, 2020 at 9:49 PM, Julien Rouhaud wrote:
> On Mon, Mar 16, 2020 at 01:34:11AM +, imai.yoshik...@fujitsu.com
> wrote:
> > On Sat, Mar 14, 2020 at 5:28 PM, Julien Rouhaud wrote:
> > > I don't think that's a correct assumption.  I obviously didn't read
> > > all of citus extension, but it looks like what's happening is that
> > > they get generate a custom Query from the original one, with all the
> > > modification needed for distributed execution and whatnot, which is
> > > then fed to the planner.  I think it's entirely mossible that the
> > > modified Query herits from a previously set queryid, while still not
> > > really having a query text.  And if citus doesn't do that, it doesn't 
> > > seem like
> an illegal use cuse anyway.
> >
> > Indeed. It can happen that queryid has some value while query_text is NULL.
> >
> >
> > > I'm instead attaching a v7 which removes the assert in
> > > pg_plan_query, and modify pgss_planner_hook to also ignore queries
> > > without a query text, as this seems the best option.
> >
> > Thank you.
> > It also seems to me that is the best option.
> 
> 
> Thanks Imai-san and PAscal for the feedback, it seems that we have an
> agreement!
> 
> 
> > BTW, I recheck the patchset.
> > I think codes are ready for committer but should we modify the
> documentation?
> > {min,max,mean,stddev}_time is now obsoleted so it is better to modify
> > it to {min,max,mean,stddev}_exec_time and add
> {min,max,mean,stddev}_plan_time.
> 
> 
> Oh indeed, I totally forgot about this.  I'm attaching v8 with updated
> documentation that should match what was implemented since some
> versions.

Okay, I checked it.
So I'll mark this as a ready for committer.

Thanks
--
Yoshikazu Imai




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-16 Thread Julien Rouhaud
On Mon, Mar 16, 2020 at 01:34:11AM +, imai.yoshik...@fujitsu.com wrote:
> On Sat, Mar 14, 2020 at 5:28 PM, Julien Rouhaud wrote:
> > I don't think that's a correct assumption.  I obviously didn't read all of
> > citus extension, but it looks like what's happening is that they get 
> > generate a
> > custom Query from the original one, with all the modification needed for
> > distributed execution and whatnot, which is then fed to the planner.  I 
> > think
> > it's entirely mossible that the modified Query herits from a previously set
> > queryid, while still not really having a query text.  And if citus doesn't 
> > do
> > that, it doesn't seem like an illegal use cuse anyway.
>
> Indeed. It can happen that queryid has some value while query_text is NULL.
>
>
> > I'm instead attaching a v7 which removes the assert in pg_plan_query, and
> > modify pgss_planner_hook to also ignore queries without a query text, as 
> > this
> > seems the best option.
>
> Thank you.
> It also seems to me that is the best option.


Thanks Imai-san and PAscal for the feedback, it seems that we have an
agreement!


> BTW, I recheck the patchset.
> I think codes are ready for committer but should we modify the documentation?
> {min,max,mean,stddev}_time is now obsoleted so it is better to modify it to
> {min,max,mean,stddev}_exec_time and add {min,max,mean,stddev}_plan_time.


Oh indeed, I totally forgot about this.  I'm attaching v8 with updated
documentation that should match what was implemented since some versions.
>From 16c888f7861300d08d21f3ebb97af4fad3ec1fa5 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Thu, 28 Mar 2019 13:33:23 +0100
Subject: [PATCH v8 1/2] Pass query string to the planner

---
 src/backend/commands/copy.c  |  3 ++-
 src/backend/commands/createas.c  |  3 ++-
 src/backend/commands/explain.c   |  2 +-
 src/backend/commands/extension.c |  2 +-
 src/backend/commands/matview.c   |  2 +-
 src/backend/commands/portalcmds.c|  2 +-
 src/backend/executor/functions.c |  1 +
 src/backend/optimizer/plan/planner.c | 10 ++
 src/backend/tcop/postgres.c  | 13 -
 src/backend/utils/cache/plancache.c  |  3 ++-
 src/include/optimizer/optimizer.h|  3 ++-
 src/include/optimizer/planner.h  |  4 +++-
 src/include/tcop/tcopprot.h  |  6 --
 13 files changed, 34 insertions(+), 20 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index fbde9f88e7..efb1e0d03e 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1580,7 +1580,8 @@ BeginCopy(ParseState *pstate,
 		}
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, NULL);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, NULL);
 
 		/*
 		 * With row level security and a user using "COPY relation TO", we
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 3a5676fb39..9febdc5165 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -329,7 +329,8 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		Assert(query->commandType == CMD_SELECT);
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, params);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, params);
 
 		/*
 		 * Use a snapshot with an updated command ID to ensure this query sees
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index d901dc4a50..fb772aa5cd 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -375,7 +375,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		INSTR_TIME_SET_CURRENT(planstart);
 
 		/* plan the query */
-		plan = pg_plan_query(query, cursorOptions, params);
+		plan = pg_plan_query(query, queryString, cursorOptions, params);
 
 		INSTR_TIME_SET_CURRENT(planduration);
 		INSTR_TIME_SUBTRACT(planduration, planstart);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 00cf4ef268..38cbea385a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -751,7 +751,7 @@ execute_sql_string(const char *sql)
 		   NULL,
 		   0,
 		   NULL);
-		stmt_list = pg_plan_queries(stmt_list, CURSOR_OPT_PARALLEL_OK, NULL);
+		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
 
 		foreach(lc2, stmt_list)
 		{
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index c3954f3e24..e5a5eef102 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -391,7 +391,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	CHECK_FOR_INTERRUPTS();
 
 	/* Plan the query which will generate data for the refresh. */
-	plan = pg_plan_query(query, 0, NULL);
+	plan = pg_plan_query(query, queryString, 0, NULL);
 
 	/*
 	 * Use a snapshot with an 

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-16 Thread legrand legrand
> I'm instead attaching a v7 which removes the assert in pg_plan_query, and
> modify pgss_planner_hook to also ignore queries without a query text, as
> this
> seems the best option. 

Ok, it was the second solution, go on !



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




RE: Planning counters in pg_stat_statements (using pgss_store)

2020-03-15 Thread imai.yoshik...@fujitsu.com
On Sat, Mar 14, 2020 at 5:28 PM, Julien Rouhaud wrote:
> On Sat, Mar 14, 2020 at 03:04:00AM -0700, legrand legrand wrote:
> > imai.yoshik...@fujitsu.com wrote
> > > On Thu, Mar 12, 2020 at 6:37 PM, Julien Rouhaud wrote:
> > >> That's very interesting feedback, thanks!  I'm not a fan of giving a way
> > >> for
> > >> queries to say that they want to be ignored by pg_stat_statements, but
> > >> double
> > >> counting the planning counters seem even worse, so I'm +0.5 to accept
> > >> NULL
> > >> query string in the planner, incidentally making pgss ignore those.
> > >
> > > It is preferable that we can count various queries statistics as much as
> > > possible
> > > but if it causes overhead even when without using pg_stat_statements, we
> > > would
> > > not have to force them to set appropriate query_text.
> > > About settings a fixed string in query_text, I think it doesn't make much
> > > sense
> > > because users can't take any actions by seeing those queries' stats.
> > > Moreover, if
> > > we set a fixed string in query_text to avoid pg_stat_statement's errors,
> > > codes
> > > would be inexplicable for other developers who don't know there's such
> > > requirements.
> > > After all, I agree accepting NULL query string in the planner.
> > >
> > > I don't know it is useful but there are also codes that avoid an error
> > > when
> > > sourceText is NULL.
> > >
> > > executor_errposition(EState *estate, int location)
> > > {
> > > ...
> > > /* Can't do anything if source text is not available */
> > > if (estate == NULL || estate->es_sourceText == NULL)
> 
> 
> I'm wondering if that's really possible.  But pgss uses the QueryDesc, which
> should always have a query text (since pgss relies on that).

I cited those codes because I just wanted to say there's already an assumption
that query text in QueryDesc would be NULL, whether or not it is true.


> > My understanding of V5 patch, that checks for Non-Zero queryid,
> > manage properly case where sourceText is NULL.
> >
> > A NULL sourceText means that there was no Parsing for the associated
> > query, if there was no Parsing, there is no queryid (queryId=0),
> > and no planning counters update.
> >
> > It doesn't change pg_plan_query behaviour (no regression for Citus, IVM,
> > ...),
> > and was tested with success for IVM.
> >
> > If my understanding is wrong, then setting track_planning = false
> > would still be the work arround for the very rare (no core) extension(s)
> > that may hit the NULL query text assertion failure.
> >
> > What do you think about this ?
> 
> 
> I don't think that's a correct assumption.  I obviously didn't read all of
> citus extension, but it looks like what's happening is that they get generate 
> a
> custom Query from the original one, with all the modification needed for
> distributed execution and whatnot, which is then fed to the planner.  I think
> it's entirely mossible that the modified Query herits from a previously set
> queryid, while still not really having a query text.  And if citus doesn't do
> that, it doesn't seem like an illegal use cuse anyway.

Indeed. It can happen that queryid has some value while query_text is NULL.


> I'm instead attaching a v7 which removes the assert in pg_plan_query, and
> modify pgss_planner_hook to also ignore queries without a query text, as this
> seems the best option.

Thank you.
It also seems to me that is the best option.

BTW, I recheck the patchset.
I think codes are ready for committer but should we modify the documentation?
{min,max,mean,stddev}_time is now obsoleted so it is better to modify it to
{min,max,mean,stddev}_exec_time and add {min,max,mean,stddev}_plan_time.


--
Yoshikazu Imai





Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-14 Thread Julien Rouhaud
On Sat, Mar 14, 2020 at 03:04:00AM -0700, legrand legrand wrote:
> imai.yoshik...@fujitsu.com wrote
> > On Thu, Mar 12, 2020 at 6:37 PM, Julien Rouhaud wrote:
> >> That's very interesting feedback, thanks!  I'm not a fan of giving a way
> >> for
> >> queries to say that they want to be ignored by pg_stat_statements, but
> >> double
> >> counting the planning counters seem even worse, so I'm +0.5 to accept
> >> NULL
> >> query string in the planner, incidentally making pgss ignore those.
> >
> > It is preferable that we can count various queries statistics as much as
> > possible
> > but if it causes overhead even when without using pg_stat_statements, we
> > would
> > not have to force them to set appropriate query_text.
> > About settings a fixed string in query_text, I think it doesn't make much
> > sense
> > because users can't take any actions by seeing those queries' stats.
> > Moreover, if
> > we set a fixed string in query_text to avoid pg_stat_statement's errors,
> > codes
> > would be inexplicable for other developers who don't know there's such
> > requirements.
> > After all, I agree accepting NULL query string in the planner.
> >
> > I don't know it is useful but there are also codes that avoid an error
> > when
> > sourceText is NULL.
> >
> > executor_errposition(EState *estate, int location)
> > {
> > ...
> > /* Can't do anything if source text is not available */
> > if (estate == NULL || estate->es_sourceText == NULL)


I'm wondering if that's really possible.  But pgss uses the QueryDesc, which
should always have a query text (since pgss relies on that).


> My understanding of V5 patch, that checks for Non-Zero queryid,
> manage properly case where sourceText is NULL.
>
> A NULL sourceText means that there was no Parsing for the associated
> query, if there was no Parsing, there is no queryid (queryId=0),
> and no planning counters update.
>
> It doesn't change pg_plan_query behaviour (no regression for Citus, IVM,
> ...),
> and was tested with success for IVM.
>
> If my understanding is wrong, then setting track_planning = false
> would still be the work arround for the very rare (no core) extension(s)
> that may hit the NULL query text assertion failure.
>
> What do you think about this ?


I don't think that's a correct assumption.  I obviously didn't read all of
citus extension, but it looks like what's happening is that they get generate a
custom Query from the original one, with all the modification needed for
distributed execution and whatnot, which is then fed to the planner.  I think
it's entirely mossible that the modified Query herits from a previously set
queryid, while still not really having a query text.  And if citus doesn't do
that, it doesn't seem like an illegal use cuse anyway.

I'm instead attaching a v7 which removes the assert in pg_plan_query, and
modify pgss_planner_hook to also ignore queries without a query text, as this
seems the best option.
>From 8412a8792bbf230406af33e4c1a93b425ac983ab Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Thu, 28 Mar 2019 13:33:23 +0100
Subject: [PATCH v7 1/2] Pass query string to the planner

---
 src/backend/commands/copy.c  |  3 ++-
 src/backend/commands/createas.c  |  3 ++-
 src/backend/commands/explain.c   |  2 +-
 src/backend/commands/extension.c |  2 +-
 src/backend/commands/matview.c   |  2 +-
 src/backend/commands/portalcmds.c|  2 +-
 src/backend/executor/functions.c |  1 +
 src/backend/optimizer/plan/planner.c | 10 ++
 src/backend/tcop/postgres.c  | 13 -
 src/backend/utils/cache/plancache.c  |  3 ++-
 src/include/optimizer/optimizer.h|  3 ++-
 src/include/optimizer/planner.h  |  4 +++-
 src/include/tcop/tcopprot.h  |  6 --
 13 files changed, 34 insertions(+), 20 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index fbde9f88e7..efb1e0d03e 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1580,7 +1580,8 @@ BeginCopy(ParseState *pstate,
 		}
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, NULL);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, NULL);
 
 		/*
 		 * With row level security and a user using "COPY relation TO", we
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 3a5676fb39..9febdc5165 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -329,7 +329,8 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		Assert(query->commandType == CMD_SELECT);
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, params);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, params);
 
 		/*
 		 * Use a snapshot with an updated command ID to ensure this query sees
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 

RE: Planning counters in pg_stat_statements (using pgss_store)

2020-03-14 Thread legrand legrand
> I don't know it is useful but there are also codes that avoid an error when
> sourceText is NULL.

> executor_errposition(EState *estate, int location)
> {
> ...
>/* Can't do anything if source text is not available */
>if (estate == NULL || estate->es_sourceText == NULL)
> }


or maybe would you prefer to replace the Non-Zero queryid test 
by Non-NULL sourcetext one ?



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




RE: Planning counters in pg_stat_statements (using pgss_store)

2020-03-14 Thread legrand legrand
imai.yoshik...@fujitsu.com wrote
> On Thu, Mar 12, 2020 at 6:37 PM, Julien Rouhaud wrote:
>> On Thu, Mar 12, 2020 at 1:11 PM Marco Slot 

> marco.slot@

>  wrote:
>> > On Thu, Mar 12, 2020 at 11:31 AM Julien Rouhaud 

> rjuju123@

> 
>> wrote:
>> > > There's at least the current version of IVM patchset that lacks the
>> > > querytext.  Looking at various extensions, I see that pg_background
>> > > and pglogical call pg_plan_query internally but shouldn't have any
>> > > issue providing the query text.  But there's also citus extension,
>> > > which don't keep around the query string at least when distributing
>> > > plans, which makes sense since it's of no use and they're heavily
>> > > modifying the original Query.  I think that citus folks opinion on
>> > > the subject would be useful, so I'm Cc-ing Marco.
>> >
>> > Most of the time we keep our Query * data structures in a form that
>> > can be deparsed back into a query string by a modified copy of
>> > ruleutils.c, so we could generate a correct query string if absolutely
>> > necessary, though there are performance-sensitive cases where we'd
>> > rather not have the deparsing overhead.
>> 
>> Yes, deparsing is probably too expensive for that use case.
>> 
>> > In case of INSERT..SELECT into a distributed table, we might call
>> > pg_plan_query on the SELECT part (Query *) and send the output into a
>> > DestReceiver that sends tuples into shards of the distributed table
>> > via COPY. The fact that SELECT does not show up in pg_stat_statements
>> > separately is generally fine because it's an implementation detail,
>> > and it would probably be a little confusing because the user never ran
>> > the SELECT query. Moreover, the call to pg_plan_query would already be
>> > reflected in the planning or execution time of the top-level query, so
>> > it would be double counted if it had its own entry.
>> 
>> Well, surprising statements can already appears in pg_stat_statements
>> when
>> you use some psql features, or if you have triggers as those will run
>> additional
>> queries under the hood.
>> 
>> The difference here is that since citus is a CustomNode, underlying calls
>> to
>> planner will be accounted for that node, and that's indeed annoying.  I
>> can see
>> that citus is doing some calls to spi_exec or
>> Executor* (in ExecuteLocalTaskPlan), which could also trigger
>> pg_stat_statements, but I don't know if a queryid is present there.
>> 
>> > Another case is when some of the shards turn out to be local to the
>> > server that handles the distributed query. In that case we plan the
>> > queries on those shards via pg_plan_query instead of deparsing and
>> > sending the query string to a remote server. It would be less
>> > confusing for these queries to show in pg_stat_statements, because the
>> > queries on the shards on remote servers will show up as well. However,
>> > this is a performance-sensitive code path where we'd rather avoid
>> > deparsing.
>> 
>> Agreed.
>> 
>> > In general, I'd prefer if there was no requirement to pass a correct
>> > query string. I'm ok with passing "SELECT 'citus_internal'" or just ""
>> > if that does not lead to issues. Passing NULL to signal that the
>> > planner call should not be tracked separately does seem a bit cleaner.
>> 
>> That's very interesting feedback, thanks!  I'm not a fan of giving a way
>> for
>> queries to say that they want to be ignored by pg_stat_statements, but
>> double
>> counting the planning counters seem even worse, so I'm +0.5 to accept
>> NULL
>> query string in the planner, incidentally making pgss ignore those.
> 
> It is preferable that we can count various queries statistics as much as
> possible
> but if it causes overhead even when without using pg_stat_statements, we
> would
> not have to force them to set appropriate query_text.
> About settings a fixed string in query_text, I think it doesn't make much
> sense
> because users can't take any actions by seeing those queries' stats.
> Moreover, if
> we set a fixed string in query_text to avoid pg_stat_statement's errors,
> codes
> would be inexplicable for other developers who don't know there's such
> requirements.
> After all, I agree accepting NULL query string in the planner.
> 
> I don't know it is useful but there are also codes that avoid an error
> when
> sourceText is NULL.
> 
> executor_errposition(EState *estate, int location)
> {
> ...
> /* Can't do anything if source text is not available */
> if (estate == NULL || estate->es_sourceText == NULL)
> }
> 
> --
> Yoshikazu Imai

Hello Imai,

My understanding of V5 patch, that checks for Non-Zero queryid,
manage properly case where sourceText is NULL.

A NULL sourceText means that there was no Parsing for the associated 
query, if there was no Parsing, there is no queryid (queryId=0), 
and no planning counters update.

It doesn't change pg_plan_query behaviour (no regression for Citus, IVM,
...),
and was tested with success for IVM.

If my understanding 

RE: Planning counters in pg_stat_statements (using pgss_store)

2020-03-13 Thread imai.yoshik...@fujitsu.com
On Thu, Mar 12, 2020 at 6:37 PM, Julien Rouhaud wrote:
> On Thu, Mar 12, 2020 at 1:11 PM Marco Slot  wrote:
> > On Thu, Mar 12, 2020 at 11:31 AM Julien Rouhaud 
> wrote:
> > > There's at least the current version of IVM patchset that lacks the
> > > querytext.  Looking at various extensions, I see that pg_background
> > > and pglogical call pg_plan_query internally but shouldn't have any
> > > issue providing the query text.  But there's also citus extension,
> > > which don't keep around the query string at least when distributing
> > > plans, which makes sense since it's of no use and they're heavily
> > > modifying the original Query.  I think that citus folks opinion on
> > > the subject would be useful, so I'm Cc-ing Marco.
> >
> > Most of the time we keep our Query * data structures in a form that
> > can be deparsed back into a query string by a modified copy of
> > ruleutils.c, so we could generate a correct query string if absolutely
> > necessary, though there are performance-sensitive cases where we'd
> > rather not have the deparsing overhead.
> 
> Yes, deparsing is probably too expensive for that use case.
> 
> > In case of INSERT..SELECT into a distributed table, we might call
> > pg_plan_query on the SELECT part (Query *) and send the output into a
> > DestReceiver that sends tuples into shards of the distributed table
> > via COPY. The fact that SELECT does not show up in pg_stat_statements
> > separately is generally fine because it's an implementation detail,
> > and it would probably be a little confusing because the user never ran
> > the SELECT query. Moreover, the call to pg_plan_query would already be
> > reflected in the planning or execution time of the top-level query, so
> > it would be double counted if it had its own entry.
> 
> Well, surprising statements can already appears in pg_stat_statements when
> you use some psql features, or if you have triggers as those will run 
> additional
> queries under the hood.
> 
> The difference here is that since citus is a CustomNode, underlying calls to
> planner will be accounted for that node, and that's indeed annoying.  I can 
> see
> that citus is doing some calls to spi_exec or
> Executor* (in ExecuteLocalTaskPlan), which could also trigger
> pg_stat_statements, but I don't know if a queryid is present there.
> 
> > Another case is when some of the shards turn out to be local to the
> > server that handles the distributed query. In that case we plan the
> > queries on those shards via pg_plan_query instead of deparsing and
> > sending the query string to a remote server. It would be less
> > confusing for these queries to show in pg_stat_statements, because the
> > queries on the shards on remote servers will show up as well. However,
> > this is a performance-sensitive code path where we'd rather avoid
> > deparsing.
> 
> Agreed.
> 
> > In general, I'd prefer if there was no requirement to pass a correct
> > query string. I'm ok with passing "SELECT 'citus_internal'" or just ""
> > if that does not lead to issues. Passing NULL to signal that the
> > planner call should not be tracked separately does seem a bit cleaner.
> 
> That's very interesting feedback, thanks!  I'm not a fan of giving a way for
> queries to say that they want to be ignored by pg_stat_statements, but double
> counting the planning counters seem even worse, so I'm +0.5 to accept NULL
> query string in the planner, incidentally making pgss ignore those.

It is preferable that we can count various queries statistics as much as 
possible
but if it causes overhead even when without using pg_stat_statements, we would
not have to force them to set appropriate query_text.
About settings a fixed string in query_text, I think it doesn't make much sense
because users can't take any actions by seeing those queries' stats. Moreover, 
if
we set a fixed string in query_text to avoid pg_stat_statement's errors, codes
would be inexplicable for other developers who don't know there's such
requirements.
After all, I agree accepting NULL query string in the planner.

I don't know it is useful but there are also codes that avoid an error when
sourceText is NULL.

executor_errposition(EState *estate, int location)
{
...
/* Can't do anything if source text is not available */
if (estate == NULL || estate->es_sourceText == NULL)
}

--
Yoshikazu Imai


RE: Planning counters in pg_stat_statements (using pgss_store)

2020-03-13 Thread imai.yoshik...@fujitsu.com
On Thu, Mar 12, 2020 at 10:31 AM, Julien Rouhaud wrote:
> > * bufusage still only counts the buffer usage during executor.
> >
> >   Now we have the ability to count the buffer usage during planner but we
> keep
> >   the bufusage count the buffer usage during executor for now.
> 
> The bufusage should reflect the sum of planning and execution usage if
> track_planning is enabled.  Did I miss something there?

Ah, you're right. I somehow misunderstood it. Sorry for the annoying.

> > * We add Assert in pg_plan_query so that query_text will not be NULL
> > when executing planner.
> >
> >   There's no case query_text will be NULL with current sources. It is not
> >   ensured there will be any case query_text will be possibly NULL in the
> >   future though. Some considerations are needed by other people about
> this.
> 
> There's at least the current version of IVM patchset that lacks the querytext.

I saw IVM patchset but I thought it is difficult to impose them to give 
appropriate
querytext.


> Looking at various extensions, I see that pg_background and pglogical call
> pg_plan_query internally but shouldn't have any issue providing the query 
> text.
> But there's also citus extension, which don't keep around the query string at
> least when distributing plans, which makes sense since it's of no use and
> they're heavily modifying the original Query.  I think that citus folks 
> opinion on
> the subject would be useful, so I'm Cc-ing Marco.

Thank you for looking those codes. I will comment about this in another mail.

--
Yoshikazu Imai


Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-12 Thread Julien Rouhaud
On Thu, Mar 12, 2020 at 1:11 PM Marco Slot  wrote:
>
> On Thu, Mar 12, 2020 at 11:31 AM Julien Rouhaud  wrote:
> > There's at least the current version of IVM patchset that lacks the
> > querytext.  Looking at various extensions, I see that pg_background
> > and pglogical call pg_plan_query internally but shouldn't have any
> > issue providing the query text.  But there's also citus extension,
> > which don't keep around the query string at least when distributing
> > plans, which makes sense since it's of no use and they're heavily
> > modifying the original Query.  I think that citus folks opinion on the
> > subject would be useful, so I'm Cc-ing Marco.
>
> Most of the time we keep our Query * data structures in a form that
> can be deparsed back into a query string by a modified copy of
> ruleutils.c, so we could generate a correct query string if absolutely
> necessary, though there are performance-sensitive cases where we'd
> rather not have the deparsing overhead.

Yes, deparsing is probably too expensive for that use case.

> In case of INSERT..SELECT into a distributed table, we might call
> pg_plan_query on the SELECT part (Query *) and send the output into a
> DestReceiver that sends tuples into shards of the distributed table
> via COPY. The fact that SELECT does not show up in pg_stat_statements
> separately is generally fine because it's an implementation detail,
> and it would probably be a little confusing because the user never ran
> the SELECT query. Moreover, the call to pg_plan_query would already be
> reflected in the planning or execution time of the top-level query, so
> it would be double counted if it had its own entry.

Well, surprising statements can already appears in pg_stat_statements
when you use some psql features, or if you have triggers as those will
run additional queries under the hood.

The difference here is that since citus is a CustomNode, underlying
calls to planner will be accounted for that node, and that's indeed
annoying.  I can see that citus is doing some calls to spi_exec or
Executor* (in ExecuteLocalTaskPlan), which could also trigger
pg_stat_statements, but I don't know if a queryid is present there.

> Another case is when some of the shards turn out to be local to the
> server that handles the distributed query. In that case we plan the
> queries on those shards via pg_plan_query instead of deparsing and
> sending the query string to a remote server. It would be less
> confusing for these queries to show in pg_stat_statements, because the
> queries on the shards on remote servers will show up as well. However,
> this is a performance-sensitive code path where we'd rather avoid
> deparsing.

Agreed.

> In general, I'd prefer if there was no requirement to pass a correct
> query string. I'm ok with passing "SELECT 'citus_internal'" or just ""
> if that does not lead to issues. Passing NULL to signal that the
> planner call should not be tracked separately does seem a bit cleaner.

That's very interesting feedback, thanks!  I'm not a fan of giving a
way for queries to say that they want to be ignored by
pg_stat_statements, but double counting the planning counters seem
even worse, so I'm +0.5 to accept NULL query string in the planner,
incidentally making pgss ignore those.




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-12 Thread Marco Slot
On Thu, Mar 12, 2020 at 11:31 AM Julien Rouhaud  wrote:
> There's at least the current version of IVM patchset that lacks the
> querytext.  Looking at various extensions, I see that pg_background
> and pglogical call pg_plan_query internally but shouldn't have any
> issue providing the query text.  But there's also citus extension,
> which don't keep around the query string at least when distributing
> plans, which makes sense since it's of no use and they're heavily
> modifying the original Query.  I think that citus folks opinion on the
> subject would be useful, so I'm Cc-ing Marco.

Most of the time we keep our Query * data structures in a form that
can be deparsed back into a query string by a modified copy of
ruleutils.c, so we could generate a correct query string if absolutely
necessary, though there are performance-sensitive cases where we'd
rather not have the deparsing overhead.

In case of INSERT..SELECT into a distributed table, we might call
pg_plan_query on the SELECT part (Query *) and send the output into a
DestReceiver that sends tuples into shards of the distributed table
via COPY. The fact that SELECT does not show up in pg_stat_statements
separately is generally fine because it's an implementation detail,
and it would probably be a little confusing because the user never ran
the SELECT query. Moreover, the call to pg_plan_query would already be
reflected in the planning or execution time of the top-level query, so
it would be double counted if it had its own entry.

Another case is when some of the shards turn out to be local to the
server that handles the distributed query. In that case we plan the
queries on those shards via pg_plan_query instead of deparsing and
sending the query string to a remote server. It would be less
confusing for these queries to show in pg_stat_statements, because the
queries on the shards on remote servers will show up as well. However,
this is a performance-sensitive code path where we'd rather avoid
deparsing.

In general, I'd prefer if there was no requirement to pass a correct
query string. I'm ok with passing "SELECT 'citus_internal'" or just ""
if that does not lead to issues. Passing NULL to signal that the
planner call should not be tracked separately does seem a bit cleaner.

Marco




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-12 Thread Julien Rouhaud
On Thu, Mar 12, 2020 at 10:19 AM imai.yoshik...@fujitsu.com
 wrote:
>
> I'll summary code review of this thread.

Thanks for the summary!  I just have some minor comments

> [Performance]
>
> If track_planning is not enabled, performance will drop 0.2-0.6% which can be
> ignored. If track_planning is enabled, performance will drop 0-2.2%. 2.2% is a
> bit large but I think it is still acceptable because people using this feature
> might take account that some overhead will happen for additional calling of a
> gettime function.
>
> https://www.postgresql.org/message-id/CY4PR20MB12227E5CE199FFBB90C68A13BCB40%40CY4PR20MB1222.namprd20.prod.outlook.com
>
> [Values in each row]
>
> * bufusage still only counts the buffer usage during executor.
>
>   Now we have the ability to count the buffer usage during planner but we keep
>   the bufusage count the buffer usage during executor for now.

The bufusage should reflect the sum of planning and execution usage if
track_planning is enabled.  Did I miss something there?

> [Coding]
>
> * We add Assert in pg_plan_query so that query_text will not be NULL when
> executing planner.
>
>   There's no case query_text will be NULL with current sources. It is not
>   ensured there will be any case query_text will be possibly NULL in the
>   future though. Some considerations are needed by other people about this.

There's at least the current version of IVM patchset that lacks the
querytext.  Looking at various extensions, I see that pg_background
and pglogical call pg_plan_query internally but shouldn't have any
issue providing the query text.  But there's also citus extension,
which don't keep around the query string at least when distributing
plans, which makes sense since it's of no use and they're heavily
modifying the original Query.  I think that citus folks opinion on the
subject would be useful, so I'm Cc-ing Marco.




RE: Planning counters in pg_stat_statements (using pgss_store)

2020-03-12 Thread imai.yoshik...@fujitsu.com
On Thu, Mar 12, 2020 at 6:31 AM, Julien Rouhaud wrote:
> On Thu, Mar 12, 2020 at 05:28:38AM +, imai.yoshik...@fujitsu.com wrote:
> > Hi Julien,
> >
> > On Mon, Mar 9, 2020 at 10:32 AM, Julien Rouhaud wrote:
> > > On Thu, Mar 05, 2020 at 01:26:19PM -0700, legrand legrand wrote:
> > > > Please consider PG13 shortest path ;o)
> > > >
> > > > My one is  parse->queryId != UINT64CONST(0) in pgss_planner_hook().
> > > > It fixes IVM problem (verified),
> > > > and keep CTAS equal to pgss without planning counters (verified too).
> > >
> > > I still disagree that hiding this problem is the right fix, but since no 
> > > one
> > > objected here's a v5 with that behavior.  Hopefully this will be fixed in 
> > > v14.
> >
> > Is there any case that query_text will be NULL when executing pg_plan_query?
> 
> With current sources, there are no cases where the query text isn't provided
> AFAICS.
> 
> > If query_text will be NULL, we need to add codes to avoid errors in
> > pgss_store like as current patch. If query_text will not be NULL, we should
> > add Assert in pg_plan_query so that other developers can notice that they
> > would not mistakenly set query_text as NULL even without using pgss_planning
> > counter.
> 
> I totally agree.  I already had such assert locally, and regression tests pass
> without any problem.  I'm attaching a v6 with that extra assert.  If the
> first patch is committed, it'll now be a requirement to provide it.  Or if
> people think it's not, it'll make sure that we'll discuss it.

I see. I also don't come up with any case of query_text is NULL. Now we need
other people's opinion about here.


I'll summary code review of this thread.

[Performance]

If track_planning is not enabled, performance will drop 0.2-0.6% which can be
ignored. If track_planning is enabled, performance will drop 0-2.2%. 2.2% is a
bit large but I think it is still acceptable because people using this feature
might take account that some overhead will happen for additional calling of a
gettime function.

https://www.postgresql.org/message-id/CY4PR20MB12227E5CE199FFBB90C68A13BCB40%40CY4PR20MB1222.namprd20.prod.outlook.com

[Values in each row]

* Rows for planner time are added as {total/min/max/mean/stddev}_plan_time. 

  These are enough statistics for users who want to investigate the
  planning time.

* Rows for executor time are changed from {total/min/max/mean/stddev}_time
to {total/min/max/mean/stddev}_exec_time.

  Because of changing the name of the rows, there's no backward compatibility.
  Thus some users needs to modify scripts which using previous version of the
  pg_stat_statements. I believe it is not expensive to rewrite scripts along
  this change and it would be better to give an appropriate name to a row
  for future users.
  I also haven't seen big opposition about losing backward compatibility so
  far.

* We don't provide {total/min/max/mean/stddev}_time.

  Users can calculate total_time as total_plan_time + total_exec_time on their
  own. About {min/max/mean/stddev}_time, it will not make much sense
  because it is not ensured that executor follows planner and each counter
  value will be different largely between planner and executor.

* bufusage still only counts the buffer usage during executor.

  Now we have the ability to count the buffer usage during planner but we keep
  the bufusage count the buffer usage during executor for now.

[Coding]

* We add Assert in pg_plan_query so that query_text will not be NULL when
executing planner.

  There's no case query_text will be NULL with current sources. It is not
  ensured there will be any case query_text will be possibly NULL in the
  future though. Some considerations are needed by other people about this.


I don't have any other comments for now. After looking patches over again and
if there are no other comments about this patch, I'll set this patch as ready
for committer for getting more opinion.

--
Yoshikazu Imai





Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-12 Thread Julien Rouhaud
Hi Imai-san,

On Thu, Mar 12, 2020 at 05:28:38AM +, imai.yoshik...@fujitsu.com wrote:
> Hi Julien,
>
> On Mon, Mar 9, 2020 at 10:32 AM, Julien Rouhaud wrote:
> > On Thu, Mar 05, 2020 at 01:26:19PM -0700, legrand legrand wrote:
> > > Please consider PG13 shortest path ;o)
> > >
> > > My one is  parse->queryId != UINT64CONST(0) in pgss_planner_hook().
> > > It fixes IVM problem (verified),
> > > and keep CTAS equal to pgss without planning counters (verified too).
> >
> > I still disagree that hiding this problem is the right fix, but since no one
> > objected here's a v5 with that behavior.  Hopefully this will be fixed in 
> > v14.
>
> Is there any case that query_text will be NULL when executing pg_plan_query?

With current sources, there are no cases where the query text isn't provided
AFAICS.

> If query_text will be NULL, we need to add codes to avoid errors in
> pgss_store like as current patch. If query_text will not be NULL, we should
> add Assert in pg_plan_query so that other developers can notice that they
> would not mistakenly set query_text as NULL even without using pgss_planning
> counter.

I totally agree.  I already had such assert locally, and regression tests pass
without any problem.  I'm attaching a v6 with that extra assert.  If the
first patch is committed, it'll now be a requirement to provide it.  Or if
people think it's not, it'll make sure that we'll discuss it.
>From 3be06d43f0ef306f159a1c4ce0755d3bb0a4dd03 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Thu, 28 Mar 2019 13:33:23 +0100
Subject: [PATCH v6 1/3] Pass query string to the planner

---
 src/backend/commands/copy.c  |  3 ++-
 src/backend/commands/createas.c  |  3 ++-
 src/backend/commands/explain.c   |  2 +-
 src/backend/commands/extension.c |  2 +-
 src/backend/commands/matview.c   |  2 +-
 src/backend/commands/portalcmds.c|  2 +-
 src/backend/executor/functions.c |  1 +
 src/backend/optimizer/plan/planner.c | 10 ++
 src/backend/tcop/postgres.c  | 16 +++-
 src/backend/utils/cache/plancache.c  |  3 ++-
 src/include/optimizer/optimizer.h|  3 ++-
 src/include/optimizer/planner.h  |  4 +++-
 src/include/tcop/tcopprot.h  |  6 --
 13 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index fbde9f88e7..efb1e0d03e 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1580,7 +1580,8 @@ BeginCopy(ParseState *pstate,
 		}
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, NULL);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, NULL);
 
 		/*
 		 * With row level security and a user using "COPY relation TO", we
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 3a5676fb39..9febdc5165 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -329,7 +329,8 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		Assert(query->commandType == CMD_SELECT);
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, params);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, params);
 
 		/*
 		 * Use a snapshot with an updated command ID to ensure this query sees
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index d901dc4a50..fb772aa5cd 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -375,7 +375,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		INSTR_TIME_SET_CURRENT(planstart);
 
 		/* plan the query */
-		plan = pg_plan_query(query, cursorOptions, params);
+		plan = pg_plan_query(query, queryString, cursorOptions, params);
 
 		INSTR_TIME_SET_CURRENT(planduration);
 		INSTR_TIME_SUBTRACT(planduration, planstart);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 00cf4ef268..38cbea385a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -751,7 +751,7 @@ execute_sql_string(const char *sql)
 		   NULL,
 		   0,
 		   NULL);
-		stmt_list = pg_plan_queries(stmt_list, CURSOR_OPT_PARALLEL_OK, NULL);
+		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
 
 		foreach(lc2, stmt_list)
 		{
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index c3954f3e24..e5a5eef102 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -391,7 +391,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	CHECK_FOR_INTERRUPTS();
 
 	/* Plan the query which will generate data for the refresh. */
-	plan = pg_plan_query(query, 0, NULL);
+	plan = pg_plan_query(query, queryString, 0, NULL);
 
 	/*
 	 * Use a snapshot with an updated command ID to ensure this query sees
diff --git a/src/backend/commands/portalcmds.c 

RE: Planning counters in pg_stat_statements (using pgss_store)

2020-03-11 Thread imai.yoshik...@fujitsu.com
Hi Julien,

On Mon, Mar 9, 2020 at 10:32 AM, Julien Rouhaud wrote:
> On Thu, Mar 05, 2020 at 01:26:19PM -0700, legrand legrand wrote:
> > Please consider PG13 shortest path ;o)
> >
> > My one is  parse->queryId != UINT64CONST(0) in pgss_planner_hook().
> > It fixes IVM problem (verified),
> > and keep CTAS equal to pgss without planning counters (verified too).
> 
> I still disagree that hiding this problem is the right fix, but since no one
> objected here's a v5 with that behavior.  Hopefully this will be fixed in v14.

Is there any case that query_text will be NULL when executing pg_plan_query?
If query_text will be NULL, we need to add codes to avoid errors in
pgss_store like as current patch. If query_text will not be NULL, we should
add Assert in pg_plan_query so that other developers can notice that they
would not mistakenly set query_text as NULL even without using pgss_planning
counter.

--
Yoshikazu Imai





Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-09 Thread Julien Rouhaud
On Thu, Mar 05, 2020 at 01:26:19PM -0700, legrand legrand wrote:
> Please consider PG13 shortest path ;o)
>
> My one is  parse->queryId != UINT64CONST(0) in pgss_planner_hook().
> It fixes IVM problem (verified),
> and keep CTAS equal to pgss without planning counters (verified too).

I still disagree that hiding this problem is the right fix, but since no one
objected here's a v5 with that behavior.  Hopefully this will be fixed in v14.
>From 1e9e7d7223482bb1976292176da7fa01deb84e93 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Thu, 28 Mar 2019 13:33:23 +0100
Subject: [PATCH v5 1/3] Pass query string to the planner

---
 src/backend/commands/copy.c  |  3 ++-
 src/backend/commands/createas.c  |  3 ++-
 src/backend/commands/explain.c   |  2 +-
 src/backend/commands/extension.c |  2 +-
 src/backend/commands/matview.c   |  2 +-
 src/backend/commands/portalcmds.c|  2 +-
 src/backend/executor/functions.c |  1 +
 src/backend/optimizer/plan/planner.c | 10 ++
 src/backend/tcop/postgres.c  | 13 -
 src/backend/utils/cache/plancache.c  |  3 ++-
 src/include/optimizer/optimizer.h|  3 ++-
 src/include/optimizer/planner.h  |  4 +++-
 src/include/tcop/tcopprot.h  |  6 --
 13 files changed, 34 insertions(+), 20 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index e79ede4cb8..62d7635a34 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1579,7 +1579,8 @@ BeginCopy(ParseState *pstate,
}
 
/* plan the query */
-   plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, NULL);
+   plan = pg_plan_query(query, pstate->p_sourcetext,
+
CURSOR_OPT_PARALLEL_OK, NULL);
 
/*
 * With row level security and a user using "COPY relation TO", 
we
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 3a5676fb39..9febdc5165 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -329,7 +329,8 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt 
*stmt,
Assert(query->commandType == CMD_SELECT);
 
/* plan the query */
-   plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, params);
+   plan = pg_plan_query(query, pstate->p_sourcetext,
+
CURSOR_OPT_PARALLEL_OK, params);
 
/*
 * Use a snapshot with an updated command ID to ensure this 
query sees
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index d901dc4a50..fb772aa5cd 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -375,7 +375,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
INSTR_TIME_SET_CURRENT(planstart);
 
/* plan the query */
-   plan = pg_plan_query(query, cursorOptions, params);
+   plan = pg_plan_query(query, queryString, cursorOptions, params);
 
INSTR_TIME_SET_CURRENT(planduration);
INSTR_TIME_SUBTRACT(planduration, planstart);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 00cf4ef268..38cbea385a 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -751,7 +751,7 @@ execute_sql_string(const char *sql)

   NULL,

   0,

   NULL);
-   stmt_list = pg_plan_queries(stmt_list, CURSOR_OPT_PARALLEL_OK, 
NULL);
+   stmt_list = pg_plan_queries(stmt_list, sql, 
CURSOR_OPT_PARALLEL_OK, NULL);
 
foreach(lc2, stmt_list)
{
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index c3954f3e24..e5a5eef102 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -391,7 +391,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
CHECK_FOR_INTERRUPTS();
 
/* Plan the query which will generate data for the refresh. */
-   plan = pg_plan_query(query, 0, NULL);
+   plan = pg_plan_query(query, queryString, 0, NULL);
 
/*
 * Use a snapshot with an updated command ID to ensure this query sees
diff --git a/src/backend/commands/portalcmds.c 
b/src/backend/commands/portalcmds.c
index 40be5069fe..6a2c233615 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -90,7 +90,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt 
*cstmt, ParamListInfo pa
elog(ERROR, "non-SELECT statement in DECLARE CURSOR");
 
/* Plan the query, applying 

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-05 Thread legrand legrand
Never mind ...

Please consider PG13 shortest path ;o)

My one is  parse->queryId != UINT64CONST(0) in pgss_planner_hook().
It fixes IVM problem (verified), 
and keep CTAS equal to pgss without planning counters (verified too).

Regards
PAscal






--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-02 Thread Julien Rouhaud
On Mon, Mar 2, 2020 at 1:01 PM legrand legrand
 wrote:
>
> Julien Rouhaud wrote
> > On Sun, Mar 1, 2020 at 3:55 PM legrand legrand
> > 
>
> > legrand_legrand@
>
> >  wrote:
> >>
> >> >> I like the idea of adding a check for a non-zero queryId in the new
> >> >> pgss_planner_hook() (zero queryid shouldn't be reserved for
> >> >> utility_statements ?).
> >>
> >> > Some assert hit later, I can say that it's not always true.  For
> >> > instance a CREATE TABLE AS won't run parse analysis for the underlying
> >> > query, as this has already been done for the original statement, but
> >> > will still call the planner.  I'll change pgss_planner_hook to ignore
> >> > such cases, as pgss_store would otherwise think that it's a utility
> >> > statement.  That'll probably incidentally fix the IVM incompatibility.
> >>
> >> Today with or without test on parse->queryId != UINT64CONST(0),
> >> CTAS is collected as a utility_statement without planning counter.
> >> This seems to me respectig the rule, not sure that this needs any
> >> new (risky) change to the actual (quite stable) patch.
> >
> > But the queryid ends up not being computed the same way:
> >
> > # select queryid, query, plans, calls from pg_stat_statements where
> > query like 'create table%';
> >queryid   | query  | plans | calls
> > -++---+---
> >  8275950546884151007 | create table test as select 1; | 1 | 0
> >  7546197440584636081 | create table test as select 1  | 0 | 1
> > (2 rows)
> >
> > That's because CreateTableAsStmt->query doesn't have a query
> > location/len, as transformTopLevelStmt is only setting that for the
> > top level Query.  That's probably an oversight in ab1f0c82257, but I'm
> > not sure what's the best way to fix that.  Should we pass that
> > information to all transformXXX function, or let transformTopLevelStmt
> > handle that.
>
>
> arf, this was not the case in my testing env (that is not up to date) :o(
> and would not have appeared at all with the proposed test on
> parse->queryId != UINT64CONST(0) ...

I'm not sure what was the exact behavior you had, but that shouldn't
have changed since previous version.  The underlying query isn't a top
level statement, so maybe you didn't set pg_stat_statements.track =
'all'?




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-02 Thread legrand legrand
Julien Rouhaud wrote
> On Sun, Mar 1, 2020 at 3:55 PM legrand legrand
> 

> legrand_legrand@

>  wrote:
>>
>> >> I like the idea of adding a check for a non-zero queryId in the new
>> >> pgss_planner_hook() (zero queryid shouldn't be reserved for
>> >> utility_statements ?).
>>
>> > Some assert hit later, I can say that it's not always true.  For
>> > instance a CREATE TABLE AS won't run parse analysis for the underlying
>> > query, as this has already been done for the original statement, but
>> > will still call the planner.  I'll change pgss_planner_hook to ignore
>> > such cases, as pgss_store would otherwise think that it's a utility
>> > statement.  That'll probably incidentally fix the IVM incompatibility.
>>
>> Today with or without test on parse->queryId != UINT64CONST(0),
>> CTAS is collected as a utility_statement without planning counter.
>> This seems to me respectig the rule, not sure that this needs any
>> new (risky) change to the actual (quite stable) patch.
> 
> But the queryid ends up not being computed the same way:
> 
> # select queryid, query, plans, calls from pg_stat_statements where
> query like 'create table%';
>queryid   | query  | plans | calls
> -++---+---
>  8275950546884151007 | create table test as select 1; | 1 | 0
>  7546197440584636081 | create table test as select 1  | 0 | 1
> (2 rows)
> 
> That's because CreateTableAsStmt->query doesn't have a query
> location/len, as transformTopLevelStmt is only setting that for the
> top level Query.  That's probably an oversight in ab1f0c82257, but I'm
> not sure what's the best way to fix that.  Should we pass that
> information to all transformXXX function, or let transformTopLevelStmt
> handle that.


arf, this was not the case in my testing env (that is not up to date) :o(
and would not have appeared at all with the proposed test on 
parse->queryId != UINT64CONST(0) ...




--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-02 Thread Julien Rouhaud
On Sun, Mar 1, 2020 at 3:55 PM legrand legrand
 wrote:
>
> >> I like the idea of adding a check for a non-zero queryId in the new
> >> pgss_planner_hook() (zero queryid shouldn't be reserved for
> >> utility_statements ?).
>
> > Some assert hit later, I can say that it's not always true.  For
> > instance a CREATE TABLE AS won't run parse analysis for the underlying
> > query, as this has already been done for the original statement, but
> > will still call the planner.  I'll change pgss_planner_hook to ignore
> > such cases, as pgss_store would otherwise think that it's a utility
> > statement.  That'll probably incidentally fix the IVM incompatibility.
>
> Today with or without test on parse->queryId != UINT64CONST(0),
> CTAS is collected as a utility_statement without planning counter.
> This seems to me respectig the rule, not sure that this needs any
> new (risky) change to the actual (quite stable) patch.

But the queryid ends up not being computed the same way:

# select queryid, query, plans, calls from pg_stat_statements where
query like 'create table%';
   queryid   | query  | plans | calls
-++---+---
 8275950546884151007 | create table test as select 1; | 1 | 0
 7546197440584636081 | create table test as select 1  | 0 | 1
(2 rows)

That's because CreateTableAsStmt->query doesn't have a query
location/len, as transformTopLevelStmt is only setting that for the
top level Query.  That's probably an oversight in ab1f0c82257, but I'm
not sure what's the best way to fix that.  Should we pass that
information to all transformXXX function, or let transformTopLevelStmt
handle that.




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-01 Thread legrand legrand
>> I like the idea of adding a check for a non-zero queryId in the new
>> pgss_planner_hook() (zero queryid shouldn't be reserved for
>> utility_statements ?).

> Some assert hit later, I can say that it's not always true.  For
> instance a CREATE TABLE AS won't run parse analysis for the underlying
> query, as this has already been done for the original statement, but
> will still call the planner.  I'll change pgss_planner_hook to ignore
> such cases, as pgss_store would otherwise think that it's a utility
> statement.  That'll probably incidentally fix the IVM incompatibility. 

Today with or without test on parse->queryId != UINT64CONST(0),
CTAS is collected as a utility_statement without planning counter.
This seems to me respectig the rule, not sure that this needs any 
new (risky) change to the actual (quite stable) patch. 





--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-01 Thread Julien Rouhaud
Hi,

On Fri, Feb 28, 2020 at 4:06 PM legrand legrand
 wrote:
>
> It seems IVM team does not consider this point as a priority ...

Well, IVM is a big project and I agree that fixing this issue isn't
the most urgent one, especially since there's no guarantee that this
pgss planning patch will be committed, or with the current behavior.

> We should not wait for them, if we want to keep a chance to be
> included in PG13.
>
> So we have to make this feature more robust, an assert failure being
> considered as a severe regression (even if this is not coming from pgss).

I'm still not convinced that handling NULL query string, as in
sometimes ignoring planning counters, is the right solution here. For
now all code is able to provide it (or at least all the code that goes
through make installcheck).   I'm wondering if it'd be better to
instead add a similar assert in pg_plan_query, to make sure that this
requirements is always met even without using pg_stat_statements, or
any other extension that would also rely on that.

I also realized that the last version of the patch I sent was a rebase
of the wrong version, I'll send the correct version soon.

> I like the idea of adding a check for a non-zero queryId in the new
> pgss_planner_hook() (zero queryid shouldn't be reserved for
> utility_statements ?).

Some assert hit later, I can say that it's not always true.  For
instance a CREATE TABLE AS won't run parse analysis for the underlying
query, as this has already been done for the original statement, but
will still call the planner.  I'll change pgss_planner_hook to ignore
such cases, as pgss_store would otherwise think that it's a utility
statement.  That'll probably incidentally fix the IVM incompatibility.




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-02-28 Thread legrand legrand
Hi Julien,

>> But I would have prefered this new feature to work the same way with or
>> without track_planning activated ;o(

> Definitely, but fixing the issue in pgss (ignoring planner calls when
> we don't have a query text) means that pgss won't give an exhaustive
> view of activity anymore, so a fix in IVM would be a better solution.
> Let's wait and see if Nagata-san and other people involved in that
> have an opinion on it.

It seems IVM team does not consider this point as a priority ... 
We should not wait for them, if we want to keep a chance to be 
included in PG13.

So we have to make this feature more robust, an assert failure being 
considered as a severe regression (even if this is not coming from pgss).

I like the idea of adding a check for a non-zero queryId in the new 
pgss_planner_hook() (zero queryid shouldn't be reserved for
utility_statements ?).

Fixing the corner case where a query (with no sql text) can be planned 
without being parsed is an other subject that should be resolved in an 
other thread.

This kind of query was ignored in pgss, it should be ignored in pgss with 
planning counters.

Any thoughts ?
Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-01-21 Thread Julien Rouhaud
Hi,

On Sat, Jan 18, 2020 at 6:14 PM legrand legrand
 wrote:
>
> Hi Julien,
>
> bot is still unhappy
> https://travis-ci.org/postgresql-cfbot/postgresql/builds/638701399
>
> portalcmds.c: In function ‘PerformCursorOpen’:
> portalcmds.c:93:7: error: ‘queryString’ may be used uninitialized in this
> function [-Werror=maybe-uninitialized]
>   plan = pg_plan_query(query, queryString, cstmt->options, params);
>^
> portalcmds.c:50:8: note: ‘queryString’ was declared here
>   char*queryString;
> ^
> cc1: all warnings being treated as errors
> : recipe for target 'portalcmds.o' failed
> make[3]: *** [portalcmds.o] Error 1
> make[3]: Leaving directory
> '/home/travis/build/postgresql-cfbot/postgresql/src/backend/commands'
> common.mk:39: recipe for target 'commands-recursive' failed
> make[2]: *** [commands-recursive] Error 2
> make[2]: *** Waiting for unfinished jobs

Indeed, thanks for the report!  PFA rebased v4 version of the patchset.
From 25176605492e3ad354f9be771c0a389324cc1780 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Thu, 28 Mar 2019 13:33:23 +0100
Subject: [PATCH 1/2] Pass query string to the planner

---
 src/backend/commands/copy.c  |  3 ++-
 src/backend/commands/createas.c  |  3 ++-
 src/backend/commands/explain.c   |  2 +-
 src/backend/commands/extension.c |  2 +-
 src/backend/commands/matview.c   |  2 +-
 src/backend/commands/portalcmds.c|  2 +-
 src/backend/executor/functions.c |  1 +
 src/backend/optimizer/plan/planner.c | 10 ++
 src/backend/tcop/postgres.c  | 13 -
 src/backend/utils/cache/plancache.c  |  3 ++-
 src/include/optimizer/optimizer.h|  3 ++-
 src/include/optimizer/planner.h  |  4 +++-
 src/include/tcop/tcopprot.h  |  6 --
 13 files changed, 34 insertions(+), 20 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 40a8ec1abd..77e4bea59f 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1579,7 +1579,8 @@ BeginCopy(ParseState *pstate,
 		}
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, NULL);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, NULL);
 
 		/*
 		 * With row level security and a user using "COPY relation TO", we
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 9f387b5f5f..6558845194 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -329,7 +329,8 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
 		Assert(query->commandType == CMD_SELECT);
 
 		/* plan the query */
-		plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, params);
+		plan = pg_plan_query(query, pstate->p_sourcetext,
+			 CURSOR_OPT_PARALLEL_OK, params);
 
 		/*
 		 * Use a snapshot with an updated command ID to ensure this query sees
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index d189b8d573..7a04d0396c 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -363,7 +363,7 @@ ExplainOneQuery(Query *query, int cursorOptions,
 		INSTR_TIME_SET_CURRENT(planstart);
 
 		/* plan the query */
-		plan = pg_plan_query(query, cursorOptions, params);
+		plan = pg_plan_query(query, queryString, cursorOptions, params);
 
 		INSTR_TIME_SET_CURRENT(planduration);
 		INSTR_TIME_SUBTRACT(planduration, planstart);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 01de398dcb..f1590e1b89 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -740,7 +740,7 @@ execute_sql_string(const char *sql)
 		   NULL,
 		   0,
 		   NULL);
-		stmt_list = pg_plan_queries(stmt_list, CURSOR_OPT_PARALLEL_OK, NULL);
+		stmt_list = pg_plan_queries(stmt_list, sql, CURSOR_OPT_PARALLEL_OK, NULL);
 
 		foreach(lc2, stmt_list)
 		{
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 1ee37c1aeb..dfb6cf72e2 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -391,7 +391,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
 	CHECK_FOR_INTERRUPTS();
 
 	/* Plan the query which will generate data for the refresh. */
-	plan = pg_plan_query(query, 0, NULL);
+	plan = pg_plan_query(query, queryString, 0, NULL);
 
 	/*
 	 * Use a snapshot with an updated command ID to ensure this query sees
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 7e5c805a1e..9141e83402 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -90,7 +90,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
 		elog(ERROR, "non-SELECT statement in DECLARE CURSOR");
 
 	/* Plan the query, applying the specified options */
-	plan = pg_plan_query(query, cstmt->options, params);
+	plan = pg_plan_query(query, pstate->p_sourcetext, 

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-01-18 Thread legrand legrand
Hi Julien,

bot is still unhappy
https://travis-ci.org/postgresql-cfbot/postgresql/builds/638701399

portalcmds.c: In function ‘PerformCursorOpen’:
portalcmds.c:93:7: error: ‘queryString’ may be used uninitialized in this
function [-Werror=maybe-uninitialized]
  plan = pg_plan_query(query, queryString, cstmt->options, params);
   ^
portalcmds.c:50:8: note: ‘queryString’ was declared here
  char*queryString;
^
cc1: all warnings being treated as errors
: recipe for target 'portalcmds.o' failed
make[3]: *** [portalcmds.o] Error 1
make[3]: Leaving directory
'/home/travis/build/postgresql-cfbot/postgresql/src/backend/commands'
common.mk:39: recipe for target 'commands-recursive' failed
make[2]: *** [commands-recursive] Error 2
make[2]: *** Waiting for unfinished jobs

regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-01-05 Thread Julien Rouhaud
On Sun, Jan 5, 2020 at 7:02 PM legrand legrand
 wrote:
>
> Julien Rouhaud wrote
> > On Sun, Jan 5, 2020 at 4:11 PM legrand legrand
> > 
>
> > legrand_legrand@
>
> >  wrote:
> >>
> >> Hi Julien,
> >>
> >> I would like to create a link with
> >> https://www.postgresql.org/message-id/
>
> > 1577490124579-0.post@.nabble
>
> >>
> >> where we met an ASSET FAILURE because query text was not initialized ...
> >>
> >> The question raised is:
> >>
> >> - should query text be always provided
> >> or
> >> - if not how to deal that case (in pgss).
> >
> > I'd think that since the query text was until now always provided,
> > there's no reason why this patch should change that.  That being said,
> > there has been other concerns raised wrt. temporary tables in the IVM
> > patchset, so ISTM that there might be important architectural changes
> > upcoming, so having to deal with this case in pgss is not rushed
> > (especially since handling that in pgss would be trivial), and can
> > help to catch issue with the query text pasing.
>
> IVM revealed that ASSERT,
> but IVM works fine with pg_stat_statements.track_planning = off.

Yes, but on the other hand the current IVM patchset also adds the only
pg_plan_query call that don't provide a query text.  I didn't see any
other possibility, and if there are other cases they're unfortunately
not covered by the full regression tests.

> There may be others parts of postgresql that would have workede fine as
> well.
>
> This means 2 things:
> - there is a (litle) risk to meet other assert failures when using planning
> counters in pgss,
> - we have an easy workarround to fix it (disabling track_planning).
>
> But I would have prefered this new feature to work the same way with or
> without track_planning activated ;o(

Definitely, but fixing the issue in pgss (ignoring planner calls when
we don't have a query text) means that pgss won't give an exhaustive
view of activity anymore, so a fix in IVM would be a better solution.
Let's wait and see if Nagata-san and other people involved in that
have an opinion on it.




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-01-05 Thread legrand legrand
Julien Rouhaud wrote
> On Sun, Jan 5, 2020 at 4:11 PM legrand legrand
> 

> legrand_legrand@

>  wrote:
>>
>> Hi Julien,
>>
>> I would like to create a link with
>> https://www.postgresql.org/message-id/

> 1577490124579-0.post@.nabble

>>
>> where we met an ASSET FAILURE because query text was not initialized ...
>>
>> The question raised is:
>>
>> - should query text be always provided
>> or
>> - if not how to deal that case (in pgss).
> 
> I'd think that since the query text was until now always provided,
> there's no reason why this patch should change that.  That being said,
> there has been other concerns raised wrt. temporary tables in the IVM
> patchset, so ISTM that there might be important architectural changes
> upcoming, so having to deal with this case in pgss is not rushed
> (especially since handling that in pgss would be trivial), and can
> help to catch issue with the query text pasing.

IVM revealed that ASSERT,
but IVM works fine with pg_stat_statements.track_planning = off.
There may be others parts of postgresql that would have workede fine as
well.

This means 2 things:
- there is a (litle) risk to meet other assert failures when using planning
counters in pgss,
- we have an easy workarround to fix it (disabling track_planning).

But I would have prefered this new feature to work the same way with or
without track_planning activated ;o(



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-01-05 Thread Julien Rouhaud
On Sun, Jan 5, 2020 at 4:11 PM legrand legrand
 wrote:
>
> Hi Julien,
>
> I would like to create a link with
> https://www.postgresql.org/message-id/1577490124579-0.p...@n3.nabble.com
>
> where we met an ASSET FAILURE because query text was not initialized ...
>
> The question raised is:
>
> - should query text be always provided
> or
> - if not how to deal that case (in pgss).

I'd think that since the query text was until now always provided,
there's no reason why this patch should change that.  That being said,
there has been other concerns raised wrt. temporary tables in the IVM
patchset, so ISTM that there might be important architectural changes
upcoming, so having to deal with this case in pgss is not rushed
(especially since handling that in pgss would be trivial), and can
help to catch issue with the query text pasing.




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-01-05 Thread legrand legrand
Hi Julien,

I would like to create a link with
https://www.postgresql.org/message-id/1577490124579-0.p...@n3.nabble.com

where we met an ASSET FAILURE because query text was not initialized ...

The question raised is:

- should query text be always provided 
or 
- if not how to deal that case (in pgss).

Regards
PAscal



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Planning counters in pg_stat_statements (using pgss_store)

2020-01-05 Thread Julien Rouhaud
On Fri, Nov 22, 2019 at 11:23 AM imai.yoshik...@fujitsu.com
 wrote:
>
> On Wed, Nov 20, 2019 at 4:55 PM, Julien Rouhaud wrote:
> > On Wed, Nov 20, 2019 at 2:06 AM imai.yoshik...@fujitsu.com 
> >  wrote:
> > >
> > > On Tue, Nov 19, 2019 at 2:27 PM, Julien Rouhaud wrote:
> > > > On Fri, Nov 15, 2019 at 2:00 AM imai.yoshik...@fujitsu.com 
> > > >  wrote:
> > > > >
> > > > > Actually I also don't have strong opinion but I thought someone
> > > > > would complain about renaming of those columns and
> > > > also some tools like monitoring which use those columns will not
> > > > work. If we use {total, min, max, mean, stddev}_time, someone might
> > > > mistakenly understand {total, min, max, mean, stddev}_time mean {total, 
> > > > min, max, mean, stddev} of planning and
> > execution.
> > > > > If I need to choose {total, min, max, mean, stddev}_time or
> > > > > {total, min, max, mean, stddev}_exec_time, I choose former
> > > > one because choosing best name is not worth destructing the existing 
> > > > scripts or tools.
> > > >
> > > > We could definitely keep (plan|exec)_time for the SRF, and have the
> > > > {total, min, max, mean, stddev}_time created by the view to be a sum
> > > > of planning + execution for each counter
> > >
> > > I might misunderstand but if we define {total, min, max, mean,
> > > stddev}_time is just a sum of planning + execution for each counter
> > > like "select total_plan_time + total_exec_time as total_time from
> > > pg_stat_statements", I wonder we can calculate stddev_time correctly.
> > > If we prepare variables in the codes to calculate those values, yes,
> > > we can correctly calculate those values even for the total_stddev.
> >
> > Yes you're right, this can't possibly work for most of the counters.
> > And also, since there's no guarantee that each execution will follow a 
> > planning, providing such global counters for
> > min/max/mean and stddev wouldn't make much sense.
>
> Ah, I see. Planning counts and execution counts differ.
> It might be difficult to redefine the meaning of {min, max, mean, 
> stddev}_time precisely, and even if we can redefine it correctly, it would 
> not be intuitive.

Thomas' automatic patch tester just warned me that the patchset is
broken since 3fd40b628c7db4, which removed the queryString from
ExecCreateTableAs.  New patch version that re-add the queryString, no
changes otherwise.


0001-Pass-query-string-to-the-planner-v3.patch
Description: Binary data


0002-Add-planning-counters-to-pg_stat_statements-v3.patch
Description: Binary data


RE: Planning counters in pg_stat_statements (using pgss_store)

2019-11-22 Thread imai.yoshik...@fujitsu.com
On Wed, Nov 20, 2019 at 4:55 PM, Julien Rouhaud wrote:
> On Wed, Nov 20, 2019 at 2:06 AM imai.yoshik...@fujitsu.com 
>  wrote:
> >
> > On Tue, Nov 19, 2019 at 2:27 PM, Julien Rouhaud wrote:
> > > On Fri, Nov 15, 2019 at 2:00 AM imai.yoshik...@fujitsu.com 
> > >  wrote:
> > > >
> > > > Actually I also don't have strong opinion but I thought someone
> > > > would complain about renaming of those columns and
> > > also some tools like monitoring which use those columns will not
> > > work. If we use {total, min, max, mean, stddev}_time, someone might
> > > mistakenly understand {total, min, max, mean, stddev}_time mean {total, 
> > > min, max, mean, stddev} of planning and
> execution.
> > > > If I need to choose {total, min, max, mean, stddev}_time or
> > > > {total, min, max, mean, stddev}_exec_time, I choose former
> > > one because choosing best name is not worth destructing the existing 
> > > scripts or tools.
> > >
> > > We could definitely keep (plan|exec)_time for the SRF, and have the
> > > {total, min, max, mean, stddev}_time created by the view to be a sum
> > > of planning + execution for each counter
> >
> > I might misunderstand but if we define {total, min, max, mean,
> > stddev}_time is just a sum of planning + execution for each counter
> > like "select total_plan_time + total_exec_time as total_time from
> > pg_stat_statements", I wonder we can calculate stddev_time correctly.
> > If we prepare variables in the codes to calculate those values, yes,
> > we can correctly calculate those values even for the total_stddev.
> 
> Yes you're right, this can't possibly work for most of the counters.
> And also, since there's no guarantee that each execution will follow a 
> planning, providing such global counters for
> min/max/mean and stddev wouldn't make much sense.

Ah, I see. Planning counts and execution counts differ.
It might be difficult to redefine the meaning of {min, max, mean, stddev}_time 
precisely, and even if we can redefine it correctly, it would not be intuitive.

--
Yoshikazu Imai


Re: Planning counters in pg_stat_statements (using pgss_store)

2019-11-20 Thread Julien Rouhaud
On Wed, Nov 20, 2019 at 2:06 AM imai.yoshik...@fujitsu.com
 wrote:
>
> On Tue, Nov 19, 2019 at 2:27 PM, Julien Rouhaud wrote:
> > On Fri, Nov 15, 2019 at 2:00 AM imai.yoshik...@fujitsu.com 
> >  wrote:
> > >
> > > Actually I also don't have strong opinion but I thought someone would 
> > > complain about renaming of those columns and
> > also some tools like monitoring which use those columns will not work. If 
> > we use {total, min, max, mean, stddev}_time,
> > someone might mistakenly understand {total, min, max, mean, stddev}_time 
> > mean {total, min, max, mean, stddev} of planning
> > and execution.
> > > If I need to choose {total, min, max, mean, stddev}_time or {total, min, 
> > > max, mean, stddev}_exec_time, I choose former
> > one because choosing best name is not worth destructing the existing 
> > scripts or tools.
> >
> > We could definitely keep (plan|exec)_time for the SRF, and have the {total, 
> > min, max, mean, stddev}_time created by
> > the view to be a sum of planning + execution for each counter
>
> I might misunderstand but if we define {total, min, max, mean, stddev}_time is
> just a sum of planning + execution for each counter like
> "select total_plan_time + total_exec_time as total_time from 
> pg_stat_statements",
> I wonder we can calculate stddev_time correctly. If we prepare variables in
> the codes to calculate those values, yes, we can correctly calculate those
> values even for the total_stddev.

Yes you're right, this can't possibly work for most of the counters.
And also, since there's no guarantee that each execution will follow a
planning, providing such global counters for min/max/mean and stddev
wouldn't make much sense.




RE: Planning counters in pg_stat_statements (using pgss_store)

2019-11-19 Thread imai.yoshik...@fujitsu.com
On Tue, Nov 19, 2019 at 2:27 PM, Julien Rouhaud wrote:
> On Fri, Nov 15, 2019 at 2:00 AM imai.yoshik...@fujitsu.com 
>  wrote:
> >
> > Actually I also don't have strong opinion but I thought someone would 
> > complain about renaming of those columns and
> also some tools like monitoring which use those columns will not work. If we 
> use {total, min, max, mean, stddev}_time,
> someone might mistakenly understand {total, min, max, mean, stddev}_time mean 
> {total, min, max, mean, stddev} of planning
> and execution.
> > If I need to choose {total, min, max, mean, stddev}_time or {total, min, 
> > max, mean, stddev}_exec_time, I choose former
> one because choosing best name is not worth destructing the existing scripts 
> or tools.
> 
> We could definitely keep (plan|exec)_time for the SRF, and have the {total, 
> min, max, mean, stddev}_time created by
> the view to be a sum of planning + execution for each counter

I might misunderstand but if we define {total, min, max, mean, stddev}_time is
just a sum of planning + execution for each counter like
"select total_plan_time + total_exec_time as total_time from 
pg_stat_statements",
I wonder we can calculate stddev_time correctly. If we prepare variables in
the codes to calculate those values, yes, we can correctly calculate those
values even for the total_stddev.

> and it doesn't sound like a bad idea if having even
> more columns in the view is not an issue.

I also wondered having many columns in the view is ok, but if it's ok, I agree
all of those columns are in the view. Only problem I can come up with is the
view will look bad with many columns, but it already looks bad because query
column values tend to be long and each row can't fit in the one row in the
console.

--
Yoshikazu Imai


Re: Planning counters in pg_stat_statements (using pgss_store)

2019-11-19 Thread Julien Rouhaud
On Fri, Nov 15, 2019 at 2:00 AM imai.yoshik...@fujitsu.com
 wrote:
>
> Actually I also don't have strong opinion but I thought someone would 
> complain about renaming of those columns and also some tools like monitoring 
> which use those columns will not work. If we use {total, min, max, mean, 
> stddev}_time, someone might mistakenly understand {total, min, max, mean, 
> stddev}_time mean {total, min, max, mean, stddev} of planning and execution.
> If I need to choose {total, min, max, mean, stddev}_time or {total, min, max, 
> mean, stddev}_exec_time, I choose former one because choosing best name is 
> not worth destructing the existing scripts or tools.

We could definitely keep (plan|exec)_time for the SRF, and have the
{total, min, max, mean, stddev}_time created by the view to be a sum
of planning + execution for each counter, and it doesn't sound like a
bad idea if having even more columns in the view is not an issue.




RE: Planning counters in pg_stat_statements (using pgss_store)

2019-11-14 Thread imai.yoshik...@fujitsu.com
On Wed, Nov 13, 2019 at 10:50 AM, Julien Rouhaud wrote:
> On Tue, Nov 12, 2019 at 5:41 AM imai.yoshik...@fujitsu.com 
>  wrote:
> >
> > On Sat, Nov 9, 2019 at 1:36 PM, Julien Rouhaud wrote:
> > >
> > > I attach v3 patches implementing those counters.
> >
> > Thanks for updating the patch! Now I can see min/max/mean/stddev plan time.
> >
> >
> > > Note that to avoid duplicating some code (related to Welford's
> > > method), I switched some of the counters to arrays rather than
> > > scalar variables.  It unfortunately makes
> > > pg_stat_statements_internal() a little bit messy, but I hope that it's 
> > > still acceptable.
> >
> > Yeah, I also think it's acceptable, but I think the codes like below
> > one is more understandable than using for loop in
> > pg_stat_statements_internal(), although some codes will be duplicated.
> >
> > pg_stat_statements_internal():
> >
> > if (api_version >= PGSS_V1_8)
> > {
> > kind = PGSS_PLAN;
> > values[i++] = Int64GetDatumFast(tmp.calls[kind]);
> > values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
> > values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
> > values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
> > values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
> > values[i++] = Float8GetDatumFast(stddev(tmp)); }
> >
> > kind = PGSS_EXEC;
> > values[i++] = Int64GetDatumFast(tmp.calls[kind]);
> > values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
> > if (api_version >= PGSS_V1_3)
> > {
> > values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
> > values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
> > values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
> > values[i++] = Float8GetDatumFast(stddev(tmp)); }
> >
> >
> > stddev(Counters counters)
> > {
> > /*
> >  * Note we are calculating the population variance here, not the
> >  * sample variance, as we have data for the whole population, so
> >  * Bessel's correction is not used, and we don't divide by
> >  * tmp.calls - 1.
> >  */
> > if (counters.calls[kind] > 1)
> > return stddev = sqrt(counters.sum_var_time[kind] / 
> > counters.calls[kind]);
> >
> > return 0.0;
> > }
> 
> Yes, that's also a possibility (though this should also take the
> "kind" as parameter).  I wanted to avoid adding a new function and
> save some redundant code, but I can change it in the next version of
> the patch if needed.

Okay. It's not much a serious problem, so we can leave it as it is.


> > > While doing this refactoring
> > > I saw that previous patches were failing to accumulate the buffers used 
> > > during planning, which is now fixed.
> >
> > Checked.
> > Now buffer usage columns include buffer usage during planning and executing,
> > but if we turn off track_planning, buffer usage during planning is also not
> > tracked which is good for users who don't want to take into account of that.
> 
> Indeed.  Note that there's a similar discussion on adding planning
> buffer counters to explain in [1].  I'm unsure if merging planning and
> execution counters in the same columns is ok or not.

Storing buffer usage to different columns is useful to detect the cause of the 
problems if there are the cases many buffers are used during planning, but I'm 
also unsure those cases actually exist. 


> > What I'm concerned about is column names will not be backward-compatible.
> > {total, min, max, mean, stddev}_{plan, exec}_time are the best names which
> > correctly show the meaning of its value, but we can't use
> > {total, min, max, mean, stddev}_time anymore and it might be harmful for
> > some users.
> > I don't come up with any good idea for that...
> 
> Well, perhaps keeping the old {total, min, max, mean, stddev}_time
> would be ok, as those historically meant "execution".  I don't have a
> strong opinion here.

Actually I also don't have strong opinion but I thought someone would complain 
about renaming of those columns and also some tools like monitoring which use 
those columns will not work. If we use {total, min, max, mean, stddev}_time, 
someone might mistakenly understand {total, min, max, mean, stddev}_time mean 
{total, min, max, mean, stddev} of planning and execution. 
If I need to choose {total, min, max, mean, stddev}_time or {total, min, max, 
mean, stddev}_exec_time, I choose former one because choosing best name is not 
worth destructing the existing scripts or tools.

Thanks.
--
Yoshikazu Imai 


Re: Planning counters in pg_stat_statements (using pgss_store)

2019-11-13 Thread Julien Rouhaud
On Tue, Nov 12, 2019 at 5:41 AM imai.yoshik...@fujitsu.com
 wrote:
>
> On Sat, Nov 9, 2019 at 1:36 PM, Julien Rouhaud wrote:
> >
> > I attach v3 patches implementing those counters.
>
> Thanks for updating the patch! Now I can see min/max/mean/stddev plan time.
>
>
> > Note that to avoid duplicating some code (related to Welford's method),
> > I switched some of the counters to arrays rather than scalar variables.  It 
> > unfortunately makes
> > pg_stat_statements_internal() a little bit messy, but I hope that it's 
> > still acceptable.
>
> Yeah, I also think it's acceptable, but I think the codes like below one is 
> more
> understandable than using for loop in pg_stat_statements_internal(),
> although some codes will be duplicated.
>
> pg_stat_statements_internal():
>
> if (api_version >= PGSS_V1_8)
> {
> kind = PGSS_PLAN;
> values[i++] = Int64GetDatumFast(tmp.calls[kind]);
> values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
> values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
> values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
> values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
> values[i++] = Float8GetDatumFast(stddev(tmp));
> }
>
> kind = PGSS_EXEC;
> values[i++] = Int64GetDatumFast(tmp.calls[kind]);
> values[i++] = Float8GetDatumFast(tmp.total_time[kind]);
> if (api_version >= PGSS_V1_3)
> {
> values[i++] = Float8GetDatumFast(tmp.min_time[kind]);
> values[i++] = Float8GetDatumFast(tmp.max_time[kind]);
> values[i++] = Float8GetDatumFast(tmp.mean_time[kind]);
> values[i++] = Float8GetDatumFast(stddev(tmp));
> }
>
>
> stddev(Counters counters)
> {
> /*
>  * Note we are calculating the population variance here, not the
>  * sample variance, as we have data for the whole population, so
>  * Bessel's correction is not used, and we don't divide by
>  * tmp.calls - 1.
>  */
> if (counters.calls[kind] > 1)
> return stddev = sqrt(counters.sum_var_time[kind] / 
> counters.calls[kind]);
>
> return 0.0;
> }

Yes, that's also a possibility (though this should also take the
"kind" as parameter).  I wanted to avoid adding a new function and
save some redundant code, but I can change it in the next version of
the patch if needed.

> > While doing this refactoring
> > I saw that previous patches were failing to accumulate the buffers used 
> > during planning, which is now fixed.
>
> Checked.
> Now buffer usage columns include buffer usage during planning and executing,
> but if we turn off track_planning, buffer usage during planning is also not
> tracked which is good for users who don't want to take into account of that.

Indeed.  Note that there's a similar discussion on adding planning
buffer counters to explain in [1].  I'm unsure if merging planning and
execution counters in the same columns is ok or not.

> What I'm concerned about is column names will not be backward-compatible.
> {total, min, max, mean, stddev}_{plan, exec}_time are the best names which
> correctly show the meaning of its value, but we can't use
> {total, min, max, mean, stddev}_time anymore and it might be harmful for
> some users.
> I don't come up with any good idea for that...

Well, perhaps keeping the old {total, min, max, mean, stddev}_time
would be ok, as those historically meant "execution".  I don't have a
strong opinion here.

[1] 
https://www.postgresql.org/message-id/20191112205506.rvadbx2dnku3p...@alap3.anarazel.de




RE: Planning counters in pg_stat_statements (using pgss_store)

2019-11-11 Thread imai.yoshik...@fujitsu.com
On Sat, Nov 9, 2019 at 1:36 PM, Julien Rouhaud wrote:
> On Fri, Nov 8, 2019 at 3:31 PM Julien Rouhaud  wrote:
> >
> > On Fri, Nov 8, 2019 at 5:35 AM imai.yoshik...@fujitsu.com
> >  wrote:
> > >
> > > On Tue, Sept 10, 2019 at 11:27 PM, Julien Rouhaud wrote:
> > > > > [0002 patch]
> > > > > In pgss_planner_hook:
> > > > >
> > > > > +   /* calc differences of buffer counters. */
> > > > > +   bufusage =
> > > > > + compute_buffer_counters(bufusage_start, pgBufferUsage);
> > > > > +
> > > > > +   /*
> > > > > +* we only store planning duration, query text has 
> > > > > been initialized
> > > > > +* during previous pgss_post_parse_analyze as it not 
> > > > > available inside
> > > > > +* pgss_planner_hook.
> > > > > +*/
> > > > > +   pgss_store(query_text,
> > > > >
> > > > > Do we need to calculate bufusage in here?
> > > > > We only store planning duration in the following pgss_store.
> > > >
> > > > Good point!  Postgres can definitely access some buffers while
> > > > planning a query (the most obvious example would be
> > > > get_actual_variable_range()), but as far as I can tell those were
> > > > previously not accounted for with pg_stat_statements as
> > > > queryDesc->totaltime->bufusage is only accumulating buffer usage
> > > > queryDesc->totaltime->in
> > > > the  executor, and indeed current patch also ignore such computed
> > > > counters.
> > > >
> > > > I think it would be better to keep this bufusage calculation
> > > > during planning and fix pgss_store() to process them, but this
> > > > would add
> > > slightly more overhead.
> > >
> > > Sorry for my late reply.
> > > I think overhead would be trivial and we can include bufusage of
> > > planning from the POV of overhead, but yeah, it will be backward
> > > incompatibility if we include them.
> >
> > Ok, let's keep planning's bufusage then.
> >
> > > BTW, ISTM it is good for including {min,max,mean,stddev}_plan_time
> > > to pg_stat_statements. Generally plan_time would be almost the same
> > > time in each execution for the same query, but there are some
> > > exceptions. For example, if we use prepare statements which uses
> > > partition tables, time differs largely between creating a general plan 
> > > and creating a custom plan.
> > >
> > > 1. Create partition table which has 1024 partitions.
> > > 2. Prepare select and update statements.
> > >   sel) prepare sel(int) as select * from pt where a = $1;
> > >   upd) prepare upd(int, int) as update pt set a = $2 where a = $1;
> > > 3. Execute each statement for 8 times.
> > >   3-1. Select from pg_stat_statements view after every execution.
> > > select query, plans, total_plan_time, calls, total_exec_time
> > > from pg_stat_statements where query like 'prepare%';
> > >
> > >
> > > Results of pg_stat_statements of sel) are
> > > query   | plans | total_plan_time | calls | 
> > > total_exec_time
> > > ---+---+-+---+-
> > >  prepare sel(int) as select * from pt where a = $1 | 1 |
> > > 0.164361 | 1 |0.004613
> > >  prepare sel(int) as select * from pt where a = $1 | 2 | 
> > > 0.277155004 | 2 |0.009447
> > >  prepare sel(int) as select * from pt where a = $1 | 3 | 
> > > 0.391001004 | 3 |0.014281
> > >  prepare sel(int) as select * from pt where a = $1 | 4 |
> > > 0.504004 | 4 |0.019265
> > >  prepare sel(int) as select * from pt where a = $1 | 5 |
> > > 0.628242 | 5 |0.024091
> > >  prepare sel(int) as select * from pt where a = $1 | 7 | 
> > > 24.2135863 | 6 |0.029144
> > >  prepare sel(int) as select * from pt where a = $1 | 8 | 
> > > 24.3689004 | 7 |0.034099
> > >  prepare sel(int) as select * from pt where a = $1 | 9 | 
> > > 24.5279563 | 8 |0.046152
> > >
> > >
> > > Results of pg_stat_statements of upd) are
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 1 | 
> > >0.280099 | 1 |0.013138
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 2 | 
> > >0.405416 | 2 | 0.01894
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 3 | 
> > >0.532361 | 3 |0.040716
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 4 | 
> > >0.671445 | 4 |0.046566
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 5 | 
> > >0.798531 | 5 | 0.0527290005
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 7 | 
> > >  896.915458 | 6 | 0.058886001
> > >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 8 | 
> > >  897.043512 | 7 |

Re: Planning counters in pg_stat_statements (using pgss_store)

2019-11-09 Thread Julien Rouhaud
On Fri, Nov 8, 2019 at 3:31 PM Julien Rouhaud  wrote:
>
> On Fri, Nov 8, 2019 at 5:35 AM imai.yoshik...@fujitsu.com
>  wrote:
> >
> > On Tue, Sept 10, 2019 at 11:27 PM, Julien Rouhaud wrote:
> > > > [0002 patch]
> > > > In pgss_planner_hook:
> > > >
> > > > +   /* calc differences of buffer counters. */
> > > > +   bufusage = compute_buffer_counters(bufusage_start, 
> > > > pgBufferUsage);
> > > > +
> > > > +   /*
> > > > +* we only store planning duration, query text has been 
> > > > initialized
> > > > +* during previous pgss_post_parse_analyze as it not 
> > > > available inside
> > > > +* pgss_planner_hook.
> > > > +*/
> > > > +   pgss_store(query_text,
> > > >
> > > > Do we need to calculate bufusage in here?
> > > > We only store planning duration in the following pgss_store.
> > >
> > > Good point!  Postgres can definitely access some buffers while
> > > planning a query (the most obvious example would be
> > > get_actual_variable_range()), but as far as I can tell those were
> > > previously not accounted for with pg_stat_statements as
> > > queryDesc->totaltime->bufusage is only accumulating buffer usage in
> > > the  executor, and indeed current patch also ignore such computed
> > > counters.
> > >
> > > I think it would be better to keep this bufusage calculation during
> > > planning and fix pgss_store() to process them, but this would add
> > slightly more overhead.
> >
> > Sorry for my late reply.
> > I think overhead would be trivial and we can include bufusage of planning 
> > from
> > the POV of overhead, but yeah, it will be backward incompatibility if we
> > include them.
>
> Ok, let's keep planning's bufusage then.
>
> > BTW, ISTM it is good for including {min,max,mean,stddev}_plan_time to
> > pg_stat_statements. Generally plan_time would be almost the same time in 
> > each
> > execution for the same query, but there are some exceptions. For example, 
> > if we
> > use prepare statements which uses partition tables, time differs largely
> > between creating a general plan and creating a custom plan.
> >
> > 1. Create partition table which has 1024 partitions.
> > 2. Prepare select and update statements.
> >   sel) prepare sel(int) as select * from pt where a = $1;
> >   upd) prepare upd(int, int) as update pt set a = $2 where a = $1;
> > 3. Execute each statement for 8 times.
> >   3-1. Select from pg_stat_statements view after every execution.
> > select query, plans, total_plan_time, calls, total_exec_time from 
> > pg_stat_statements where query like 'prepare%';
> >
> >
> > Results of pg_stat_statements of sel) are
> > query   | plans | total_plan_time | calls | 
> > total_exec_time
> > ---+---+-+---+-
> >  prepare sel(int) as select * from pt where a = $1 | 1 |
> > 0.164361 | 1 |0.004613
> >  prepare sel(int) as select * from pt where a = $1 | 2 | 
> > 0.277155004 | 2 |0.009447
> >  prepare sel(int) as select * from pt where a = $1 | 3 | 
> > 0.391001004 | 3 |0.014281
> >  prepare sel(int) as select * from pt where a = $1 | 4 |
> > 0.504004 | 4 |0.019265
> >  prepare sel(int) as select * from pt where a = $1 | 5 |
> > 0.628242 | 5 |0.024091
> >  prepare sel(int) as select * from pt where a = $1 | 7 | 
> > 24.2135863 | 6 |0.029144
> >  prepare sel(int) as select * from pt where a = $1 | 8 | 
> > 24.3689004 | 7 |0.034099
> >  prepare sel(int) as select * from pt where a = $1 | 9 | 
> > 24.5279563 | 8 |0.046152
> >
> >
> > Results of pg_stat_statements of upd) are
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 1 |   
> >  0.280099 | 1 |0.013138
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 2 |   
> >  0.405416 | 2 | 0.01894
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 3 |   
> >  0.532361 | 3 |0.040716
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 4 |   
> >  0.671445 | 4 |0.046566
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 5 |   
> >  0.798531 | 5 | 0.0527290005
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 7 |  
> > 896.915458 | 6 | 0.058886001
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 8 |  
> > 897.043512 | 7 |0.064446
> >  prepare upd(int, int) as update pt set a = $2 where a = $1 | 9 |  
> > 897.169711 | 8 |0.070644
> >
> >
> > How do you think about that?
>
> That's indeed a very valid point and something we should help user to
> investigate.  I'll submit an updated 

Re: Planning counters in pg_stat_statements (using pgss_store)

2019-11-08 Thread Julien Rouhaud
On Fri, Nov 8, 2019 at 5:35 AM imai.yoshik...@fujitsu.com
 wrote:
>
> On Tue, Sept 10, 2019 at 11:27 PM, Julien Rouhaud wrote:
> > > [0002 patch]
> > > In pgss_planner_hook:
> > >
> > > +   /* calc differences of buffer counters. */
> > > +   bufusage = compute_buffer_counters(bufusage_start, 
> > > pgBufferUsage);
> > > +
> > > +   /*
> > > +* we only store planning duration, query text has been 
> > > initialized
> > > +* during previous pgss_post_parse_analyze as it not 
> > > available inside
> > > +* pgss_planner_hook.
> > > +*/
> > > +   pgss_store(query_text,
> > >
> > > Do we need to calculate bufusage in here?
> > > We only store planning duration in the following pgss_store.
> >
> > Good point!  Postgres can definitely access some buffers while
> > planning a query (the most obvious example would be
> > get_actual_variable_range()), but as far as I can tell those were
> > previously not accounted for with pg_stat_statements as
> > queryDesc->totaltime->bufusage is only accumulating buffer usage in
> > the  executor, and indeed current patch also ignore such computed
> > counters.
> >
> > I think it would be better to keep this bufusage calculation during
> > planning and fix pgss_store() to process them, but this would add
> slightly more overhead.
>
> Sorry for my late reply.
> I think overhead would be trivial and we can include bufusage of planning from
> the POV of overhead, but yeah, it will be backward incompatibility if we
> include them.

Ok, let's keep planning's bufusage then.

> BTW, ISTM it is good for including {min,max,mean,stddev}_plan_time to
> pg_stat_statements. Generally plan_time would be almost the same time in each
> execution for the same query, but there are some exceptions. For example, if 
> we
> use prepare statements which uses partition tables, time differs largely
> between creating a general plan and creating a custom plan.
>
> 1. Create partition table which has 1024 partitions.
> 2. Prepare select and update statements.
>   sel) prepare sel(int) as select * from pt where a = $1;
>   upd) prepare upd(int, int) as update pt set a = $2 where a = $1;
> 3. Execute each statement for 8 times.
>   3-1. Select from pg_stat_statements view after every execution.
> select query, plans, total_plan_time, calls, total_exec_time from 
> pg_stat_statements where query like 'prepare%';
>
>
> Results of pg_stat_statements of sel) are
> query   | plans | total_plan_time | calls | 
> total_exec_time
> ---+---+-+---+-
>  prepare sel(int) as select * from pt where a = $1 | 1 |0.164361 
> | 1 |0.004613
>  prepare sel(int) as select * from pt where a = $1 | 2 | 
> 0.277155004 | 2 |0.009447
>  prepare sel(int) as select * from pt where a = $1 | 3 | 
> 0.391001004 | 3 |0.014281
>  prepare sel(int) as select * from pt where a = $1 | 4 |0.504004 
> | 4 |0.019265
>  prepare sel(int) as select * from pt where a = $1 | 5 |0.628242 
> | 5 |0.024091
>  prepare sel(int) as select * from pt where a = $1 | 7 | 
> 24.2135863 | 6 |0.029144
>  prepare sel(int) as select * from pt where a = $1 | 8 | 
> 24.3689004 | 7 |0.034099
>  prepare sel(int) as select * from pt where a = $1 | 9 | 
> 24.5279563 | 8 |0.046152
>
>
> Results of pg_stat_statements of upd) are
>  prepare upd(int, int) as update pt set a = $2 where a = $1 | 1 |
> 0.280099 | 1 |0.013138
>  prepare upd(int, int) as update pt set a = $2 where a = $1 | 2 |
> 0.405416 | 2 | 0.01894
>  prepare upd(int, int) as update pt set a = $2 where a = $1 | 3 |
> 0.532361 | 3 |0.040716
>  prepare upd(int, int) as update pt set a = $2 where a = $1 | 4 |
> 0.671445 | 4 |0.046566
>  prepare upd(int, int) as update pt set a = $2 where a = $1 | 5 |
> 0.798531 | 5 | 0.0527290005
>  prepare upd(int, int) as update pt set a = $2 where a = $1 | 7 |  
> 896.915458 | 6 | 0.058886001
>  prepare upd(int, int) as update pt set a = $2 where a = $1 | 8 |  
> 897.043512 | 7 |0.064446
>  prepare upd(int, int) as update pt set a = $2 where a = $1 | 9 |  
> 897.169711 | 8 |0.070644
>
>
> How do you think about that?

That's indeed a very valid point and something we should help user to
investigate.  I'll submit an updated patch with support for
min/max/mean/stddev plan time shortly.




RE: Planning counters in pg_stat_statements (using pgss_store)

2019-11-07 Thread imai.yoshik...@fujitsu.com
On Tue, Sept 10, 2019 at 11:27 PM, Julien Rouhaud wrote:
> > [0002 patch]
> > In pgss_planner_hook:
> >
> > +   /* calc differences of buffer counters. */
> > +   bufusage = compute_buffer_counters(bufusage_start, 
> > pgBufferUsage);
> > +
> > +   /*
> > +* we only store planning duration, query text has been 
> > initialized
> > +* during previous pgss_post_parse_analyze as it not 
> > available inside
> > +* pgss_planner_hook.
> > +*/
> > +   pgss_store(query_text,
> >
> > Do we need to calculate bufusage in here?
> > We only store planning duration in the following pgss_store.
> 
> Good point!  Postgres can definitely access some buffers while
> planning a query (the most obvious example would be
> get_actual_variable_range()), but as far as I can tell those were
> previously not accounted for with pg_stat_statements as
> queryDesc->totaltime->bufusage is only accumulating buffer usage in
> the  executor, and indeed current patch also ignore such computed
> counters.
> 
> I think it would be better to keep this bufusage calculation during
> planning and fix pgss_store() to process them, but this would add
slightly more overhead.

Sorry for my late reply.
I think overhead would be trivial and we can include bufusage of planning from
the POV of overhead, but yeah, it will be backward incompatibility if we
include them.


BTW, ISTM it is good for including {min,max,mean,stddev}_plan_time to
pg_stat_statements. Generally plan_time would be almost the same time in each
execution for the same query, but there are some exceptions. For example, if we
use prepare statements which uses partition tables, time differs largely
between creating a general plan and creating a custom plan.

1. Create partition table which has 1024 partitions.
2. Prepare select and update statements.
  sel) prepare sel(int) as select * from pt where a = $1;
  upd) prepare upd(int, int) as update pt set a = $2 where a = $1;
3. Execute each statement for 8 times.
  3-1. Select from pg_stat_statements view after every execution.
select query, plans, total_plan_time, calls, total_exec_time from 
pg_stat_statements where query like 'prepare%';


Results of pg_stat_statements of sel) are
query   | plans | total_plan_time | calls | total_exec_time 
---+---+-+---+-
 prepare sel(int) as select * from pt where a = $1 | 1 |0.164361 |  
   1 |0.004613
 prepare sel(int) as select * from pt where a = $1 | 2 | 
0.277155004 | 2 |0.009447
 prepare sel(int) as select * from pt where a = $1 | 3 | 
0.391001004 | 3 |0.014281
 prepare sel(int) as select * from pt where a = $1 | 4 |0.504004 |  
   4 |0.019265
 prepare sel(int) as select * from pt where a = $1 | 5 |0.628242 |  
   5 |0.024091
 prepare sel(int) as select * from pt where a = $1 | 7 | 24.2135863 
| 6 |0.029144
 prepare sel(int) as select * from pt where a = $1 | 8 | 24.3689004 
| 7 |0.034099
 prepare sel(int) as select * from pt where a = $1 | 9 | 24.5279563 
| 8 |0.046152


Results of pg_stat_statements of upd) are
 prepare upd(int, int) as update pt set a = $2 where a = $1 | 1 |
0.280099 | 1 |0.013138
 prepare upd(int, int) as update pt set a = $2 where a = $1 | 2 |
0.405416 | 2 | 0.01894
 prepare upd(int, int) as update pt set a = $2 where a = $1 | 3 |
0.532361 | 3 |0.040716
 prepare upd(int, int) as update pt set a = $2 where a = $1 | 4 |
0.671445 | 4 |0.046566
 prepare upd(int, int) as update pt set a = $2 where a = $1 | 5 |
0.798531 | 5 | 0.0527290005
 prepare upd(int, int) as update pt set a = $2 where a = $1 | 7 |  
896.915458 | 6 | 0.058886001
 prepare upd(int, int) as update pt set a = $2 where a = $1 | 8 |  
897.043512 | 7 |0.064446
 prepare upd(int, int) as update pt set a = $2 where a = $1 | 9 |  
897.169711 | 8 |0.070644


How do you think about that?


--
Yoshikazu Imai 



Re: Planning counters in pg_stat_statements (using pgss_store)

2019-09-10 Thread Julien Rouhaud
Hello,

On Sun, Sep 8, 2019 at 11:45 AM Imai Yoshikazu  wrote:
>
> I also saw the codes and have one comment.

Thanks for looking at this patch!

> [0002 patch]
> In pgss_planner_hook:
>
> +   /* calc differences of buffer counters. */
> +   bufusage = compute_buffer_counters(bufusage_start, 
> pgBufferUsage);
> +
> +   /*
> +* we only store planning duration, query text has been 
> initialized
> +* during previous pgss_post_parse_analyze as it not 
> available inside
> +* pgss_planner_hook.
> +*/
> +   pgss_store(query_text,
>
> Do we need to calculate bufusage in here?
> We only store planning duration in the following pgss_store.

Good point!  Postgres can definitely access some buffers while
planning a query (the most obvious example would be
get_actual_variable_range()), but as far as I can tell those were
previously not accounted for with pg_stat_statements as
queryDesc->totaltime->bufusage is only accumulating buffer usage in
the  executor, and indeed current patch also ignore such computed
counters.

I think it would be better to keep this bufusage calculation during
planning and fix pgss_store() to process them, but this would add
slightly more overhead.


> We only store planning duration in the following pgss_store.
>
> --
> Yoshikazu Imai




Re: Planning counters in pg_stat_statements (using pgss_store)

2019-09-10 Thread Julien Rouhaud
On Fri, Sep 6, 2019 at 4:19 PM Tomas Vondra
 wrote:
>
> On Wed, Sep 04, 2019 at 07:19:47PM +0300, Sergei Kornilov wrote:
> >
> > ...
> >
> >Results:
> >
> > test |   mode   | average_tps | degradation_perc
> >--+--+-+--
> > head_no_pgss | extended |   13816 |1.000
> > patch_not_loaded | extended |   13755 |0.996
> > head_track_none  | extended |   13607 |0.985
> > patch_track_none | extended |   13560 |0.981
> > head_track_top   | extended |   13277 |0.961
> > patch_track_top  | extended |   13189 |0.955
> > patch_track_planning | extended |   12983 |0.940
> > head_no_pgss | prepared |   29101 |1.000
> > head_track_none  | prepared |   28510 |0.980
> > patch_track_none | prepared |   28481 |0.979
> > patch_not_loaded | prepared |   28382 |0.975
> > patch_track_planning | prepared |   28046 |0.964
> > head_track_top   | prepared |   28035 |0.963
> > patch_track_top  | prepared |   27973 |0.961
> > head_no_pgss | simple   |   16733 |1.000
> > patch_not_loaded | simple   |   16552 |0.989
> > head_track_none  | simple   |   16452 |0.983
> > patch_track_none | simple   |   16365 |0.978
> > head_track_top   | simple   |   15867 |0.948
> > patch_track_top  | simple   |   15820 |0.945
> > patch_track_planning | simple   |   15739 |0.941
> >
> >So I found slight slowdown with track_planning = off compared to HEAD. 
> >Possibly just at the level of measurement error. I think this is ok.
> >track_planning = on also has no dramatic impact. In my opinion proposed 
> >design with pgss_store call is acceptable.
> >
>
> FWIW I've done some benchmarking on this too, with a single pgbench client
> running select-only test on a tiny database, in different modes (simple,
> extended, prepared). I've done that on two systems with different CPUs
> (spreadsheet with results attached).
>
> I don't see any performance regression - there are some small variations
> in both directions (say, ~1%) but that's well within the noise. So I think
> the patch is fine in this regard.

Thanks a lot Sergei and Tomas!  It's good to know that this patch
doesn't add significant overhead.




Re: Planning counters in pg_stat_statements (using pgss_store)

2019-09-08 Thread Imai Yoshikazu
Hello

On 2019/09/06 23:19, Tomas Vondra wrote:
> On Wed, Sep 04, 2019 at 07:19:47PM +0300, Sergei Kornilov wrote:
>>
>> ...
>>
>> Results:
>>
>>     test |   mode   | average_tps | degradation_perc
>> --+--+-+--
>> head_no_pgss | extended |   13816 |    1.000
>> patch_not_loaded | extended |   13755 |    0.996
>> head_track_none  | extended |   13607 |    0.985
>> patch_track_none | extended |   13560 |    0.981
>> head_track_top   | extended |   13277 |    0.961
>> patch_track_top  | extended |   13189 |    0.955
>> patch_track_planning | extended |   12983 |    0.940
>> head_no_pgss | prepared |   29101 |    1.000
>> head_track_none  | prepared |   28510 |    0.980
>> patch_track_none | prepared |   28481 |    0.979
>> patch_not_loaded | prepared |   28382 |    0.975
>> patch_track_planning | prepared |   28046 |    0.964
>> head_track_top   | prepared |   28035 |    0.963
>> patch_track_top  | prepared |   27973 |    0.961
>> head_no_pgss | simple   |   16733 |    1.000
>> patch_not_loaded | simple   |   16552 |    0.989
>> head_track_none  | simple   |   16452 |    0.983
>> patch_track_none | simple   |   16365 |    0.978
>> head_track_top   | simple   |   15867 |    0.948
>> patch_track_top  | simple   |   15820 |    0.945
>> patch_track_planning | simple   |   15739 |    0.941
>>
>> So I found slight slowdown with track_planning = off compared to HEAD. 
>> Possibly just at the level of measurement error. I think this is ok.
>> track_planning = on also has no dramatic impact. In my opinion 
>> proposed design with pgss_store call is acceptable.
>>
> 
> FWIW I've done some benchmarking on this too, with a single pgbench client
> running select-only test on a tiny database, in different modes (simple,
> extended, prepared). I've done that on two systems with different CPUs
> (spreadsheet with results attached).

Refering to Sergei's results, if a user, currently using pgss with 
tracking execute time, uses the new feature, a user will see 0~2.2% 
performance regression as below.

 >> head_track_top   | extended |   13277 |0.961
 >> patch_track_planning | extended |   12983 |0.940
 >> patch_track_planning | prepared |   28046 |0.964
 >> head_track_top   | prepared |   28035 |0.963
 >> head_track_top   | simple   |   15867 |0.948
 >> patch_track_planning | simple   |   15739 |0.941

If a user will not turn on the track_planning, a user will see 0.2-0.6% 
performance regression as below.

 >> head_track_top   | extended |   13277 |0.961
 >> patch_track_top  | extended |   13189 |0.955
 >> head_track_top   | prepared |   28035 |0.963
 >> patch_track_top  | prepared |   27973 |0.961
 >> head_track_top   | simple   |   15867 |0.948
 >> patch_track_top  | simple   |   15820 |0.945

> 
> I don't see any performance regression - there are some small variations
> in both directions (say, ~1%) but that's well within the noise. So I think
> the patch is fine in this regard.

+1


I also saw the codes and have one comment.

[0002 patch]
In pgss_planner_hook:

+   /* calc differences of buffer counters. */
+   bufusage = compute_buffer_counters(bufusage_start, 
pgBufferUsage);
+
+   /*
+* we only store planning duration, query text has been 
initialized
+* during previous pgss_post_parse_analyze as it not available 
inside
+* pgss_planner_hook.
+*/
+   pgss_store(query_text,

Do we need to calculate bufusage in here?
We only store planning duration in the following pgss_store.

--
Yoshikazu Imai


Re: Planning counters in pg_stat_statements (using pgss_store)

2019-09-06 Thread Tomas Vondra

On Wed, Sep 04, 2019 at 07:19:47PM +0300, Sergei Kornilov wrote:


...

Results:

test |   mode   | average_tps | degradation_perc
--+--+-+--
head_no_pgss | extended |   13816 |1.000
patch_not_loaded | extended |   13755 |0.996
head_track_none  | extended |   13607 |0.985
patch_track_none | extended |   13560 |0.981
head_track_top   | extended |   13277 |0.961
patch_track_top  | extended |   13189 |0.955
patch_track_planning | extended |   12983 |0.940
head_no_pgss | prepared |   29101 |1.000
head_track_none  | prepared |   28510 |0.980
patch_track_none | prepared |   28481 |0.979
patch_not_loaded | prepared |   28382 |0.975
patch_track_planning | prepared |   28046 |0.964
head_track_top   | prepared |   28035 |0.963
patch_track_top  | prepared |   27973 |0.961
head_no_pgss | simple   |   16733 |1.000
patch_not_loaded | simple   |   16552 |0.989
head_track_none  | simple   |   16452 |0.983
patch_track_none | simple   |   16365 |0.978
head_track_top   | simple   |   15867 |0.948
patch_track_top  | simple   |   15820 |0.945
patch_track_planning | simple   |   15739 |0.941

So I found slight slowdown with track_planning = off compared to HEAD. Possibly 
just at the level of measurement error. I think this is ok.
track_planning = on also has no dramatic impact. In my opinion proposed design 
with pgss_store call is acceptable.



FWIW I've done some benchmarking on this too, with a single pgbench client
running select-only test on a tiny database, in different modes (simple,
extended, prepared). I've done that on two systems with different CPUs
(spreadsheet with results attached).

I don't see any performance regression - there are some small variations
in both directions (say, ~1%) but that's well within the noise. So I think
the patch is fine in this regard.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





  1   2   >