Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType

2017-10-16 Thread Robert Haas
On Mon, Oct 16, 2017 at 2:36 AM, David Rowley
 wrote:
> I personally think it's great we're starting to see a useful feature
> materialise that can help with poor row estimates from the planner.

I agree.  My original post to this thread was more of a throw-away
comment than anything, and I'm not attacking the feature.  I didn't
think it was a very clear example and, TBH, I still don't.  But I
don't want to blow that up into a big debate on the virtues of this
feature, which I never intended to question, or on the correctness of
the patch, which I also did not intend to question.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType

2017-10-16 Thread David Rowley
On 15 October 2017 at 06:49, Robert Haas  wrote:
> On Fri, Oct 13, 2017 at 4:49 PM, David Rowley
>  wrote:
>> tps = 8282.481310 (including connections establishing)
>> tps = 8282.750821 (excluding connections establishing)
>
> vs.
>
>> tps = 8520.822410 (including connections establishing)
>> tps = 8521.132784 (excluding connections establishing)
>>
>> With the patch we are making use of the extended statistics, which we
>> do expect to be more work for the planner. Although, we didn't add
>> extended statistics to speed up the planner.
>
> Sure, I understand.  That's actually a pretty substantial regression -
> I guess that means that it's pretty important to avoid creating
> extended statistics that are not needed, at least for short-running
> queries.

To be honest, I ran that on a VM on my laptop. I was getting quite a
bit of noise. I just posted that to show that the 12x slowdown didn't
exist. I don't know what the actual slowdown is. I just know extended
stats are not free and that nobody expected that they ever would be.
The good news is that they're off by default and if the bad ever
outweighs the good then the fix for that starts with "DROP STATISTICS"

I personally think it's great we're starting to see a useful feature
materialise that can help with poor row estimates from the planner.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType

2017-10-14 Thread Tomas Vondra


On 10/14/2017 07:49 PM, Robert Haas wrote:
> On Fri, Oct 13, 2017 at 4:49 PM, David Rowley
>  wrote:
>> tps = 8282.481310 (including connections establishing)
>> tps = 8282.750821 (excluding connections establishing)
> 
> vs.
> 
>> tps = 8520.822410 (including connections establishing)
>> tps = 8521.132784 (excluding connections establishing)
>>
>> With the patch we are making use of the extended statistics, which
>> we do expect to be more work for the planner. Although, we didn't
>> add extended statistics to speed up the planner.
> 
> Sure, I understand. That's actually a pretty substantial regression
> - I guess that means that it's pretty important to avoid creating 
> extended statistics that are not needed, at least for short-running 
> queries.
> 

Well, it's only about 3% difference in a single run, which may be easily
due to slightly different binary layout, random noise etc. So I wouldn't
call that "substantial regression", at least not based on this one test.

I've done more thorough testing, and what I see is 1.0-1.2% drop, but on
a test that's rather extreme (statistics on empty table). So again,
likely well within noise, and on larger tables it'll get even less
significant.

But of course - it's not free. It's a bit more work we need to do. But
if you don't need multi-column statistics, don't create them. If your
queries are already fast, you probably don't need them at all.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType

2017-10-14 Thread Robert Haas
On Fri, Oct 13, 2017 at 4:49 PM, David Rowley
 wrote:
> tps = 8282.481310 (including connections establishing)
> tps = 8282.750821 (excluding connections establishing)

vs.

> tps = 8520.822410 (including connections establishing)
> tps = 8521.132784 (excluding connections establishing)
>
> With the patch we are making use of the extended statistics, which we
> do expect to be more work for the planner. Although, we didn't add
> extended statistics to speed up the planner.

Sure, I understand.  That's actually a pretty substantial regression -
I guess that means that it's pretty important to avoid creating
extended statistics that are not needed, at least for short-running
queries.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType

2017-10-14 Thread Robert Haas
On Fri, Oct 13, 2017 at 4:44 PM, Tomas Vondra
 wrote:
> On 10/13/2017 10:04 PM, Robert Haas wrote:
>> On Mon, Oct 9, 2017 at 11:03 PM, David Rowley
>>  wrote:
>>> -- Unpatched
>>>  Planning time: 0.184 ms
>>>  Execution time: 105.878 ms
>>>
>>> -- Patched
>>>  Planning time: 2.175 ms
>>>  Execution time: 106.326 ms
>>
>> This might not be the best example to show the advantages of the
>> patch, honestly.
>
> Not sure what exactly is your point? If you're suggesting this example
> is bad because the planning time increased from 0.184 to 2.175 ms, then
> perhaps consider the plans were likely generated on a assert-enabled
> build and on a laptop (both of which adds quite a bit of noise to
> occasional timings). The patch has no impact on planning time (at least
> I've been unable to measure any).

I don't really think there's a problem with the patch; I just noticed
that with the patch applied both the planning and execution time went
up.  I understand that's because this is a toy example, not a real
one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType

2017-10-13 Thread David Rowley
On 14 October 2017 at 09:04, Robert Haas  wrote:
> On Mon, Oct 9, 2017 at 11:03 PM, David Rowley
>  wrote:
>> -- Unpatched
>>  Planning time: 0.184 ms
>>  Execution time: 105.878 ms
>>
>> -- Patched
>>  Planning time: 2.175 ms
>>  Execution time: 106.326 ms
>
> This might not be the best example to show the advantages of the
> patch, honestly.

The focus was on the row estimate. I try to highlight that by
mentioning "Note rows=1 vs rows=98 in the Gather node.". I can't
imagine the test I added would have made the planner about 12 times
slower, but just for the record:

create table ab (a varchar, b varchar);
insert into ab select (x%1000)::varchar, (x%1)::Varchar from
generate_Series(1,100)x;
create statistics ab_a_b_stats (dependencies) on a,b from ab;
vacuum analyze ab;

$ cat a.sql
explain select * from ab where a = '1' and b = '1';

e9ef11ac8bb2acc2d2462fc17ec3291a959589e7 (Patched)

$ pgbench -f a.sql -T 60 -n
transaction type: a.sql
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 496950
latency average = 0.121 ms
tps = 8282.481310 (including connections establishing)
tps = 8282.750821 (excluding connections establishing)

e9ef11ac8bb2acc2d2462fc17ec3291a959589e7~1 (Unpatched)

$ pgbench -f a.sql -T 60 -n
transaction type: a.sql
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 511250
latency average = 0.117 ms
tps = 8520.822410 (including connections establishing)
tps = 8521.132784 (excluding connections establishing)

With the patch we are making use of the extended statistics, which we
do expect to be more work for the planner. Although, we didn't add
extended statistics to speed up the planner.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType

2017-10-13 Thread Tomas Vondra

On 10/13/2017 10:04 PM, Robert Haas wrote:
> On Mon, Oct 9, 2017 at 11:03 PM, David Rowley
>  wrote:
>> -- Unpatched
>>  Planning time: 0.184 ms
>>  Execution time: 105.878 ms
>>
>> -- Patched
>>  Planning time: 2.175 ms
>>  Execution time: 106.326 ms
> 
> This might not be the best example to show the advantages of the
> patch, honestly.
> 

Not sure what exactly is your point? If you're suggesting this example
is bad because the planning time increased from 0.184 to 2.175 ms, then
perhaps consider the plans were likely generated on a assert-enabled
build and on a laptop (both of which adds quite a bit of noise to
occasional timings). The patch has no impact on planning time (at least
I've been unable to measure any).

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType

2017-10-13 Thread Robert Haas
On Mon, Oct 9, 2017 at 11:03 PM, David Rowley
 wrote:
> -- Unpatched
>  Planning time: 0.184 ms
>  Execution time: 105.878 ms
>
> -- Patched
>  Planning time: 2.175 ms
>  Execution time: 106.326 ms

This might not be the best example to show the advantages of the
patch, honestly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType

2017-10-12 Thread David Rowley
On 13 October 2017 at 04:56, Alvaro Herrera  wrote:
> I pushed your original fix.

Thanks for committing

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType

2017-10-12 Thread David Rowley
On 13 October 2017 at 02:17, Alvaro Herrera  wrote:
> I propose this slightly larger change.

hmm, this is not right. You're not checking that there's a Var below
the RelabelType.

I tried with:

explain select * from ab where (a||a)::varchar = '' and b = '';

and your code assumed the OpExpr was a Var.

The reason Tomas coded it the way it was coded is due to the fact that
there's already code that works exactly the same way in
clauselist_selectivity(). Personally, I don't particularly like that
code, but I'd rather not invent a new way to do the same thing.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Extended statistics is not working on Vars hidden under a RelabelType

2017-10-09 Thread David Rowley
Basically, $subject is causing us not to properly find matching
extended stats in this case.

The attached patch fixes it.

The following test cases is an example of the misbehaviour. Note
rows=1 vs rows=98 in the Gather node.

create table ab (a varchar, b varchar);
insert into ab select (x%1000)::varchar, (x%1)::Varchar from
generate_Series(1,100)x;
create statistics ab_a_b_stats (dependencies) on a,b from ab;
analyze ab;

-- Unpatched
explain analyze select * from ab where a = '1' and b = '1';
   QUERY PLAN
-
 Gather  (cost=1000.00..12466.10 rows=1 width=7) (actual
time=0.441..90.515 rows=100 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on ab  (cost=0.00..11466.00 rows=1 width=7)
(actual time=1.081..74.944 rows=33 loops=3)
 Filter: (((a)::text = '1'::text) AND ((b)::text = '1'::text))
 Rows Removed by Filter: 00
 Planning time: 0.184 ms
 Execution time: 105.878 ms
(8 rows)

-- Patched
explain analyze select * from ab where a = '1' and b = '1';
QUERY PLAN
--
 Gather  (cost=1000.00..12475.80 rows=98 width=7) (actual
time=1.076..92.595 rows=100 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on ab  (cost=0.00..11466.00 rows=41 width=7)
(actual time=0.491..77.833 rows=33 loops=3)
 Filter: (((a)::text = '1'::text) AND ((b)::text = '1'::text))
 Rows Removed by Filter: 00
 Planning time: 2.175 ms
 Execution time: 106.326 ms
(8 rows)

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


allow_relabelled_vars_in_dependency_stats.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers