Re: Partition prune with stable Expr

2020-10-15 Thread Andy Fan
On Mon, Sep 28, 2020 at 8:21 PM Andy Fan  wrote:

>
>
> On Mon, Sep 28, 2020 at 7:15 AM Tom Lane  wrote:
>
>> Andy Fan  writes:
>> > On Mon, Sep 28, 2020 at 4:46 AM David Rowley 
>> wrote:
>> >> Thanks for showing an interest in partition pruning. Unfortunately,
>> >> it's not possible to use stable functions to prune partitions during
>> >> planning.
>>
>> > Sigh.. I understand you now, I ignored the plan can be cached for later
>> use.
>> > Without that,  we should be able to prune with stable function.
>>
>> No, that's still wrong.  The contract for a stable function is that
>> its result won't change over execution of a single query; but that
>> says *execution*, not *planning and execution*.
>>
>
> I have a slightly different opinion about the impact of "cached the plan
> for later use will be wrong" now.  Generic plan will never be partition
> pruned plan since we don't know which partition to prune at plan time.
> So for any cached plan, it is not a  plan time partition pruned plan.
> Partition prune with stable expr is still  unacceptable even this is not
> an issue but hope the snapshot issue will be the only one issue to
> fix in future for this direction.   I'd like to know if I am wrong again.
>

Indeed I was wrong again.  I'd like to end this thread with this
understanding fix.

prepare s as select * from measurement where logdate = $1 ;
execute s(now());

In this case, even if we run the planning time partition prune with a stable
function, we can still get the correct result (ignore the different
snapshot case).
since the generic plan includes all the partitions and just do initial
partition
prune case.

However if we create the prepared stmt like prepare s as select * from
measurement where logdate = now();  Then the cached plan should be
wrong.  Actually this example is exactly the same as Daivd's example
at the beginning..

-- 
Best Regards
Andy Fan


Re: Partition prune with stable Expr

2020-09-28 Thread Andy Fan
On Mon, Sep 28, 2020 at 9:17 PM Thomas Kellerer  wrote:

> Andy Fan schrieb am 28.09.2020 um 02:54:
> > Well, that's very interesting.  Specific to my user case,
> > SELECT * FROM p WHERE pkey = to_date('2018-12-13', '-mm-dd)';
>
> Why use to_date() at all for a constant value?
> Can't you use a standard ANSI date literal (date '2018-12-13')?
>
>
This is from lots of legacy code from Oracle and the real case is
to_date('20181213', 'mmdd').

Alternatively, you could use make_date(2018,12,13) which is marked as
> immutable.
>
>
Thanks, Nice to know make_date function.

-- 
Best Regards
Andy Fan


Re: Partition prune with stable Expr

2020-09-28 Thread Thomas Kellerer
Andy Fan schrieb am 28.09.2020 um 02:54:
> Well, that's very interesting.  Specific to my user case, 
> SELECT * FROM p WHERE pkey = to_date('2018-12-13', '-mm-dd)';

Why use to_date() at all for a constant value?
Can't you use a standard ANSI date literal (date '2018-12-13')?

Alternatively, you could use make_date(2018,12,13) which is marked as immutable.

Thomas




Re: Partition prune with stable Expr

2020-09-28 Thread Andy Fan
On Mon, Sep 28, 2020 at 7:15 AM Tom Lane  wrote:

> Andy Fan  writes:
> > On Mon, Sep 28, 2020 at 4:46 AM David Rowley 
> wrote:
> >> Thanks for showing an interest in partition pruning. Unfortunately,
> >> it's not possible to use stable functions to prune partitions during
> >> planning.
>
> > Sigh.. I understand you now, I ignored the plan can be cached for later
> use.
> > Without that,  we should be able to prune with stable function.
>
> No, that's still wrong.  The contract for a stable function is that
> its result won't change over execution of a single query; but that
> says *execution*, not *planning and execution*.
>

I have a slightly different opinion about the impact of "cached the plan
for later use will be wrong" now.  Generic plan will never be partition
pruned plan since we don't know which partition to prune at plan time.
So for any cached plan, it is not a  plan time partition pruned plan.
Partition prune with stable expr is still  unacceptable even this is not
an issue but hope the snapshot issue will be the only one issue to
fix in future for this direction.   I'd like to know if I am wrong again.

-- 
Best Regards
Andy Fan


Re: Partition prune with stable Expr

2020-09-28 Thread Juan José Santamaría Flecha
On Mon, Sep 28, 2020 at 9:23 AM Andy Fan  wrote:

> On Mon, Sep 28, 2020 at 2:44 PM Jesse Zhang  wrote:
>
>> On Sun, Sep 27, 2020 at 7:52 PM Andy Fan wrote:
>> > On Mon, Sep 28, 2020 at 9:15 AM Tom Lane wrote:
>> >>
>> >> Yeah.  It depends on the lc_time setting, and possibly also the
>> timezone
>> >> GUC.  (Admittedly, common values of the format string would not have
>> >> any lc_time dependency, but the immutability property is not
>> fine-grained
>> >> enough to recognize that.)
>> >
>> > Thanks for your reply. Even it has something on GUC or lc_time setting,
>> suppose
>> > it should be decided at planning time.  Do we have concerns about
>> changes
>> > between planning and execution?
>>
>> Immutable functions need to produce the same output for the same
>> argument values. None of the functions changed in the patch is
>> immutable: they are all stable because they all depend on GUC settings
>> (e.g. to_tsvector depends on default_text_search_config).
>>
>
> Thanks,  how can I misunderstand Tom's comment above!!
>

To make data type formatting functions immutable you would need a third
input argument with the locale used for that specific function call, as
proposed in [1].

As for the original use case, in the documentation there is a tip about
using to_date() to handle input formats that cannot be converted by simple
casting, and that for most standard date/time formats a cast is the easier
way to do so [2], I may also add that is better performing in Postgres.

[1]
https://www.postgresql.org/message-id/01fe23b2-7779-d3ee-056a-074a7385e248%40mail.de
[2] https://www.postgresql.org/docs/current/functions-formatting.html

Regards,

Juan José Santamaría Flecha

>
>


Re: Partition prune with stable Expr

2020-09-28 Thread Andy Fan
On Mon, Sep 28, 2020 at 2:44 PM Jesse Zhang  wrote:

> On Sun, Sep 27, 2020 at 7:52 PM Andy Fan wrote:
> >
> >
> > On Mon, Sep 28, 2020 at 9:15 AM Tom Lane wrote:
> >>
> >> Andy Fan writes:
> >> > Well, that's very interesting.  Specific to my user case,
> >> > SELECT * FROM p WHERE pkey = to_date('2018-12-13', '-mm-dd)';
> >> > p has 1500+ partitions and planning takes lots of time, which is so
> same
> >> > with SELECT * FROM p WHERE pkey = '2018-12-13',  however the planning
> >> > time difference is so huge, that doesn't make sense in human view.
> Can
> >> > we do something for that?  to_date(text, text) should be a "immutable"
> >> > function IMO.  Does that have a semantic issue or other issues?
> >>
> >> Yeah.  It depends on the lc_time setting, and possibly also the timezone
> >> GUC.  (Admittedly, common values of the format string would not have
> >> any lc_time dependency, but the immutability property is not
> fine-grained
> >> enough to recognize that.)
> >>
> >> regards, tom lane
> >
> >
> > Thanks for your reply. Even it has something on GUC or lc_time setting,
> suppose
> > it should be decided at planning time.  Do we have concerns about changes
> > between planning and execution?
>
> Planner can be called at prepared statement creation time, like
>
> PREPARE yolo() AS SELECT * FROM foo WHERE pk = to_date(...);
>
> Here, there's an arbitrary gap between planning time, and execution.
>
> >
> > The attached patch marked some common formatting function as immutable,
> > only one partition prune test case needed fixing because of this. I only
> changed
> > to_char/to_date/to_timestamp,  however the whole list is below. I can
> change
> > all of them if needed.
> >
> >  proname | count
> > -+---
> >  to_ascii| 3
> >  to_char | 8
> >  to_date | 1
> >  to_hex  | 2
> >  to_json | 1
> >  to_jsonb| 1
> >  to_number   | 1
> >  to_regclass | 1
> >  to_regcollation | 1
> >  to_regnamespace | 1
> >  to_regoper  | 1
> >  to_regoperator  | 1
> >  to_regproc  | 1
> >  to_regprocedure | 1
> >  to_regrole  | 1
> >  to_regtype  | 1
> >  to_timestamp| 2
> >  to_tsquery  | 2
> >  to_tsvector | 6
> > (19 rows)
> >
> This patch is ridiculous.
>
> Immutable functions need to produce the same output for the same
> argument values. None of the functions changed in the patch is
> immutable: they are all stable because they all depend on GUC settings
> (e.g. to_tsvector depends on default_text_search_config).
>

Thanks,  how can I misunderstand Tom's comment above!!

-- 
Best Regards
Andy Fan


Re: Partition prune with stable Expr

2020-09-28 Thread Jesse Zhang
On Sun, Sep 27, 2020 at 7:52 PM Andy Fan wrote:
>
>
> On Mon, Sep 28, 2020 at 9:15 AM Tom Lane wrote:
>>
>> Andy Fan writes:
>> > Well, that's very interesting.  Specific to my user case,
>> > SELECT * FROM p WHERE pkey = to_date('2018-12-13', '-mm-dd)';
>> > p has 1500+ partitions and planning takes lots of time, which is so same
>> > with SELECT * FROM p WHERE pkey = '2018-12-13',  however the planning
>> > time difference is so huge, that doesn't make sense in human view.  Can
>> > we do something for that?  to_date(text, text) should be a "immutable"
>> > function IMO.  Does that have a semantic issue or other issues?
>>
>> Yeah.  It depends on the lc_time setting, and possibly also the timezone
>> GUC.  (Admittedly, common values of the format string would not have
>> any lc_time dependency, but the immutability property is not fine-grained
>> enough to recognize that.)
>>
>> regards, tom lane
>
>
> Thanks for your reply. Even it has something on GUC or lc_time setting, 
> suppose
> it should be decided at planning time.  Do we have concerns about changes
> between planning and execution?

Planner can be called at prepared statement creation time, like

PREPARE yolo() AS SELECT * FROM foo WHERE pk = to_date(...);

Here, there's an arbitrary gap between planning time, and execution.

>
> The attached patch marked some common formatting function as immutable,
> only one partition prune test case needed fixing because of this. I only 
> changed
> to_char/to_date/to_timestamp,  however the whole list is below. I can change
> all of them if needed.
>
>  proname | count
> -+---
>  to_ascii| 3
>  to_char | 8
>  to_date | 1
>  to_hex  | 2
>  to_json | 1
>  to_jsonb| 1
>  to_number   | 1
>  to_regclass | 1
>  to_regcollation | 1
>  to_regnamespace | 1
>  to_regoper  | 1
>  to_regoperator  | 1
>  to_regproc  | 1
>  to_regprocedure | 1
>  to_regrole  | 1
>  to_regtype  | 1
>  to_timestamp| 2
>  to_tsquery  | 2
>  to_tsvector | 6
> (19 rows)
>
This patch is ridiculous.

Immutable functions need to produce the same output for the same
argument values. None of the functions changed in the patch is
immutable: they are all stable because they all depend on GUC settings
(e.g. to_tsvector depends on default_text_search_config).




Re: Partition prune with stable Expr

2020-09-27 Thread Andy Fan
On Mon, Sep 28, 2020 at 9:15 AM Tom Lane  wrote:

> Andy Fan  writes:
> > Well, that's very interesting.  Specific to my user case,
> > SELECT * FROM p WHERE pkey = to_date('2018-12-13', '-mm-dd)';
> > p has 1500+ partitions and planning takes lots of time, which is so same
> > with SELECT * FROM p WHERE pkey = '2018-12-13',  however the planning
> > time difference is so huge, that doesn't make sense in human view.  Can
> > we do something for that?  to_date(text, text) should be a "immutable"
> > function IMO.  Does that have a semantic issue or other issues?
>
> Yeah.  It depends on the lc_time setting, and possibly also the timezone
> GUC.  (Admittedly, common values of the format string would not have
> any lc_time dependency, but the immutability property is not fine-grained
> enough to recognize that.)
>
> regards, tom lane
>

Thanks for your reply. Even it has something on GUC or lc_time setting,
suppose
it should be decided at planning time.  Do we have concerns about changes
between planning and execution?

The attached patch marked some common formatting function as immutable,
only one partition prune test case needed fixing because of this. I only
changed
to_char/to_date/to_timestamp,  however the whole list is below. I can
change
all of them if needed.

 proname | count
-+---
 to_ascii| 3
 to_char | 8
 to_date | 1
 to_hex  | 2
 to_json | 1
 to_jsonb| 1
 to_number   | 1
 to_regclass | 1
 to_regcollation | 1
 to_regnamespace | 1
 to_regoper  | 1
 to_regoperator  | 1
 to_regproc  | 1
 to_regprocedure | 1
 to_regrole  | 1
 to_regtype  | 1
 to_timestamp| 2
 to_tsquery  | 2
 to_tsvector | 6
(19 rows)

With this change, the exact issue on the beginning of this thread can be
fixed as
well with this patch.

-- 
Best Regards
Andy Fan


v1-0001-Mark-some-formating-builtin-function-as-immutable.patch
Description: Binary data


Re: Partition prune with stable Expr

2020-09-27 Thread Tom Lane
Andy Fan  writes:
> Well, that's very interesting.  Specific to my user case,
> SELECT * FROM p WHERE pkey = to_date('2018-12-13', '-mm-dd)';
> p has 1500+ partitions and planning takes lots of time, which is so same
> with SELECT * FROM p WHERE pkey = '2018-12-13',  however the planning
> time difference is so huge, that doesn't make sense in human view.  Can
> we do something for that?  to_date(text, text) should be a "immutable"
> function IMO.  Does that have a semantic issue or other issues?

Yeah.  It depends on the lc_time setting, and possibly also the timezone
GUC.  (Admittedly, common values of the format string would not have
any lc_time dependency, but the immutability property is not fine-grained
enough to recognize that.)

regards, tom lane




Re: Partition prune with stable Expr

2020-09-27 Thread Andy Fan
On Mon, Sep 28, 2020 at 7:15 AM Tom Lane  wrote:

> Andy Fan  writes:
> > On Mon, Sep 28, 2020 at 4:46 AM David Rowley 
> wrote:
> >> Thanks for showing an interest in partition pruning. Unfortunately,
> >> it's not possible to use stable functions to prune partitions during
> >> planning.
>
> > Sigh.. I understand you now, I ignored the plan can be cached for later
> use.
> > Without that,  we should be able to prune with stable function.
>
> No, that's still wrong.  The contract for a stable function is that
> its result won't change over execution of a single query; but that
> says *execution*, not *planning and execution*.
>
> In particular, the canonical example of a stable function is one
> whose result depends on a database query.  The reason it can be
> considered stable is that within a single outer query, the MVCC
> snapshot it's used with won't change.  But we take a new snapshot
> (later than the planner's snapshot) when beginning execution.
>
> Somebody (Robert Haas, if memory serves, which it might not)
> tried to change that a few years ago.  It blew up pretty well,
> and was eventually reverted, because of undesirable side-effects
> on user-visible query semantics.  You'd have to check the archives
> for details.
>
> It's possible that we could make that work differently in serializable
> mode, thanks to the longer persistence of snapshots.  Not sure that
> it'd be desirable for planning to work differently in serializable
> mode, though.
>
> regards, tom lane
>

Well, that's very interesting.  Specific to my user case,
SELECT * FROM p WHERE pkey = to_date('2018-12-13', '-mm-dd)';
p has 1500+ partitions and planning takes lots of time, which is so same
with SELECT * FROM p WHERE pkey = '2018-12-13',  however the planning
time difference is so huge, that doesn't make sense in human view.  Can
we do something for that?  to_date(text, text) should be a "immutable"
function
IMO.  Does that have a semantic issue or other issues?


-- 
Best Regards
Andy Fan


Re: Partition prune with stable Expr

2020-09-27 Thread Tom Lane
Andy Fan  writes:
> On Mon, Sep 28, 2020 at 4:46 AM David Rowley  wrote:
>> Thanks for showing an interest in partition pruning. Unfortunately,
>> it's not possible to use stable functions to prune partitions during
>> planning.

> Sigh.. I understand you now, I ignored the plan can be cached for later use.
> Without that,  we should be able to prune with stable function.

No, that's still wrong.  The contract for a stable function is that
its result won't change over execution of a single query; but that
says *execution*, not *planning and execution*.

In particular, the canonical example of a stable function is one
whose result depends on a database query.  The reason it can be
considered stable is that within a single outer query, the MVCC
snapshot it's used with won't change.  But we take a new snapshot
(later than the planner's snapshot) when beginning execution.

Somebody (Robert Haas, if memory serves, which it might not)
tried to change that a few years ago.  It blew up pretty well,
and was eventually reverted, because of undesirable side-effects
on user-visible query semantics.  You'd have to check the archives
for details.

It's possible that we could make that work differently in serializable
mode, thanks to the longer persistence of snapshots.  Not sure that
it'd be desirable for planning to work differently in serializable
mode, though.

regards, tom lane




Re: Partition prune with stable Expr

2020-09-27 Thread Andy Fan
Thank you David for coming:)

On Mon, Sep 28, 2020 at 4:46 AM David Rowley  wrote:

> On Mon, 28 Sep 2020 at 08:59, Andy Fan  wrote:
> > I find we can't prune partitions in the planner if the qual is a stable
> function.
>
> > IMO, we should do it. Why not?
>
> Thanks for showing an interest in partition pruning. Unfortunately,
> it's not possible to use stable functions to prune partitions during
> planning.
>
> NOW() is one example of a function that's stable, but the return value
> will change over time. If we used the return value of that to perform
> partition pruning then we'd end up with a plan that's wrong over time.
>
>
Sigh.. I understand you now, I ignored the plan can be cached for later use.
Without that,  we should be able to prune with stable function.  I know the
run-time partition prune can help with this, but it can't help with
planning time.
I run into some cases that SELECT * FROM p WHERE pkey = to_date(..);
p has 1500+ partitions and planning takes lots of time.  and users are not
willing to remove the to_date('2018-11-11', '-mm-dd') style code since
too much and can't find out all of them at once.  Actually I think to_date
should
be marked as immuable rather than stable.


-- 
Best Regards
Andy Fan


Re: Partition prune with stable Expr

2020-09-27 Thread David Rowley
On Mon, 28 Sep 2020 at 08:59, Andy Fan  wrote:
> I find we can't prune partitions in the planner if the qual is a stable 
> function.

> IMO, we should do it. Why not?

Thanks for showing an interest in partition pruning. Unfortunately,
it's not possible to use stable functions to prune partitions during
planning.

NOW() is one example of a function that's stable, but the return value
will change over time. If we used the return value of that to perform
partition pruning then we'd end up with a plan that's wrong over time.

Here's an example:

create table rp (t timestamp) partition by range(t);
create table rp1 partition of rp for values from ('now'::timestamp) to
('now'::timestamp + '1 min'::interval);
create table rp2 partition of rp for values from ('now'::timestamp +
'1 min'::interval) to ('now'::timestamp + '2 min'::interval);
insert into rp select t from generate_Series('now'::timestamp,
'now'::timestamp + '1 min 59 sec'::interval, '1 sec'::interval) t;

prepare q1 as select count(*) from rp where t > now() and t < now() +
'10 sec'::interval;

Now, if you run the following command with your patch, it'll prune the
rp2 partition as it's not required for the WHERE clause (at the time
we planned). However, just wait 1 minute and execute the plan again.
Oops, my rows vanished!

execute q1; select pg_sleep(60); execute q1;

The 2nd execute should have returned 10 rows, the same as the first
(assuming you executed that directly after creating the tables)

Run-time partition pruning was invented just for this purpose.

David




Partition prune with stable Expr

2020-09-27 Thread Andy Fan
Hi:

I find we can't prune partitions in the planner if the qual is a stable
function.

CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktempint,
unitsales   int
) PARTITION BY RANGE (logdate);

CREATE TABLE measurement_y2006m02 PARTITION OF measurement
FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');

CREATE TABLE measurement_y2006m03 PARTITION OF measurement
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');

postgres=# explain (costs off) select * from measurement
postgres-# where logdate = to_date('2006-03-02', '-mm-dd');
 QUERY PLAN
-
 Append
   Subplans Removed: 1  <-- Here
   ->  Seq Scan on measurement_y2006m03 measurement_1
 Filter: (logdate = to_date('2006-03-02'::text, '-mm-dd'::text))
(4 rows)

IMO, we should do it. Why not?  The attached is used to show the things
in my mind.

Btw,  why the to_date function is declared as stable rather than immutable
since it always delivers the same result for the same inputs.

-- 
Best Regards
Andy Fan


v1-0001-Allow-planner-prune-partitionn-with-stable-Expr.patch
Description: Binary data