Re: [SQL] Volatile functions in WITH

2013-02-20 Thread Ben Morrow
At 12PM -0800 on 20/02/13 you (Sergey Konoplev) wrote:
> On Wed, Feb 20, 2013 at 10:16 AM, Ben Morrow  wrote:
> >> If you got mixed up with plpgsql anyway what is the reason of making
> >> this WITH query constructions instead of implementing everything in a
> >> plpgsql trigger on DELETE on exp then?
> >
> > I'm not sure what you mean. "exp" isn't a table, it's a WITH CTE. The
> 
> Sorry, I meant "item" of course, "exp" was a typo.

OK.

> > statement is deleting some entries from "item", and replacing some of
> > them with new entries, based on the information in the "item_expired"
> > view. I can't do anything with a trigger on "item", since there are
> > other circumstances where items are deleted that shouldn't trigger
> > replacement.
> 
> Okay, I see.
> 
> If the case is specific you can make a simple plpgsql function that
> will process it like FOR _row IN DELETE ... RETORNING * LOOP ...
> RETURN NEXT _row; END LOOP;

Yes, I *know* I can write a function if I have to. I can also send the
whole lot down to the client and do the inserts from there, or use a
temporary table. I was hoping to avoid that, since the plain INSERT case
works perfectly well.

> > select *
> > from (select j.type, random() r from item j) i
> > where i.type = 1
> >
> > the planner will transform it into
> >
> > select i.type, random() r
> > from item i
> > where i.type = 1
> >
> > before planning, so even though random() is volatile it will only get
> > called for rows of item with type = 1.
> 
> Yes, functions are executed depending on the resulting plan "A query
> using a volatile function will re-evaluate the function at every row
> where its value is needed".
> 
> > I don't know if this happens, or may sometimes happen, or might happen
> > in the future, for rows eliminated because of DISTINCT.
> 
> It is a good point. Nothing guarantees it in a perspective. Optimizer
> guarantees a stable result but not the way it is reached.

Well, it makes functions which perform DML a lot less useful, so I
wonder whether this is intentional behaviour.

Ben



-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Volatile functions in WITH

2013-02-20 Thread Sergey Konoplev
On Wed, Feb 20, 2013 at 10:16 AM, Ben Morrow  wrote:
>> If you got mixed up with plpgsql anyway what is the reason of making
>> this WITH query constructions instead of implementing everything in a
>> plpgsql trigger on DELETE on exp then?
>
> I'm not sure what you mean. "exp" isn't a table, it's a WITH CTE. The

Sorry, I meant "item" of course, "exp" was a typo.

> statement is deleting some entries from "item", and replacing some of
> them with new entries, based on the information in the "item_expired"
> view. I can't do anything with a trigger on "item", since there are
> other circumstances where items are deleted that shouldn't trigger
> replacement.

Okay, I see.

If the case is specific you can make a simple plpgsql function that
will process it like FOR _row IN DELETE ... RETORNING * LOOP ...
RETURN NEXT _row; END LOOP;

> select *
> from (select j.type, random() r from item j) i
> where i.type = 1
>
> the planner will transform it into
>
> select i.type, random() r
> from item i
> where i.type = 1
>
> before planning, so even though random() is volatile it will only get
> called for rows of item with type = 1.

Yes, functions are executed depending on the resulting plan "A query
using a volatile function will re-evaluate the function at every row
where its value is needed".

> I don't know if this happens, or may sometimes happen, or might happen
> in the future, for rows eliminated because of DISTINCT.

It is a good point. Nothing guarantees it in a perspective. Optimizer
guarantees a stable result but not the way it is reached.

--
Sergey Konoplev
Database and Software Architect
http://www.linkedin.com/in/grayhemp

Phones:
USA +1 415 867 9984
Russia, Moscow +7 901 903 0499
Russia, Krasnodar +7 988 888 1979

Skype: gray-hemp
Jabber: gray...@gmail.com


-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Volatile functions in WITH

2013-02-20 Thread Ben Morrow
At  8AM -0800 on 20/02/13 you (Sergey Konoplev) wrote:
> On Wed, Feb 20, 2013 at 12:19 AM, Ben Morrow  wrote:
> > That's not reliable. A concurrent txn could insert a conflicting row
> > between the update and the insert, which would cause the insert to fail
> > with a unique constraint violation.
> 
> Okay I think I got it. The function catches exception when INSERTing
> and does UPDATE instead, correct?

Well, it tries the update first, but yes. It's pretty-much exactly the
example in the PL/pgSQL docs.

> If you got mixed up with plpgsql anyway what is the reason of making
> this WITH query constructions instead of implementing everything in a
> plpgsql trigger on DELETE on exp then?

I'm not sure what you mean. "exp" isn't a table, it's a WITH CTE. The
statement is deleting some entries from "item", and replacing some of
them with new entries, based on the information in the "item_expired"
view. I can't do anything with a trigger on "item", since there are
other circumstances where items are deleted that shouldn't trigger
replacement.

> > Yes, I can do experiments too; the alternatives I gave before both work
> > on my test database. What I was asking was whether they are guaranteed
> > to work in all situations, given that the planner can in principle see
> > that the extra table reference won't affect the result.
> 
> From the documentation "VOLATILE indicates that the function value can
> change even within a single table scan, so no optimizations can be
> made". So they are guaranteed to behave as you need in your last
> example.

Well, that's ambiguous. The return value can change even within a single
scan, so if you want 3 return values you have to make 3 calls. But what
if you don't actually need one of those three: is the planner allowed to
optimise the whole thing out? For instance, given

select * 
from (select j.type, random() r from item j) i
where i.type = 1

the planner will transform it into

select i.type, random() r
from item i
where i.type = 1

before planning, so even though random() is volatile it will only get
called for rows of item with type = 1. I don't know if this happens, or
may sometimes happen, or might happen in the future, for rows eliminated
because of DISTINCT.

(I think perhaps what I would ideally want is a PERFORM verb, which is
just like SELECT but says 'actually calculate all the rows implied here,
without pulling in additional filter conditions'. WITH would then have
to treat a top-level PERFORM inside a WITH the same as DML.)

Ben



-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Volatile functions in WITH

2013-02-20 Thread Sergey Konoplev
On Wed, Feb 20, 2013 at 12:19 AM, Ben Morrow  wrote:
> That's not reliable. A concurrent txn could insert a conflicting row
> between the update and the insert, which would cause the insert to fail
> with a unique constraint violation.

Okay I think I got it. The function catches exception when INSERTing
and does UPDATE instead, correct?

If you got mixed up with plpgsql anyway what is the reason of making
this WITH query constructions instead of implementing everything in a
plpgsql trigger on DELETE on exp then?

> Yes, I can do experiments too; the alternatives I gave before both work
> on my test database. What I was asking was whether they are guaranteed
> to work in all situations, given that the planner can in principle see
> that the extra table reference won't affect the result.

>From the documentation "VOLATILE indicates that the function value can
change even within a single table scan, so no optimizations can be
made". So they are guaranteed to behave as you need in your last
example.

What about optimizing it out in WITH - I would like to listen to
hackers' opinion, because for me it looks like a bug.

--
Sergey Konoplev
Database and Software Architect
http://www.linkedin.com/in/grayhemp

Phones:
USA +1 415 867 9984
Russia, Moscow +7 901 903 0499
Russia, Krasnodar +7 988 888 1979

Skype: gray-hemp
Jabber: gray...@gmail.com


-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Volatile functions in WITH

2013-02-20 Thread Ben Morrow
Quoth gray...@gmail.com (Sergey Konoplev):
> On Sat, Feb 16, 2013 at 11:58 PM, Ben Morrow  wrote:
> > WITH "exp" AS ( -- as before
> > ),
> > "subst" AS (
> > SELECT add_item(e.basket, e.nref, e.count)
> > FROM "exp" e
> > WHERE e.nref IS NOT NULL
> > )
> > SELECT DISTINCT e.msg
> > FROM "exp" e
> 
> Alternatively I suppose you can try this one:
> 
>  WITH "exp" AS (
> DELETE FROM "item" i
> USING "item_expired" e
> WHERE e.oref = i.ref
> AND i.basket = $1
> RETURNING i.basket, e.oref, e.nref, i.count, e.msg
> ),
> "upd" AS (
> UPDATE "item" SET "count" = e.count
> FROM "exp" e
> WHERE e.nref IS NOT NULL
> AND ("basket", "nref") IS NOT DISTINCT FROM (e.basket, e.nref)
> RETURNING "basket", "nref"
> )
>"ins" AS (
> INSERT INTO "item" ("basket", "ref", "count")
> SELECT e.basket, e.nref, e.count
> FROM "exp" e LEFT JOIN "upd" u
> ON ("basket", "nref") IS NOT DISTINCT FROM (e.basket, e.nref)
> WHERE e.nref IS NOT NULL AND (u.basket, u.nref) IS NULL
> )
> SELECT DISTINCT e.msg
> FROM "exp" e

That's not reliable. A concurrent txn could insert a conflicting row
between the update and the insert, which would cause the insert to fail
with a unique constraint violation.

> > then the planner sees that the results of "subst" are not used, and
> > doesn't include it in the query plan at all.
> >
> > Is there any way I can tell WITH that add_item is actually a data-
> > modifying statement? Adding FOR UPDATE doesn't seem to help (I didn't
> > really expect it would.)
> 
> In this regard I would like to listen to gugrus' opinion too.
> 
> EXPLAIN ANALYZE WITH t AS (SELECT random()) SELECT 1;
>  QUERY PLAN
> 
>  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.002..0.003
> rows=1 loops=1)
>  Total runtime: 0.063 ms
> (2 rows)
> 
> EXPLAIN ANALYZE WITH t AS (SELECT random()) SELECT 1 from t;
>  QUERY PLAN
> 
>  CTE Scan on t  (cost=0.01..0.03 rows=1 width=0) (actual
> time=0.048..0.052 rows=1 loops=1)
>CTE t
>  ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual
> time=0.038..0.039 rows=1 loops=1)
>  Total runtime: 0.131 ms
> (4 rows)
> 
> I couldn't manage to come to any solution except faking the reference
> in the resulting query:
> 
> WITH t AS (SELECT random()) SELECT 1 UNION ALL (SELECT 1 FROM t LIMIT 0);

Yes, I can do experiments too; the alternatives I gave before both work
on my test database. What I was asking was whether they are guaranteed
to work in all situations, given that the planner can in principle see
that the extra table reference won't affect the result.

Ben



-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql