Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and

2006-10-15 Thread Markus Schaber
Hi, Tom,

Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
 The specific concern I have is large result sets, like 10s or 100s of MB
 (or more). We just added support for not buffering those in psql, so it
 seems like a step backwards to have the backend now buffering it (unless
 I'm confused on how a tuplestore works...)
 
 Well, a tuplestore can dump to disk, so at least you don't need to worry
 about out-of-memory considerations.

Would it be possible to kinda wrap the query execution into the
tuplestore interface, so that the tuples are generated on the fly when
fetched from the tuplestore, for large resultsets?

Thanks,
Markus
-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org



signature.asc
Description: OpenPGP digital signature


[HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers

2006-10-12 Thread Tom Lane
While investigating Merlin's bug report here:
http://archives.postgresql.org/pgsql-general/2006-10/msg00447.php
I realized that we've completely failed to consider the interactions
of $subject.  In particular, functions.c still thinks that SELECT is
the only type of query that can return rows.

ISTM that ideally, a query with RETURNING ought to act like a SELECT
for the purposes of a SQL function --- to wit, that the result rows are
discarded if it's not the last query in the function, and are returned
as the function result if it is.

The difficulty with this is that unlike SELECT, a query with RETURNING
might be queueing up AFTER triggers, which we shouldn't fire until the
query is fully executed.

Merlin's report shows that we've already got a problem in the back
branches with mishandling of after-trigger state, because we push an
AfterTrigger stack level at start of an SQL function command, and then
are willing to return from the function with that stack level still
active if it's a set-returning function.  I think we can fix this in
the back branches by the expedient of not pushing a stack level (ie,
not calling AfterTriggerBegin/EndQuery) unless it's a non-SELECT
command --- SELECT will never queue triggers, and we never return
partway through a non-SELECT command.  But this falls down for
RETURNING queries.

I thought about fixing this by extending the AfterTrigger state
structure to let it be a tree rather than just a stack, ie, we could
temporarily pop the function AfterTrigger status entry without executing
any queued triggers, and then push it back on when re-entering the
function.  This seems horribly messy however, and I'm not sure we could
still promise unsurprising order of trigger execution in complicated
cases.

I think the most promising answer may be to push RETURNING rows into a
TupleStore and then read them out from there, which is pretty much the
same approach we adopted for RETURNING queries inside portals.  This'd
allow the query to be executed completely, and its triggers fired,
before we return from the SQL function.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers

2006-10-12 Thread Jim C. Nasby
On Thu, Oct 12, 2006 at 12:19:24PM -0400, Tom Lane wrote:
 I think the most promising answer may be to push RETURNING rows into a
 TupleStore and then read them out from there, which is pretty much the
 same approach we adopted for RETURNING queries inside portals.  This'd
 allow the query to be executed completely, and its triggers fired,
 before we return from the SQL function.

Would this only affect RETURNING queries that are returning data via a
SRF? ISTM that pushing to a tuplestore is a lot of extra work for
simpler cases like INSERT INTO table DELETE FROM queue_table WHERE ...
RETURNING *; Even for the case of just going back to the client, it
seems like a fair amount of overhead.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers

2006-10-12 Thread Jonah H. Harris

On 10/12/06, Tom Lane [EMAIL PROTECTED] wrote:

I think the most promising answer may be to push RETURNING rows into a
TupleStore and then read them out from there, which is pretty much the
same approach we adopted for RETURNING queries inside portals.


It certainly sounds like the safest implementation and I can't think
of any simple way around using a tuplestore in this type of case.

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers

2006-10-12 Thread David Fetter
On Thu, Oct 12, 2006 at 01:28:18PM -0500, Jim C. Nasby wrote:
 On Thu, Oct 12, 2006 at 12:19:24PM -0400, Tom Lane wrote:
  I think the most promising answer may be to push RETURNING rows
  into a TupleStore and then read them out from there, which is
  pretty much the same approach we adopted for RETURNING queries
  inside portals.  This'd allow the query to be executed completely,
  and its triggers fired, before we return from the SQL function.
 
 Would this only affect RETURNING queries that are returning data via
 a SRF? ISTM that pushing to a tuplestore is a lot of extra work for
 simpler cases like INSERT INTO table DELETE FROM queue_table WHERE
 ...  RETURNING *; Even for the case of just going back to the
 client, it seems like a fair amount of overhead.

More generally, would this change impede promoting RETURNING to work
just as VALUES does in 8.2 (i.e. being able to join RETURNING results,
etc.)?

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers

2006-10-12 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 On Thu, Oct 12, 2006 at 12:19:24PM -0400, Tom Lane wrote:
 I think the most promising answer may be to push RETURNING rows into a
 TupleStore and then read them out from there, which is pretty much the
 same approach we adopted for RETURNING queries inside portals.

 Would this only affect RETURNING queries that are returning data via a
 SRF?

Right, and specifically an SQL-language function.

 ISTM that pushing to a tuplestore is a lot of extra work for

I'm not entirely convinced of that --- the overhead of getting down
through functions.c and ExecutorRun into the per-tuple loop isn't
trivial either.  It wouldn't work at all without significant
restructuring, in fact, because as execMain stands we'd be firing per
statement triggers once per row if we tried to handle RETURNING queries
the same way that SQL functions currently handle SELECTs.  When you look
at the big picture there's a whole lot of call overhead that would go
away if SQL functions returned a tuplestore instead of a row at a time.
I was toying with the idea that we should make SELECTs return via a
tuplestore too, which would allow eliminating the special case of having
ExecutorRun return an individual tuple at all.  That's not a bugfix
though so I'll wait for 8.3 before thinking more about it ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers

2006-10-12 Thread Jim C. Nasby
On Thu, Oct 12, 2006 at 02:50:34PM -0400, Tom Lane wrote:
  ISTM that pushing to a tuplestore is a lot of extra work for
 
 I'm not entirely convinced of that --- the overhead of getting down
 through functions.c and ExecutorRun into the per-tuple loop isn't
 trivial either.  It wouldn't work at all without significant
 restructuring, in fact, because as execMain stands we'd be firing per
 statement triggers once per row if we tried to handle RETURNING queries
 the same way that SQL functions currently handle SELECTs.  When you look
 at the big picture there's a whole lot of call overhead that would go
 away if SQL functions returned a tuplestore instead of a row at a time.
 I was toying with the idea that we should make SELECTs return via a
 tuplestore too, which would allow eliminating the special case of having
 ExecutorRun return an individual tuple at all.  That's not a bugfix
 though so I'll wait for 8.3 before thinking more about it ...

The specific concern I have is large result sets, like 10s or 100s of MB
(or more). We just added support for not buffering those in psql, so it
seems like a step backwards to have the backend now buffering it (unless
I'm confused on how a tuplestore works...)
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers

2006-10-12 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes:
 More generally, would this change impede promoting RETURNING to work
 just as VALUES does in 8.2 (i.e. being able to join RETURNING results,
 etc.)?

Making that happen would imply a whole lot of other changes; this issue
isn't the principal gating factor.  One of the main things I'd point to
right now, in view of this having all arisen from the question of when
triggers should fire, is where and when we'd fire BEFORE/AFTER STATEMENT
triggers for a RETURNING command embedded in a larger query.  For that
matter, the system has several not-easily-removed assumptions that a
SELECT command won't fire any triggers at all --- which would break down
if we allowed constructs like

SELECT ... FROM (INSERT ... RETURNING ...) ...

We do currently have the ability to make plpgsql functions send
RETURNING results back to a calling query, and with this change we could
say the same of plain SQL functions --- and in both cases we'll be
depending on a tuplestore buffer to keep things sane in terms of when
triggers fire.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers

2006-10-12 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 The specific concern I have is large result sets, like 10s or 100s of MB
 (or more). We just added support for not buffering those in psql, so it
 seems like a step backwards to have the backend now buffering it (unless
 I'm confused on how a tuplestore works...)

Well, a tuplestore can dump to disk, so at least you don't need to worry
about out-of-memory considerations.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers

2006-10-12 Thread Jim C. Nasby
On Thu, Oct 12, 2006 at 03:03:43PM -0400, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  The specific concern I have is large result sets, like 10s or 100s of MB
  (or more). We just added support for not buffering those in psql, so it
  seems like a step backwards to have the backend now buffering it (unless
  I'm confused on how a tuplestore works...)
 
 Well, a tuplestore can dump to disk, so at least you don't need to worry
 about out-of-memory considerations.

Sure, it's just a lot of data to be shuffling around if we can avoid it.

Perhaps we could only do this if there's triggers on the table involved?
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers

2006-10-12 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 Sure, it's just a lot of data to be shuffling around if we can avoid it.
 Perhaps we could only do this if there's triggers on the table involved?

Maybe, but it's awfully late in the 8.2 cycle to be worrying about
performance improvements for something that currently doesn't work at all.
I'm inclined to keep it simple for now; we can revisit the issue later
if anyone has problems in practice.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] SQL functions, INSERT/UPDATE/DELETE RETURNING, and triggers

2006-10-12 Thread Tom Lane
I wrote:
 ISTM that ideally, a query with RETURNING ought to act like a SELECT
 for the purposes of a SQL function --- to wit, that the result rows are
 discarded if it's not the last query in the function, and are returned
 as the function result if it is.

The current state of affairs is that the first part of that works as
expected, and the second part fails like so:

regression=# create function foo8(bigint,bigint) returns setof int8_tbl as
$$ insert into int8_tbl values($1,$2) returning * $$
language sql;
ERROR:  return type mismatch in function declared to return int8_tbl
DETAIL:  Function's final statement must be a SELECT.
CONTEXT:  SQL function foo8
regression=#

While this is certainly undesirable, it looks more like a missing
feature than a bug, especially since the documentation says exactly
that:

... the final command must be a SELECT that returns whatever is
specified as the function's return type.

I spent some time looking at what it would take to fix it, and I find
that the changes are a bit bigger than I want to be making in mid-beta.
So my recommendation is that for now we just add a TODO item:

* Allow SQL-language functions to return results from RETURNING queries

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match