Re: [HACKERS] Batch API for After Triggers

2013-06-18 Thread Simon Riggs
On 17 June 2013 20:53, Kevin Grittner kgri...@ymail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:
 On 9 June 2013 12:58, Craig Ringer cr...@2ndquadrant.com wrote:

 We don't currently have OLD and NEW relations so we're free to
 define how this works pretty freely.

 I think the best way, if we did do this, would be to have a
 number of different relations defined:

 OLD
 NEW
 INSERTED
 DELETED
 all of which would be defined same as main table

 and also one called
 UPDATED
 which would have two row vars called OLD and NEW
 so you would access it like e.g. IF UPDATED.OLD.id = 7

 Well, there is the SQL standard, which has a couple paragraphs on
 the topic which we might want to heed.

Yes, I already did in my proposal above.
OLD and NEW are all we need to fulfill the standard.

 For a delete there is just
 an old table; for an insert just a new one.  For an update you have
 both, with the same cardinality.  The rows in the old and new
 tables have a correspondence, but that is only visible to FOR EACH
 ROW triggers.

Yes, those are the relevant parts. SQL:2008 4.38 is the paragraphs
that describe this (for later reference).

What the standard doesn't cover is recursive calls, that might
generate new events of different kinds. So an UPDATE statement might
have caused DELETEs etc.. So we'd need a way to get access to DELETED
rows even when the OLD relation covers only the UPDATED rows.

For row level triggers we support macros like TRIGGER_FIRED_BY_INSERT.
Having an INSERT relation is just the logical equivalent for statement
level triggers.

 For something like RI, why would you need to
 establish correspondence?  A row with the referenced key either
 exists after the statement completes, or it doesn't -- why would we
 care whether it is an updated version of the same row?

I wasn't doing this for RI specifically, I was looking at what we'd
need to provide a full facilitiy.

It's not very easy to see how we can support RI via statement level triggers.

By definiton, statement level triggers happen after all row level
triggers have fired. So implementing row level RI by using statement
level triggers that follow the standard isn't possible. The main
things we'd need to cope with would be recursive trigger calls, for
example DELETE cascades. The firing of the triggers generates more
trigger events which delete more rows etc.. If we want to implement RI
triggers using some form of set processing we would need to do that in
the middle of handling the after row events, i.e. execute a set, then
re-check for a new set of events and execute them.

Directly using the statement-level standard triggers isn't the way, I
conclude after some detailed thinking. But there could be some
intermediate form that makes sense.

 Syntax for how to refer to the these is defined by the standard.

 As usual, I don't object to adding capabilities as long as the
 standard syntax is also supported with standard semantics.

Agreed.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-18 Thread Simon Riggs
On 17 June 2013 23:30, Craig Ringer cr...@2ndquadrant.com wrote:

 INSERTED and UPDATED could just be views...

Yes, that would be my suggestion.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-17 Thread Simon Riggs
On 9 June 2013 12:58, Craig Ringer cr...@2ndquadrant.com wrote:
 On 06/09/2013 04:58 PM, Simon Riggs wrote:
 There are also difficulties in semantics, since when
 we have OLD and NEW at row level we know we are discussing the same
 row. With sets of OLD and NEW we'd need to be able to link the
 relations back together somehow, which couldn't be done by PK since
 that could change.

 We don't currently have OLD and NEW relations so we're free to define
 how this works pretty freely.

 Rather than having OLD and NEW as separate relations, we could just have
 one OLD_AND_NEW relation. In that relation we exploit Pg's composite
 types to nest the old and new tuples in a single outer change record.

 OLD_AND_NEW would look to PL/PgSQL as if it were:

 CREATE TEMPORARY TABLE OLD_AND_NEW (
 OLD tabletype NOT NULL,
 NEW tabletype NOT NULL
 );

 ...though presumably without the ability to create indexes on it and the
 other things you can do to a real temp table. Though I can see cases
 where that'd be awfully handy too.

 For DELETE and INSERT we'd either provide different relations named OLD
 and NEW respectively, or we'd use OLD_AND_NEW with one field or the
 other blank. I'm not sure which would be best.

 Alternately, we could break the usual rules for relations and define OLD
 and NEW as ordered, so lock-step iteration would always return matching
 pairs of rows. That's useless in SQL since there's no way to achieve
 lock-step iteration, but if we provide a
 for_each_changed_row('some_function'::regproc) that scans them in
 lock-step and invokes `some_function` for each one...? (I haven't yet
 done enough in the core to have any idea if this approach is completely
 and absurdly impossible, or just ugly. Figured I'd throw it out there
 anyway.)


I think the best way, if we did do this, would be to have a number of
different relations defined:

OLD
NEW
INSERTED
DELETED
all of which would be defined same as main table

and also one called
UPDATED
which would have two row vars called OLD and NEW
so you would access it like e.g. IF UPDATED.OLD.id = 7

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-17 Thread Pavel Stehule
2013/6/17 Simon Riggs si...@2ndquadrant.com:
 On 9 June 2013 12:58, Craig Ringer cr...@2ndquadrant.com wrote:
 On 06/09/2013 04:58 PM, Simon Riggs wrote:
 There are also difficulties in semantics, since when
 we have OLD and NEW at row level we know we are discussing the same
 row. With sets of OLD and NEW we'd need to be able to link the
 relations back together somehow, which couldn't be done by PK since
 that could change.

 We don't currently have OLD and NEW relations so we're free to define
 how this works pretty freely.

 Rather than having OLD and NEW as separate relations, we could just have
 one OLD_AND_NEW relation. In that relation we exploit Pg's composite
 types to nest the old and new tuples in a single outer change record.

 OLD_AND_NEW would look to PL/PgSQL as if it were:

 CREATE TEMPORARY TABLE OLD_AND_NEW (
 OLD tabletype NOT NULL,
 NEW tabletype NOT NULL
 );

 ...though presumably without the ability to create indexes on it and the
 other things you can do to a real temp table. Though I can see cases
 where that'd be awfully handy too.

 For DELETE and INSERT we'd either provide different relations named OLD
 and NEW respectively, or we'd use OLD_AND_NEW with one field or the
 other blank. I'm not sure which would be best.

 Alternately, we could break the usual rules for relations and define OLD
 and NEW as ordered, so lock-step iteration would always return matching
 pairs of rows. That's useless in SQL since there's no way to achieve
 lock-step iteration, but if we provide a
 for_each_changed_row('some_function'::regproc) that scans them in
 lock-step and invokes `some_function` for each one...? (I haven't yet
 done enough in the core to have any idea if this approach is completely
 and absurdly impossible, or just ugly. Figured I'd throw it out there
 anyway.)


 I think the best way, if we did do this, would be to have a number of
 different relations defined:

 OLD
 NEW
 INSERTED
 DELETED
 all of which would be defined same as main table

 and also one called
 UPDATED
 which would have two row vars called OLD and NEW
 so you would access it like e.g. IF UPDATED.OLD.id = 7


nice idea

+1

Pavel

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-17 Thread Kevin Grittner
Simon Riggs si...@2ndquadrant.com wrote:
 On 9 June 2013 12:58, Craig Ringer cr...@2ndquadrant.com wrote:

 We don't currently have OLD and NEW relations so we're free to
 define how this works pretty freely.

 I think the best way, if we did do this, would be to have a
 number of different relations defined:

 OLD
 NEW
 INSERTED
 DELETED
 all of which would be defined same as main table

 and also one called
 UPDATED
 which would have two row vars called OLD and NEW
 so you would access it like e.g. IF UPDATED.OLD.id = 7

Well, there is the SQL standard, which has a couple paragraphs on
the topic which we might want to heed.  For a delete there is just
an old table; for an insert just a new one.  For an update you have
both, with the same cardinality.  The rows in the old and new
tables have a correspondence, but that is only visible to FOR EACH
ROW triggers.  For something like RI, why would you need to
establish correspondence?  A row with the referenced key either
exists after the statement completes, or it doesn't -- why would we
care whether it is an updated version of the same row?

Syntax for how to refer to the these is defined by the standard. 

As usual, I don't object to adding capabilities as long as the
standard syntax is also supported with standard semantics.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-17 Thread Craig Ringer
On 06/18/2013 01:25 AM, Pavel Stehule wrote:
  and also one called
  UPDATED
  which would have two row vars called OLD and NEW
  so you would access it like e.g. IF UPDATED.OLD.id = 7
 
 nice idea

 +1
Much better naming than OLD_AND_NEW.

I'm not so sure about

OLD
NEW
INSERTED
DELETED

in that I imagine we'd want to pick one pair and stick with it. Since
using INSERTED / DELETED makes UPDATED make sense, and since OLD
and NEW are already used to refer to the magic variables of those
names in for each row triggers, I think INSERTED / UPDATED / DELETED is
the way to go.

INSERTED and UPDATED could just be views of the same data as UPDATED
that show only the OLD or only the NEW composite type fields. That'd
allow you to write a trigger without TG_OP tests in many cases, as
UPDATED would always contain what you wanted. It seems slightly weird to
have INSERTED and DELETED populated for an UPDATE, but when an UPDATE is
logically an INSERT+DELETE anyway...

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-15 Thread Simon Riggs
On 9 June 2013 14:56, Kevin Grittner kgri...@ymail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:
 On 8 June 2013 22:25, Kevin Grittner kgri...@ymail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:

 There are also difficulties in semantics, since when
 we have OLD and NEW at row level we know we are discussing the same
 row. With sets of OLD and NEW we'd need to be able to link the
 relations back together somehow, which couldn't be done by PK since
 that could change. So we'd need to invent some semantics for a
 linking identifier of some description. Which once we've done it
 would be used by people to join them back together again, which is
 what we already had in the first place. So my take is that it sounds
 expressive, but definitely not performant.

 I have used a feature like this in other database products, and can
 say from experience that these relations in a statement trigger can
 be very useful without the linkage you propose.  I can see how the
 linkage could potentially be useful, but if that is the only
 hang-up, we would be adding a powerful feature without it.

What I'm trying to do is tune FKs, which are currently implemented as
after row triggers. Hence why I'm looking at ways to speed up after
row triggers.

Suggesting I work on after statement triggers would imply you think
that FKs should/could be rewritten as after statement triggers. Is
that what you mean?

Doing it that way would mean rewriting a lot of code and would still
have problems 2 and 3 identified above. I can't imagine anybody would
go for that, but if you have a sketch of how it might work we can
consider it.

 Since my objective is performance, not standards, I don't see a reason
 to work on that, yet. I might have time to work on it later, lets see.

 This seems like it has some overlap with the delta relations I will
 need to generate for incremental maintenance of materialized views,
 so we should coordinate on those efforts if they happen to occur
 around the same time.

IMHO you should use Andres' logical changeset extraction for
incremental maintenance for mat views. Doing mat view maintenance
using triggers would be choosing the lower performance option.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-15 Thread Kevin Grittner
Simon Riggs si...@2ndquadrant.com wrote:
 On 9 June 2013 14:56, Kevin Grittner kgri...@ymail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:
 On 8 June 2013 22:25, Kevin Grittner kgri...@ymail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:

 There are also difficulties in semantics, since when
 we have OLD and NEW at row level we know we are discussing the same
 row. With sets of OLD and NEW we'd need to be able to link the
 relations back together somehow, which couldn't be done by PK since
 that could change. So we'd need to invent some semantics for a
 linking identifier of some description. Which once we've done it
 would be used by people to join them back together again, which is
 what we already had in the first place. So my take is that it sounds
 expressive, but definitely not performant.

 I have used a feature like this in other database products, and can
 say from experience that these relations in a statement trigger can
 be very useful without the linkage you propose.  I can see how the
 linkage could potentially be useful, but if that is the only
 hang-up, we would be adding a powerful feature without it.

 What I'm trying to do is tune FKs, which are currently implemented as
 after row triggers. Hence why I'm looking at ways to speed up after
 row triggers.

 Suggesting I work on after statement triggers would imply you think
 that FKs should/could be rewritten as after statement triggers. Is
 that what you mean?

Well, I didn't make any suggestion -- I was asking questions, and
then when you commented on a hypothetical feature I responded.

That said, I was asking the question because I used SQL Server for
years when it had triggers but no foreign key definitions.  There
was a pattern for enforcing foreign key relationships in AFTER EACH
STATEMENT triggers using the old and new relations (with no linkage
between particular updated rows) which we used so heavily I could
write the triggers for a given foreign key mechanically in just a
minute or two.  This technique had good performance and seems to
have all the semantics you're looking for, so I was wondering
whether that might be a better approach to this problem.  It sure
seems like it would have fewer moving parts.  I know it was very
reliable with S2PL concurrency control, but there may be problems
with adapting it to MVCC that I'm not seeing without a deeper look.

If you're interested, I could try to prod those areas of my memory
to recall the pattern, or find examples of it somewhere.

 Doing it that way would mean rewriting a lot of code and would still
 have problems 2 and 3 identified above. I can't imagine anybody would
 go for that, but if you have a sketch of how it might work we can
 consider it.

I don't, but if there is interest I could probably sketch the outlines.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-11 Thread Simon Riggs
On 10 June 2013 22:50, Robert Haas robertmh...@gmail.com wrote:
 On Sat, Jun 8, 2013 at 5:00 PM, Simon Riggs si...@2ndquadrant.com wrote:
 While fiddling with FK tuning, Noah suggested batching trigger
 executions together to avoid execution overhead.

 It turns out there is no easy way to write triggers that can take
 advantage of the knowledge that they are being executed as a set of
 trigger executions. Some API is required to allow a trigger to
 understand that there may be other related trigger executions in the
 very near future, so it can attempt to amortise call overhead across
 many invocations (batching).

 The attached patch adds two fields to the TriggerDesc trigger
 functions are handed, allowing them to inspect (if they choose) the
 additional fields and thus potentially use some form of batching.

 I'm unclear how this could be used in practice.

As described, you can use it to prepare batches.

 Are the events in a
 batch guaranteed to, say, all be related to the same relation?

Good point. I was too focused on the single large statements I was
considering. There would need to some logic to look at relations as
well.

For individual statements it can work well, since all or at least the
majority of events are for a particular relation and we can know which
one that is.

It probably wouldn't work that well for deferred trigger events that
covered a spread of different relations.

Perhaps that's an argument for a final function after all.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-10 Thread Robert Haas
On Sat, Jun 8, 2013 at 5:00 PM, Simon Riggs si...@2ndquadrant.com wrote:
 While fiddling with FK tuning, Noah suggested batching trigger
 executions together to avoid execution overhead.

 It turns out there is no easy way to write triggers that can take
 advantage of the knowledge that they are being executed as a set of
 trigger executions. Some API is required to allow a trigger to
 understand that there may be other related trigger executions in the
 very near future, so it can attempt to amortise call overhead across
 many invocations (batching).

 The attached patch adds two fields to the TriggerDesc trigger
 functions are handed, allowing them to inspect (if they choose) the
 additional fields and thus potentially use some form of batching.

I'm unclear how this could be used in practice.  Are the events in a
batch guaranteed to, say, all be related to the same relation?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-09 Thread Simon Riggs
On 8 June 2013 22:25, Kevin Grittner kgri...@ymail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:

 Comments please.

 How much of this problem space do you think could be addressed by
 providing OLD and NEW *relations* to AFTER EACH STATEMENT triggers?

It's a reasonable question because those two things sound a little
like they might be related.

Providing the proposed additional info costs almost nothing since the
work to calculate that info is already performed. I've written this
patch since it was trivial to do so, while inspecting the code to see
if it was possible. As it now turns out, I'll be putting most effort
into the WHEN clause approach for FKs, but there's no reason why
others like Slony or pgmemcache wouldn't benefit here also - hence
posting the patch. The proposed API changes don't conflict in any way
with the feature you propose.

Providing the whole OLD and NEW sets as relations to a trigger would
require significant resources and wouldn't be done for performance
reasons AFAICS. There are also difficulties in semantics, since when
we have OLD and NEW at row level we know we are discussing the same
row. With sets of OLD and NEW we'd need to be able to link the
relations back together somehow, which couldn't be done by PK since
that could change. So we'd need to invent some semantics for a
linking identifier of some description. Which once we've done it
would be used by people to join them back together again, which is
what we already had in the first place. So my take is that it sounds
expressive, but definitely not performant.

Since my objective is performance, not standards, I don't see a reason
to work on that, yet. I might have time to work on it later, lets see.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-09 Thread Simon Riggs
On 9 June 2013 05:08, Stephen Frost sfr...@snowman.net wrote:
 * Simon Riggs (si...@2ndquadrant.com) wrote:
 While fiddling with FK tuning, Noah suggested batching trigger
 executions together to avoid execution overhead.

 I like the general idea, but I'd prefer a way to avoid having to queue
 up tons of trigger events to be executed in the first place.

There's already a thread on that exact topic, for FKs, which is what
spawned this thread.

 Aggregates do this by providing a way to store up information to be
 processed by an eventual 'final' function.

As I mentioned in my post, I did consider that and then chose not to
do that. However, having a final func is a major modification in the
way that we specify trigger functions. We'd also need to cope with
recursive trigger execution, which would mean the final func would get
called potentially many times, so there's no way of knowing if the
final func is actually the last call needed. That sounded complex and
confusing to me.

The proposed API allows you to do exactly that anyway, more easily, by
just waiting until tg_event_num == tg_tot_num_events.

 Another option, as Kevin
 asked about, would be statement level triggers which are able to see
 both the OLD and the NEW versions of the relation.

 The per-row trigger option with a way to be called immediately

... it already exists and is known as the WHEN clause. This is the
mechanism I expect to use to tune FKs

 and then
 store what it cares about for a later final function strikes me as very
 generalized and able to do things that the statement-level option
 couldn't,

 but I'm not sure if there's a use-case that could solve which
 the OLD/NEW statement trigger capability couldn't.

I think the two things are quite different, as I explained on a
separate post to Kevin.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-09 Thread Martijn van Oosterhout
On Sun, Jun 09, 2013 at 10:15:09AM +0100, Simon Riggs wrote:
 As I mentioned in my post, I did consider that and then chose not to
 do that. However, having a final func is a major modification in the
 way that we specify trigger functions. We'd also need to cope with
 recursive trigger execution, which would mean the final func would get
 called potentially many times, so there's no way of knowing if the
 final func is actually the last call needed. That sounded complex and
 confusing to me.
 
 The proposed API allows you to do exactly that anyway, more easily, by
 just waiting until tg_event_num == tg_tot_num_events.

Can you signal partial completion? For example, if a trigger know that
blocks of 10,000 are optimal and it sees tg_tot_num_events == 1,000,000
that it could do work every 10,000 entries, as in when:

(tg_event_num % 1) == 0 || tg_event_num == tg_tot_num_events

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Batch API for After Triggers

2013-06-09 Thread Craig Ringer
On 06/09/2013 04:58 PM, Simon Riggs wrote:
 There are also difficulties in semantics, since when
 we have OLD and NEW at row level we know we are discussing the same
 row. With sets of OLD and NEW we'd need to be able to link the
 relations back together somehow, which couldn't be done by PK since
 that could change.

We don't currently have OLD and NEW relations so we're free to define
how this works pretty freely.

Rather than having OLD and NEW as separate relations, we could just have
one OLD_AND_NEW relation. In that relation we exploit Pg's composite
types to nest the old and new tuples in a single outer change record.

OLD_AND_NEW would look to PL/PgSQL as if it were:

CREATE TEMPORARY TABLE OLD_AND_NEW (
OLD tabletype NOT NULL,
NEW tabletype NOT NULL
);

...though presumably without the ability to create indexes on it and the
other things you can do to a real temp table. Though I can see cases
where that'd be awfully handy too.

For DELETE and INSERT we'd either provide different relations named OLD
and NEW respectively, or we'd use OLD_AND_NEW with one field or the
other blank. I'm not sure which would be best.

Alternately, we could break the usual rules for relations and define OLD
and NEW as ordered, so lock-step iteration would always return matching
pairs of rows. That's useless in SQL since there's no way to achieve
lock-step iteration, but if we provide a
for_each_changed_row('some_function'::regproc) that scans them in
lock-step and invokes `some_function` for each one...? (I haven't yet
done enough in the core to have any idea if this approach is completely
and absurdly impossible, or just ugly. Figured I'd throw it out there
anyway.)



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-09 Thread Kevin Grittner
Simon Riggs si...@2ndquadrant.com wrote:
 On 8 June 2013 22:25, Kevin Grittner kgri...@ymail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:

 There are also difficulties in semantics, since when
 we have OLD and NEW at row level we know we are discussing the same
 row. With sets of OLD and NEW we'd need to be able to link the
 relations back together somehow, which couldn't be done by PK since
 that could change. So we'd need to invent some semantics for a
 linking identifier of some description. Which once we've done it
 would be used by people to join them back together again, which is
 what we already had in the first place. So my take is that it sounds
 expressive, but definitely not performant.

I have used a feature like this in other database products, and can
say from experience that these relations in a statement trigger can
be very useful without the linkage you propose.  I can see how the
linkage could potentially be useful, but if that is the only
hang-up, we would be adding a powerful feature without it.

 Since my objective is performance, not standards, I don't see a reason
 to work on that, yet. I might have time to work on it later, lets see.

This seems like it has some overlap with the delta relations I will
need to generate for incremental maintenance of materialized views,
so we should coordinate on those efforts if they happen to occur
around the same time.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Batch API for After Triggers

2013-06-08 Thread Simon Riggs
While fiddling with FK tuning, Noah suggested batching trigger
executions together to avoid execution overhead.

It turns out there is no easy way to write triggers that can take
advantage of the knowledge that they are being executed as a set of
trigger executions. Some API is required to allow a trigger to
understand that there may be other related trigger executions in the
very near future, so it can attempt to amortise call overhead across
many invocations (batching).

The attached patch adds two fields to the TriggerDesc trigger
functions are handed, allowing them to inspect (if they choose) the
additional fields and thus potentially use some form of batching.

This is backwards compatible with earlier trigger API.

Two fields are

int   tg_tot_num_events;
int   tg_event_num

So your trigger can work out it is e.g. number 3 of 56 invocations in
the current set of after triggers.

Going back to Noah's example, this would allow you to collect all 56
values and then execute a single statement with an array of 56 values
in it. Knowing there are 56 means you can wait until the 56th
invocation before executing the batched statement, without risking
skipping some checks because you've only got half a batch left.

If you don't do this, then you'd need to introduce the concept of a
final function similar to the way aggregates work. But that seems
much too complex to be real world useful.

This seemed a generally useful approach for any after trigger author,
not just for RI.

Comments please.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


batch_api_after_triggers.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-08 Thread Kevin Grittner
Simon Riggs si...@2ndquadrant.com wrote:

 Comments please.

How much of this problem space do you think could be addressed by
providing OLD and NEW *relations* to AFTER EACH STATEMENT triggers?

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Batch API for After Triggers

2013-06-08 Thread Stephen Frost
* Simon Riggs (si...@2ndquadrant.com) wrote:
 While fiddling with FK tuning, Noah suggested batching trigger
 executions together to avoid execution overhead.

I like the general idea, but I'd prefer a way to avoid having to queue
up tons of trigger events to be executed in the first place.
Aggregates do this by providing a way to store up information to be
processed by an eventual 'final' function.  Another option, as Kevin
asked about, would be statement level triggers which are able to see
both the OLD and the NEW versions of the relation.

The per-row trigger option with a way to be called immediately and then
store what it cares about for a later final function strikes me as very
generalized and able to do things that the statement-level option
couldn't, but I'm not sure if there's a use-case that could solve which
the OLD/NEW statement trigger capability couldn't.

Thanks,

Stephen


signature.asc
Description: Digital signature