On Sun, Feb 02, 2014 at 11:53:51AM +0100, Ronan Dunklau wrote: > Le jeudi 30 janvier 2014 14:05:08 Noah Misch a écrit : > > On Thu, Jan 23, 2014 at 03:17:35PM +0100, Ronan Dunklau wrote: > > > What do you think about this approach ? Is there something I missed which > > > would make it not sustainable ? > > > > Seems basically reasonable. I foresee multiple advantages from having one > > tuplestore per query level as opposed to one for the entire transaction. > > You would remove the performance trap of backing up the tuplestore by > > rescanning. It permits reclaiming memory and disk space in > > AfterTriggerEndQuery() rather than at end of transaction. You could remove > > ate_ptr1 and ate_ptr2 from AfterTriggerEventDataFDW and just store the > > flags word: depending on AFTER_TRIGGER_2CTIDS, grab either the next one or > > the next two tuples from the tuplestore. Using work_mem per > > AfterTriggerBeginQuery() instead of per transaction is no problem. What do > > you think of that design change? > > > > I agree that this design is better, but I have some objections. > > We can remove ate_ptr2 and rely on the AFTER_TRIGGER_2CTIDS flag, but the > rescanning and ate_ptr1 (renamed ate_tupleindex in the attached patch) can't > go away. > > Consider for example the case of a foreign table with more than one AFTER > UPDATE triggers. Unless we store the tuples once for each trigger, we will > have to rescan the tuplestore.
Will we? Within a given query level, when do (non-deferred) triggers execute in an order other than the enqueue order? > To mitigate the effects of this behaviour, I added the option to perform a > reverse_seek when the looked-up tuple is nearer from the current index than > from the start. If there's still a need to seek within the tuplestore, that should get rid of the O(n^2) effect. I'm hoping that per-query-level tuplestores will eliminate the need to seek entirely. > > If you do pursue that change, make sure the code still does the right thing > > when it drops queued entries during subxact abort. > > I don't really understand what should be done at that stage. Since triggers > on > foreign tables are not allowed to be deferred, everything should be cleaned > up > at the end of each query, right ? So, there shouldn't be any queued entries. I suspect that's right. If you haven't looked over AfterTriggerEndSubXact(), please do so and ensure all its actions still make sense in the context of this new kind of trigger storage. > > > The attached patch checks this, and add documentation for this limitation. > > > I'm not really sure about how to phrase that correctly in the error > > > message > > > and the documentation. One can store at most INT_MAX foreign tuples, which > > > means that at most INT_MAX insert or delete or "half-updates" can occur. > > > By > > > half-updates, I mean that for update two tuples are stored whereas in > > > contrast to only one for insert and delete trigger. > > > > > > Besides, I don't know where this disclaimer should be in the > > > documentation. > > > Any advice here ? > > > > I wouldn't mention that limitation. > > Maybe it shouldn't be so prominent, but I still think a note somewhere > couldn't hurt. Perhaps. There's not much documentation of such implementation upper limits, and there's no usage of "INT_MAX" outside of parts that discuss writing C code. I'm not much of a visionary when it comes to the documentation; I try to document new things with an amount of detail similar to old features. > Should the use of work_mem be documented somewhere, too ? I wouldn't. Most uses of work_mem are undocumented, even relatively major ones like count(DISTINCT ...) and CTEs. So, while I'd generally favor a patch documenting all/most of the things that use work_mem, it would be odd to document one new consumer apart from the others. > > This is the performance trap I mentioned above; it brings potential O(n^2) > > complexity to certain AFTER trigger execution scenarios. > > What scenarios do you have in mind ? I "fixed" the problem when there are > multiple triggers on a foreign table, do you have any other one ? I'm not aware of such a performance trap in your latest design. For what it's worth, I don't think multiple triggers were ever a problem. In the previous design, a problem arose if you had a scenario like this: Query level 1: queue one million events ... Repeat this section many times: Query level 2: queue one event Query level 3: queue one event Query level 3: execute events Query level 2: execute events <-- had to advance through the 1M stored events ... Query level 1: execute events Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers