Re: [HACKERS] WIP fix proposal for bug #6123
Alvaro Herrera wrote: Kevin Grittner escribió: Tom Lane t...@sss.pgh.pa.us wrote: Also, it doesn't appear that we ever got around to preparing documentation updates, but I think we definitely need some if we're going to start throwing errors for things that used to be allowed. Since Kevin has the most field experience with this problem, I'd like to nominate him to write some docs ... OK, will do. The redo the DELETE and RETURN NULL workaround is not at all obvious; we should definitely include an example of that. Any chance this patch could be pushed soon? The problem is that this patch conflicts rather heavily with my FOR KEY SHARE patch. I think it makes sense to commit this one first. To me, it would be good enough that the code changes go in now; the doc patch can wait a little longer. Sorry I just got to this in wading through backlog. Will push today without docs and work on docs soon. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Kevin Grittner escribió: Tom Lane t...@sss.pgh.pa.us wrote: Also, it doesn't appear that we ever got around to preparing documentation updates, but I think we definitely need some if we're going to start throwing errors for things that used to be allowed. Since Kevin has the most field experience with this problem, I'd like to nominate him to write some docs ... OK, will do. The redo the DELETE and RETURN NULL workaround is not at all obvious; we should definitely include an example of that. Any chance this patch could be pushed soon? The problem is that this patch conflicts rather heavily with my FOR KEY SHARE patch. I think it makes sense to commit this one first. To me, it would be good enough that the code changes go in now; the doc patch can wait a little longer. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Kevin Grittner kevin.gritt...@wicourts.gov writes: At any rate, I think we might want to apply Tom's patch for this while 9.3 is still early in development, to see what else might shake out from it while there is still plenty of time to fix any issues. It's now looking good from my perspective. I just re-read the thread to refresh my memory. It seems that we pretty much had consensus on throwing an error if any operation fired by a BEFORE UPDATE/DELETE trigger changes the target row, unless the trigger returns NULL to skip the update/delete. It is not clear to me however whether we had consensus about what to do with SELECT FOR UPDATE locking cases. The last patch posted in the thread was here: http://archives.postgresql.org/pgsql-hackers/2012-01/msg01241.php That message includes an example of the FOR UPDATE problem case and says that fixing it would require significantly more work. Do we want to worry about tackling that now, or shall we be satisfied with fixing the trigger cases? Also, it doesn't appear that we ever got around to preparing documentation updates, but I think we definitely need some if we're going to start throwing errors for things that used to be allowed. Since Kevin has the most field experience with this problem, I'd like to nominate him to write some docs ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: At any rate, I think we might want to apply Tom's patch for this while 9.3 is still early in development, to see what else might shake out from it while there is still plenty of time to fix any issues. It's now looking good from my perspective. I just re-read the thread to refresh my memory. It seems that we pretty much had consensus on throwing an error if any operation fired by a BEFORE UPDATE/DELETE trigger changes the target row, unless the trigger returns NULL to skip the update/delete. It is not clear to me however whether we had consensus about what to do with SELECT FOR UPDATE locking cases. The last patch posted in the thread was here: http://archives.postgresql.org/pgsql-hackers/2012-01/msg01241.php That message includes an example of the FOR UPDATE problem case and says that fixing it would require significantly more work. Do we want to worry about tackling that now, or shall we be satisfied with fixing the trigger cases? As you said in the referenced message, the FOR UPDATE issue seems orthogonal and should probably be addressed separately. Also, it doesn't appear that we ever got around to preparing documentation updates, but I think we definitely need some if we're going to start throwing errors for things that used to be allowed. Since Kevin has the most field experience with this problem, I'd like to nominate him to write some docs ... OK, will do. The redo the DELETE and RETURN NULL workaround is not at all obvious; we should definitely include an example of that. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Kevin Grittner kevin.gritt...@wicourts.gov wrote: We discussed it to the point of consensus, and Tom wrote a patch to implement that. Testing in my shop hit problems for which the cause was not obvious. I don't know whether there is a flaw in the designed approach that we all missed, a simple programming bug of some sort in the patch, or pilot error on this end. It's looking like the last of those. The problem was in BEFORE DELETE triggers which, for example, would check that there were the expected child records (throwing an error if they were missing) right before deleting them. When the reissue the DELETE and then RETURN NULL trick was used to avoid errors with this patch, the trigger would fail the second time through. Of course, such triggers were more than a bit silly and clearly The Wrong Thing To Do in general. I don't know how widespread such practice is, but it will need to be fixed where it exists in order to use the proposed patch and related workaround for cascade delete-like triggers. Before someone says that foreign keys should just be used, I want to point out the -like above. In my experience, for about every ten cases where a declarative constraint like UNIQUE or FOREIGN KEY can cover a business rule, there is about one that is logically very similar to such a declarative constraint but just different enough that it must be implemented in custom code. Jeff Davis has been improving that ratio, but I doubt that the issue will ever disappear entirely. At any rate, I think we might want to apply Tom's patch for this while 9.3 is still early in development, to see what else might shake out from it while there is still plenty of time to fix any issues. It's now looking good from my perspective. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Bruce Momjian br...@momjian.us wrote: Did we ever decide on this? We discussed it to the point of consensus, and Tom wrote a patch to implement that. Testing in my shop hit problems for which the cause was not obvious. I don't know whether there is a flaw in the designed approach that we all missed, a simple programming bug of some sort in the patch, or pilot error on this end. It's on my list of things to do, but it's hovering around 4th place on that list, and new things seem to be appearing at the top of that list at about the rate that I can clear them. :-( Is it a TODO? We don't want to lose track of it, but with a pending patch to debug, I'm not sure the TODO list is the right place to put it. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
On Wed, Aug 8, 2012 at 09:26:41AM -0500, Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: Did we ever decide on this? We discussed it to the point of consensus, and Tom wrote a patch to implement that. Testing in my shop hit problems for which the cause was not obvious. I don't know whether there is a flaw in the designed approach that we all missed, a simple programming bug of some sort in the patch, or pilot error on this end. It's on my list of things to do, but it's hovering around 4th place on that list, and new things seem to be appearing at the top of that list at about the rate that I can clear them. :-( Is it a TODO? We don't want to lose track of it, but with a pending patch to debug, I'm not sure the TODO list is the right place to put it. OK, I will let it linger on your TODO list then. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Did we ever decide on this? Is it a TODO? --- On Fri, Jul 22, 2011 at 04:01:20PM -0500, Kevin Grittner wrote: Robert Haas robertmh...@gmail.com wrote: On Wed, Jul 20, 2011 at 2:58 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: So basically, the goal of this patch is to ensure that a BEFORE DELETE trigger doesn't silently fail to delete a row because that row was updated during the BEFORE DELETE trigger firing (in our case by secondary trigger firing). I've run across this problem before while writing application code and have found it surprising, unfortunate, and difficult to work around. It's not so bad when it only bites you once, but as things get more complicated and you have more and more triggers floating around, it gets pretty darn horrible. One of the things I've done to mitigate the impact of this is to write as many triggers as possible as AFTER triggers Yeah, this is not an issue in AFTER triggers, so moving any updating to those is a solution. In most cases that's where you want triggered modifications to take place anyway. The cascading delete situation is the most obvious exception, although there certainly could be others. but that's sometimes difficult to accomplish. Yeah, sometimes. Your scenario is a BEFORE DELETE trigger that does an UPDATE on the same row, but I think this problem also occurs if you have a BEFORE UPDATE trigger that does an UPDATE on the same row. I believe the second update gets silently ignored. My testing shows that the primary update gets ignored, while all the triggered effects of that update are persisted. Yuck. :-( It certainly seems possible to turn that around, but that hardly seems better. In asking application programmers here what they would *expect* to happen, they all seem to think that it is surprising that the BEFORE trigger functions *return a record*, rather than a boolean to say whether to proceed with the operation. They feel it would be less confusing if a value set into NEW was effective if the operation does take effect, and the boolean controls whether or not that happens. They rather expect that if an update happens from the same transaction while a before trigger is running, that the NEW record will reflect the change. I recognize how hard it would be to create that expected behavior, and how unlikely it is that the community would accept such a change at this point. But current behavior is to silently do something really dumb, so I think some change should be considered -- even if that change is to throw an error where we now allow nonsense. INSERT is not a problem -- if a BEFORE INSERT trigger inserts a row with a conflicting primary key (or other unique index key), the operation will be rolled back. That's fine. I think DELETE can be cleanly fixed with a patch similar to what I posted earlier in the thread. I found one more value that looks like it should be set, and it could use some comments, but I believe that we can get DELETE behavior which is every bit as sensible as INSERT behavior with a very small change. The worms do come crawling out of the can on BEFORE UPDATE triggers, though. When faced with an UPDATE which hasn't yet been applied, and other UPDATEs triggering from within the BEFORE UPDATE trigger which touch the same row, it doesn't seem like you can honor both the original UPDATE which was requested *and* the triggered UPDATEs. Of course, if you discard the original UPDATE, you can't very well do anything with the record returned from the BEFORE UPDATE trigger for that update. Since it seems equally evil to allow the update while ignoring some of the work caused by its trigger functions as to show the work of the triggered updates while suppressing the original update, I think the right thing is to throw an error if the old row for a BEFORE UPDATE is updated by the same transaction and the trigger function ultimately returns a non-NULL value. Thoughts? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Florian Pflug f...@phlo.org wrote: I think it would be helpful if we had a more precise idea about the intended use-cases. So far, the only use-case that has been described in detail is the one which made Kevin aware of the problem. But if I understood Kevin correctly, that fact that they use BEFORE and not AFTER triggers it more of an accident than the result of a conscious design decision. For the UPDATE case we have so far only identified one place where this was a problem -- although the grep we used would only show the simplest form of this (where the UPDATE can be replaced by setting fields in the NEW record). We've scheduled people to run through the system test scripts again with my quick fix in place so that we get an error rather than silent corruption (which might easily have been missed the last time), so others could still turn up. In any event, I have neither seen nor imagined any use cases in our shop where we need to allow the UPDATE behavior. Though he did also mention that there might actually be advantages to using BEFORE instead of AFTER triggers, because that prevents other triggers from seeing a non-consistent state. Right, although I've only seen that for the DELETE triggers. We are using the BEFORE triggers to delete from the bottom up, and doing this in the AFTER trigger would expose states to triggers where children still exist in the absence of parents, which might fire validation failures from hard-to-predict places. It would certainly be more convenient on this end if the DELETE behavior from my patch was accepted, but I'm confident that there are workarounds for the long term if not. What I can add is that AFAIR all instances of same-row UPDATES from BEFORE triggers I ever encountered where replaceable by a simple assignment to NEW. (That, however, is more of an anti-usecase) If there is a valid use-case for UPDATE, it would have to be less direct -- for example, the BEFORE UPDATE trigger modifies some other table or row, and the trigger for *that* updates the original row. I really can't see any excuse for a BEFORE UPDATE ... FOR EACH ROW trigger to execute a new UPDATE statement referencing the row for which it is being fired. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Florian Pflug f...@phlo.org wrote: To summarize, here's what I currently believe would be a sensible approach: After every BEFORE trigger invocation, if the trigger returned non-NULL, check if latest row version is still the same as when the trigger started. If not, complain. That certainly has the advantages of being a small, safe change and of being easy to explain. It would certainly prevent the astonishing sorts of behaviors which now occur and which can leave people with database contents they thought they had guards against. The down side is that something this strict does make it hard to achieve certain behaviors which could be desirable for maintaining redundant data for performance. In my bottom-up delete scenario, there would either need to be somewhere to note that a row was being deleted so that the delete of the children could skip maintaining it, or the cascade would need to be implemented in the AFTER triggers, and validations would need to accept orphans which could be created. Either approach can be made to work, but from the application development side, it's not as clean or easy. The suggested approach for UPDATE with my original approach to DELETE would make me happier, but I'm still not clear on Robert's use cases and how that would affect him. Can you clarify why you feel UPDATE and DELETE should both do this? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
On Aug9, 2011, at 15:41 , Kevin Grittner wrote: Florian Pflug f...@phlo.org wrote: To summarize, here's what I currently believe would be a sensible approach: After every BEFORE trigger invocation, if the trigger returned non-NULL, check if latest row version is still the same as when the trigger started. If not, complain. That certainly has the advantages of being a small, safe change and of being easy to explain. It would certainly prevent the astonishing sorts of behaviors which now occur and which can leave people with database contents they thought they had guards against. The down side is that something this strict does make it hard to achieve certain behaviors which could be desirable for maintaining redundant data for performance. In my bottom-up delete scenario, there would either need to be somewhere to note that a row was being deleted so that the delete of the children could skip maintaining it, or the cascade would need to be implemented in the AFTER triggers, and validations would need to accept orphans which could be created. Either approach can be made to work, but from the application development side, it's not as clean or easy. Another option would be to re-issue the DELETE from the BEFORE DELETE trigger, and then return NULL. It'll cause the BEFORE DELETE trigger to be invoked recursively, but presumably the second invocation could easily detect that all required pre-delete actions have already taken place and exit early (and return OLD). In pseudo-code, something like BEFORE DELETE ON parent table: DELETE FROM child table WHERE parent_id = OLD.id; IF FOUND THEN -- Removing children might have modified our row, -- so returning non-NULL is not an option DELETE FROM table WHERE id = OLD.id; RETURN NULL; ELSE -- No children removed, so our row should be unmodified RETURN OLD; END IF; Or, more generally, you could check for modifications of the row to be deleted. I'm not sure if the ctid column is currectly available for OLD and NEW, but if it is, you could do BEFORE DELETE ON table: arbitrarily complex child removal logic IF EXISTS(SELECT 1 FROM table WHERE ctid = OLD.ctid) -- Original row wasn't unmodified, actual delete should succeed RETURN OLD; ELSE -- Original was modified, actual delete would fail DELETE FROM table WHERE pk = OLD.pk; RETURN NULL; END IF; This only requires that the arbitrarily complex child removal logic is smart enough not to update the parent table if no children were actually removed. The suggested approach for UPDATE with my original approach to DELETE would make me happier, but I'm still not clear on Robert's use cases and how that would affect him. Can you clarify why you feel UPDATE and DELETE should both do this? I'm concerned that proceeding with a DELETE even though the row has been modified has an equal chance of cause subtle data-integrity violations as the current behaviour. We might remove a row, even though the row, at the time of deletion, does *not* match the DELETE's WHERE condition. That seems like a violation of very basic guarantees to me. A real-world use-case where that might happen could consist of a table with a reference_count column, together with a BEFORE DELETE trigger which (indirectly via other triggers) sometimes causes a reference_count to be changed from 0 to 0. Simply inserting a new row instead of re-using the old one may not be an option because of UNIQUE constraints. The statement DELETE FROM reference_counted_table WHERE reference_count = 0 might then delete rows even though they *are* referenced, if the references where added from the BEFORE DELETE trigger. The result would probably dangling references, i.e. a data-integrity violation. One might attempt to solve that by re-checking the DELETE's WHERE condition before carrying out the DELETE. To avoid ending up where we started, i.e. with BEFORE triggers firing but no DELETE taking place, we'd have to error out if the re-check fails instead of silently ignoring the DELETE. At which point whether or not a DELETE of a single record succeeds or fails depends on the WHERE condition used to *find* that record. That, I feel, isn't a road we want to take... best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Florian Pflug f...@phlo.org wrote: Another option would be to re-issue the DELETE from the BEFORE DELETE trigger, and then return NULL. It'll cause the BEFORE DELETE trigger to be invoked recursively, but presumably the second invocation could easily detect that all required pre-delete actions have already taken place and exit early (and return OLD). In pseudo-code, something like BEFORE DELETE ON parent table: DELETE FROM child table WHERE parent_id = OLD.id; IF FOUND THEN -- Removing children might have modified our row, -- so returning non-NULL is not an option DELETE FROM table WHERE id = OLD.id; RETURN NULL; ELSE -- No children removed, so our row should be unmodified RETURN OLD; END IF; Yeah, that would cover it all right. That pretty much eliminates my objections to your check for error after firing each BEFORE trigger approach. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Robert Haas robertmh...@gmail.com wrote: Florian Pflug f...@phlo.org wrote: Going down that road opens the door to a *lot* of subtle semantic differences between currently equivalent queries. For example, UPDATE T SET f=f, a=1 would behave differently then UPDATE T SET a=1 because in the first case the new row would depend on the old row's value of f, while in the second case it doesn't. True. But I'm not really bothered by that. If the user gets an error message that says: ERROR: updated column f has already been modified by a BEFORE trigger ...the user will realize the error of their ways. There's no way to get the same result as if you'd done either one of them first, because they are inextricably intertwined. In practice, my hand-wavy reference to reconciling the updates is a problem because of the way the trigger interface works. It feels pretty impossible to decide that we're going to do the update, but with some other random values we dredged up from some other update replacing some of the ones the user explicitly handed to us. But if the trigger could return an indication of which column values it wished to override, then it seems to me that we could piece together a reasonable set of semantics. It's not exactly clear how to make that work, though. I dunno, that all still feels awfully complex. As you said yourself, this case is quite similar to a serialization anomaly. Taking that correspondence further, that reconciliation of updates is pretty much what the EPQ machinery does in READ COMMITTED mode. Now, we ourselves have warned users in the past to *not* use READ COMMITTED mode if they do complex updates (e.g., use UPDATE ... FROM ...), because the behaviour of that reconciliation machinery in the present of concurrent updates is extremely hard to predict. I thus don't believe that it's a good idea to introduce similarly complex behaviour in other parts of the system - and particularly not if you cannot disable it by switching to another isolation level. Simply throwing an error, on the other hand, makes the behaviour simple to understand and explain. True. I'm coming around to liking that behavior better than I did on first hearing, but I'm still not crazy about it, because as an app developer I would really like to have at least the unproblematic cases actually work. Throwing an error at least makes it clear that you've done something which isn't supported, and that's probably an improvement over the current, somewhat- baffling behavior. However, it's not even 25% as nice as having it actually work as intended. That's why, even if we can't make all the cases work sanely, I'd be a lot more enthusiastic about it if we could find a way to make at least some of them work sanely. The mind-bending cases are unlikely to be the ones people do on purpose. So, we have three options on the table here: (1) We (the Wisconsin Courts) are using a very simple fix to work around the issue so we can move forward with conversion to PostgreSQL triggers. A DELETE is allowed to complete if the BEFORE trigger doesn't return NULL, even if the row was updated while the trigger was executing. An UPDATE fails if the BEFORE trigger doesn't return NULL and any other update is made to the row while the trigger was executing. (2) You (Robert) have proposed (as I understand it) modifying that approach to allow some UPDATE cases to work, where there are not conflicting updates to any one column within the row. (3) Florian has proposed banning all updates to a row which is being processed by a BEFORE UPDATE or BEFORE DELETE trigger. As I understand it, this would be similar to the approach taken by Oracle, although less strict. I can live with any of these solutions. How do we move forward to consensus so that a patch can be written? Does anyone else have a preference for one of these approaches versus another? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
On Aug8, 2011, at 17:02 , Kevin Grittner wrote: So, we have three options on the table here: (1) We (the Wisconsin Courts) are using a very simple fix to work around the issue so we can move forward with conversion to PostgreSQL triggers. A DELETE is allowed to complete if the BEFORE trigger doesn't return NULL, even if the row was updated while the trigger was executing. An UPDATE fails if the BEFORE trigger doesn't return NULL and any other update is made to the row while the trigger was executing. (2) You (Robert) have proposed (as I understand it) modifying that approach to allow some UPDATE cases to work, where there are not conflicting updates to any one column within the row. (3) Florian has proposed banning all updates to a row which is being processed by a BEFORE UPDATE or BEFORE DELETE trigger. As I understand it, this would be similar to the approach taken by Oracle, although less strict. Yeah, though much, much less strict. I think it would be helpful if we had a more precise idea about the intended use-cases. So far, the only use-case that has been described in detail is the one which made Kevin aware of the problem. But if I understood Kevin correctly, that fact that they use BEFORE and not AFTER triggers it more of an accident than the result of a conscious design decision. Though he did also mention that there might actually be advantages to using BEFORE instead of AFTER triggers, because that prevents other triggers from seeing a non-consistent state. What I can add is that AFAIR all instances of same-row UPDATES from BEFORE triggers I ever encountered where replaceable by a simple assignment to NEW. (That, however, is more of an anti-usecase) best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
On Aug8, 2011, at 17:35 , Florian Pflug wrote: I think it would be helpful if we had a more precise idea about the intended use-cases. So far, the only use-case that has been described in detail is the one which made Kevin aware of the problem. But if I understood Kevin correctly, that fact that they use BEFORE and not AFTER triggers it more of an accident than the result of a conscious design decision. Though he did also mention that there might actually be advantages to using BEFORE instead of AFTER triggers, because that prevents other triggers from seeing a non-consistent state. I pondered this some more, and came up with the following use-case of same-row modification from a BEFORE trigger which are not easily moved to an AFTER trigger. Assume you have a partitioned table T with two partitions. Partition Tcompleted contains records which are completed in some sense, while Tcommencing contains records which are not. A record's state is indicated by a boolean column completed, and CHECK constraints guarantee that completed is indeed false for all records in Tcommencing and true for all records in Tcompleted. Now, if the partitioning is supposed to be transparent to applications, a record must automatically be moved from Tcommencing to Tcompleted if it's completed flag is changed from false to true. The following BEFORE UPDATE trigger on Tcommencing accomplishes that. IF NEW.completed THEN DELETE FROM Tcommencing WHERE id = OLD.id; INSERT INTO Tcompleted VALUES (NEW.*); -- Probably a syntax error, -- but intent is clear RETURN NULL; ELSE RETURN NEW; END IF; Doing this from within an AFTER trigger doesn't work because the CHECK constraint on Tcommencing would prevent the UPDATE from occurring. This would also fail if we did as I suggested, and forbade any modifications to a row for which a BEFORE trigger was in progress. I take this as evidence that my suggestion was miss-guided and we should in fact only report an error if the row was modified since the before trigger started *and* the trigger returns non-NULL. I still believe we shouldn't distinguish between UPDATES and DELETES, so a DELETE should fail if the row is updated from within a BEFORE DELETE trigger which returns non-NULL. BTW, the case of multiple BEFORE triggers also raises some interesting questions. Assume there are two BEFORE UPDATE triggers T1 and T2 defined on a table T, and T1 causes a same-row UPDATE and returns non-NULL. Also assume that T1 is coded in a way that prevents infinite recursion in that case (i.e., the same-row UPDATE is skipped if is T1 invocation already as a result of a same-row UPDATE). The sequence of invocations after an update of a row would be. 1) T1 (for the original UPDATE) 2) T1 (for the same-row UPDATE initiated in (1)) 3) T2 (for the same-row UPDATE initiated in (1)) 4) T2 (for the original UPDATE) Now, T2 in invoked in (4) for a row which isn't visible anymore. So I'm thinking that maybe we should check after every trigger invocation whether or not a same-row UPDATE occurred, and immediately raise an error if it did and the trigger didn't return NULL, and not delay the check until all triggers have run. It seems unlikely that delaying the check until all triggers have run adds any significant functionality. For that to be the case, T2 would have to know whether T1 did a same-row update, and return NULL, since otherwise we'd raise an error anyway. That beauty of doing the check after every trigger lies in the fact that it makes the coding rule Thou shall return NULL if thou dared to modify thine own row local (i.e. per-trigger) instead of global (i.e. per trigger-chain), and thus less likely to be violated by some strange interactions of triggers serving distinct purposes. To summarize, here's what I currently believe would be a sensible approach: After every BEFORE trigger invocation, if the trigger returned non-NULL, check if latest row version is still the same as when the trigger started. If not, complain. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
On Aug3, 2011, at 04:54 , Robert Haas wrote: On Tue, Aug 2, 2011 at 2:32 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Would you feel comfortable with a patch which threw an error on the DELETE case, as it does on the UPDATE case? Yes, though with a little additional twist. The twist being that I'd like the error to be thrown earlier, at the time of the offending UPDATE or DELETE, not later, when the original UPDATE or DELETE which caused the BEFORE trigger invocation stumbles over the updated row. It not only seems more logical that way, but it also makes it possible for the trigger to catch the error, and react accordingly. I hadn't thought of that. It does seem better in every way except for how much work it takes to do it and how much testing it needs to have confidence in it. Definitely not 9.1 material. I'm not sure I understand the justification for throwing an error. Updating a row twice is not an error under any other circumstances; nor is deleting a row you've updated. Why should it be here? Because updating or deleting a row from within a BEFORE trigger fired upon updating or deleting that very row causes two intermingled UPDATE/DELETE executions, not one UPDATE/DELETE followed by another. Here's a step-by-step description of what happens in the case of two UPDATES, assuming that we don't ignore any updated and throw no error. 1) UPDATE T SET ... WHERE ID=1 2) Row with ID=1 is found locked 3) BEFORE UPDATE triggers are fired, with OLD containing the original row's contents and NEW the values specified in (1) 4) UPDATE T SET ... WHERE ID=1, executed from within one of the triggers 5) BEFORE UPDATE triggers are fired again, with OLD containing the original row's contents and NEW the value specified in (4). 6) The row is updated with the values specified in (4), possibly changed by the triggers in (5) 7) The row is updated with the values specified in (2), possibly changed by the triggers in (3). Now, in step (7), we're overwriting the UPDATE from steps 4-6 without even looking at the row that UPDATE produced. If, for example, both UPDATES do counter = counter + 1, we'd end up with the counter incremented by 1, *not* by 2, even though there were two distinct UPDATEs. Or even worse, the inner UPDATE at (4) might have modified the ID. Then, we'll apply the outer UPDATE in (7), even though the original WHERE condition from (1) no longer matches the row. We could attempt to fix that by using the EvalPlanQual machinery to re-check the WHERE clause and re-compute the new row in (7). However, EvalPlanQual introduces a lot of conceptional complexity, and can lead to very surprising results for more complex UPDATES. (There's that whole self-join problem with EvalPlanQual, for example) Also, even if we did use EvalPlanQual, we'd still have executed the outer BEFORE trigger (step 3) with values for OLD and NEW which *don't* match the row the we actually update in (7). Doing that seems rather hazardous too - who knows what the BEFORE trigger has used the values for. The different between Kevin's original patch and my suggestion is, BTW, that he made step (7) through an error while I suggested the error to be thrown already at (4). best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Robert Haas robertmh...@gmail.com wrote: Kevin Grittner kevin.gritt...@wicourts.gov wrote: Would you feel comfortable with a patch which threw an error on the DELETE case, as it does on the UPDATE case? Yes, though with a little additional twist. The twist being that I'd like the error to be thrown earlier, at the time of the offending UPDATE or DELETE, not later, when the original UPDATE or DELETE which caused the BEFORE trigger invocation stumbles over the updated row. It not only seems more logical that way, but it also makes it possible for the trigger to catch the error, and react accordingly. I hadn't thought of that. It does seem better in every way except for how much work it takes to do it and how much testing it needs to have confidence in it. Definitely not 9.1 material. IMHO, none of this is 9.1 material. Nobody is currently arguing for including anything to fix this in 9.1. Perhaps I should have been more explicit that in agreeing with Florian, I'm giving up on the idea of any PostgreSQL release attempting to do anything about this before 9.2. That said, this is a data corrupting bug in the view of management here, based on reports from testers and my explanation of why they saw that behavior. They are aware that it can be worked around, but that's not acceptable if there's no reliable way to find all occurrences of patterns that hit this bug. To keep this effort alive here, I am using my quick hack for 9.0 and 9.1. Deletes will behave as they expect, and updates won't quietly discard part of the work of a transaction -- an error will be thrown in that situation. It's been like this forever As have many other data mangling bugs which we fix in minor releases. Our point here is that it's never been like this in any product that the Wisconsin Courts has used for triggers, and never will be. People who would otherwise be tempted to write their applications this way will end up not doing so precisely because it currently does not work. I can attest to that. So I think we should focus on getting the right semantics here, rather than trying to minimize the scope of the change. Agreed. I thought that's what we were talking about. I'm not sure I understand the justification for throwing an error. Nobody has suggested sane semantics for doing otherwise. There is the pipe dream of what application programmers here would like, described earlier in the thread. I just don't see that happening, and I'm not sure it addresses all of Florian's concerns anyway. You had a suggestion for how to salvage the update situation, but it was pretty hand-wavy, and I find it very hard to reconcile to some of the issues raised by Florian. Maybe someone can propose sane semantics which covers all the objections, and maybe that would even be possible to implement; but right now Florian's approach seems like the most solid proposal yet put forward. Updating a row twice is not an error under any other circumstances; nor is deleting a row you've updated. Why should it be here? As Florian pointed out, it's not exactly a matter of doing those things. If you have a proposal which addresses the issues raised earlier in the thread, please share. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
On Wed, Aug 3, 2011 at 10:48 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: As have many other data mangling bugs which we fix in minor releases. Our point here is that it's never been like this in any product that the Wisconsin Courts has used for triggers, and never will be. I don't believe this is very similar at all to the sorts of bugs we fix in minor releases, but let's leave that aside for the moment since it seems that we're in violent agreement about which release we're targeting. With respect to the second part of your comment, it might be useful to describe how this works in some of those other products. I'm not sure I understand the justification for throwing an error. Nobody has suggested sane semantics for doing otherwise. There is the pipe dream of what application programmers here would like, described earlier in the thread. I just don't see that happening, and I'm not sure it addresses all of Florian's concerns anyway. You had a suggestion for how to salvage the update situation, but it was pretty hand-wavy, and I find it very hard to reconcile to some of the issues raised by Florian. Maybe someone can propose sane semantics which covers all the objections, and maybe that would even be possible to implement; but right now Florian's approach seems like the most solid proposal yet put forward. It probably is, but I'm not sure we've done enough thinking about it to be certain that it's the right way forward. On that note, I think in some ways the problems we're hitting here are very much like serialization anomalies. If the user updates a tuple based on its PK and sets some non-PK field to a constant, and while we're doing that, a BEFORE trigger updates any field in the tuple other than the PK, then in theory it seems we ought to be able to reconcile the updates. It feels like the result ought to be the same as if we'd simply run the BEFORE-trigger update to completion, and then run the main update. However, if the BEFORE-trigger modifies any columns that the main update examined while constructing its value for NEW, then the updates can't be serialized. There's no way to get the same result as if you'd done either one of them first, because they are inextricably intertwined. In practice, my hand-wavy reference to reconciling the updates is a problem because of the way the trigger interface works. It feels pretty impossible to decide that we're going to do the update, but with some other random values we dredged up from some other update replacing some of the ones the user explicitly handed to us. But if the trigger could return an indication of which column values it wished to override, then it seems to me that we could piece together a reasonable set of semantics. It's not exactly clear how to make that work, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
On Wed, Aug 3, 2011 at 4:50 AM, Florian Pflug f...@phlo.org wrote: On Aug3, 2011, at 04:54 , Robert Haas wrote: On Tue, Aug 2, 2011 at 2:32 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Would you feel comfortable with a patch which threw an error on the DELETE case, as it does on the UPDATE case? Yes, though with a little additional twist. The twist being that I'd like the error to be thrown earlier, at the time of the offending UPDATE or DELETE, not later, when the original UPDATE or DELETE which caused the BEFORE trigger invocation stumbles over the updated row. It not only seems more logical that way, but it also makes it possible for the trigger to catch the error, and react accordingly. I hadn't thought of that. It does seem better in every way except for how much work it takes to do it and how much testing it needs to have confidence in it. Definitely not 9.1 material. I'm not sure I understand the justification for throwing an error. Updating a row twice is not an error under any other circumstances; nor is deleting a row you've updated. Why should it be here? Because updating or deleting a row from within a BEFORE trigger fired upon updating or deleting that very row causes two intermingled UPDATE/DELETE executions, not one UPDATE/DELETE followed by another. Here's a step-by-step description of what happens in the case of two UPDATES, assuming that we don't ignore any updated and throw no error. 1) UPDATE T SET ... WHERE ID=1 2) Row with ID=1 is found locked 3) BEFORE UPDATE triggers are fired, with OLD containing the original row's contents and NEW the values specified in (1) 4) UPDATE T SET ... WHERE ID=1, executed from within one of the triggers 5) BEFORE UPDATE triggers are fired again, with OLD containing the original row's contents and NEW the value specified in (4). 6) The row is updated with the values specified in (4), possibly changed by the triggers in (5) 7) The row is updated with the values specified in (2), possibly changed by the triggers in (3). Now, in step (7), we're overwriting the UPDATE from steps 4-6 without even looking at the row that UPDATE produced. If, for example, both UPDATES do counter = counter + 1, we'd end up with the counter incremented by 1, *not* by 2, even though there were two distinct UPDATEs. Or even worse, the inner UPDATE at (4) might have modified the ID. Then, we'll apply the outer UPDATE in (7), even though the original WHERE condition from (1) no longer matches the row. We could attempt to fix that by using the EvalPlanQual machinery to re-check the WHERE clause and re-compute the new row in (7). However, EvalPlanQual introduces a lot of conceptional complexity, and can lead to very surprising results for more complex UPDATES. (There's that whole self-join problem with EvalPlanQual, for example) Also, even if we did use EvalPlanQual, we'd still have executed the outer BEFORE trigger (step 3) with values for OLD and NEW which *don't* match the row the we actually update in (7). Doing that seems rather hazardous too - who knows what the BEFORE trigger has used the values for. The different between Kevin's original patch and my suggestion is, BTW, that he made step (7) through an error while I suggested the error to be thrown already at (4). I think Kevin's proposal is better, because AFAICS if the BEFORE UPDATE trigger returns NULL then we haven't got a problem; and we can't know that until we get to step 7. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
On Aug3, 2011, at 17:55 , Robert Haas wrote: On that note, I think in some ways the problems we're hitting here are very much like serialization anomalies. Yeah, I had the same feeling of familiarity ;-) If the user updates a tuple based on its PK and sets some non-PK field to a constant, and while we're doing that, a BEFORE trigger updates any field in the tuple other than the PK, then in theory it seems we ought to be able to reconcile the updates. It feels like the result ought to be the same as if we'd simply run the BEFORE-trigger update to completion, and then run the main update. However, if the BEFORE-trigger modifies any columns that the main update examined while constructing its value for NEW, then the updates can't be serialized. Going down that road opens the door to a *lot* of subtle semantic differences between currently equivalent queries. For example, UPDATE T SET f=f, a=1 would behave differently then UPDATE T SET a=1 because in the first case the new row would depend on the old row's value of f, while in the second case it doesn't. There's no way to get the same result as if you'd done either one of them first, because they are inextricably intertwined. In practice, my hand-wavy reference to reconciling the updates is a problem because of the way the trigger interface works. It feels pretty impossible to decide that we're going to do the update, but with some other random values we dredged up from some other update replacing some of the ones the user explicitly handed to us. But if the trigger could return an indication of which column values it wished to override, then it seems to me that we could piece together a reasonable set of semantics. It's not exactly clear how to make that work, though. I dunno, that all still feels awfully complex. As you said yourself, this case is quite similar to a serialization anomaly. Taking that correspondence further, that reconciliation of updates is pretty much what the EPQ machinery does in READ COMMITTED mode. Now, we ourselves have warned users in the past to *not* use READ COMMITTED mode if they do complex updates (e.g., use UPDATE ... FROM ...), because the behaviour of that reconciliation machinery in the present of concurrent updates is extremely hard to predict. I thus don't believe that it's a good idea to introduce similarly complex behaviour in other parts of the system - and particularly not if you cannot disable it by switching to another isolation level. Simply throwing an error, on the other hand, makes the behaviour simple to understand and explain. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
On Aug3, 2011, at 17:57 , Robert Haas wrote: On Wed, Aug 3, 2011 at 4:50 AM, Florian Pflug f...@phlo.org wrote: The different between Kevin's original patch and my suggestion is, BTW, that he made step (7) through an error while I suggested the error to be thrown already at (4). I think Kevin's proposal is better, because AFAICS if the BEFORE UPDATE trigger returns NULL then we haven't got a problem; and we can't know that until we get to step 7. Hm, not sure if I agree. A BEFORE trigger might decide to block an update, and then take some action based on the assumption that the row is actually unmodified (i.e., identical to the OLD value observed by the trigger). To me, it still seems conceptionally cleaner to just decree that a row must not be modified while BEFORE triggers are running, period. This, BTW, also matches what Oracle does, only on a per-row instead of per-table basis. Oracle AFAIR simply forbids touching of the table a BEFORE trigger is attached to from within that trigger. (They even forbid SELECTS, which is presumably because they don't have an equivalent of our per-row command id, i.e. cannot ensure that such a SELECT sees the state the table was in at the beginning of the statement) best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Robert Haas robertmh...@gmail.com wrote: Florian Pflug f...@phlo.org wrote: Here's a step-by-step description of what happens in the case of two UPDATES, assuming that we don't ignore any updated and throw no error. 1) UPDATE T SET ... WHERE ID=1 2) Row with ID=1 is found locked 3) BEFORE UPDATE triggers are fired, with OLD containing the original row's contents and NEW the values specified in (1) 4) UPDATE T SET ... WHERE ID=1, executed from within one of the triggers 5) BEFORE UPDATE triggers are fired again, with OLD containing the original row's contents and NEW the value specified in (4). 6) The row is updated with the values specified in (4), possibly changed by the triggers in (5) 7) The row is updated with the values specified in (2), possibly changed by the triggers in (3). The different between Kevin's original patch and my suggestion is, BTW, that he made step (7) through an error while I suggested the error to be thrown already at (4). I think Kevin's proposal is better, because AFAICS if the BEFORE UPDATE trigger returns NULL then we haven't got a problem; and we can't know that until we get to step 7. Robert makes a good point -- at step 4 we can't know whether the BEFORE trigger will return an unmodified record, a modified record, or NULL. Without some way to deal with that, an throwing the error earlier seems like a non-starter. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Florian Pflug f...@phlo.org wrote: To me, it still seems conceptionally cleaner to just decree that a row must not be modified while BEFORE triggers are running, period. This, BTW, also matches what Oracle does, only on a per-row instead of per-table basis. Oracle AFAIR simply forbids touching of the table a BEFORE trigger is attached to from within that trigger. (They even forbid SELECTS, which is presumably because they don't have an equivalent of our per-row command id, i.e. cannot ensure that such a SELECT sees the state the table was in at the beginning of the statement) It appears this was in flight while I was composing my last post. This seems pretty strict, but given the trade-offs, perhaps it is worth it. I can live with either solution, myself. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
On Wed, Aug 3, 2011 at 12:27 PM, Florian Pflug f...@phlo.org wrote: Going down that road opens the door to a *lot* of subtle semantic differences between currently equivalent queries. For example, UPDATE T SET f=f, a=1 would behave differently then UPDATE T SET a=1 because in the first case the new row would depend on the old row's value of f, while in the second case it doesn't. True. But I'm not really bothered by that. If the user gets an error message that says: ERROR: updated column f has already been modified by a BEFORE trigger ...the user will realize the error of their ways. There's no way to get the same result as if you'd done either one of them first, because they are inextricably intertwined. In practice, my hand-wavy reference to reconciling the updates is a problem because of the way the trigger interface works. It feels pretty impossible to decide that we're going to do the update, but with some other random values we dredged up from some other update replacing some of the ones the user explicitly handed to us. But if the trigger could return an indication of which column values it wished to override, then it seems to me that we could piece together a reasonable set of semantics. It's not exactly clear how to make that work, though. I dunno, that all still feels awfully complex. As you said yourself, this case is quite similar to a serialization anomaly. Taking that correspondence further, that reconciliation of updates is pretty much what the EPQ machinery does in READ COMMITTED mode. Now, we ourselves have warned users in the past to *not* use READ COMMITTED mode if they do complex updates (e.g., use UPDATE ... FROM ...), because the behaviour of that reconciliation machinery in the present of concurrent updates is extremely hard to predict. I thus don't believe that it's a good idea to introduce similarly complex behaviour in other parts of the system - and particularly not if you cannot disable it by switching to another isolation level. Simply throwing an error, on the other hand, makes the behaviour simple to understand and explain. True. I'm coming around to liking that behavior better than I did on first hearing, but I'm still not crazy about it, because as an app developer I would really like to have at least the unproblematic cases actually work. Throwing an error at least makes it clear that you've done something which isn't supported, and that's probably an improvement over the current, somewhat-baffling behavior. However, it's not even 25% as nice as having it actually work as intended. That's why, even if we can't make all the cases work sanely, I'd be a lot more enthusiastic about it if we could find a way to make at least some of them work sanely. The mind-bending cases are unlikely to be the ones people do on purpose. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
On Aug1, 2011, at 20:02 , Kevin Grittner wrote: Kevin Grittner kevin.gritt...@wicourts.gov wrote: I consider the trigger behavior addressed by this patch to be a bug, and serious enough to merit inclusion of a fix in the 9.1 release, even at this late stage. I'm sorry but I disagree, on multiple grounds. First, I'm not sure this is even a bug. To me, it seems that postgres currently resolves an inherently ambiguous situation in one possible way, while you expect it to pick another. It might be that the behaviour that you suggest is more in line with people's expectations (More on that below), but that still doesn't make the existing behaviour wrong. Secondly, even if one considers the current behaviour to be wrong, that still doesn't prove that your proposed behaviour is any better (More on that below too). There might very well be situations in which the current behaviour is preferable over the behaviour you suggest, and maybe these situations turn out to be much more common than anyone anticipates right now. But if we apply the patch now, the chance of that being discovered before 9.1 is released is virtually zero. And after the release there's no going back, so we might end up changing the behaviour to the worse. And lastly, I believe that your ultimate goal of guaranteeing that a row is actually deleted (updated) after a BEFORE DELETE (BEFORE UPDATE) trigger has fired is actually insurmountable anyway, at least for isolation level READ COMMITTED. We don't seem to lock rows before firing BEFORE DELETE or BEFORE UPDATE triggers, so a row may very well be updated by a concurrent transaction between the firing of these triggers and the actual DELETE or UPDATE. In READ COMMITTED mode, we'll then re-check if the original WHERE condition still applies (using the EvalPlanQual machinery), and only go forward with the modification if it does. Having said all that, I don't doubt that the current behaviour is source of pain for you, and I do believe that we ought to improve things - but not by replacing one unsatisfactory behaviour with another. The root of the issue seems to be that you're trying to do things in a BEFORE trigger that really belong into an AFTER trigger. If you could explain why using an AFTER trigger doesn't work for you, maybe we could improve them to be suitable for your purposes (and please forgive me if you already did that). To recap: a trigger for UPDATE BEFORE EACH ROW or DELETE BEFORE EACH ROW, which directly or indirectly causes update of the OLD row in the trigger, can cause the triggering operation to be silently ignored even though the trigger returns a non-NULL record, and even though all triggered modifications are persisted. The only clue that an incomplete and incongruous set of operations has been performed is that the UPDATE or DELETE count from the statement is reduced by the number of rows on which this occurred: if an UPDATE would have affected 5 rows, but one of them just fired triggers *as though* it had been updated but was actually left untouched by the original UPDATE statement, you would get UPDATE 4 rather than UPDATE 5. This patch causes DELETE to behave as most people would expect, and throws and ERROR on the conflicting UPDATE case. I'm prepared to argue that whether or not people expect the behaviour that patch implements depends entirely on how you phrase the question. If you ask do you expect a row to be actually deleted after BEFORE DELETE triggers have run, you'll undoubtedly hear yes. But if you ask do you expect a row to be deleted even though it didn't match the DELETE's WHERE condition I bet you'll hear a very firm no. And equally firm do I expect to be the no if you ask do you expect the row that is actually deleted and the contents of the variable OLD in the delete trigger to differ. But the behaviour indicated in the second and third question *is* what happens with your patch applied. Here's an example create table t (id int); create or replace function on_t_delete() returns trigger as $$ begin raise notice 'deleting row %', old.id; update t set id = -id where id = old.id; return old; end; $$ language plpgsql volatile; create trigger t_delete before delete on t for each row execute procedure on_t_delete(); insert into t (id) values (1); insert into t (id) values (2); delete from t where id 0; Without your patch, the DELETE doesn't remove any rows, because they're updated in on_t_delete(). With your patch applied, the rows are removed - even though, due to the UPDATE in on_t_delete(), they *don't* match the DELETE's WHERE condition (id 0) at the time they're deleted. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Florian Pflug f...@phlo.org wrote: On Aug1, 2011, at 20:02 , Kevin Grittner wrote: I consider the trigger behavior addressed by this patch to be a bug, and serious enough to merit inclusion of a fix in the 9.1 release, even at this late stage. I'm sorry but I disagree, on multiple grounds. Thanks for offering your thoughts! First, I'm not sure this is even a bug. To me, it seems that postgres currently resolves an inherently ambiguous situation in one possible way, while you expect it to pick another. It might be that the behaviour that you suggest is more in line with people's expectations (More on that below), but that still doesn't make the existing behaviour wrong. That point would be a hard sell for me. To silently turn a BEFORE trigger into an INSTEAD OF trigger invites data corruption, in the sense that business rules that you thought were unconditionally enforced in triggers can be broken with no obvious hint that it happened. Secondly, even if one considers the current behaviour to be wrong, that still doesn't prove that your proposed behaviour is any better Granted. I tried to phrase my post to not preclude other solutions. There might very well be situations in which the current behaviour is preferable over the behaviour you suggest, and maybe these situations turn out to be much more common than anyone anticipates right now. My imagination is not up to the task of putting any plausible situations forward where current behavior is a good thing. Of course I realize that's not proof of anything, but it leaves me skeptical. But if we apply the patch now, the chance of that being discovered before 9.1 is released is virtually zero. And after the release there's no going back, so we might end up changing the behaviour to the worse. Perhaps throwing an error on the DELETE case as well as the UPDATE case would address this concern. That would have saved weeks of staff time here when we started seeing the mysterious behavior which currently exists. The only way we finally got to the bottom of it was to step through execution in gdb; those users not equipped to do so might have bigger problems than we did. We would probably still be patching the version we use in our shop to actually *do* the delete if no error is thrown and no BEFORE trigger returns NULL, but we would have been able to get there a lot faster. And lastly, I believe that your ultimate goal of guaranteeing that a row is actually deleted (updated) after a BEFORE DELETE (BEFORE UPDATE) trigger has fired is actually insurmountable anyway, at least for isolation level READ COMMITTED. Hmm. That's a very potent point. I had not considered READ COMMITTED because we have never, ever used it under PostgreSQL. Obviously, any solution has to work for those who *do* use it, too; and the behavior between different isolation levels can't be too convoluted. So, as you say, the *ultimate* goal may be unattainable. Having said all that, I don't doubt that the current behaviour is source of pain for you, It has thrown off our production development and deployment schedule by weeks, and management has considered shelving the whole concept of using PostgreSQL triggers because of it. I've convinced them that this is a surmountable obstacle, and am trying to get things back on track. and I do believe that we ought to improve things - but not by replacing one unsatisfactory behaviour with another. I hope you're not suggesting *that's* my goal! ;-) The root of the issue seems to be that you're trying to do things in a BEFORE trigger that really belong into an AFTER trigger. I'll take that as a *collective* pronoun, since I haven't personally written any of the thousands of triggers. There are 20 programmers doing that. Part of the problem is that they sometimes put things in a BEFORE trigger that belong in an AFTER trigger. I caught one place they were doing an UPDATE of the target row in a BEFORE trigger rather than setting the values in the NEW record. My patch makes the latter throw an error, as I believe it should, rather than silently leaving the data in a bad state. If you could explain why using an AFTER trigger doesn't work for you, maybe we could improve them to be suitable for your purposes (and please forgive me if you already did that). I did to some extent, but really perhaps the biggest issue (which I don't think I've really covered earlier in the thread) is to not silently corrupt the data. I would settle for throwing an error on the DELETE case as well as the UPDATE case, because my data would then be protected, and programmers would be forced to work around the issue in a way that PostgreSQL can handle correctly. Robert said that he has mainly run into this on BEFORE UPDATE triggers, so perhaps he can elaborate on the usage patterns that run into it there. In my case it is mainly an issue in deletes (and the special case of a purge process where
Re: [HACKERS] WIP fix proposal for bug #6123
On Aug2, 2011, at 17:03 , Kevin Grittner wrote: Florian Pflug f...@phlo.org wrote: First, I'm not sure this is even a bug. To me, it seems that postgres currently resolves an inherently ambiguous situation in one possible way, while you expect it to pick another. It might be that the behaviour that you suggest is more in line with people's expectations (More on that below), but that still doesn't make the existing behaviour wrong. That point would be a hard sell for me. To silently turn a BEFORE trigger into an INSTEAD OF trigger invites data corruption, in the sense that business rules that you thought were unconditionally enforced in triggers can be broken with no obvious hint that it happened. Hm, I can see your point. But I still maintain that a trigger who acts based on values of OLD which don't reflect the actual tuple being deleted is an equally dangerous source of data corruption. There might very well be situations in which the current behaviour is preferable over the behaviour you suggest, and maybe these situations turn out to be much more common than anyone anticipates right now. My imagination is not up to the task of putting any plausible situations forward where current behavior is a good thing. Of course I realize that's not proof of anything, but it leaves me skeptical. One situation I had in mind was a table whose records contain a kind of reference count. A cleanup process might then issue a DELETE which removes all entries whose reference count is zero. Now, if such a table contains a BEFORE DELETE trigger which, through some elaborate chains of triggers, manages to increment some row's reference count that was initially zero, it'd be a bad idea for the DELETE to remove it. But my actual point was not so much that *I* have a reasonable use-case in mind in which the current behaviour is preferable, but more that *someone* might. But if we apply the patch now, the chance of that being discovered before 9.1 is released is virtually zero. And after the release there's no going back, so we might end up changing the behaviour to the worse. Perhaps throwing an error on the DELETE case as well as the UPDATE case would address this concern. It addresses all concerns except this one, I'm afraid. Whatever we change the behaviour to, I feel that we need to give ourselves plenty of time to tweak it, should problems arise. Which just isn't possible before 9.1 is released. Or at least that my opinion. That would have saved weeks of staff time here when we started seeing the mysterious behavior which currently exists. The only way we finally got to the bottom of it was to step through execution in gdb; those users not equipped to do so might have bigger problems than we did. Ouch. And lastly, I believe that your ultimate goal of guaranteeing that a row is actually deleted (updated) after a BEFORE DELETE (BEFORE UPDATE) trigger has fired is actually insurmountable anyway, at least for isolation level READ COMMITTED. Hmm. That's a very potent point. I had not considered READ COMMITTED because we have never, ever used it under PostgreSQL. Obviously, any solution has to work for those who *do* use it, too; and the behavior between different isolation levels can't be too convoluted. So, as you say, the *ultimate* goal may be unattainable. After reading the code again, I take that back. We *do* seem to lock the tuple before firing BEFORE {UPDATE,DELETE} triggers - trigger.c's GetTupleForTrigger() takes care of that. I previous missed that because I only looked at ExecUpdate() and ExecDelete() in nodeModifyTable.c. Sorry for that. and I do believe that we ought to improve things - but not by replacing one unsatisfactory behaviour with another. I hope you're not suggesting *that's* my goal! ;-) Nah, more of a case of bad wording on my part... The root of the issue seems to be that you're trying to do things in a BEFORE trigger that really belong into an AFTER trigger. I'll take that as a *collective* pronoun, since I haven't personally written any of the thousands of triggers. There are 20 programmers doing that. Part of the problem is that they sometimes put things in a BEFORE trigger that belong in an AFTER trigger. I caught one place they were doing an UPDATE of the target row in a BEFORE trigger rather than setting the values in the NEW record. My patch makes the latter throw an error, as I believe it should, rather than silently leaving the data in a bad state. If you could explain why using an AFTER trigger doesn't work for you, maybe we could improve them to be suitable for your purposes (and please forgive me if you already did that). I did to some extent, but really perhaps the biggest issue (which I don't think I've really covered earlier in the thread) is to not silently corrupt the data. I would settle for throwing an error on the DELETE case as well as the UPDATE case, because my data would then
Re: [HACKERS] WIP fix proposal for bug #6123
Florian Pflug f...@phlo.org wrote: On Aug2, 2011, at 17:03 , Kevin Grittner wrote: Hm, OK I see your point now I believe. This is not so much about wanting to put things into BEFORe triggers which doesn't really fit there, but instead avoiding weeks of debugging if someones messes up. I would say that is the overriding concern. But there are arguments for sometimes putting some things in the BEFORE triggers which could raise the issue, too. For us, there would be a lot to dance around if deletes cascade in the AFTER triggers, since we'd have orphans hanging around during some validations, as described below. In my case it is mainly an issue in deletes (and the special case of a purge process where deletes are not normally allowed) doing a depth-first search for dependent records, and deleting from the bottom up. In some cases there is redundant information in higher level tables based on primary data in lower level tables -- in the form of counts, sums, or status codes. If you delete a case, and that cascades to a event history table which has a trigger which updates case status according to certain business rules, the delete of the case is canceled because of the delete of a status-changing event. That's very bad. If we at least threw an error instead, we could identify the problem reliably, and code around it somehow. That might be by moving the delete of dependent records to a point after the parent record has already been deleted, but that runs the risk of triggering other validation failures based on business rules in the triggers, because the triggers would then be seeing a state we try very hard to prevent. If we go with my suggestion below (which entails throwing the error earlier at the time of the offending update or delete), you could get the non-throwing behaviour you want by protecting UPDATES and DELETES which might touch rows which are already in the process of being updated or deleted by EXCEPTION blocks. Would you feel comfortable with a patch which threw an error on the DELETE case, as it does on the UPDATE case? Yes, though with a little additional twist. The twist being that I'd like the error to be thrown earlier, at the time of the offending UPDATE or DELETE, not later, when the original UPDATE or DELETE which caused the BEFORE trigger invocation stumbles over the updated row. It not only seems more logical that way, but it also makes it possible for the trigger to catch the error, and react accordingly. I hadn't thought of that. It does seem better in every way except for how much work it takes to do it and how much testing it needs to have confidence in it. Definitely not 9.1 material. That's OK -- we have something which should work for us in 9.0 and 9.1. I'd really prefer not to fork this permanently, but if there's a better way coming in 9.2, we can use our patch for a year or so and then switch over to the superior capabilities available in 9.2. For example, if a BEFORE trigger on table A modifies table B, and one of B's BEFORE triggers in turn modifies A, B's BEFORE trigger could then catch the error and e.g. decide to skip the UPDATE. Yeah, that provides a reasonable default but still gives the application programmer fine-grained control on this. That's a good thing. Implementation-wise, I image we'd set a flag in the tuple header after locking the tuple (as I now discovered we do, sorry again) but before firing BEFORE triggers. We'd clear the flag again once all BEFORE triggers have run, and let the actual UPDATE or DELETE proceed. If an UPDATE or DELETE encounters a flagged tuple, and the transaction (or one of its parents) is the current lock holder, we'd throw an error. To clean up after aborted transactions, we'd clear the flag upon acquiring a tuple lock. Instead of a flag in the tuple header, we could also keep a (backend-local) list of ctids for which we've fired BEFORE triggers but haven't done the actual modification yet. We might also want to make it possible to distinguish pending UPDATES from pending DELETES (i.e., choose a different error code for the two cases), because it seems reasonable that code would want to react differently in those cases (e.g., skip an UPDATE if there's a pending DELETE). Does anyone else want to weigh in on this overall approach or narrow down the alternatives Florian has sketched out above? If we can reach some consensus on an approach, I can work on a new patch to implement it. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Kevin Grittner kevin.gritt...@wicourts.gov wrote: Florian Pflug f...@phlo.org wrote: Hm, OK I see your point now I believe. This is not so much about wanting to put things into BEFORe triggers which doesn't really fit there, but instead avoiding weeks of debugging if someones messes up. I would say that is the overriding concern. I'm going to have to argue with myself -- really, the *biggest* concern is that the current situation allows data to be quietly mangled through non-obvious action-at-a-distance. The damage might not be discovered for a very long time. Debugging the cause of the mangling, once noticed, is the *next* most significant concern. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
On Tue, Aug 2, 2011 at 2:32 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Would you feel comfortable with a patch which threw an error on the DELETE case, as it does on the UPDATE case? Yes, though with a little additional twist. The twist being that I'd like the error to be thrown earlier, at the time of the offending UPDATE or DELETE, not later, when the original UPDATE or DELETE which caused the BEFORE trigger invocation stumbles over the updated row. It not only seems more logical that way, but it also makes it possible for the trigger to catch the error, and react accordingly. I hadn't thought of that. It does seem better in every way except for how much work it takes to do it and how much testing it needs to have confidence in it. Definitely not 9.1 material. IMHO, none of this is 9.1 material. It's been like this forever, and it sucks, but it doesn't suck particularly more than any of the other things that we didn't get around to fixing in 9.1. In fact, I'd go so far as to argue that this would be just about the worst imaginable type of patch to slip into a release at the last minute. On the one hand, I think there's a real chance of breaking things in subtle ways that are difficult to detect; and on the other hand, a significant percentage of users will get no benefit from any change we make here, just because this is something that most people don't do. People who would otherwise be tempted to write their applications this way will end up not doing so precisely because it currently does not work. So I think we should focus on getting the right semantics here, rather than trying to minimize the scope of the change. I'm not sure I understand the justification for throwing an error. Updating a row twice is not an error under any other circumstances; nor is deleting a row you've updated. Why should it be here? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Kevin Grittner kevin.gritt...@wicourts.gov wrote: By the way, my current patch does break two existing UPDATE statements in the regression test misc.sql file: Fixed in the attached version of the patch. I consider the trigger behavior addressed by this patch to be a bug, and serious enough to merit inclusion of a fix in the 9.1 release, even at this late stage. I don't think it should be back-patched, because it changes behavior that there is some remote chance that somebody somewhere understands and is intentionally using. While I agree that Robert's suggested approach (or something functionally equivalent) would be the ideal long-term solution, I think that it would be too large of a change to consider for 9.1. I am suggesting a minimal defensive change for 9.1, with possible development of a fancier approach in a later release. To recap: a trigger for UPDATE BEFORE EACH ROW or DELETE BEFORE EACH ROW, which directly or indirectly causes update of the OLD row in the trigger, can cause the triggering operation to be silently ignored even though the trigger returns a non-NULL record, and even though all triggered modifications are persisted. The only clue that an incomplete and incongruous set of operations has been performed is that the UPDATE or DELETE count from the statement is reduced by the number of rows on which this occurred: if an UPDATE would have affected 5 rows, but one of them just fired triggers *as though* it had been updated but was actually left untouched by the original UPDATE statement, you would get UPDATE 4 rather than UPDATE 5. This patch causes DELETE to behave as most people would expect, and throws and ERROR on the conflicting UPDATE case. So far, beyond my confirmation that it passes all PostgreSQL regression tests and works as expected in a few ad hoc tests I've done, there have been six full-time days of business analysts here testing our applications with a version of 9.0.4 patched this way, confirming that the application works as expected. They have a standard set of testing scripts they use for every release, and programmers have added tests specifically targeted at this area. Plus the analysts are trying to exercise as many paths to data modification as possible, to stress the triggers, including interfaces with other agencies. They are scheduled to do a minimum of 20 more full-time days of testing in the next two weeks, so if someone wants to suggest changes or alternatives to this particular patch, there would still be time to get over 100 hours of professional testing against it -- if it comes through soon enough. -Kevin *** a/src/backend/executor/execMain.c --- b/src/backend/executor/execMain.c *** *** 1847,1854 EvalPlanQualFetch(EState *estate, Relation relation, int lockmode, switch (test) { case HeapTupleSelfUpdated: - /* treat it as deleted; do not process */ ReleaseBuffer(buffer); return NULL; case HeapTupleMayBeUpdated: --- 1847,1862 switch (test) { case HeapTupleSelfUpdated: ReleaseBuffer(buffer); + if (!ItemPointerEquals(update_ctid, tuple.t_self)) + { + /* it was updated, so look at the updated version */ + tuple.t_self = update_ctid; + /* updated row should have xmin matching this xmax */ + priorXmax = update_xmax; + continue; + } + /* treat it as deleted; do not process */ return NULL; case HeapTupleMayBeUpdated: *** a/src/backend/executor/nodeModifyTable.c --- b/src/backend/executor/nodeModifyTable.c *** *** 353,359 ldelete:; --- 353,382 true /* wait for commit */ ); switch (result) { + /* +* Don't allow updates to a row during its BEFORE DELETE trigger +* to prevent the deletion. One example of where this might +* happen is that the BEFORE DELETE trigger could delete a child +* row, and a trigger on that child row might update some count or +* status column in the row originally being deleted. +*/ case
Re: [HACKERS] WIP fix proposal for bug #6123
On Fri, Jul 22, 2011 at 5:01 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Your scenario is a BEFORE DELETE trigger that does an UPDATE on the same row, but I think this problem also occurs if you have a BEFORE UPDATE trigger that does an UPDATE on the same row. I believe the second update gets silently ignored. My testing shows that the primary update gets ignored, while all the triggered effects of that update are persisted. Yuck. :-( That was my recollection... It certainly seems possible to turn that around, but that hardly seems better. Agreed. In asking application programmers here what they would *expect* to happen, they all seem to think that it is surprising that the BEFORE trigger functions *return a record*, rather than a boolean to say whether to proceed with the operation. They feel it would be less confusing if a value set into NEW was effective if the operation does take effect, and the boolean controls whether or not that happens. They rather expect that if an update happens from the same transaction while a before trigger is running, that the NEW record will reflect the change. I think this is mostly a matter of what you get familiar with, and, as you say, not worth breaking compatibility for. I recognize how hard it would be to create that expected behavior, and how unlikely it is that the community would accept such a change at this point. But current behavior is to silently do something really dumb, so I think some change should be considered -- even if that change is to throw an error where we now allow nonsense. INSERT is not a problem -- if a BEFORE INSERT trigger inserts a row with a conflicting primary key (or other unique index key), the operation will be rolled back. That's fine. I think DELETE can be cleanly fixed with a patch similar to what I posted earlier in the thread. I found one more value that looks like it should be set, and it could use some comments, but I believe that we can get DELETE behavior which is every bit as sensible as INSERT behavior with a very small change. The worms do come crawling out of the can on BEFORE UPDATE triggers, though. When faced with an UPDATE which hasn't yet been applied, and other UPDATEs triggering from within the BEFORE UPDATE trigger which touch the same row, it doesn't seem like you can honor both the original UPDATE which was requested *and* the triggered UPDATEs. Of course, if you discard the original UPDATE, you can't very well do anything with the record returned from the BEFORE UPDATE trigger for that update. Since it seems equally evil to allow the update while ignoring some of the work caused by its trigger functions as to show the work of the triggered updates while suppressing the original update, I think the right thing is to throw an error if the old row for a BEFORE UPDATE is updated by the same transaction and the trigger function ultimately returns a non-NULL value. Thoughts? Well, it seems to me that if the trigger update and the main update were executed as separate commands (with no triggers involved) it would often be the case that they'd dovetail nicely. When this has come up for me, it's usually been the case that the sets of fields being updated are completely non-overlapping. So ideally what I'd like to happen is to have EPQ, or something like it, test whether the newest version of the row still satisfies the UPDATE criteria. If so, it applies the update to the new row version; if not, it either discards the main UPDATE or throws an error. There's still some room here for surprising results, but I think they would be surprising results arising out of having done something intrinsically complicated, rather than surprising results arising out of an odd implementation artifact. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Robert Haas robertmh...@gmail.com wrote: Well, it seems to me that if the trigger update and the main update were executed as separate commands (with no triggers involved) it would often be the case that they'd dovetail nicely. When this has come up for me, it's usually been the case that the sets of fields being updated are completely non-overlapping. Agreed that this is typically the case -- that's why the application programmers here expected NEW to be effectively a dynamic representation of the WIP state of the row. A lot of things would just work that way. Of course, they're blissfully unaware of what a huge revamp of the guts of PostgreSQL that would be. So ideally what I'd like to happen is to have EPQ, or something like it, test whether the newest version of the row still satisfies the UPDATE criteria. If so, it applies the update to the new row version; if not, it either discards the main UPDATE or throws an error. There's still some room here for surprising results, but I think they would be surprising results arising out of having done something intrinsically complicated, rather than surprising results arising out of an odd implementation artifact. So, you're advocating a logical merge of the results with something exceptional done on a conflicting update of the same columns? That would effectively get you to the same end result as a live NEW tuple, but without such a radical revamp of the guts of things. Still, not trivial to do properly, and I would argue for throwing an error rather than silently doing something surprising on conflict. This issue has already forced the rearrangement of our release schedule here, so I'm going to do the simple fix of just throwing an error on update from the BEFORE UPDATE trigger (of the row for with the trigger is firing). That fix is very simple and seems very safe to me, and should allow us to deploy without further schedule slippage; then I'll see if I can code up what you're suggesting. I had a new patch I was about to post with new error language, a different SQLSTATE, comments, and regression test changes; but unless someone wants to see that I won't clutter the list with it until I've had a chance to see if I can manage to handle it the way you're requesting. There's no doubt that it would be better the way you're suggesting; but it looks to me like about five times as many lines of code, harder to be sure it's right, and probably forcing me to learn a few new subsystems of PostgreSQL internals to accomplish. Thanks for the feedback. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
On Mon, Jul 25, 2011 at 12:26 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: There's no doubt that it would be better the way you're suggesting; but it looks to me like about five times as many lines of code, harder to be sure it's right, and probably forcing me to learn a few new subsystems of PostgreSQL internals to accomplish. Sorry, I didn't mean to make homework for you. Nor am I sure that the solution will pass must all around even if I think it's the best thing since sliced bread. I was just throwing it out there as what I would like to have happen in an ideal world... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Robert Haas robertmh...@gmail.com wrote: On Mon, Jul 25, 2011 at 12:26 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: There's no doubt that it would be better the way you're suggesting; but it looks to me like about five times as many lines of code, harder to be sure it's right, and probably forcing me to learn a few new subsystems of PostgreSQL internals to accomplish. Sorry, I didn't mean to make homework for you. Nor am I sure that the solution will pass must all around even if I think it's the best thing since sliced bread. I was just throwing it out there as what I would like to have happen in an ideal world... Well, if it can be done, it will be better and less likely to break existing code, so it's at least worth looking at. I don't object to broadening my horizons. ;-) Sorry if it sounded like a complaint; my intention was to communicate that I'm going to be looking at it, but I've got a few more urgent tasks to deal with first to get our application release out the door. By the way, my current patch does break two existing UPDATE statements in the regression test misc.sql file: | -- This non-func update stuff needs to be examined | -- more closely. - jolly (2/22/96) | -- | UPDATE tmp |SET stringu1 = reverse_name(onek.stringu1) |FROM onek |WHERE onek.stringu1 = 'JB' and | onek.stringu1 = tmp.stringu1; | | UPDATE tmp |SET stringu1 = reverse_name(onek2.stringu1) |FROM onek2 |WHERE onek2.stringu1 = 'JC' and | onek2.stringu1 = tmp.stringu1; Perhaps it's time -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Robert Haas robertmh...@gmail.com wrote: On Wed, Jul 20, 2011 at 2:58 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: So basically, the goal of this patch is to ensure that a BEFORE DELETE trigger doesn't silently fail to delete a row because that row was updated during the BEFORE DELETE trigger firing (in our case by secondary trigger firing). I've run across this problem before while writing application code and have found it surprising, unfortunate, and difficult to work around. It's not so bad when it only bites you once, but as things get more complicated and you have more and more triggers floating around, it gets pretty darn horrible. One of the things I've done to mitigate the impact of this is to write as many triggers as possible as AFTER triggers Yeah, this is not an issue in AFTER triggers, so moving any updating to those is a solution. In most cases that's where you want triggered modifications to take place anyway. The cascading delete situation is the most obvious exception, although there certainly could be others. but that's sometimes difficult to accomplish. Yeah, sometimes. Your scenario is a BEFORE DELETE trigger that does an UPDATE on the same row, but I think this problem also occurs if you have a BEFORE UPDATE trigger that does an UPDATE on the same row. I believe the second update gets silently ignored. My testing shows that the primary update gets ignored, while all the triggered effects of that update are persisted. Yuck. :-( It certainly seems possible to turn that around, but that hardly seems better. In asking application programmers here what they would *expect* to happen, they all seem to think that it is surprising that the BEFORE trigger functions *return a record*, rather than a boolean to say whether to proceed with the operation. They feel it would be less confusing if a value set into NEW was effective if the operation does take effect, and the boolean controls whether or not that happens. They rather expect that if an update happens from the same transaction while a before trigger is running, that the NEW record will reflect the change. I recognize how hard it would be to create that expected behavior, and how unlikely it is that the community would accept such a change at this point. But current behavior is to silently do something really dumb, so I think some change should be considered -- even if that change is to throw an error where we now allow nonsense. INSERT is not a problem -- if a BEFORE INSERT trigger inserts a row with a conflicting primary key (or other unique index key), the operation will be rolled back. That's fine. I think DELETE can be cleanly fixed with a patch similar to what I posted earlier in the thread. I found one more value that looks like it should be set, and it could use some comments, but I believe that we can get DELETE behavior which is every bit as sensible as INSERT behavior with a very small change. The worms do come crawling out of the can on BEFORE UPDATE triggers, though. When faced with an UPDATE which hasn't yet been applied, and other UPDATEs triggering from within the BEFORE UPDATE trigger which touch the same row, it doesn't seem like you can honor both the original UPDATE which was requested *and* the triggered UPDATEs. Of course, if you discard the original UPDATE, you can't very well do anything with the record returned from the BEFORE UPDATE trigger for that update. Since it seems equally evil to allow the update while ignoring some of the work caused by its trigger functions as to show the work of the triggered updates while suppressing the original update, I think the right thing is to throw an error if the old row for a BEFORE UPDATE is updated by the same transaction and the trigger function ultimately returns a non-NULL value. Thoughts? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Kevin Grittner kevin.gritt...@wicourts.gov wrote: I believe that we can get DELETE behavior which is every bit as sensible as INSERT behavior with a very small change. I think the right thing is to throw an error if the old row for a BEFORE UPDATE is updated by the same transaction and the trigger function ultimately returns a non-NULL value. And to make this a bit less hand-wavy, a rough patch attached. I expect the error message could use some word-smithing, and it could use comments; but it seemed like something concrete might speed things along. -Kevin *** a/src/backend/executor/execMain.c --- b/src/backend/executor/execMain.c *** *** 1847,1854 EvalPlanQualFetch(EState *estate, Relation relation, int lockmode, switch (test) { case HeapTupleSelfUpdated: - /* treat it as deleted; do not process */ ReleaseBuffer(buffer); return NULL; case HeapTupleMayBeUpdated: --- 1847,1862 switch (test) { case HeapTupleSelfUpdated: ReleaseBuffer(buffer); + if (!ItemPointerEquals(update_ctid, tuple.t_self)) + { + /* it was updated, so look at the updated version */ + tuple.t_self = update_ctid; + /* updated row should have xmin matching this xmax */ + priorXmax = update_xmax; + continue; + } + /* treat it as deleted; do not process */ return NULL; case HeapTupleMayBeUpdated: *** a/src/backend/executor/nodeModifyTable.c --- b/src/backend/executor/nodeModifyTable.c *** *** 354,359 ldelete:; --- 354,375 switch (result) { case HeapTupleSelfUpdated: + if (!ItemPointerEquals(tupleid, update_ctid)) + { + HeapTuple copyTuple; + + estate-es_output_cid = GetCurrentCommandId(false); + copyTuple = EvalPlanQualFetch(estate, + resultRelationDesc, + LockTupleExclusive, + update_ctid, + update_xmax); + if (copyTuple != NULL) + { + *tupleid = update_ctid = copyTuple-t_self; + goto ldelete; + } + } /* already deleted by self; nothing to do */ return NULL; *** *** 570,575 lreplace:; --- 586,595 switch (result) { case HeapTupleSelfUpdated: + if (!ItemPointerEquals(tupleid, update_ctid)) + ereport(ERROR, + (errcode(ERRCODE_TRIGGERED_ACTION_EXCEPTION), +errmsg(row modified by same transaction during trigger execution))); /* already deleted by self; nothing to do */ return NULL; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] WIP fix proposal for bug #6123
We're nearing completion of testing the migration of a lot of code which used our custom Java framework into PostgreSQL functions and triggers. Yesterday our testers ran into surprising behavior related to delete triggers. A test case is presented on the -bugs list, but basically it amounts to this: (1) We have some detail which is summarized by triggers into related higher-level tables for performance reasons. (2) On delete, some of the higher level tables should cascade the delete down to the lower levels. (3) Sometimes the same tables are involved in both. This is complicated by the foreign key situation -- due to conversion of less-than-perfect data and the fact that there is a legal concept of the elected Clerk of Court in each county being the custodian of the data we have orphaned detail in some tables which we don't have authority to delete or create bogus parent rows for. (It would actually be a felony for us to do so, I think.) Until 9.2 we won't be able to create foreign keys for these relationships, but in each county we've created foreign keys for all relationships without orphans. So, this is one reason we can't count on foreign key declarations, with the ON DELETE CASCADE option, yet we don't want to drop the foreign keys where they exist, as they help prevent further degradation of the data integrity. So the DELETE from the children should occur in the BEFORE trigger to avoid upsetting FK logic. Otherwise we could move the cascading deletes to the AFTER DELETE trigger, where this odd behavior doesn't occur. So basically, the goal of this patch is to ensure that a BEFORE DELETE trigger doesn't silently fail to delete a row because that row was updated during the BEFORE DELETE trigger firing (in our case by secondary trigger firing). If that description was too hard to follow, let me know and I'll try again. :-/ [Summarizing discussion on the -bugs list,] Tom didn't feel that there was a need to support application code which does what I describe above, and he felt that fixing it would open a can of worms, with logical quandaries about correct behavior. Basically, the changes I made were within switch statements where if the row was found to be HeapTupleUpdated in READ COMMITTED, it would follow the ctid chain; I used similar logic for HeapTupleSelfUpdated regardless of transaction isolation level. The reasons for not re-firing delete triggers here is the same for why delete triggers are not fired in the existing case -- it's just one delete. No claims are made for completeness of this patch -- it's a proof of concept on which I hope to get comments. Before this patch would be production ready I would need to check for similar needs on UPDATE, and would need to check to make sure there is no resource leakage. It passes `make check-world`, `make installcheck-world` with a database set for serializable transaction isolation, and the isolation tests. And of course, it doesn't show the behavior which we found so astonishing -- we no longer see an attempted delete of a parent succeed in deleting the children but leave the parent sitting there. A patch against 9.0 based on this approach may be find its way into production here in about two weeks if there are no contra-indications, so any review would be very much appreciated. -Kevin *** a/src/backend/executor/execMain.c --- b/src/backend/executor/execMain.c *** *** 1847,1854 EvalPlanQualFetch(EState *estate, Relation relation, int lockmode, switch (test) { case HeapTupleSelfUpdated: - /* treat it as deleted; do not process */ ReleaseBuffer(buffer); return NULL; case HeapTupleMayBeUpdated: --- 1847,1862 switch (test) { case HeapTupleSelfUpdated: ReleaseBuffer(buffer); + if (!ItemPointerEquals(update_ctid, tuple.t_self)) + { + /* it was updated, so look at the updated version */ + tuple.t_self = update_ctid; + /* updated row should have xmin matching this xmax */ + priorXmax = update_xmax; + continue; + } + /* treat it as deleted; do not process */ return NULL; case HeapTupleMayBeUpdated: *** a/src/backend/executor/nodeModifyTable.c --- b/src/backend/executor/nodeModifyTable.c ***
Re: [HACKERS] WIP fix proposal for bug #6123
On Wed, Jul 20, 2011 at 2:58 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: So basically, the goal of this patch is to ensure that a BEFORE DELETE trigger doesn't silently fail to delete a row because that row was updated during the BEFORE DELETE trigger firing (in our case by secondary trigger firing). I've run across this problem before while writing application code and have found it surprising, unfortunate, and difficult to work around. It's not so bad when it only bites you once, but as things get more complicated and you have more and more triggers floating around, it gets pretty darn horrible. One of the things I've done to mitigate the impact of this is to write as many triggers as possible as AFTER triggers, but that's sometimes difficult to accomplish. Your scenario is a BEFORE DELETE trigger that does an UPDATE on the same row, but I think this problem also occurs if you have a BEFORE UPDATE trigger that does an UPDATE on the same row. I believe the second update gets silently ignored. I'm unfortunately unqualified to speak to the correctness of your patch, but I concur with your view that the current behavior is lousy. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP fix proposal for bug #6123
Robert Haas robertmh...@gmail.com wrote: I think this problem also occurs if you have a BEFORE UPDATE trigger that does an UPDATE on the same row. I'm pretty sure you're right, and I do intend to cover that in the final patch. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers