Re: Allow DELETE to use ORDER BY and LIMIT/OFFSET
> > Out of curiosity, could you please tell me the concrete situations > where you wanted to delete one of two identical records? > In my case, there is a table with known duplicates, and we would like to delete all but the one with the lowest ctid, and then add a unique index to the table which then allows us to use INSERT ON CONFLICT in a meaningful way. The other need for a DELETE...LIMIT or UPDATE...LIMIT is when you're worried about flooding a replica, so you parcel out the DML into chunks that don't cause unacceptable lag on the replica. Both of these can be accomplished via DELETE FROM foo WHERE ctid IN ( SELECT ... FROM foo ... LIMIT 1000), but until recently such a construct would result in a full table scan, and you'd take the same hit with each subsequent DML. I *believe* that the ctid range scan now can limit those scans, especially if you can order the limited set by ctid, but those techniques are not widely known at this time.
Re: Allow DELETE to use ORDER BY and LIMIT/OFFSET
Hello Greg, On Fri, 17 Dec 2021 01:40:45 -0500 Greg Stark wrote: > On Thu, 16 Dec 2021 at 22:18, Tom Lane wrote: > > > > * If the sort order is underspecified, or you omit ORDER BY > > entirely, then it's not clear which rows will be operated on. > > The LIMIT might stop after just some of the rows in a peer > > group, and you can't predict which ones. > > Meh, that never seemed very compelling to me. I think that's on the > user and there are *lots* of cases where the user can easily know > enough extra context to know that's what she wants. In particular I've > often wanted to delete one of two identical records and it would have > been really easy to just do a DELETE LIMIT 1. I know there are ways to > do it but it's always seemed like unnecessary hassle when there was a > natural way to express it. Out of curiosity, could you please tell me the concrete situations where you wanted to delete one of two identical records? > > * UPDATE/DELETE necessarily involve the equivalent of SELECT > > FOR UPDATE, which may cause the rows to be ordered more > > surprisingly than you expected, ie the sort happens *before* > > rows are replaced by their latest versions, which might have > > different sort keys. > > This... is a real issue. Or is it? Just to be clear I think what > you're referring to is a case like: > > INSERT INTO t values (1),(2) > > In another session: BEGIN; UPDATE t set c=0 where c=2 > > DELETE FROM t ORDER BY c ASC LIMIT 1 > > > In other session: COMMIT > > Which row was deleted? In this case it was the row where c=1. Even > though the UPDATE reported success (i.e. 1 row updated) so presumably > it happened "before" the delete. The delete saw the ordering from > before it was blocked and saw the ordering with c=1, c=2 so apparently > it happened "before" the update. When I tried it using my patch, the DELETE deletes the row where c=1 as same as above, but it did not block. Is that the result of an experiment using my patch or other RDBMS like MySQL? > There are plenty of other ways to see the same surprising > serialization failure. If instead of a DELETE you did an UPDATE there > are any number of functions or other features that have been added > which can expose the order in which the updates happened. > > The way to solve those cases would be to use serializable isolation > (or even just repeatable read in this case). Wouldn't that also solve > the DELETE serialization failure too? Do you mean such serialization failures would be avoided in SERIALIZABLE or REPATABLE READ by aborting the transaction, such serialization failures may be accepted in READ COMMITTED? Regards, Yugo Nagata -- Yugo NAGATA
Re: Allow DELETE to use ORDER BY and LIMIT/OFFSET
On Thu, 16 Dec 2021 at 22:18, Tom Lane wrote: > > * If the sort order is underspecified, or you omit ORDER BY > entirely, then it's not clear which rows will be operated on. > The LIMIT might stop after just some of the rows in a peer > group, and you can't predict which ones. Meh, that never seemed very compelling to me. I think that's on the user and there are *lots* of cases where the user can easily know enough extra context to know that's what she wants. In particular I've often wanted to delete one of two identical records and it would have been really easy to just do a DELETE LIMIT 1. I know there are ways to do it but it's always seemed like unnecessary hassle when there was a natural way to express it. > * UPDATE/DELETE necessarily involve the equivalent of SELECT > FOR UPDATE, which may cause the rows to be ordered more > surprisingly than you expected, ie the sort happens *before* > rows are replaced by their latest versions, which might have > different sort keys. This... is a real issue. Or is it? Just to be clear I think what you're referring to is a case like: INSERT INTO t values (1),(2) In another session: BEGIN; UPDATE t set c=0 where c=2 DELETE FROM t ORDER BY c ASC LIMIT 1 In other session: COMMIT Which row was deleted? In this case it was the row where c=1. Even though the UPDATE reported success (i.e. 1 row updated) so presumably it happened "before" the delete. The delete saw the ordering from before it was blocked and saw the ordering with c=1, c=2 so apparently it happened "before" the update. There are plenty of other ways to see the same surprising serialization failure. If instead of a DELETE you did an UPDATE there are any number of functions or other features that have been added which can expose the order in which the updates happened. The way to solve those cases would be to use serializable isolation (or even just repeatable read in this case). Wouldn't that also solve the DELETE serialization failure too? -- greg
Re: Allow DELETE to use ORDER BY and LIMIT/OFFSET
On Thu, 16 Dec 2021 20:56:59 -0700 "David G. Johnston" wrote: > On Thursday, December 16, 2021, Yugo NAGATA wrote: > > > > > Also, here seem to be some use cases. For example, > > - when you want to delete the specified number of rows from a table > > that doesn't have a primary key and contains tuple duplicated. > > > Not our problem…use the tools correctly; there is always the hack > work-around for the few who didn’t. > > > > - when you want to delete the bottom 10 items with bad scores > > (without using rank() window function). > > > This one doesn’t make sense to me. > > - when you want to delete only some of rows because it takes time > > to delete all of them. > > > > > This seems potentially compelling though I’d be more concerned about the > memory aspects than simply taking a long amount of time. If this is a > problem then maybe discuss it without having a solution-in-hand? But given > the intense I/O cost that would happen spreading this out over time seems > acceptable and it should be an infrequent thing to do. Expecting users to > plan and execute some custom code for their specific need seems reasonable. > > So even if Tom’s technical concerns aren’t enough working on this based > upon these uses cases doesn’t seem of high enough benefit. Thank you for your comments. Ok. I agree that there are not so strong use cases. Regards, Yugo Nagata -- Yugo NAGATA
Re: Allow DELETE to use ORDER BY and LIMIT/OFFSET
On Thu, 16 Dec 2021 22:17:58 -0500 Tom Lane wrote: > Yugo NAGATA writes: > > We cannot use ORDER BY or LIMIT/OFFSET in the current > > DELETE statement syntax, so all the row matching the > > WHERE condition are deleted. However, the tuple retrieving > > process of DELETE is basically same as SELECT statement, > > so I think that we can also allow DELETE to use ORDER BY > > and LIMIT/OFFSET. > > Indeed, this is technically possible, but we've rejected the idea > before and I'm not aware of any reason to change our minds. > The problem is that a partial DELETE is not very deterministic > about which rows are deleted, and that does not seem like a > great property for a data-updating command. (The same applies > to UPDATE, which is why we don't allow these options in that > command either.) The core issues are: > > * If the sort order is underspecified, or you omit ORDER BY > entirely, then it's not clear which rows will be operated on. > The LIMIT might stop after just some of the rows in a peer > group, and you can't predict which ones. > > * UPDATE/DELETE necessarily involve the equivalent of SELECT > FOR UPDATE, which may cause the rows to be ordered more > surprisingly than you expected, ie the sort happens *before* > rows are replaced by their latest versions, which might have > different sort keys. > > We live with this amount of indeterminism in SELECT, but that > doesn't make it a brilliant idea to allow it in UPDATE/DELETE. Thank you for your explaining it! I'm glad to understand why this idea is not good and has been rejected. Regards, Yugo Nagata -- Yugo NAGATA
Re: Allow DELETE to use ORDER BY and LIMIT/OFFSET
On Thursday, December 16, 2021, Yugo NAGATA wrote: > > Also, here seem to be some use cases. For example, > - when you want to delete the specified number of rows from a table > that doesn't have a primary key and contains tuple duplicated. Not our problem…use the tools correctly; there is always the hack work-around for the few who didn’t. > - when you want to delete the bottom 10 items with bad scores > (without using rank() window function). This one doesn’t make sense to me. - when you want to delete only some of rows because it takes time > to delete all of them. > > This seems potentially compelling though I’d be more concerned about the memory aspects than simply taking a long amount of time. If this is a problem then maybe discuss it without having a solution-in-hand? But given the intense I/O cost that would happen spreading this out over time seems acceptable and it should be an infrequent thing to do. Expecting users to plan and execute some custom code for their specific need seems reasonable. So even if Tom’s technical concerns aren’t enough working on this based upon these uses cases doesn’t seem of high enough benefit. David J.
Re: Allow DELETE to use ORDER BY and LIMIT/OFFSET
Yugo NAGATA writes: > We cannot use ORDER BY or LIMIT/OFFSET in the current > DELETE statement syntax, so all the row matching the > WHERE condition are deleted. However, the tuple retrieving > process of DELETE is basically same as SELECT statement, > so I think that we can also allow DELETE to use ORDER BY > and LIMIT/OFFSET. Indeed, this is technically possible, but we've rejected the idea before and I'm not aware of any reason to change our minds. The problem is that a partial DELETE is not very deterministic about which rows are deleted, and that does not seem like a great property for a data-updating command. (The same applies to UPDATE, which is why we don't allow these options in that command either.) The core issues are: * If the sort order is underspecified, or you omit ORDER BY entirely, then it's not clear which rows will be operated on. The LIMIT might stop after just some of the rows in a peer group, and you can't predict which ones. * UPDATE/DELETE necessarily involve the equivalent of SELECT FOR UPDATE, which may cause the rows to be ordered more surprisingly than you expected, ie the sort happens *before* rows are replaced by their latest versions, which might have different sort keys. We live with this amount of indeterminism in SELECT, but that doesn't make it a brilliant idea to allow it in UPDATE/DELETE. regards, tom lane
Re: Allow DELETE to use ORDER BY and LIMIT/OFFSET
On Fri, 17 Dec 2021 09:47:18 +0900 Yugo NAGATA wrote: > Hello hackers, > > We cannot use ORDER BY or LIMIT/OFFSET in the current > DELETE statement syntax, so all the row matching the > WHERE condition are deleted. However, the tuple retrieving > process of DELETE is basically same as SELECT statement, > so I think that we can also allow DELETE to use ORDER BY > and LIMIT/OFFSET. > > Attached is the concept patch. This enables the following > operations: After post this, I noticed that there are several similar proposals in past: https://www.postgresql.org/message-id/flat/AANLkTi%3D6fBZh9yZT7f7kKh%2BzmQngAyHgZWBPM3eiEMj1%40mail.gmail.com https://www.postgresql.org/message-id/flat/1393112801.59251.YahooMailNeo%40web163006.mail.bf1.yahoo.com https://www.postgresql.org/message-id/flat/CADB9FDf-Vh6RnKAMZ4Rrg_YP9p3THdPbji8qe4qkxRuiOwm%3Dmg%40mail.gmail.com https://www.postgresql.org/message-id/flat/CALAY4q9fcrscybax7fg_uojFwjw_Wg0UMuSrf-FvN68SeSAPAA%40mail.gmail.com Anyway, I'll review these threads before progressing it. > > > postgres=# select * from t order by i; > i > > 1 > 2 > 2 > 2 > 2 > 5 > 10 > 20 > 33 > 35 > 53 > (11 rows) > > postgres=# delete from t where i = 2 limit 2; > DELETE 2 > postgres=# select * from t order by i; > i > > 1 > 2 > 2 > 5 > 10 > 20 > 33 > 35 > 53 > (9 rows) > > postgres=# delete from t order by i offset 3 limit 3; > DELETE 3 > postgres=# select * from t order by i; > i > > 1 > 2 > 2 > 33 > 35 > 53 > (6 rows) > > > Although we can do the similar operations using ctid and a subquery > such as > > DELETE FROM t WHERE ctid IN (SELECT ctid FROM t WHERE ... ORDER BY ... LIMIT > ...), > > it is more user friendly and intuitive to allow it in the DELETE syntax > because ctid is a system column and most users may not be familiar with it. > > Although this is not allowed in the SQL standard, it is supported > in MySQL[1]. DB2 also supports it although the syntax is somewhat > strange.[2] > > Also, here seem to be some use cases. For example, > - when you want to delete the specified number of rows from a table > that doesn't have a primary key and contains tuple duplicated. > - when you want to delete the bottom 10 items with bad scores > (without using rank() window function). > - when you want to delete only some of rows because it takes time > to delete all of them. > > [1] https://dev.mysql.com/doc/refman/8.0/en/delete.html > [2] > https://www.dba-db2.com/2015/04/delete-first-1000-rows-in-a-db2-table-using-fetch-first.html > > How do you think it? > > -- > Yugo NAGATA -- Yugo NAGATA
Allow DELETE to use ORDER BY and LIMIT/OFFSET
Hello hackers, We cannot use ORDER BY or LIMIT/OFFSET in the current DELETE statement syntax, so all the row matching the WHERE condition are deleted. However, the tuple retrieving process of DELETE is basically same as SELECT statement, so I think that we can also allow DELETE to use ORDER BY and LIMIT/OFFSET. Attached is the concept patch. This enables the following operations: postgres=# select * from t order by i; i 1 2 2 2 2 5 10 20 33 35 53 (11 rows) postgres=# delete from t where i = 2 limit 2; DELETE 2 postgres=# select * from t order by i; i 1 2 2 5 10 20 33 35 53 (9 rows) postgres=# delete from t order by i offset 3 limit 3; DELETE 3 postgres=# select * from t order by i; i 1 2 2 33 35 53 (6 rows) Although we can do the similar operations using ctid and a subquery such as DELETE FROM t WHERE ctid IN (SELECT ctid FROM t WHERE ... ORDER BY ... LIMIT ...), it is more user friendly and intuitive to allow it in the DELETE syntax because ctid is a system column and most users may not be familiar with it. Although this is not allowed in the SQL standard, it is supported in MySQL[1]. DB2 also supports it although the syntax is somewhat strange.[2] Also, here seem to be some use cases. For example, - when you want to delete the specified number of rows from a table that doesn't have a primary key and contains tuple duplicated. - when you want to delete the bottom 10 items with bad scores (without using rank() window function). - when you want to delete only some of rows because it takes time to delete all of them. [1] https://dev.mysql.com/doc/refman/8.0/en/delete.html [2] https://www.dba-db2.com/2015/04/delete-first-1000-rows-in-a-db2-table-using-fetch-first.html How do you think it? -- Yugo NAGATA diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 297b6ee715..c596bd80ea 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -3241,6 +3241,11 @@ _copyDeleteStmt(const DeleteStmt *from) COPY_NODE_FIELD(returningList); COPY_NODE_FIELD(withClause); + COPY_NODE_FIELD(sortClause); + COPY_NODE_FIELD(limitOffset); + COPY_NODE_FIELD(limitCount); + COPY_SCALAR_FIELD(limitOption); + return newnode; } diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c index f537d3eb96..84f509b57b 100644 --- a/src/backend/nodes/equalfuncs.c +++ b/src/backend/nodes/equalfuncs.c @@ -1050,6 +1050,11 @@ _equalDeleteStmt(const DeleteStmt *a, const DeleteStmt *b) COMPARE_NODE_FIELD(returningList); COMPARE_NODE_FIELD(withClause); + COMPARE_NODE_FIELD(sortClause); + COMPARE_NODE_FIELD(limitOffset); + COMPARE_NODE_FIELD(limitCount); + COMPARE_SCALAR_FIELD(limitOption); + return true; } diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index e276264882..8a201749bf 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -3668,6 +3668,12 @@ raw_expression_tree_walker(Node *node, return true; if (walker(stmt->withClause, context)) return true; +if (walker(stmt->sortClause, context)) + return true; +if (walker(stmt->limitOffset, context)) + return true; +if (walker(stmt->limitCount, context)) + return true; } break; case T_UpdateStmt: diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index 146ee8dd1e..015e879f7a 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -471,6 +471,12 @@ transformDeleteStmt(ParseState *pstate, DeleteStmt *stmt) qual = transformWhereClause(pstate, stmt->whereClause, EXPR_KIND_WHERE, "WHERE"); + qry->sortClause = transformSortClause(pstate, + stmt->sortClause, + >targetList, + EXPR_KIND_ORDER_BY, + false /* allow SQL92 rules */ ); + qry->returningList = transformReturningList(pstate, stmt->returningList); /* done building the range table and jointree */ @@ -482,6 +488,15 @@ transformDeleteStmt(ParseState *pstate, DeleteStmt *stmt) qry->hasTargetSRFs = pstate->p_hasTargetSRFs; qry->hasAggs = pstate->p_hasAggs; + /* transform LIMIT */ + qry->limitOffset = transformLimitClause(pstate, stmt->limitOffset, + EXPR_KIND_OFFSET, "OFFSET", + stmt->limitOption); + qry->limitCount = transformLimitClause(pstate, stmt->limitCount, + EXPR_KIND_LIMIT, "LIMIT", + stmt->limitOption); + qry->limitOption = stmt->limitOption; + assign_query_collations(pstate, qry); /* this must be done after collations, for reliable comparison of exprs */ diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 86ce33bd97..0c6d11c23c 100644 --- a/src/backend/parse