Re: [HACKERS] UPDATE using sub selects
Hi, Hmm. That sounds like it would be a horrid mess. You need to decouple the execution of the subplan from the use of its outputs, apparently. There is some precedent for this in the way that InitPlans are handled: the result of the subplan is stored into a ParamList array entry that's later referenced by a Param node in the parent's expression tree. That would generalize easily enough to setting more than one Param, but I'm not clear on where you'd want to stick the subplan itself in the plan tree, nor on what controls how often it needs to get evaluated. Ended up using something similar to the above suggestion. I have posted the patch to -patches based on this. An important concern was where to stick the evaluation of the subqueries so that they end up becoming subplans which are used in the execution. For this I have added a new field in the Query structure. This entry gets preprocessed similar to other fields of the Query from within subquery_planner. Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] UPDATE using sub selects
Hi, No. Offhand I think you'd either need to relax EXPR_SUBLINK to allow multiple output columns, or invent a ROW_SUBLINK SubLinkType that is just like EXPR_SUBLINK except for allowing multiple output columns. The latter would probably be less likely to break other things... Yeah, was looking at EXPR_SUBLINK and its single column use case and drove to the same conclusion that inventing a new sublink type would be better too. It is indeed becoming a not so simple and narrow fix as you had mentioned earlier in your first response :) I have invented a ROWEXPR_SUBLINK type that handles multiple output columns. The trouble is that since eventually columns of the parents have to be part of the query's targetList, I am sending the entire subquery as one of the entries in that list and the targetList gets populated with entries dependent on the subquery much later via make_subplan. This breaks code in rewriteTargetList (which expects every list entry to be of type TargetEntry), and expand_targetlist (which expects the targets to be present in attrno order, the entries added because of the subquery will not be in order as compared to normal SET colname = expr targets). Is there a simpler way of doing things? Should I try generating a resjunk TargetEntry in transformUpdateStmt and have its expr point to the subquery and see if that works? Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] UPDATE using sub selects
NikhilS [EMAIL PROTECTED] writes: I have invented a ROWEXPR_SUBLINK type that handles multiple output columns. The trouble is that since eventually columns of the parents have to be part of the query's targetList, I am sending the entire subquery as one of the entries in that list and the targetList gets populated with entries dependent on the subquery much later via make_subplan. This breaks code in rewriteTargetList (which expects every list entry to be of type TargetEntry), and expand_targetlist (which expects the targets to be present in attrno order, the entries added because of the subquery will not be in order as compared to normal SET colname = expr targets). Hmm. That sounds like it would be a horrid mess. You need to decouple the execution of the subplan from the use of its outputs, apparently. There is some precedent for this in the way that InitPlans are handled: the result of the subplan is stored into a ParamList array entry that's later referenced by a Param node in the parent's expression tree. That would generalize easily enough to setting more than one Param, but I'm not clear on where you'd want to stick the subplan itself in the plan tree, nor on what controls how often it needs to get evaluated. An alternative approach is to put the subplan into the rangetable and use Vars to reference its outputs. Again it's not quite clear what drives re-execution of the subplan. It strikes me though that an approach like this might also serve for SQL2003's LATERAL construct, which'd be a nice thing to support. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] UPDATE using sub selects
Hi, On 3/31/07, Tom Lane [EMAIL PROTECTED] wrote: NikhilS [EMAIL PROTECTED] writes: I have invented a ROWEXPR_SUBLINK type that handles multiple output columns. The trouble is that since eventually columns of the parents have to be part of the query's targetList, I am sending the entire subquery as one of the entries in that list and the targetList gets populated with entries dependent on the subquery much later via make_subplan. This breaks code in rewriteTargetList (which expects every list entry to be of type TargetEntry), and expand_targetlist (which expects the targets to be present in attrno order, the entries added because of the subquery will not be in order as compared to normal SET colname = expr targets). Hmm. That sounds like it would be a horrid mess. You need to decouple the execution of the subplan from the use of its outputs, apparently. There is some precedent for this in the way that InitPlans are handled: the result of the subplan is stored into a ParamList array entry that's later referenced by a Param node in the parent's expression tree. That would generalize easily enough to setting more than one Param, but I'm not clear on where you'd want to stick the subplan itself in the plan tree, nor on what controls how often it needs to get evaluated. Yes, I have tried this already. As you suspect, it seems that the subplan does not get evaluated if its not part of the targetList at all. An alternative approach is to put the subplan into the rangetable and use Vars to reference its outputs. Again it's not quite clear what drives re-execution of the subplan. It strikes me though that an approach like this might also serve for SQL2003's LATERAL construct, which'd be a nice thing to support. Ok, I will try this out. Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] UPDATE using sub selects
Hi, What's the expected result if the tuple from subselect is more than 1? Error, per SQL99 section 7.14: 1) If the cardinality of a row subquery is greater than 1 (one), then an exception condition is raised: cardinality violation. I expect no update at all in case of void result set, is this the case ? No, you get nulls; it's a subquery not a join. Per SQL99 7.1: c) If the row value constructor is a row subquery, then: i) Let R be the result of the row subquery and let D be the degree of R. ii) If the cardinality of R is 0 (zero), then the result of the row value constructor is D null values. iii) If the cardinality of R is 1 (one), then the result of the row value constructor is R. regards, tom lane To allow both of the above to hold, I think the subselect will have to be treated like a EXPR_SUBLINK subquery. I was wondering if we have a similar mechanism for plain selects/subselects to check and restrict their output to a single row. Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] UPDATE using sub selects
NikhilS [EMAIL PROTECTED] writes: To allow both of the above to hold, I think the subselect will have to be treated like a EXPR_SUBLINK subquery. I was wondering if we have a similar mechanism for plain selects/subselects to check and restrict their output to a single row. No. Offhand I think you'd either need to relax EXPR_SUBLINK to allow multiple output columns, or invent a ROW_SUBLINK SubLinkType that is just like EXPR_SUBLINK except for allowing multiple output columns. The latter would probably be less likely to break other things... regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] UPDATE using sub selects
Hi, On 3/16/07, Tom Lane [EMAIL PROTECTED] wrote: NikhilS [EMAIL PROTECTED] writes: To allow both of the above to hold, I think the subselect will have to be treated like a EXPR_SUBLINK subquery. I was wondering if we have a similar mechanism for plain selects/subselects to check and restrict their output to a single row. No. Offhand I think you'd either need to relax EXPR_SUBLINK to allow multiple output columns, or invent a ROW_SUBLINK SubLinkType that is just like EXPR_SUBLINK except for allowing multiple output columns. The latter would probably be less likely to break other things... Yeah, was looking at EXPR_SUBLINK and its single column use case and drove to the same conclusion that inventing a new sublink type would be better too. It is indeed becoming a not so simple and narrow fix as you had mentioned earlier in your first response :) Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] UPDATE using sub selects
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 NikhilS wrote: I have tried some update-subselect variations and they seem to work. For example the case in the src/test/regress/sql/update.sql, which used to fail till now, seems to work: UPDATE update_test SET (a,b) = (select a,b FROM update_test where c = 'foo') WHERE a = 10; What's the expected result if the tuple from subselect is more than 1? I expect no update at all in case of void result set, is this the case ? Regards Gaetano Mendola -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF+Vx47UpzwH2SGd4RAvyVAKCGK5pC9B/lmrNjAFPGXhm5ialwSwCglM2n DxrxWyvJASX5WSF9B8cAMas= =AoVF -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] UPDATE using sub selects
Gaetano Mendola [EMAIL PROTECTED] writes: NikhilS wrote: I have tried some update-subselect variations and they seem to work. For example the case in the src/test/regress/sql/update.sql, which used to fail till now, seems to work: UPDATE update_test SET (a,b) = (select a,b FROM update_test where c = 'foo') WHERE a = 10; What's the expected result if the tuple from subselect is more than 1? Error, per SQL99 section 7.14: 1) If the cardinality of a row subquery is greater than 1 (one), then an exception condition is raised: cardinality violation. I expect no update at all in case of void result set, is this the case ? No, you get nulls; it's a subquery not a join. Per SQL99 7.1: c) If the row value constructor is a row subquery, then: i) Let R be the result of the row subquery and let D be the degree of R. ii) If the cardinality of R is 0 (zero), then the result of the row value constructor is D null values. iii) If the cardinality of R is 1 (one), then the result of the row value constructor is R. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] UPDATE using sub selects
Hi, I have coded up a patch which solves the following TODO. I will submit a patch for this soon: - UPDATE - Allow UPDATE tab SET ROW (col, ...) = (SELECT...) http://archives.postgresql.org/pgsql-hackers/2006-07/msg01306.php The question is that since this enhances the UPDATE syntax, what changes and where all they need to be made with respect to the documentation? Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] UPDATE using sub selects
NikhilS [EMAIL PROTECTED] writes: I have coded up a patch which solves the following TODO. I will submit a patch for this soon: http://archives.postgresql.org/pgsql-hackers/2006-07/msg01306.php Cool... The question is that since this enhances the UPDATE syntax, what changes and where all they need to be made with respect to the documentation? Documentation is the very least of your worries. What exactly is your implementation plan? If this were a simple or narrow fix it would have been done already. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] UPDATE using sub selects
Hi, The question is that since this enhances the UPDATE syntax, what changes and where all they need to be made with respect to the documentation? Documentation is the very least of your worries. What exactly is your implementation plan? If this were a simple or narrow fix it would have been done already. The implementation that I have planned is pretty similar to the way INSERT INTO ... SELECT has been implemented. Along with the grammar changes in gram.y, the changes are localized in the transformUpdateStmt code path. The SELECT clause ends up becoming a subquery to the update query with the target column expressions transformed properly to include the subquery expressions. Does this sound ok? I have tried some update-subselect variations and they seem to work. For example the case in the src/test/regress/sql/update.sql, which used to fail till now, seems to work: UPDATE update_test SET (a,b) = (select a,b FROM update_test where c = 'foo') WHERE a = 10; Will try testing out some other variations too. Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com