Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
A quick recap: I submitted a patch for empty ARRAY[] syntax back in November, and as far as I can see it never made it to the patches list. Gregory suggested a different way of approaching the problem (quoted below), but nobody commented further about how it might be made to work. I'd like to RFC again on Gregory's idea, and if that doesn't bear any fruit I'd like to submit the patch as-is for review. Regards, BJ On 01/12/2007, Brendan Jurd <[EMAIL PROTECTED]> wrote: > On Nov 30, 2007 9:09 PM, Gregory Stark <[EMAIL PROTECTED]> wrote: > > I'm sorry to suggest anything at this point, but... would it be less > invasive > > if instead of requiring the immediate cast you created a special case in > the > > array code to allow a placeholder object for "empty array of unknown type". > > The only operation which would be allowed on it would be to cast it to some > > specific array type. > > > > That way things like > > > > UPDATE foo SET col = array[]; > > INSERT INTO foo (col) VALUES (array[]); > > > > could be allowed if they could be contrived to introduce an assignment > cast. > > Not sure it would be less invasive, but I do like the outcome of being > able to create an empty array pending assignment. In addition to your > examples, it might also make it possible to do things like this in > plpgsql > > DECLARE > a text[] := array[]; > > Whereas my patch requires you to write > > a text[]: =array[]::text[]; > > ... which seems pretty stupid. > ... > Any suggestions about how you would enforce the "only allow casts to > array types" restriction on the empty array? > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your Subscription: http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
On Nov 30, 2007 9:09 PM, Gregory Stark <[EMAIL PROTECTED]> wrote: > I'm sorry to suggest anything at this point, but... would it be less invasive > if instead of requiring the immediate cast you created a special case in the > array code to allow a placeholder object for "empty array of unknown type". > The only operation which would be allowed on it would be to cast it to some > specific array type. > > That way things like > > UPDATE foo SET col = array[]; > INSERT INTO foo (col) VALUES (array[]); > > could be allowed if they could be contrived to introduce an assignment cast. Hi Gregory. Not sure it would be less invasive, but I do like the outcome of being able to create an empty array pending assignment. In addition to your examples, it might also make it possible to do things like this in plpgsql DECLARE a text[] := array[]; Whereas my patch requires you to write a text[]: =array[]::text[]; ... which seems pretty stupid. So, I like your idea a lot from a usability point of view. But I really, really hate it from a "just spent half a week on this patch" point of view =/ Any suggestions about how you would enforce the "only allow casts to array types" restriction on the empty array? Cheers BJ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
"Brendan Jurd" <[EMAIL PROTECTED]> writes: > The patch is very invasive (at least compared to any of my previous > patches), but so far I haven't managed to find any broken behaviour. I'm sorry to suggest anything at this point, but... would it be less invasive if instead of requiring the immediate cast you created a special case in the array code to allow a placeholder object for "empty array of unknown type". The only operation which would be allowed on it would be to cast it to some specific array type. That way things like UPDATE foo SET col = array[]; INSERT INTO foo (col) VALUES (array[]); could be allowed if they could be contrived to introduce an assignment cast. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
As discussed on -hackers, this patch allows the construction of an empty array if an explicit cast to an array type is given (as in, ARRAY[]::int[]). postgres=# select array[]::int[]; array --- {} postgres=# select array[]; ERROR: no target type for empty array HINT: Empty arrays must be explictly cast to the desired array type, e.g. ARRAY[]::int[] A few notes on the implementation: * The syntax now allows an ARRAY constructor with an empty expression list (array_expr_list may be empty). * I've added a new parsenode for arrays, A_ArrayExpr (previously the parser would create ArrayExpr primnodes). * transformArrayExpr() now takes two extra arguments, a type oid and a typmod. When transforming a typecast which casts an A_ArrayExpr to an array type, transformExpr passes these type details down to transformArrayExpr, and skips the typecast. * transformArrayExpr() behaves slightly differently when passed type information. The overall type of the array is set to the given type, and all elements are explictly coerced to the equivalent element type. If it was not passed a type, then the behaviour is as previous; the function looks for a common type among the elements, and coerces them to that type. The overall type of the array is derived from the common element type. The patch is very invasive (at least compared to any of my previous patches), but so far I haven't managed to find any broken behaviour. All regression tests pass, and the regression tests for arrays seem to be quite comprehensive. I did add a couple of new tests for the empty array behaviours, but the rest I've left alone. I look forward to your comments -- although given the length of the 8.4 patch review queue, that will probably be an exercise in extreme patience! Major thanks go out to Tom for all his guidance on -hackers while I developed the patch. Regards, BJ *** ./doc/src/sgml/syntax.sgml.orig Fri Nov 30 19:31:29 2007 --- ./doc/src/sgml/syntax.sgml Fri Nov 30 19:32:11 2007 *** *** 1497,1503 array value from values for its member elements. A simple array constructor consists of the key word ARRAY, a left square bracket ! [, one or more expressions (separated by commas) for the array element values, and finally a right square bracket ]. For example: --- 1497,1503 array value from values for its member elements. A simple array constructor consists of the key word ARRAY, a left square bracket ! [, a list of expressions (separated by commas) for the array element values, and finally a right square bracket ]. For example: *** *** 1507,1515 {1,2,7} (1 row) ! The array element type is the common type of the member expressions, ! determined using the same rules as for UNION or ! CASE constructs (see ). --- 1507,1516 {1,2,7} (1 row) ! If the array is not explictly cast to a particular type, the array element ! type is the common type of the member expressions, determined using the ! same rules as for UNION or CASE constructs (see ! ). *** *** 1554,1559 --- 1555,1573 +You can construct an empty array, but since it's impossible to have an array +with no type, you must explictly cast your empty array to the desired type. For example: + + SELECT ARRAY[]::int[]; + int4 + -- + {} + (1 row) + +For more on casting, see . + + + It is also possible to construct an array from the results of a subquery. In this form, the array constructor is written with the key word ARRAY followed by a parenthesized (not *** ./src/backend/nodes/copyfuncs.c.origFri Nov 30 19:29:16 2007 --- ./src/backend/nodes/copyfuncs.c Fri Nov 30 19:32:11 2007 *** *** 1704,1709 --- 1704,1719 return newnode; } + static A_ArrayExpr * + _copyA_ArrayExpr(A_ArrayExpr *from) + { + A_ArrayExpr *newnode = makeNode(A_ArrayExpr); + + COPY_NODE_FIELD(elements); + + return newnode; + } + static ResTarget * _copyResTarget(ResTarget *from) { *** *** 3538,3543 --- 3548,3556 case T_A_ArrayExpr: retval = _copyA_ArrayExpr(from); break; + case T_A_ArrayExpr: + retval = _copyA_ArrayExpr(from); + break; case T_ResTarget: retval = _copyResTarget(from); break; *** ./src/backend/nodes/outfuncs.c.orig Fri Nov 30 19:29:16 2007 --- ./src/backend/nodes/outfuncs.c Fri Nov 30 19:32:11 2007 *** *** 1978,1983 --- 1978,1991 } static void + _outA_ArrayExpr(StringInfo str, A_ArrayExpr *node) + { + WRITE_NODE_TYPE("A_ARRAYEXPR"); + + WRITE_NODE_FIELD(elements); + } + + static void _outRes
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
Martijn van Oosterhout <[EMAIL PROTECTED]> writes: >> 1) How should we determine whether the array is multidimensional if we >> know the type in advance? > Well, given the array should be regular you should be able to just look > at the first element, if it's a array look at it's first element, etc > to determine the dimensions. This'll be fairly quick. How does that work with non-constant array constructor members? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
On Fri, Nov 30, 2007 at 06:13:20AM +1100, Brendan Jurd wrote: > Hi folks, > > The patch is coming along nicely now. I do have a couple of questions > about the implementation in transformArrayExpr though. Awesome. > 1) How should we determine whether the array is multidimensional if we > know the type in advance? Well, given the array should be regular you should be able to just look at the first element, if it's a array look at it's first element, etc to determine the dimensions. This'll be fairly quick. > 2) Should the typecast propagate downwards into nested array elements? IMHO yes, you have th einfo you may as well use it. > If we have a nested array written as, say, ARRAY[ARRAY[1, 2], ARRAY[3, > 4], ARRAY[5, 6]]::float[], should we treat the inner arrays the same > way as the outer array (with the advance knowledge that the array type > should be float[])? TBH, I think you're going to have to go through the whole array to coerce them and check, so you may as well determine the dimensions at the same time. In general I think it's better to mark the type up front. In don't know if you should actually do the conversion straight away, but at least you don't need to guess the type anymore. Hope this helps, Have a nice day, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > Those who make peaceful revolution impossible will make violent revolution > inevitable. > -- John F Kennedy signature.asc Description: Digital signature
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
Hi folks, The patch is coming along nicely now. I do have a couple of questions about the implementation in transformArrayExpr though. 1) How should we determine whether the array is multidimensional if we know the type in advance? Currently, transformArrayExpr uses the results of its search for a common element type to figure out whether the array is multidimensional. If we know the type in advance, we don't need to do the common type search (a nice side-effect), so we need some other way of figuring out how to set ArrayExpr->multidims on the new node. I could just check the nodeTag of the elements as they are transformed, but I'm concerned that the existing code might be relying on select_common_type to catch stupid input, like a mixture of scalar and array elements. If that's the case it might be unwise to bypass select_common_type or, at least, I'd need to come up with something else to provide the same level of sanity assurance in both code paths. 2) Should the typecast propagate downwards into nested array elements? If we have a nested array written as, say, ARRAY[ARRAY[1, 2], ARRAY[3, 4], ARRAY[5, 6]]::float[], should we treat the inner arrays the same way as the outer array (with the advance knowledge that the array type should be float[])? If I'm reading the code correctly, the end result should be much the same, because the inner arrays will end up being coerced to float[] anyway. But shortcutting the coercion could save some cycles. Comments? Regards, BJ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
Brendan Jurd escribió: > If the only reason for keeping A_Const->typename around is the alleged > code saving (as indicated by the code comments), my offer to do away > with it is still on the table. Code cleanup is always welcome. -- Alvaro Herrera Developer, http://www.PostgreSQL.org/ "The eagle never lost so much time, as when he submitted to learn of the crow." (William Blake) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
On Nov 28, 2007 9:49 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > > I had a bit of a dig into this. A_Const->typename gets set directly > > by the parse paths for "INTERVAL [(int)] string [interval range]". In > > fact, as far as I can tell that's the _only_ place A_Const->typename > > gets used at all. > > Uh, you missed quite a lot of others ... see CURRENT_DATE and a lot of > other productions. > Thanks again. I missed those because they don't use makeStringConst(). Looking again, it turns out "many productions" is more like 15. That's a bigger number, certainly, but it's still manageable. It wouldn't be hard to convert them to generate a const-in-a-cast. In fact with the addition of a makeCastStringConst(), I think the code saving from A_Const->typename would be cancelled out. If the only reason for keeping A_Const->typename around is the alleged code saving (as indicated by the code comments), my offer to do away with it is still on the table. Regards, BJ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
"Brendan Jurd" <[EMAIL PROTECTED]> writes: > I actually thought that A_ArrayExpr would be a good addition even if > you ignore the matter of typecasting. It always seemed weird to me > that the parser generates an ArrayExpr directly. ArrayExpr has a > bunch of members that are only set by the transform; all the parser > does is set the 'elements' member. Well, that's a reasonable argument. And now that I think about it, a parser-only node type doesn't have nearly the support overhead that a full-fledged executable node does. So no objection to A_ArrayExpr if you want to do that. > I had a bit of a dig into this. A_Const->typename gets set directly > by the parse paths for "INTERVAL [(int)] string [interval range]". In > fact, as far as I can tell that's the _only_ place A_Const->typename > gets used at all. Uh, you missed quite a lot of others ... see CURRENT_DATE and a lot of other productions. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
On Nov 28, 2007 4:19 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Brendan Jurd" <[EMAIL PROTECTED]> writes: > > Now I'm thinking I leave the grammar rules alone (apart from making it > > legal to specify an empty list of elements), and instead push the > > typename down into the child node from makeTypeCast(), if the child is > > an A_ArrayExpr. Does that work better? > > Actually, if you do that you might as well forego the separate node type > (which requires a nontrivial amount of infrastructure). I think it > would work just about as well to have transformExpr check whether the > argument of a TypeCast is an ArrayExpr, and if so call > transformArrayExpr directly from there, passing the TypeName as an > additional argument. I actually thought that A_ArrayExpr would be a good addition even if you ignore the matter of typecasting. It always seemed weird to me that the parser generates an ArrayExpr directly. ArrayExpr has a bunch of members that are only set by the transform; all the parser does is set the 'elements' member. And then the transform creates a brand new ArrayExpr and populates it based on what's in the 'elements' member of the otherwise-empty ArrayExpr passed to it. So my feeling is that an A_ArrayExpr is a better fit for the parser output than ArrayExpr, and more in keeping with how the rest of the code does things. Mind you I'm also okay with your suggestion to let transformExpr take care of it. But I'm not adverse to putting in the legwork to set up the infrastructure for A_ArrayExpr, if it's a nice outcome. > Kinda ugly, but not really any worse than the way > A_Const is handled in that same routine. (In fact, we could use the > same technique to get rid of the typename field in A_Const ... might > be worth doing?) I had a bit of a dig into this. A_Const->typename gets set directly by the parse paths for "INTERVAL [(int)] string [interval range]". In fact, as far as I can tell that's the _only_ place A_Const->typename gets used at all. And all the transform does with that piece of information is treat the node like a typecast. I'm not seeing a huge amount of value in this special treatment. Why not just have the parser build this as an A_Const inside a TypeCast and then let the transform deal with it in the usual way? I found the following comment at parsenodes.h:244 * NOTE: for mostly historical reasons, A_Const parsenodes contain * room for a TypeName; we only generate a separate TypeCast node if the * argument to be casted is not a constant. In theory either representation * would work, but the combined representation saves a bit of code in many * productions in gram.y. However, this is no longer the case. makeTypeCast() doesn't care about whether its argument is a constant anymore: * Earlier we would determine whether an A_Const would * be acceptable, however Domains require coerce_type() * to process them -- applying constraints as required. And in "many productions in gram.y", "many" == 2. Currently the combined representation requires more code than it saves. So, I get the impression the use-case for A_Const->typename has become extinct. I think it could be removed with a minimum of fuss, and I'd be happy to include same with my patch (or, submit it as a separate patch; let me know your preference). Regards, BJ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
"Brendan Jurd" <[EMAIL PROTECTED]> writes: > Now I'm thinking I leave the grammar rules alone (apart from making it > legal to specify an empty list of elements), and instead push the > typename down into the child node from makeTypeCast(), if the child is > an A_ArrayExpr. Does that work better? Actually, if you do that you might as well forego the separate node type (which requires a nontrivial amount of infrastructure). I think it would work just about as well to have transformExpr check whether the argument of a TypeCast is an ArrayExpr, and if so call transformArrayExpr directly from there, passing the TypeName as an additional argument. Kinda ugly, but not really any worse than the way A_Const is handled in that same routine. (In fact, we could use the same technique to get rid of the typename field in A_Const ... might be worth doing?) regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
On Nov 28, 2007 2:56 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > > I wonder whether we are also interested in catching CAST(), e.g.: > > > CAST(ARRAY[] AS text[]) > > I think you'll find that it's just about impossible to not handle both, > because they look the same after the grammar gets done. Thanks Tom ... your comment makes me suspect I've been barking up the wrong tree. My original intent was to modify the grammar rules to catch an array expression followed by a typecast, and put the target typename of the cast directly into the A_ArrayExpr struct. That notion came from looking at the way that TypeName gets put into A_Const -- makeStringConst() takes an optional TypeName argument. Looking at the code in the context of your comment, that was probably a bad approach. I may've taken the A_Const analogy too far. Now I'm thinking I leave the grammar rules alone (apart from making it legal to specify an empty list of elements), and instead push the typename down into the child node from makeTypeCast(), if the child is an A_ArrayExpr. Does that work better? Regards, BJ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
"Brendan Jurd" <[EMAIL PROTECTED]> writes: > So far I've only considered the '::' cast syntax suggested in the > original proposal, e.g.: > ARRAY[]::text[] > I wonder whether we are also interested in catching CAST(), e.g.: > CAST(ARRAY[] AS text[]) I think you'll find that it's just about impossible to not handle both, because they look the same after the grammar gets done. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
So far I've only considered the '::' cast syntax suggested in the original proposal, e.g.: ARRAY[]::text[] I wonder whether we are also interested in catching CAST(), e.g.: CAST(ARRAY[] AS text[]) I'm personally okay with leaving it at support for '::', but admittedly I am heavily biased towards this syntax (I find CAST very ugly). I suppose supporting CAST as well would be the more predictable behaviour; I think people might be surprised if we supported one form of casting but not the other. Comments? Regards, BJ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
"Brendan Jurd" <[EMAIL PROTECTED]> writes: > I'm not 100% clear on what the A_ prefix signifies ... is A_ArrayExpr > a good name for the parse-time structure? Yeah, might as well use that for consistency. The A_ doesn't seem very meaningful to me either, but I don't want to rename the existing examples ... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
On Nov 27, 2007 8:04 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Brendan Jurd" <[EMAIL PROTECTED]> writes: > > ... So > > unfortunately I can't just add a TypeName member to ArrayExpr. > > That would be quite the wrong thing to do anyway, since ArrayExpr is > a run-time representation and shouldn't have any such thing attached > to it. What you probably need is a separate parse-time representation > of ARRAY[], a la the difference between A_Const and Const. > Ah. I wasn't aware of the distinction; I started by looking in gram.y and saw that the ARRAY parse path creates an ArrayExpr node, whilst the constant parse paths create A_Const nodes. I didn't realise that ArrayExpr was "skipping ahead" and creating the same kind of object that the transform produces. Glad I stopped and asked for directions then. =) I'm not 100% clear on what the A_ prefix signifies ... is A_ArrayExpr a good name for the parse-time structure? Thanks for your time, BJ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
"Brendan Jurd" <[EMAIL PROTECTED]> writes: > This approach is making sense to me, but I've run into a bit of a > dependency issue. A_Const does indeed have a slot for typecasts by > way of a TypeName member. A_Const and TypeName are both defined in > parsenodes.h, whereas ArrayExpr is defined in primnodes.h. So > unfortunately I can't just add a TypeName member to ArrayExpr. That would be quite the wrong thing to do anyway, since ArrayExpr is a run-time representation and shouldn't have any such thing attached to it. What you probably need is a separate parse-time representation of ARRAY[], a la the difference between A_Const and Const. Another possibility is to just hack up a private communication path between transformExpr and transformArrayExpr, ie when you see TypeCast check to see if its argument is ArrayExpr and do something different. This would be a mite klugy but it'd be a much smaller patch that way. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
Quoting Tom, from the previous thread linked by Martijn: > It could be pretty ugly, because type assignment normally proceeds > bottom-up :-(. What you might have to do is make the raw grammar > representation of ARRAY[] work like A_Const does, ie, there's a > slot to plug in a typecast. That's pretty much vestigial now for > A_Const, if memory serves, but it'd be needful if ARRAY[] has to > be able to "see" the typecast that would otherwise be above it in > the parse tree. This approach is making sense to me, but I've run into a bit of a dependency issue. A_Const does indeed have a slot for typecasts by way of a TypeName member. A_Const and TypeName are both defined in parsenodes.h, whereas ArrayExpr is defined in primnodes.h. So unfortunately I can't just add a TypeName member to ArrayExpr. I'm new to this area of the codebase (and parsers generally), so I'm treading carefully. What would be the best way to resolve this? Would moving TypeName into primnodes.h be acceptable? Thanks for your time, BJ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]
On Nov 26, 2007 3:58 AM, Martijn van Oosterhout <[EMAIL PROTECTED]> wrote: > On Mon, Nov 26, 2007 at 03:51:37AM +1100, Brendan Jurd wrote: > > I noticed in the 8.3 release notes that ARRAY(SELECT ...) now returns > > an empty array if there are no rows returned by the subquery. > > This has come up before, Tom had an idea about how to fix it: > > http://groups.google.com/group/pgsql.general/browse_thread/thread/911791e145a17daa/6b035035aeaac399 > http://www.mail-archive.com/[EMAIL PROTECTED]/msg90681.html [moving thread to -hackers] Thanks for the link Martijn. I'd be interested in taking a swing at this if nobody else has laid claim. Since that thread died back in January, I'm guessing it's wide open. Regards, BJ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings