On 03-Apr-2020, at 00:05, David G. Johnston <david.g.johns...@gmail.com> wrote:

On Thu, Apr 2, 2020 at 8:46 PM Bryn Llewellyn <b...@yugabyte.com 
<mailto:b...@yugabyte.com>> wrote:
On 02-Apr-2020, at 19:25, Tom Lane <t...@sss.pgh.pa.us 
<mailto:t...@sss.pgh.pa.us>> wrote:

Bryn Llewellyn <b...@yugabyte.com <mailto:b...@yugabyte.com>> writes:
> The documentation in section “8.16.2. Constructing Composite Values” here:
> https://www.postgresql.org/docs/11/rowtypes.html#ROWTYPES-IO-SYNTAX 
> <https://www.postgresql.org/docs/11/rowtypes.html#ROWTYPES-IO-SYNTAX> 
> <https://www.postgresql.org/docs/11/rowtypes.html#ROWTYPES-IO-SYNTAX 
> <https://www.postgresql.org/docs/11/rowtypes.html#ROWTYPES-IO-SYNTAX>>

The authoritative documentation for that is at 8.16.6 "Composite Type
Input and Output Syntax", and it says quite clearly that whitespace is
not ignored (except for before and after the outer parentheses). 
[...] 

“Whitespace outside the parentheses is ignored, but within the parentheses it 
is considered part of the field value, and might or might not be significant 
depending on the input conversion rules for the field data type. For example, 
in:

'(  42)'

the whitespace will be ignored if the field type is integer, but not if it is 
text. As shown previously, when writing a composite value you can write double 
quotes around any individual field value.”

Notice the wording “double quotes around any individual field value.” The word 
“around” was the source of my confusion. For the docs to communicate what, it 
seems, they ought to, then the word should be “within”. This demonstrates my 
point:

Actually, they do mean around (as in, the double-quotes must be adjacent to the 
commas that delimit the fields, or the parens).

The next two sentences clear things up a bit:

"You must do so if the field value would otherwise confuse the composite-value 
parser.  In particular, fields containing parentheses, commas, double quotes, 
or backslashes must be double-quoted."

That said the documentation doesn't match the behavior (which is considerably 
more forgiving and also willing to simply discard double-quotes instead of 
error-ing out when the documented rules are not adhered to)

Specifically:  '(a \"b c\" d,     e \"f,g\" h)'::rt leaves the double-quote 
while '(a ""b c"" d,     e ""f,g"" h)'::rt does not.  Neither have the field 
surround with double-quotes so should be invalid per the documentation.  When 
you follow the documentation they then both retain the double-quote.

So if you follow the guidelines set forth in the documentation you get the 
result the documentation promises.  If you fail to follow the guidelines you 
may still get a result but there is no promise made as to its contents.  Not 
ideal but also not likely to be changed after all this time.

create type rt as (a text, b text);
with v as (select '(a "b c" d,     e "f,g" h)'::rt as r)
select
  '>'||(r).a||'<' as a,
  '>'||(r).b||'<' as b
from v;

This demonstrates that, in my input, the double quotes are *within* each of the 
two text values—and definitely *do not surround* them.

Yep, which is why you have an issue.   The "surround them" is indeed what the 
text meant to say.


I really would appreciate a reply to the second part of my earlier question:

“please explain the rationale for what seems to me to be a counter-intuitive 
syntax design choice.”
[...]
They chose the rule that a string value *must* be surrounded by double quotes 
and that other values must *not* be so surrounded. The JSON model resonates 
with my intuition.

This point wasn't answered because there is no good answer to be given.  The 
above is how someone in the mid-90s or so decided PostgreSQL should handle 
this.  I'll personally agree that more verbose but explicit, and less 
forgiving, syntax and parsing, would have been a better choice.  But the choice 
has been made and isn't subject to change.

But regardless of what drove the original design choice if you really care 
about it in a "want to learn" mode then it is very easy to observe the defined 
behavior and critique it independently of how it came to be.  If all you want 
to do is make a left-handed jab at PostgreSQL for having a non-intuitive to you 
design choice do act surprised when people don't choose to respond - especially 
when none of us made the original choice.

The only thing we can do today is describe the system more clearly if we have 
failed to do so adequately.  You are probably in the best position, then, to 
learn what it does and propose new wording that someone with inexperienced (or 
biased toward a different system) eyes would understand quickly and clearly.

David J.


Thanks for the explanation, David. It helped me a lot. You said “If all you 
want to do is make a left-handed jab at PostgreSQL”. That was not at all my 
intention. Rather, my question was driven by my experience of learning things 
over the years—and by my experience of teaching others. Sometimes, at the early 
stage of learning, a new idea strikes one as counter intuitive, and one feels 
that there must, surely, have been a better way. The common answer goes along 
these lines: “No, you’re missing the big picture. Consider use case X. And use 
case Y. See how your proposed design would fail to handle these. And see how 
the adopted design does handle them.” And so the questioner is enlightened—but 
only by asking. This is a key aspect of learning. Occasionally, the answer is 
“You’re right. A different design would have been nicer. But it’s too late, 
now, to change things. However, if you follow the rules, you can handle all use 
cases.” That answer, too, is very useful. It tells the questioner that they are 
not missing any subtlety. This is *key*. Then they can simply move on and use 
the feature as it is.

Thanks for suggesting that I might propose some rewording of the PG doc. I 
believe that I can see how this might be done. I just submitted my proposal 
using this form:

https://www.postgresql.org/account/comments/new/11/rowtypes.html/ 
<https://www.postgresql.org/account/comments/new/11/rowtypes.html/>











Reply via email to