Re: [HACKERS] Unicode string literals versus the world

2009-05-28 Thread Peter Eisentraut
On Friday 29 May 2009 06:31:23 Bruce Momjian wrote: > Peter Eisentraut wrote: > > On Tuesday 05 May 2009 03:01:05 Tom Lane wrote: > > > Peter Eisentraut writes: > > > > On Tuesday 14 April 2009 21:34:51 Peter Eisentraut wrote: > > > >> I think we can handle that and the cases Tom presents by error

Re: [HACKERS] Unicode string literals versus the world

2009-05-28 Thread Bruce Momjian
Peter Eisentraut wrote: > On Tuesday 05 May 2009 03:01:05 Tom Lane wrote: > > Peter Eisentraut writes: > > > On Tuesday 14 April 2009 21:34:51 Peter Eisentraut wrote: > > >> I think we can handle that and the cases Tom presents by erroring out > > >> when the U& syntax is used with stdstr off. > >

Re: [HACKERS] Unicode string literals versus the world

2009-05-05 Thread Peter Eisentraut
On Tuesday 05 May 2009 03:01:05 Tom Lane wrote: > Peter Eisentraut writes: > > On Tuesday 14 April 2009 21:34:51 Peter Eisentraut wrote: > >> I think we can handle that and the cases Tom presents by erroring out > >> when the U& syntax is used with stdstr off. > > > > Proposed patch for that attac

Re: [HACKERS] Unicode string literals versus the world

2009-05-04 Thread Hiroshi Saito
Hi. quick test for great patch. ! == SCRIPT == set CLIENT_ENCODING to 'UTF-8'; DROP TABLE ucheck CASCADE; CREATE TABLE ucheck (key VARCHAR(10) PRIMARY KEY, data NCHAR(50)); set STANDARD_CONFORMING_STRINGS to on; INSERT INTO ucheck VALUES('ucheck1',u&'\68ee\9dd7\5916'); SELECT * FROM ucheck; set

Re: [HACKERS] Unicode string literals versus the world

2009-05-04 Thread Tom Lane
Peter Eisentraut writes: > On Tuesday 14 April 2009 21:34:51 Peter Eisentraut wrote: >> I think we can handle that and the cases Tom presents by erroring out when >> the U& syntax is used with stdstr off. > Proposed patch for that attached. I have not been able to think of any security hole in t

Re: [HACKERS] Unicode string literals versus the world

2009-05-04 Thread Peter Eisentraut
On Tuesday 14 April 2009 21:34:51 Peter Eisentraut wrote: > On Tuesday 14 April 2009 17:13:00 Marko Kreen wrote: > > If the parsing does not happen in 2 passes and it does not take account > > of stdstr setting then the default breakage would be: > > > >stdstr=off, U&' \' UESCAPE '!'. > > I th

Re: [HACKERS] Unicode string literals versus the world

2009-04-17 Thread Sam Mason
On Fri, Apr 17, 2009 at 10:15:57AM -0400, Tom Lane wrote: > Sam Mason writes: > > Just noticed that the spec only supports four hex digits; > > Better read it again. You're right of course. My ability to read patches seems not to be very good. -- Sam http://samason.me.uk/ -- Sent via pgs

Re: [HACKERS] Unicode string literals versus the world

2009-04-17 Thread Tom Lane
Sam Mason writes: > Just noticed that the spec only supports four hex digits; Better read it again. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hac

Re: [HACKERS] Unicode string literals versus the world

2009-04-17 Thread Sam Mason
On Thu, Apr 16, 2009 at 12:08:37PM -0400, Tom Lane wrote: > Sam Mason writes: > > I've failed to keep up with the discussion so I'm not sure where this > > conversation has got to! Is the consensus for 8.4 to enable SQL2003 > > style U&lit escaped literals if and only if standard_conforming_strin

Re: [HACKERS] Unicode string literals versus the world

2009-04-16 Thread Tom Lane
Sam Mason writes: > I've failed to keep up with the discussion so I'm not sure where this > conversation has got to! Is the consensus for 8.4 to enable SQL2003 > style U&lit escaped literals if and only if standard_conforming_strings > is set? That was Peter's proposal, and no one's shot a hole

Re: [HACKERS] Unicode string literals versus the world

2009-04-16 Thread Sam Mason
On Thu, Apr 16, 2009 at 06:34:06PM +0300, Marko Kreen wrote: > Which hints that you can aswell enter the pairs directly: \uxx\uxx. > If I'd be language designer, I would not see any reason to disallow it. > > And anyway, at least mono seems to support it: > > using System; > public class HelloWor

Re: [HACKERS] Unicode string literals versus the world

2009-04-16 Thread Marko Kreen
On 4/16/09, Tom Lane wrote: > Sam Mason writes: > > I'd never heard of UTF-16 surrogate pairs before this discussion and > > hence didn't realise that it's valid to have a surrogate pair in place > > of a single code point. The docs say that corresponds to > > U+10302, Python would appear t

Re: [HACKERS] Unicode string literals versus the world

2009-04-16 Thread Andrew Dunstan
Tom Lane wrote: Sam Mason writes: I'd never heard of UTF-16 surrogate pairs before this discussion and hence didn't realise that it's valid to have a surrogate pair in place of a single code point. The docs say that corresponds to U+10302, Python would appear to follow my intuitions in t

Re: [HACKERS] Unicode string literals versus the world

2009-04-16 Thread Marko Kreen
On 4/16/09, Sam Mason wrote: > On Thu, Apr 16, 2009 at 02:47:20PM +0300, Marko Kreen wrote: > > On 4/16/09, Sam Mason wrote: > > > Microsoft have also gone this way in C#, named code points are not > > > supported however. > > > > And it handles also non-BMP codepoints with \u escape similarl

Re: [HACKERS] Unicode string literals versus the world

2009-04-16 Thread Sam Mason
On Thu, Apr 16, 2009 at 10:54:16AM -0400, Tom Lane wrote: > Sam Mason writes: > > I'd never heard of UTF-16 surrogate pairs before this discussion and > > hence didn't realise that it's valid to have a surrogate pair in place > > of a single code point. The docs say that corresponds to > > U+103

Re: [HACKERS] Unicode string literals versus the world

2009-04-16 Thread Tom Lane
Sam Mason writes: > I'd never heard of UTF-16 surrogate pairs before this discussion and > hence didn't realise that it's valid to have a surrogate pair in place > of a single code point. The docs say that corresponds to > U+10302, Python would appear to follow my intuitions in that: > ord(u'

Re: [HACKERS] Unicode string literals versus the world

2009-04-16 Thread Sam Mason
On Thu, Apr 16, 2009 at 02:47:20PM +0300, Marko Kreen wrote: > On 4/16/09, Sam Mason wrote: > > Microsoft have also gone this way in C#, named code points are not > > supported however. > > And it handles also non-BMP codepoints with \u escape similarly: > > http://en.csharp-online.net/ECMA-33

Re: [HACKERS] Unicode string literals versus the world

2009-04-16 Thread Andrew Dunstan
Tatsuo Ishii wrote: I could live with either. Wikipedia says: "The characters outside the first plane usually have very specialized or rare use." For years we rejected all characters beyond the first plane, and while that's fixed now, the volume of complaints wasn't huge. I you mean "

Re: [HACKERS] Unicode string literals versus the world

2009-04-16 Thread Marko Kreen
On 4/16/09, Sam Mason wrote: > On Wed, Apr 15, 2009 at 11:19:42PM +0300, Marko Kreen wrote: > > On 4/15/09, Tom Lane wrote: > > > > Given Martijn's complaint about more-than-16-bit code points, I think > > > the \u proposal is not mature enough to go into 8.4. We can think > > > about some

Re: [HACKERS] Unicode string literals versus the world

2009-04-16 Thread Sam Mason
On Wed, Apr 15, 2009 at 11:19:42PM +0300, Marko Kreen wrote: > On 4/15/09, Tom Lane wrote: > > Given Martijn's complaint about more-than-16-bit code points, I think > > the \u proposal is not mature enough to go into 8.4. We can think > > about some version of that later, if there's enough inte

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread Tatsuo Ishii
> > >>> I still stand on my proposal, how about extending E'' strings with > > >>> unicode escapes (eg. \u)? The E'' strings are already more > > >>> clearly defined than '' and they are our "own", we don't need to > > >>> consider random standards, but can consider our sanity. > > >>>

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread Tatsuo Ishii
> >>> I still stand on my proposal, how about extending E'' strings with > >>> unicode escapes (eg. \u)? The E'' strings are already more > >>> clearly defined than '' and they are our "own", we don't need to > >>> consider random standards, but can consider our sanity. > >>> > >> I sus

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread Marko Kreen
On 4/15/09, Tom Lane wrote: > Marko Kreen writes: > > As both this and the doubling-\\ way would mean we should have usable > > alternative in case of stdstr=off also, so in the end we have agreed > > to accept \u also? > > Given Martijn's complaint about more-than-16-bit code points, I think >

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread Tom Lane
Marko Kreen writes: > As both this and the doubling-\\ way would mean we should have usable > alternative in case of stdstr=off also, so in the end we have agreed > to accept \u also? Given Martijn's complaint about more-than-16-bit code points, I think the \u proposal is not mature enough to go

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread Andrew Dunstan
Martijn van Oosterhout wrote: On Tue, Apr 14, 2009 at 08:10:54AM -0400, Andrew Dunstan wrote: Marko Kreen wrote: I still stand on my proposal, how about extending E'' strings with unicode escapes (eg. \u)? The E'' strings are already more clearly defined than '' and they are our

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread Marko Kreen
On 4/15/09, Tom Lane wrote: > Marko Kreen writes: > > Whats wrong with requiring U& to conform with stdstr=off quoting rules? > > The sole and only excuse for that misbegotten syntax is to be exactly > SQL spec compliant --- otherwise we might as well pick something saner. > So it needs to wor

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread Tom Lane
Marko Kreen writes: > Whats wrong with requiring U& to conform with stdstr=off quoting rules? The sole and only excuse for that misbegotten syntax is to be exactly SQL spec compliant --- otherwise we might as well pick something saner. So it needs to work like stdstr=on. I thought Peter's propos

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread Marko Kreen
On 4/15/09, Greg Stark wrote: > On Wed, Apr 15, 2009 at 6:23 PM, Tom Lane wrote: > >> Wouldn't we just then say that U&'' strings are always standard- > >> conforming? > > > > That's exactly what's causing the problem --- they are, but there > > is lots of software that won't know it. > > We

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread Greg Stark
On Wed, Apr 15, 2009 at 6:52 PM, Greg Stark wrote: > On Wed, Apr 15, 2009 at 6:23 PM, Tom Lane wrote: >>> Wouldn't we just then say that U&'' strings are always standard- >>> conforming? >> >> That's exactly what's causing the problem --- they are, but there >> is lots of software that won't know

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread Greg Stark
On Wed, Apr 15, 2009 at 6:23 PM, Tom Lane wrote: >> Wouldn't we just then say that U&'' strings are always standard- >> conforming? > > That's exactly what's causing the problem --- they are, but there > is lots of software that won't know it. We could say U&'' escapes only work if you have stan

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread Tom Lane
"David E. Wheeler" writes: > Wouldn't we just then say that U&'' strings are always standard- > conforming? That's exactly what's causing the problem --- they are, but there is lots of software that won't know it. regards, tom lane -- Sent via pgsql-hackers mailing lis

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread Martijn van Oosterhout
On Tue, Apr 14, 2009 at 08:10:54AM -0400, Andrew Dunstan wrote: > Marko Kreen wrote: > >I still stand on my proposal, how about extending E'' strings with > >unicode escapes (eg. \u)? The E'' strings are already more > >clearly defined than '' and they are our "own", we don't need to > >consid

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread David E. Wheeler
On Apr 15, 2009, at 4:45 AM, Sam Mason wrote: Doh, yes it does doesn't it. Sorry I searched for a bit and failed to find anything before. Looks as though the signal to noise ratio was far too low as I've just searched again and found a (single) reference to their docs describing the feature

Re: [HACKERS] Unicode string literals versus the world

2009-04-15 Thread Sam Mason
On Tue, Apr 14, 2009 at 04:01:48PM +0300, Peter Eisentraut wrote: > On Saturday 11 April 2009 18:20:47 Sam Mason wrote: > > I can't see much support in the other database engines; searched for > > Oracle, MS-SQL, DB2 and Firebird. MySQL has it planned for 7.1, so not > > for a while. > > DB2 supp

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Robert Haas
On Tue, Apr 14, 2009 at 2:55 PM, Tom Lane wrote: > Robert Haas writes: >> Well, that's fine, but that's a long way from Peter's statement that >> "I think the tendency should be to get rid of E'' usage". > > Bear in mind that that's Peter's opinion; it's not necessarily shared > by anyone else.  

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Marko Kreen
On 4/14/09, Tom Lane wrote: > Peter Eisentraut writes: > > On Tuesday 14 April 2009 18:54:33 Tom Lane wrote: > >> The other proposal that seemed > >> attractive to me was a decode-like function: > >> > >> uescape('foo\00e9bar') > >> uescape('foo\00e9bar', '\') > > > This was discussed prev

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Meredith L. Patterson
Tom Lane wrote: > This is *not* about code within Postgres. One typically provides libraries for this sort of thing, but your point is taken; suggestion withdrawn. --mlp _ Meredith L. Patterson Founder and CTO Osogato, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hacke

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Meredith L. Patterson
Tom Lane wrote: > I suspect that it's actually impossible to parse such a thing correctly > without a full-fledged flex lexer or something of equivalent complexity. > Certainly it's a couple of orders of magnitude harder than it is for > either standard-conforming or E'' literals. Is there a reaso

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Tom Lane
"Meredith L. Patterson" writes: > Tom Lane wrote: >> I suspect that it's actually impossible to parse such a thing correctly >> without a full-fledged flex lexer or something of equivalent complexity. > Is there a reason not to use a full-fledged flex lexer? The point is that that's a pretty lar

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Peter Eisentraut
On Tuesday 14 April 2009 21:48:12 Tom Lane wrote: > Peter Eisentraut writes: > > I think we can handle that and the cases Tom presents by erroring out > > when the U& syntax is used with stdstr off. > > I think you're missing the point --- this is not about whether the > syntax is unambiguous (it

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Tom Lane
Robert Haas writes: > Well, that's fine, but that's a long way from Peter's statement that > "I think the tendency should be to get rid of E'' usage". Bear in mind that that's Peter's opinion; it's not necessarily shared by anyone else. I was just responding to your assertion of the diametricall

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Tom Lane
Peter Eisentraut writes: > On Tuesday 14 April 2009 18:54:33 Tom Lane wrote: >> The other proposal that seemed >> attractive to me was a decode-like function: >> >> uescape('foo\00e9bar') >> uescape('foo\00e9bar', '\') > This was discussed previously, but rejected with the following argument: >

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Robert Haas
On Tue, Apr 14, 2009 at 2:22 PM, Tom Lane wrote: > Robert Haas writes: >> Maybe I've just got my head deeply in the sand, but I don't understand >> what the alternative to E'' supposedly is.  How am I supposed to write >> the equivalent of E'\t\n\f' without using E''?  The >> standard_conforming_

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Tom Lane
Peter Eisentraut writes: > I think we can handle that and the cases Tom presents by erroring out > when the U& syntax is used with stdstr off. I think you're missing the point --- this is not about whether the syntax is unambiguous (it is already) but about whether a frontend that doesn't underst

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Peter Eisentraut
On Tuesday 14 April 2009 17:32:00 Tom Lane wrote: > I admit that the SQL:2008 way also covers Unicode code > points in identifiers, which we can't emulate without a lexical change; > but frankly I think the use-case for that is so thin as to be almost > nonexistent. Who is going to choose identif

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Peter Eisentraut
On Tuesday 14 April 2009 18:54:33 Tom Lane wrote: > The other proposal that seemed > attractive to me was a decode-like function: > > uescape('foo\00e9bar') > uescape('foo\00e9bar', '\') This was discussed previously, but rejected with the following argument: There are some other

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Peter Eisentraut
On Tuesday 14 April 2009 21:22:29 Tom Lane wrote: > BTW, does anyone know whether Unicode includes the ASCII control > characters ... ie, is \u0009 a name for tab? If so, maybe this > syntax is in part an attempt to cover that use-case in the standard. Yes on both. -- Sent via pgsql-hackers mai

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Peter Eisentraut
On Tuesday 14 April 2009 17:13:00 Marko Kreen wrote: > If the parsing does not happen in 2 passes and it does not take account > of stdstr setting then the default breakage would be: > >stdstr=off, U&' \' UESCAPE '!'. I think we can handle that and the cases Tom presents by erroring out when

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread David E. Wheeler
On Apr 14, 2009, at 11:22 AM, Tom Lane wrote: BTW, does anyone know whether Unicode includes the ASCII control characters ... ie, is \u0009 a name for tab? If so, maybe this syntax is in part an attempt to cover that use-case in the standard. Yes, you can use, e.g., in HTML to represent a t

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Peter Eisentraut
On Tuesday 14 April 2009 20:35:21 Robert Haas wrote: > Maybe I've just got my head deeply in the sand, but I don't understand > what the alternative to E'' supposedly is. How am I supposed to write > the equivalent of E'\t\n\f' without using E''? Well, the first alternative is to type those chara

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Tom Lane
Robert Haas writes: > Maybe I've just got my head deeply in the sand, but I don't understand > what the alternative to E'' supposedly is. How am I supposed to write > the equivalent of E'\t\n\f' without using E''? The > standard_conforming_strings syntax apparently supports no escapes of > any k

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Robert Haas
On Tue, Apr 14, 2009 at 8:53 AM, Peter Eisentraut wrote: > This doesn't excite me.  I think the tendency should be to get rid of E'' > usage, because its definition of escape sequences is single-byte and ASCII > centric and thus overall a legacy construct.  Certainly, we will want to keep > around

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Peter Eisentraut
On Saturday 11 April 2009 18:20:47 Sam Mason wrote: > I can't see much support in the other database engines; searched for > Oracle, MS-SQL, DB2 and Firebird. MySQL has it planned for 7.1, so not > for a while. DB2 supports it, as far as I know. -- Sent via pgsql-hackers mailing list (pgsql-hac

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Peter Eisentraut
On Tuesday 14 April 2009 14:38:38 Marko Kreen wrote: > I think the problem is that they should not act like E'' strings, but they > should act like plain '' strings - they should follow stdstr setting. > > That way existing tools that may (or may not..) understand E'' and stdstr > settings, but def

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Tom Lane
Marko Kreen writes: > I would prefer that such quoting extensions would wait until > stdstr=on setting is the only mode Postgres will operate. > Fitting new quoting ways to environment with flippable stdstr setting > will be rather painful for everyone. It would certainly be a lot safer to wait u

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Andrew Dunstan
Marko Kreen wrote: I still stand on my proposal, how about extending E'' strings with unicode escapes (eg. \u)? The E'' strings are already more clearly defined than '' and they are our "own", we don't need to consider random standards, but can consider our sanity. I suspect there wo

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Tom Lane
Peter Eisentraut writes: > On Saturday 11 April 2009 00:54:25 Tom Lane wrote: >> If we let this go into 8.4, our previous rounds with security holes >> caused by careless string parsing will look like a day at the beach. > Note that the escape character marks the Unicode escapes; it doesn't > aff

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Marko Kreen
On 4/14/09, Peter Eisentraut wrote: > On Tuesday 14 April 2009 14:38:38 Marko Kreen wrote: > > I think the problem is that they should not act like E'' strings, but they > > should act like plain '' strings - they should follow stdstr setting. > > > > That way existing tools that may (or may n

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Marko Kreen
On 4/14/09, Peter Eisentraut wrote: > On Saturday 11 April 2009 00:54:25 Tom Lane wrote: > > It gets worse though: I have seldom seen such a badly designed piece of > > syntax as the Unicode string syntax --- see > > http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL >

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Peter Eisentraut
On Saturday 11 April 2009 21:50:29 Josh Berkus wrote: > On 4/11/09 11:47 AM, Marko Kreen wrote: > > On 4/11/09, Tom Lane wrote: > >> It gets worse though: I have seldom seen such a badly designed piece > >> of syntax as the Unicode string syntax --- see > >> > >> http://developer.postgresql.or

Re: [HACKERS] Unicode string literals versus the world

2009-04-14 Thread Peter Eisentraut
On Saturday 11 April 2009 00:54:25 Tom Lane wrote: > It gets worse though: I have seldom seen such a badly designed piece of > syntax as the Unicode string syntax --- see > http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL >-SYNTAX-STRINGS-UESCAPE > > You scan the string,

Re: [HACKERS] Unicode string literals versus the world

2009-04-11 Thread Josh Berkus
Peter put it in, I think. It is in the SQL:2008 spec, but that doesn't change the fact that it's a horribly bad piece of design. Hmmm. We're not going to implement *everything* in the spec; nobody does, even IBM. I think maybe these kinds of additions need to be hashed out for value so we

Re: [HACKERS] Unicode string literals versus the world

2009-04-11 Thread Tom Lane
Josh Berkus writes: >> On 4/11/09, Tom Lane wrote: >>> http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-UESCAPE > WTF? Whose feature is this? What's the use case? Peter put it in, I think. It is in the SQL:2008 spec, but that doesn't change the fact

Re: [HACKERS] Unicode string literals versus the world

2009-04-11 Thread Josh Berkus
On 4/11/09 11:47 AM, Marko Kreen wrote: On 4/11/09, Tom Lane wrote: It gets worse though: I have seldom seen such a badly designed piece of syntax as the Unicode string syntax --- see http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-UESCAPE WTF

Re: [HACKERS] Unicode string literals versus the world

2009-04-11 Thread Marko Kreen
On 4/11/09, Tom Lane wrote: > It gets worse though: I have seldom seen such a badly designed piece of > syntax as the Unicode string syntax --- see > > http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-UESCAPE > > You scan the string, and then after th

Re: [HACKERS] Unicode string literals versus the world

2009-04-11 Thread Andrew Dunstan
Tom Lane wrote: It gets worse though: I have seldom seen such a badly designed piece of syntax as the Unicode string syntax --- see http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-UESCAPE You scan the string, and then after that they tell you what the

Re: [HACKERS] Unicode string literals versus the world

2009-04-11 Thread Sam Mason
On Fri, Apr 10, 2009 at 05:54:25PM -0400, Tom Lane wrote: > It gets worse though: I have seldom seen such a badly designed piece of > syntax as the Unicode string syntax --- see > http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-UESCAPE > > I think we need

[HACKERS] Unicode string literals versus the world

2009-04-10 Thread Tom Lane
So I started to look at what might be involved in teaching plpgsql about standard_conforming_strings, and was soon dismayed by the sheer epic nature of its failure to act like the core lexer. It was shaky enough before, but the recent introduction of Unicode strings and identifiers into the core h