On Friday 29 May 2009 06:31:23 Bruce Momjian wrote:
> Peter Eisentraut wrote:
> > On Tuesday 05 May 2009 03:01:05 Tom Lane wrote:
> > > Peter Eisentraut writes:
> > > > On Tuesday 14 April 2009 21:34:51 Peter Eisentraut wrote:
> > > >> I think we can handle that and the cases Tom presents by error
Peter Eisentraut wrote:
> On Tuesday 05 May 2009 03:01:05 Tom Lane wrote:
> > Peter Eisentraut writes:
> > > On Tuesday 14 April 2009 21:34:51 Peter Eisentraut wrote:
> > >> I think we can handle that and the cases Tom presents by erroring out
> > >> when the U& syntax is used with stdstr off.
> >
On Tuesday 05 May 2009 03:01:05 Tom Lane wrote:
> Peter Eisentraut writes:
> > On Tuesday 14 April 2009 21:34:51 Peter Eisentraut wrote:
> >> I think we can handle that and the cases Tom presents by erroring out
> >> when the U& syntax is used with stdstr off.
> >
> > Proposed patch for that attac
Hi.
quick test for great patch. !
== SCRIPT ==
set CLIENT_ENCODING to 'UTF-8';
DROP TABLE ucheck CASCADE;
CREATE TABLE ucheck (key VARCHAR(10) PRIMARY KEY, data NCHAR(50));
set STANDARD_CONFORMING_STRINGS to on;
INSERT INTO ucheck VALUES('ucheck1',u&'\68ee\9dd7\5916');
SELECT * FROM ucheck;
set
Peter Eisentraut writes:
> On Tuesday 14 April 2009 21:34:51 Peter Eisentraut wrote:
>> I think we can handle that and the cases Tom presents by erroring out when
>> the U& syntax is used with stdstr off.
> Proposed patch for that attached.
I have not been able to think of any security hole in t
On Tuesday 14 April 2009 21:34:51 Peter Eisentraut wrote:
> On Tuesday 14 April 2009 17:13:00 Marko Kreen wrote:
> > If the parsing does not happen in 2 passes and it does not take account
> > of stdstr setting then the default breakage would be:
> >
> >stdstr=off, U&' \' UESCAPE '!'.
>
> I th
On Fri, Apr 17, 2009 at 10:15:57AM -0400, Tom Lane wrote:
> Sam Mason writes:
> > Just noticed that the spec only supports four hex digits;
>
> Better read it again.
You're right of course. My ability to read patches seems not to be very
good.
--
Sam http://samason.me.uk/
--
Sent via pgs
Sam Mason writes:
> Just noticed that the spec only supports four hex digits;
Better read it again.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hac
On Thu, Apr 16, 2009 at 12:08:37PM -0400, Tom Lane wrote:
> Sam Mason writes:
> > I've failed to keep up with the discussion so I'm not sure where this
> > conversation has got to! Is the consensus for 8.4 to enable SQL2003
> > style U&lit escaped literals if and only if standard_conforming_strin
Sam Mason writes:
> I've failed to keep up with the discussion so I'm not sure where this
> conversation has got to! Is the consensus for 8.4 to enable SQL2003
> style U&lit escaped literals if and only if standard_conforming_strings
> is set?
That was Peter's proposal, and no one's shot a hole
On Thu, Apr 16, 2009 at 06:34:06PM +0300, Marko Kreen wrote:
> Which hints that you can aswell enter the pairs directly: \uxx\uxx.
> If I'd be language designer, I would not see any reason to disallow it.
>
> And anyway, at least mono seems to support it:
>
> using System;
> public class HelloWor
On 4/16/09, Tom Lane wrote:
> Sam Mason writes:
> > I'd never heard of UTF-16 surrogate pairs before this discussion and
> > hence didn't realise that it's valid to have a surrogate pair in place
> > of a single code point. The docs say that corresponds to
> > U+10302, Python would appear t
Tom Lane wrote:
Sam Mason writes:
I'd never heard of UTF-16 surrogate pairs before this discussion and
hence didn't realise that it's valid to have a surrogate pair in place
of a single code point. The docs say that corresponds to
U+10302, Python would appear to follow my intuitions in t
On 4/16/09, Sam Mason wrote:
> On Thu, Apr 16, 2009 at 02:47:20PM +0300, Marko Kreen wrote:
> > On 4/16/09, Sam Mason wrote:
> > > Microsoft have also gone this way in C#, named code points are not
> > > supported however.
> >
> > And it handles also non-BMP codepoints with \u escape similarl
On Thu, Apr 16, 2009 at 10:54:16AM -0400, Tom Lane wrote:
> Sam Mason writes:
> > I'd never heard of UTF-16 surrogate pairs before this discussion and
> > hence didn't realise that it's valid to have a surrogate pair in place
> > of a single code point. The docs say that corresponds to
> > U+103
Sam Mason writes:
> I'd never heard of UTF-16 surrogate pairs before this discussion and
> hence didn't realise that it's valid to have a surrogate pair in place
> of a single code point. The docs say that corresponds to
> U+10302, Python would appear to follow my intuitions in that:
> ord(u'
On Thu, Apr 16, 2009 at 02:47:20PM +0300, Marko Kreen wrote:
> On 4/16/09, Sam Mason wrote:
> > Microsoft have also gone this way in C#, named code points are not
> > supported however.
>
> And it handles also non-BMP codepoints with \u escape similarly:
>
> http://en.csharp-online.net/ECMA-33
Tatsuo Ishii wrote:
I could live with either. Wikipedia says: "The characters outside the
first plane usually have very specialized or rare use." For years we
rejected all characters beyond the first plane, and while that's fixed
now, the volume of complaints wasn't huge.
I you mean "
On 4/16/09, Sam Mason wrote:
> On Wed, Apr 15, 2009 at 11:19:42PM +0300, Marko Kreen wrote:
> > On 4/15/09, Tom Lane wrote:
>
> > > Given Martijn's complaint about more-than-16-bit code points, I think
> > > the \u proposal is not mature enough to go into 8.4. We can think
> > > about some
On Wed, Apr 15, 2009 at 11:19:42PM +0300, Marko Kreen wrote:
> On 4/15/09, Tom Lane wrote:
> > Given Martijn's complaint about more-than-16-bit code points, I think
> > the \u proposal is not mature enough to go into 8.4. We can think
> > about some version of that later, if there's enough inte
> > >>> I still stand on my proposal, how about extending E'' strings with
> > >>> unicode escapes (eg. \u)? The E'' strings are already more
> > >>> clearly defined than '' and they are our "own", we don't need to
> > >>> consider random standards, but can consider our sanity.
> > >>>
> >>> I still stand on my proposal, how about extending E'' strings with
> >>> unicode escapes (eg. \u)? The E'' strings are already more
> >>> clearly defined than '' and they are our "own", we don't need to
> >>> consider random standards, but can consider our sanity.
> >>>
> >> I sus
On 4/15/09, Tom Lane wrote:
> Marko Kreen writes:
> > As both this and the doubling-\\ way would mean we should have usable
> > alternative in case of stdstr=off also, so in the end we have agreed
> > to accept \u also?
>
> Given Martijn's complaint about more-than-16-bit code points, I think
>
Marko Kreen writes:
> As both this and the doubling-\\ way would mean we should have usable
> alternative in case of stdstr=off also, so in the end we have agreed
> to accept \u also?
Given Martijn's complaint about more-than-16-bit code points, I think
the \u proposal is not mature enough to go
Martijn van Oosterhout wrote:
On Tue, Apr 14, 2009 at 08:10:54AM -0400, Andrew Dunstan wrote:
Marko Kreen wrote:
I still stand on my proposal, how about extending E'' strings with
unicode escapes (eg. \u)? The E'' strings are already more
clearly defined than '' and they are our
On 4/15/09, Tom Lane wrote:
> Marko Kreen writes:
> > Whats wrong with requiring U& to conform with stdstr=off quoting rules?
>
> The sole and only excuse for that misbegotten syntax is to be exactly
> SQL spec compliant --- otherwise we might as well pick something saner.
> So it needs to wor
Marko Kreen writes:
> Whats wrong with requiring U& to conform with stdstr=off quoting rules?
The sole and only excuse for that misbegotten syntax is to be exactly
SQL spec compliant --- otherwise we might as well pick something saner.
So it needs to work like stdstr=on. I thought Peter's propos
On 4/15/09, Greg Stark wrote:
> On Wed, Apr 15, 2009 at 6:23 PM, Tom Lane wrote:
> >> Wouldn't we just then say that U&'' strings are always standard-
> >> conforming?
> >
> > That's exactly what's causing the problem --- they are, but there
> > is lots of software that won't know it.
>
> We
On Wed, Apr 15, 2009 at 6:52 PM, Greg Stark wrote:
> On Wed, Apr 15, 2009 at 6:23 PM, Tom Lane wrote:
>>> Wouldn't we just then say that U&'' strings are always standard-
>>> conforming?
>>
>> That's exactly what's causing the problem --- they are, but there
>> is lots of software that won't know
On Wed, Apr 15, 2009 at 6:23 PM, Tom Lane wrote:
>> Wouldn't we just then say that U&'' strings are always standard-
>> conforming?
>
> That's exactly what's causing the problem --- they are, but there
> is lots of software that won't know it.
We could say U&'' escapes only work if you have
stan
"David E. Wheeler" writes:
> Wouldn't we just then say that U&'' strings are always standard-
> conforming?
That's exactly what's causing the problem --- they are, but there
is lots of software that won't know it.
regards, tom lane
--
Sent via pgsql-hackers mailing lis
On Tue, Apr 14, 2009 at 08:10:54AM -0400, Andrew Dunstan wrote:
> Marko Kreen wrote:
> >I still stand on my proposal, how about extending E'' strings with
> >unicode escapes (eg. \u)? The E'' strings are already more
> >clearly defined than '' and they are our "own", we don't need to
> >consid
On Apr 15, 2009, at 4:45 AM, Sam Mason wrote:
Doh, yes it does doesn't it. Sorry I searched for a bit and failed to
find anything before. Looks as though the signal to noise ratio was
far
too low as I've just searched again and found a (single) reference to
their docs describing the feature
On Tue, Apr 14, 2009 at 04:01:48PM +0300, Peter Eisentraut wrote:
> On Saturday 11 April 2009 18:20:47 Sam Mason wrote:
> > I can't see much support in the other database engines; searched for
> > Oracle, MS-SQL, DB2 and Firebird. MySQL has it planned for 7.1, so not
> > for a while.
>
> DB2 supp
On Tue, Apr 14, 2009 at 2:55 PM, Tom Lane wrote:
> Robert Haas writes:
>> Well, that's fine, but that's a long way from Peter's statement that
>> "I think the tendency should be to get rid of E'' usage".
>
> Bear in mind that that's Peter's opinion; it's not necessarily shared
> by anyone else.
On 4/14/09, Tom Lane wrote:
> Peter Eisentraut writes:
> > On Tuesday 14 April 2009 18:54:33 Tom Lane wrote:
> >> The other proposal that seemed
> >> attractive to me was a decode-like function:
> >>
> >> uescape('foo\00e9bar')
> >> uescape('foo\00e9bar', '\')
>
> > This was discussed prev
Tom Lane wrote:
> This is *not* about code within Postgres.
One typically provides libraries for this sort of thing, but your point
is taken; suggestion withdrawn.
--mlp
_
Meredith L. Patterson
Founder and CTO
Osogato, Inc.
--
Sent via pgsql-hackers mailing list (pgsql-hacke
Tom Lane wrote:
> I suspect that it's actually impossible to parse such a thing correctly
> without a full-fledged flex lexer or something of equivalent complexity.
> Certainly it's a couple of orders of magnitude harder than it is for
> either standard-conforming or E'' literals.
Is there a reaso
"Meredith L. Patterson" writes:
> Tom Lane wrote:
>> I suspect that it's actually impossible to parse such a thing correctly
>> without a full-fledged flex lexer or something of equivalent complexity.
> Is there a reason not to use a full-fledged flex lexer?
The point is that that's a pretty lar
On Tuesday 14 April 2009 21:48:12 Tom Lane wrote:
> Peter Eisentraut writes:
> > I think we can handle that and the cases Tom presents by erroring out
> > when the U& syntax is used with stdstr off.
>
> I think you're missing the point --- this is not about whether the
> syntax is unambiguous (it
Robert Haas writes:
> Well, that's fine, but that's a long way from Peter's statement that
> "I think the tendency should be to get rid of E'' usage".
Bear in mind that that's Peter's opinion; it's not necessarily shared
by anyone else. I was just responding to your assertion of the
diametricall
Peter Eisentraut writes:
> On Tuesday 14 April 2009 18:54:33 Tom Lane wrote:
>> The other proposal that seemed
>> attractive to me was a decode-like function:
>>
>> uescape('foo\00e9bar')
>> uescape('foo\00e9bar', '\')
> This was discussed previously, but rejected with the following argument:
>
On Tue, Apr 14, 2009 at 2:22 PM, Tom Lane wrote:
> Robert Haas writes:
>> Maybe I've just got my head deeply in the sand, but I don't understand
>> what the alternative to E'' supposedly is. How am I supposed to write
>> the equivalent of E'\t\n\f' without using E''? The
>> standard_conforming_
Peter Eisentraut writes:
> I think we can handle that and the cases Tom presents by erroring out
> when the U& syntax is used with stdstr off.
I think you're missing the point --- this is not about whether the
syntax is unambiguous (it is already) but about whether a frontend that
doesn't underst
On Tuesday 14 April 2009 17:32:00 Tom Lane wrote:
> I admit that the SQL:2008 way also covers Unicode code
> points in identifiers, which we can't emulate without a lexical change;
> but frankly I think the use-case for that is so thin as to be almost
> nonexistent. Who is going to choose identif
On Tuesday 14 April 2009 18:54:33 Tom Lane wrote:
> The other proposal that seemed
> attractive to me was a decode-like function:
>
> uescape('foo\00e9bar')
> uescape('foo\00e9bar', '\')
This was discussed previously, but rejected with the following argument:
There are some other
On Tuesday 14 April 2009 21:22:29 Tom Lane wrote:
> BTW, does anyone know whether Unicode includes the ASCII control
> characters ... ie, is \u0009 a name for tab? If so, maybe this
> syntax is in part an attempt to cover that use-case in the standard.
Yes on both.
--
Sent via pgsql-hackers mai
On Tuesday 14 April 2009 17:13:00 Marko Kreen wrote:
> If the parsing does not happen in 2 passes and it does not take account
> of stdstr setting then the default breakage would be:
>
>stdstr=off, U&' \' UESCAPE '!'.
I think we can handle that and the cases Tom presents by erroring out when
On Apr 14, 2009, at 11:22 AM, Tom Lane wrote:
BTW, does anyone know whether Unicode includes the ASCII control
characters ... ie, is \u0009 a name for tab? If so, maybe this
syntax is in part an attempt to cover that use-case in the standard.
Yes, you can use, e.g., in HTML to represent a t
On Tuesday 14 April 2009 20:35:21 Robert Haas wrote:
> Maybe I've just got my head deeply in the sand, but I don't understand
> what the alternative to E'' supposedly is. How am I supposed to write
> the equivalent of E'\t\n\f' without using E''?
Well, the first alternative is to type those chara
Robert Haas writes:
> Maybe I've just got my head deeply in the sand, but I don't understand
> what the alternative to E'' supposedly is. How am I supposed to write
> the equivalent of E'\t\n\f' without using E''? The
> standard_conforming_strings syntax apparently supports no escapes of
> any k
On Tue, Apr 14, 2009 at 8:53 AM, Peter Eisentraut wrote:
> This doesn't excite me. I think the tendency should be to get rid of E''
> usage, because its definition of escape sequences is single-byte and ASCII
> centric and thus overall a legacy construct. Certainly, we will want to keep
> around
On Saturday 11 April 2009 18:20:47 Sam Mason wrote:
> I can't see much support in the other database engines; searched for
> Oracle, MS-SQL, DB2 and Firebird. MySQL has it planned for 7.1, so not
> for a while.
DB2 supports it, as far as I know.
--
Sent via pgsql-hackers mailing list (pgsql-hac
On Tuesday 14 April 2009 14:38:38 Marko Kreen wrote:
> I think the problem is that they should not act like E'' strings, but they
> should act like plain '' strings - they should follow stdstr setting.
>
> That way existing tools that may (or may not..) understand E'' and stdstr
> settings, but def
Marko Kreen writes:
> I would prefer that such quoting extensions would wait until
> stdstr=on setting is the only mode Postgres will operate.
> Fitting new quoting ways to environment with flippable stdstr setting
> will be rather painful for everyone.
It would certainly be a lot safer to wait u
Marko Kreen wrote:
I still stand on my proposal, how about extending E'' strings with
unicode escapes (eg. \u)? The E'' strings are already more
clearly defined than '' and they are our "own", we don't need to
consider random standards, but can consider our sanity.
I suspect there wo
Peter Eisentraut writes:
> On Saturday 11 April 2009 00:54:25 Tom Lane wrote:
>> If we let this go into 8.4, our previous rounds with security holes
>> caused by careless string parsing will look like a day at the beach.
> Note that the escape character marks the Unicode escapes; it doesn't
> aff
On 4/14/09, Peter Eisentraut wrote:
> On Tuesday 14 April 2009 14:38:38 Marko Kreen wrote:
> > I think the problem is that they should not act like E'' strings, but they
> > should act like plain '' strings - they should follow stdstr setting.
> >
> > That way existing tools that may (or may n
On 4/14/09, Peter Eisentraut wrote:
> On Saturday 11 April 2009 00:54:25 Tom Lane wrote:
> > It gets worse though: I have seldom seen such a badly designed piece of
> > syntax as the Unicode string syntax --- see
> > http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL
>
On Saturday 11 April 2009 21:50:29 Josh Berkus wrote:
> On 4/11/09 11:47 AM, Marko Kreen wrote:
> > On 4/11/09, Tom Lane wrote:
> >> It gets worse though: I have seldom seen such a badly designed piece
> >> of syntax as the Unicode string syntax --- see
> >>
> >> http://developer.postgresql.or
On Saturday 11 April 2009 00:54:25 Tom Lane wrote:
> It gets worse though: I have seldom seen such a badly designed piece of
> syntax as the Unicode string syntax --- see
> http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL
>-SYNTAX-STRINGS-UESCAPE
>
> You scan the string,
Peter put it in, I think. It is in the SQL:2008 spec, but that doesn't
change the fact that it's a horribly bad piece of design.
Hmmm. We're not going to implement *everything* in the spec; nobody
does, even IBM. I think maybe these kinds of additions need to be
hashed out for value so we
Josh Berkus writes:
>> On 4/11/09, Tom Lane wrote:
>>> http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-UESCAPE
> WTF? Whose feature is this? What's the use case?
Peter put it in, I think. It is in the SQL:2008 spec, but that doesn't
change the fact
On 4/11/09 11:47 AM, Marko Kreen wrote:
On 4/11/09, Tom Lane wrote:
It gets worse though: I have seldom seen such a badly designed piece of
syntax as the Unicode string syntax --- see
http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-UESCAPE
WTF
On 4/11/09, Tom Lane wrote:
> It gets worse though: I have seldom seen such a badly designed piece of
> syntax as the Unicode string syntax --- see
>
> http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-UESCAPE
>
> You scan the string, and then after th
Tom Lane wrote:
It gets worse though: I have seldom seen such a badly designed piece of
syntax as the Unicode string syntax --- see
http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-UESCAPE
You scan the string, and then after that they tell you what the
On Fri, Apr 10, 2009 at 05:54:25PM -0400, Tom Lane wrote:
> It gets worse though: I have seldom seen such a badly designed piece of
> syntax as the Unicode string syntax --- see
> http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-UESCAPE
>
> I think we need
67 matches
Mail list logo