[HACKERS] Status report: regex replacement

2003-02-05 Thread Tom Lane
I have just committed the latest version of Henry Spencer's regex package (lifted from Tcl 8.4.1) into CVS HEAD. This code is natively able to handle wide characters efficiently, and so it avoids the multibyte performance problems recently exhibited by Wade Klaver. I have not done extensive perfor

Re: [HACKERS] Status report: regex replacement

2003-02-05 Thread Jon Jensen
On Wed, 5 Feb 2003, Tom Lane wrote: > 1. There are a couple of minor incompatibilities between the "advanced" > regex syntax implemented by this package and the syntax handled by our > old code; in particular, backslash is now a special character within > bracket expressions. It seems to me that

Re: [HACKERS] Status report: regex replacement

2003-02-05 Thread Christopher Kings-Lynne
> > set regex_flavor = advanced > > set regex_flavor = extended > > set regex_flavor = basic > [snip] > > Any suggestions about the name of the parameter? > > Actually I think 'regex_flavor' sounds fine. Not more Americanisms in our config files!! :P Chris -

Re: [HACKERS] Status report: regex replacement

2003-02-05 Thread Tom Lane
"Christopher Kings-Lynne" <[EMAIL PROTECTED]> writes: >> Actually I think 'regex_flavor' sounds fine. > Not more Americanisms in our config files!! :P You want regex_flavour? ;-) regards, tom lane ---(end of broadcast)--- T

Re: [HACKERS] Status report: regex replacement

2003-02-05 Thread Christopher Kings-Lynne
> "Christopher Kings-Lynne" <[EMAIL PROTECTED]> writes: > >> Actually I think 'regex_flavor' sounds fine. > > > Not more Americanisms in our config files!! :P > > You want regex_flavour? ;-) Hehe - yeah I don't really care. I have to use 'color' often enough accessing 100% of the world's programm

Re: [HACKERS] Status report: regex replacement

2003-02-05 Thread Tom Lane
"Christopher Kings-Lynne" <[EMAIL PROTECTED]> writes: >> You want regex_flavour? ;-) > Hehe - yeah I don't really care. I have to use 'color' often enough > accessing 100% of the world's programming APIs... > How about regex_type, regex_mode, regex_option, etc.? ;) Well, I used "flavor" in my p

Re: [HACKERS] Status report: regex replacement

2003-02-05 Thread Hannu Krosing
Christopher Kings-Lynne kirjutas N, 06.02.2003 kell 03:56: > > > set regex_flavor = advanced > > > set regex_flavor = extended > > > set regex_flavor = basic > > [snip] > > > Any suggestions about the name of the parameter? > > > > Actually I think 'regex_flavor' sounds fine. > > Not more A

Re: [HACKERS] Status report: regex replacement

2003-02-06 Thread Tatsuo Ishii
> I have just committed the latest version of Henry Spencer's regex > package (lifted from Tcl 8.4.1) into CVS HEAD. This code is natively > able to handle wide characters efficiently, and so it avoids the > multibyte performance problems recently exhibited by Wade Klaver. > I have not done extens

Re: [HACKERS] Status report: regex replacement

2003-02-06 Thread Tim Allen
On Fri, 7 Feb 2003 00:49, Hannu Krosing wrote: > Tatsuo Ishii kirjutas N, 06.02.2003 kell 17:05: > > > Perhaps we should not call the encoding UNICODE but UTF8 (which it > > > really is). UNICODE is a character set which has half a dozen official > > > encodings and calling one of them "UNICODE" do

Re: [HACKERS] Status report: regex replacement

2003-02-07 Thread Hannu Krosing
Tatsuo Ishii kirjutas R, 07.02.2003 kell 04:03: > > UTF-8 seems to be the most popular, but even XML standard requires all > > compliant implementations to deal with at least both UTF-8 and UTF-16. > > I don't think PostgreSQL is going to natively support UTF-16. By natively, do you mean "as bac

Re: [HACKERS] Status report: regex replacement

2003-02-06 Thread Hannu Krosing
On Thu, 2003-02-06 at 13:25, Tatsuo Ishii wrote: > > I have just committed the latest version of Henry Spencer's regex > > package (lifted from Tcl 8.4.1) into CVS HEAD. This code is natively > > able to handle wide characters efficiently, and so it avoids the > > multibyte performance problems re

Re: [HACKERS] Status report: regex replacement

2003-02-06 Thread Tatsuo Ishii
> Perhaps we should not call the encoding UNICODE but UTF8 (which it > really is). UNICODE is a character set which has half a dozen official > encodings and calling one of them "UNICODE" does not make things very > clear. Right. Also we perhaps should call LATIN1 or ISO-8859-1 more precisely way

Re: [HACKERS] Status report: regex replacement

2003-02-06 Thread Hannu Krosing
Tatsuo Ishii kirjutas N, 06.02.2003 kell 17:05: > > Perhaps we should not call the encoding UNICODE but UTF8 (which it > > really is). UNICODE is a character set which has half a dozen official > > encodings and calling one of them "UNICODE" does not make things very > > clear. > > Right. Also we

Re: [HACKERS] Status report: regex replacement

2003-02-06 Thread Tatsuo Ishii
> > Right. Also we perhaps should call LATIN1 or ISO-8859-1 more precisely > > way since ISO-8859-1 can be encoded in either 7 bit or 8 bit(we use > > this). I don't know what it is called though. > > I don't think that calling 8-bit ISO-8859-1 ISO-8859-1 can confuse > anybody, but UCS-2 (ISO-1064

Re: [HACKERS] Status report: regex replacement

2003-02-10 Thread Peter Eisentraut
Tom Lane writes: > code is concerned: the regex library actually offers three regex > flavors, "advanced", "extended", and "basic", where "extended" matches > what we had before ("extended" and "basic" correspond to different > levels of the POSIX 1003.2 standard). We just need a way to expose >

Re: [HACKERS] Status report: regex replacement

2003-02-11 Thread Peter Eisentraut
Tatsuo Ishii writes: > > UTF-8 seems to be the most popular, but even XML standard requires all > > compliant implementations to deal with at least both UTF-8 and UTF-16. > > I don't think PostgreSQL is going to natively support UTF-16. At FOSDEM it was claimed that Windows natively uses UCS-2, a

Re: [HACKERS] Status report: regex replacement

2003-02-10 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes: > Tom Lane writes: >> code is concerned: the regex library actually offers three regex >> flavors, "advanced", "extended", and "basic", where "extended" matches >> what we had before ("extended" and "basic" correspond to different >> levels of the POSIX