Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-05 Thread Jeff Davis
On Wed, 2008-06-04 at 06:35 +0200, Martijn van Oosterhout wrote: > Check the archives for details on how it works precisely, but it's far > nicer than merely adding an typmod, since that would cause you to throw > errors at runtime if there's a problem. Ok, that makes sense. I agree that any type

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-03 Thread Martijn van Oosterhout
On Tue, Jun 03, 2008 at 01:53:56PM -0700, Jeff Davis wrote: > On Tue, 2008-06-03 at 21:26 +0200, Zdenek Kotala wrote: > > The first step is per database, because it is relative easy. Collation > > per-column is very difficult. It requires a lot of changes (parser, planer, > > executor...) in whol

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-03 Thread Tom Lane
Jeff Davis <[EMAIL PROTECTED]> writes: > Right now typmod is already passed to all those layers, right? Would it > be a useful intermediate step to use typmod to hold this information for > the text type? No, it would not, because typmod doesn't propagate through functions.

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-03 Thread Jeff Davis
On Tue, 2008-06-03 at 21:26 +0200, Zdenek Kotala wrote: > The first step is per database, because it is relative easy. Collation > per-column is very difficult. It requires a lot of changes (parser, planer, > executor...) in whole source code, because you need to keep collation > information tog

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-03 Thread Zdenek Kotala
Pavel Stehule napsal(a): 2008/6/3 Zdenek Kotala <[EMAIL PROTECTED]>: David E. Wheeler napsal(a): On Jun 3, 2008, at 02:27, Zdenek Kotala wrote: The proposal of GSoc is there: http://archives.postgresql.org/pgsql-hackers/2008-05/msg00857.php It should create basic framework for full SQL COLLA

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-03 Thread Pavel Stehule
2008/6/3 Zdenek Kotala <[EMAIL PROTECTED]>: > David E. Wheeler napsal(a): >> >> On Jun 3, 2008, at 02:27, Zdenek Kotala wrote: >> >>> The proposal of GSoc is there: >>> http://archives.postgresql.org/pgsql-hackers/2008-05/msg00857.php >>> >>> It should create basic framework for full SQL COLLATION

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-03 Thread Zdenek Kotala
David E. Wheeler napsal(a): On Jun 3, 2008, at 12:06, Zdenek Kotala wrote: It is simple. SQL standard does not specify notation for that (chapter 11.34). But there is proposed notation: CREATE COLLATION FOR FROM [ ] [ sensitive> ] [ ] [ LC_COLLATE ] [ LC_CTYPE ] := NO PAD | PAD SP

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-03 Thread David E. Wheeler
On Jun 3, 2008, at 12:06, Zdenek Kotala wrote: It is simple. SQL standard does not specify notation for that (chapter 11.34). But there is proposed notation: CREATE COLLATION FOR FROM [ ] [ sensitive> ] [ ] [ LC_COLLATE ] [ LC_CTYPE ] := NO PAD | PAD SPACE := CASE SENSITIVE | C

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-03 Thread Zdenek Kotala
David E. Wheeler napsal(a): On Jun 3, 2008, at 02:27, Zdenek Kotala wrote: The proposal of GSoc is there: http://archives.postgresql.org/pgsql-hackers/2008-05/msg00857.php It should create basic framework for full SQL COLLATION support. All comments are welcome. That looks great, Zdenek. I'

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-03 Thread David E. Wheeler
On Jun 3, 2008, at 02:27, Zdenek Kotala wrote: The proposal of GSoc is there: http://archives.postgresql.org/pgsql-hackers/2008-05/msg00857.php It should create basic framework for full SQL COLLATION support. All comments are welcome. That looks great, Zdenek. I'm very excited to have impro

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-03 Thread Zdenek Kotala
Martijn van Oosterhout napsal(a): On Mon, Jun 02, 2008 at 11:08:55AM -0700, Jeff Davis wrote: http://wiki.postgresql.org/wiki/Todo:Collate The last reference I see on that page is from 2005. Is there any updated information? Are there any major obstacles holding this up aside from the platform

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread Martijn van Oosterhout
On Mon, Jun 02, 2008 at 11:08:55AM -0700, Jeff Davis wrote: > http://wiki.postgresql.org/wiki/Todo:Collate > > The last reference I see on that page is from 2005. Is there any updated > information? Are there any major obstacles holding this up aside from > the platform issues mentioned on that pa

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread Shane Ambler
To diverge a little - Bit of a nood question along these lines - Does LIKE and ILIKE take into consideration the locale allowing insensitive searches in any locale setting? I know that LIKE can use an index if you don't start the match with a wild card. ILIKE doesn't seem to. Is or would i

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread Tino Wildenhain
Andrew Sullivan wrote: ... I think if you want some special treatment of text for some users, it should be explicit. Yes. Also, not just text. Think of currency, numeric separators, &c. Which imho, should not really be the business of the type interface but instead something to_char() and

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread Andrew Sullivan
On Mon, Jun 02, 2008 at 10:29:30AM -0700, Jeff Davis wrote: > What if you had a CHECK constraint that was locale-sensitive? Would the > constraint only be non-false (true or null) for records inserted under > the same locale? That's not very useful. It would seem that this is one of the important

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread Jeff Davis
On Mon, 2008-06-02 at 19:55 +0200, Martijn van Oosterhout wrote: > The SQL COLLATE syntax handles this just fine. Either the original > COLLATE patch or the new one will let people tags strings with any > collation they like. http://wiki.postgresql.org/wiki/Todo:Collate The last reference I see o

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread Jeff Davis
On Sun, 2008-06-01 at 22:13 -0700, David E. Wheeler wrote: > What locale is right? If I have a Web app, there could be data in many > different languages in a single table/column. I think the values should be explicitly treated differently. It would be nice if you could just typecast, like: "l

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread Martijn van Oosterhout
On Mon, Jun 02, 2008 at 10:29:30AM -0700, Jeff Davis wrote: > I think if you want some special treatment of text for some users, it > should be explicit. Text in one locale is really a different type from > text in another locale, and so changing the locale of some text variable > is really a typec

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread Jeff Davis
On Mon, 2008-06-02 at 09:51 -0400, Andrew Sullivan wrote: > On Sun, Jun 01, 2008 at 10:13:07PM -0700, David E. Wheeler wrote: > > > What locale is right? If I have a Web app, there could be data in many > > different languages in a single table/column. > > I think the above amounts to a need for

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread David E. Wheeler
On Jun 2, 2008, at 09:33, Tom Lane wrote: Would the use of str_tolower() in formatting.c fix that? Yeah, you need something equivalent to that. I think that whole area is due for refactoring, though --- we've got kind of a weird collection of upper/lower/initcap APIs spread through a couple

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread Tom Lane
"David E. Wheeler" <[EMAIL PROTECTED]> writes: > On Jun 1, 2008, at 21:08, Tom Lane wrote: >> [ broken record... ] Kinda depends on your locale. However, >> tolower() >> is 100% guaranteed not to work for multibyte encodings, so citext is >> quite useless if you're using UTF8. This is fixable,

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread David E. Wheeler
On Jun 1, 2008, at 21:08, Tom Lane wrote: 1. Does the use of the tolower() C function in the citext data type on pgfoundry basically give me the same results as using lower() in my SQL has for all these years? [ broken record... ] Kinda depends on your locale. However, tolower() is 100%

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread David E. Wheeler
On Jun 2, 2008, at 06:51, Andrew Sullivan wrote: On Sun, Jun 01, 2008 at 10:13:07PM -0700, David E. Wheeler wrote: What locale is right? If I have a Web app, there could be data in many different languages in a single table/column. I think the above amounts to a need for per-session locale

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread Andrew Sullivan
On Sun, Jun 01, 2008 at 10:13:07PM -0700, David E. Wheeler wrote: > What locale is right? If I have a Web app, there could be data in many > different languages in a single table/column. I think the above amounts to a need for per-session locale settings or something, no? A -- Andrew Sullivan

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread Oleg Bartunov
On Sun, 1 Jun 2008, David E. Wheeler wrote: On Jun 1, 2008, at 22:21, Oleg Bartunov wrote: David, we wrote contrib module (mchar) for one customer, which ports its application from mssql to postgres. It does case-insensitive comparison for new data type 'mchar' and linked with ICU for system

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread David E. Wheeler
On Jun 1, 2008, at 22:21, Oleg Bartunov wrote: David, we wrote contrib module (mchar) for one customer, which ports its application from mssql to postgres. It does case-insensitive comparison for new data type 'mchar' and linked with ICU for system independent locale. That sounds promisin

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread David E. Wheeler
On Jun 1, 2008, at 22:18, Tom Lane wrote: I'm using C. Of course you're correct that it depends on the locale, I always forget that. But does not the Unicode standard offer up some sort locale-independent case-insensitivity, so that it gets it right some large percentage of the time? Not rea

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread Oleg Bartunov
David, we wrote contrib module (mchar) for one customer, which ports its application from mssql to postgres. It does case-insensitive comparison for new data type 'mchar' and linked with ICU for system independent locale. Oleg On Sun, 1 Jun 2008, David E. Wheeler wrote: Howdy, I'm sure I'm

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread Tom Lane
"David E. Wheeler" <[EMAIL PROTECTED]> writes: > On Jun 1, 2008, at 21:08, Tom Lane wrote: >> Okay ... according to whose locale? > I'm using C. Of course you're correct that it depends on the locale, I > always forget that. But does not the Unicode standard offer up some > sort locale-indepen

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread David E. Wheeler
On Jun 1, 2008, at 21:08, Tom Lane wrote: "David E. Wheeler" <[EMAIL PROTECTED]> writes: I really need case-insensitive string comparison in my database. Okay ... according to whose locale? I'm using C. Of course you're correct that it depends on the locale, I always forget that. But does

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread Zdenek Kotala
David E. Wheeler napsal(a): Howdy, I'm sure I'm just showing off my ignorance here, but here goes… I really need case-insensitive string comparison in my database. Collation per database level should be help you. It is now under development and I hope it will be part of 8.4. You can see htt

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread Tom Lane
"David E. Wheeler" <[EMAIL PROTECTED]> writes: > I really need case-insensitive string comparison in my database. Okay ... according to whose locale? > Ideally there'd be a nice ITEXT data type (and friends, ichar, > ivarchar, etc.). But of course there isn't, and for years I've just > used

[HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread David E. Wheeler
Howdy, I'm sure I'm just showing off my ignorance here, but here goes… I really need case-insensitive string comparison in my database. Ideally there'd be a nice ITEXT data type (and friends, ichar, ivarchar, etc.). But of course there isn't, and for years I've just used LOWER() on indexes