Re: [BUGS] BUG #6457: Regexp not processing word (with special characters on ends) correctly (UTF-8)

2012-02-15 Thread Duncan Rance
On 14 Feb 2012, at 18:28, Tom Lane wrote: > > Oh, I see the reason for this: the code in cclass() in regc_locale.c > doesn't go further up than U+00FF, so no codes above that will be > thought to be letters (or members of any other character class). > Clearly we need to go further when we are deal

Re: [BUGS] BUG #6457: Regexp not processing word (with special characters on ends) correctly (UTF-8)

2012-02-15 Thread Duncan Rance
On 14 Feb 2012, at 18:28, Tom Lane wrote: > > Oh, I see the reason for this: the code in cclass() in regc_locale.c > doesn't go further up than U+00FF, so no codes above that will be > thought to be letters (or members of any other character class). > Clearly we need to go further when we are deal

Re: [BUGS] BUG #6457: Regexp not processing word (with special characters on ends) correctly (UTF-8)

2012-02-14 Thread Tom Lane
albert.cieszkow...@cc.com.pl writes: > peimp=> select 'Świnoujście' ~* '\mŚwinoujście\M'; > ?column? > -- > f > (1 row) Oh, I see the reason for this: the code in cclass() in regc_locale.c doesn't go further up than U+00FF, so no codes above that will be thought to be letters (or mem

Re: [BUGS] BUG #6457: Regexp not processing word (with special characters on ends) correctly (UTF-8)

2012-02-14 Thread Albert Cieszkowski
Hello Tom, Every lc_x value is pl_PL.UTF8 (corresponding to the word's language). Database was created with --locale=pl_PL.UTF8. OS (CentOS 5.x) uses: en_US.UTF-8 Best regards, Albert Cieszkowski W dniu 2012-02-14 16:27, Tom Lane pisze:

Re: [BUGS] BUG #6457: Regexp not processing word (with special characters on ends) correctly (UTF-8)

2012-02-14 Thread Tom Lane
albert.cieszkow...@cc.com.pl writes: > OS, base and client encoding UTF-8: What's your lc_collate/lc_ctype settings? regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref

[BUGS] BUG #6457: Regexp not processing word (with special characters on ends) correctly (UTF-8)

2012-02-14 Thread albert . cieszkowski
The following bug has been logged on the website: Bug reference: 6457 Logged by: Albert Cieszkowski Email address: albert.cieszkow...@cc.com.pl PostgreSQL version: 9.0.6 Operating system: CentOS 5.x Description: OS, base and client encoding UTF-8: peimp=> select 'Świ