Re: [GENERAL] russian case-insensitive regexp search not working
Oleg Bartunov wrote: alexander, lc_ctype and lc_collate can be changed only at initdb ! You need to read localization chapter http://www.postgresql.org/docs/current/static/charset.html Yes, i knew about this, but i thought maybe somehow it can be changed onthefly. ... (10 minutes later) Yes, now when initdb done with --locale=ru_RU.UTF-8, lower('RussianString') gives me 'russianstring', though, case-insensiive regexp still not working. I guess i'll stick with lower() ~ lower() construction. And thanks everybody who replied! Oleg On Thu, 12 Jul 2007, alexander lunyov wrote: Tom Lane wrote: alexander lunyov [EMAIL PROTECTED] writes: With this i just wanted to say that lower() doesn't work at all on russian unicode characters, In that case you're using the wrong locale (ie, not russian unicode). Check show lc_ctype. db= SHOW LC_CTYPE; lc_ctype -- C (1 запись) db= SHOW LC_COLLATE; lc_collate C (1 запись) Where can i change this? Trying to SET this parameters gives error parameter lc_collate cannot be changed Or [ checks back in thread... ] maybe you're using the wrong operating system. Not so long ago FreeBSD didn't have Unicode locale support at all; I'm not sure if 6.2 has that problem but it is worth checking. Does it work for you to do case-insensitive russian comparisons in grep, for instance? I put to textfile 3 russian strings with different case of first char and grep'ed them all: # cat textfile Зеленая Зеленодольская зеленая # grep -i зелен * textfile:Зеленая textfile:Зеленодольская textfile:зеленая So i think system is fine about unicode. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- alexander lunyov [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] russian case-insensitive regexp search not working
Tom Lane wrote: alexander lunyov [EMAIL PROTECTED] writes: With this i just wanted to say that lower() doesn't work at all on russian unicode characters, In that case you're using the wrong locale (ie, not russian unicode). Check show lc_ctype. db= SHOW LC_CTYPE; lc_ctype -- C (1 запись) db= SHOW LC_COLLATE; lc_collate C (1 запись) Where can i change this? Trying to SET this parameters gives error parameter lc_collate cannot be changed Or [ checks back in thread... ] maybe you're using the wrong operating system. Not so long ago FreeBSD didn't have Unicode locale support at all; I'm not sure if 6.2 has that problem but it is worth checking. Does it work for you to do case-insensitive russian comparisons in grep, for instance? I put to textfile 3 russian strings with different case of first char and grep'ed them all: # cat textfile Зеленая Зеленодольская зеленая # grep -i зелен * textfile:Зеленая textfile:Зеленодольская textfile:зеленая So i think system is fine about unicode. -- alexander lunyov [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] russian case-insensitive regexp search not working
Karsten Hilbert wrote: Just to clarify: lower() on both sides of a comparison should still work as expected on multibyte encodings ? It's been suggested here before. lower() on both sides also does not working in my case, it still search for case-sensitive data. String in this example have first char capitalized, and result is the same. Seems that lower() can't lower multibyte character. db= select lower('Зелен'); Well, no, With this i just wanted to say that lower() doesn't work at all on russian unicode characters, even in select lower('String') 'String' don't become lowercase, and further it does not work in more complex select statement. select my_string where lower(my_string) ~ lower(search_fragment); Does that help ? (~ does work for eg. German in my experience) No, for russian unicode strings it is not working. I searched pgsql-patches@ list and found there this thread: http://archives.postgresql.org/pgsql-patches/2007-06/msg00021.php I wrote Andrew (he didn't answer yet) about whether this patch can help with my problem. P.S.: if this issue is a known bug (as we talked earlier), then how long will it take to fix it? I know little about postgresql development process, maybe you know it little better? -- alexander lunyov [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] russian case-insensitive regexp search not working
Hello, friends. OS FreeBSD 6.2, Postgresql 8.2.4 Postgresql does not search case-insensitive russian regexp unicode patterns. Postgres is working under user pgsql with login class (in /etc/login.conf): postgres:\ :lang=ru_RU.UTF-8:\ :setenv=LC_COLLATE=C:\ :tc=default: In .profile of postgres user: LANG=ru_RU.UTF-8 export LANG CHARSET=UTF-8 export CHARSET Then, database: db= \encoding UTF8 Case insensitive search for low-cased pattern show nothing: db= select street from people where street ~* 'зелен'; street (0 rows) While there are records, but they are with first capital character: db= select street from people where street ~* 'Зелен'; street Зеленая Зеленоградская (2 rows) Search for english values work fine, russian values not. Why could it be? -- alexander lunyov [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] russian case-insensitive regexp search not working
No, ILIKE also does case-sensitive search. I found this bug report: http://archives.postgresql.org/pgsql-bugs/2006-09/msg00065.php Is it about this issue? And will it be fixed someday? Sergey Levchenko wrote: Just use: select street from people where street ILIKE 'зелен%'; select with case-insensitive regexp does no work right now! On 09/07/07, alexander lunyov [EMAIL PROTECTED] wrote: Hello, friends. OS FreeBSD 6.2, Postgresql 8.2.4 Postgresql does not search case-insensitive russian regexp unicode patterns. Postgres is working under user pgsql with login class (in /etc/login.conf): postgres:\ :lang=ru_RU.UTF-8:\ :setenv=LC_COLLATE=C:\ :tc=default: In .profile of postgres user: LANG=ru_RU.UTF-8 export LANG CHARSET=UTF-8 export CHARSET Then, database: db= \encoding UTF8 Case insensitive search for low-cased pattern show nothing: db= select street from people where street ~* 'зелен'; street (0 rows) While there are records, but they are with first capital character: db= select street from people where street ~* 'Зелен'; street Зеленая Зеленоградская (2 rows) Search for english values work fine, russian values not. Why could it be? -- alexander lunyov [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- alexander lunyov [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] russian case-insensitive regexp search not working
Karsten Hilbert wrote: Just to clarify: lower() on both sides of a comparison should still work as expected on multibyte encodings ? It's been suggested here before. lower() on both sides also does not working in my case, it still search for case-sensitive data. String in this example have first char capitalized, and result is the same. Seems that lower() can't lower multibyte character. db= select lower('Зелен'); lower --- Зелен (1 запись) -- alexander lunyov [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly