Re: LC_COLLATE and PostgreSQL
On Jun 24, 2007, at 1:41 PM, bsd_news wrote: Hi I like OpenBSD very much but: I have not proper sorts in my PostgreSQL 8.1 database on my OpenBSD 4.0 server. I had set in /etc/profile the LC_COLLATE to pl_PL.ISO8859-2. The PostgreSQL cluster was created by command: initdb --locale=pl_PL.ISO8859-2 -E LATIN2 --lc-messages=C --lc- monetary=C --lc-numeric=C --lc-time=C -D /var/postgresql/data. I do not know is there possibility to fix this problem - maybe OpenBSD now support only C and POSIX collation ? If i understand you correctly, you're having trouble with how Postgresql colates, not OpenBSD. you should consult the Postgresql docs, starting with http://www.postgresql.org/docs/8.1/static/charset.html#AEN22133 Thanks for every help, best regards, Artur ps. sorry for my poor English Ben
Re: LC_COLLATE and PostgreSQL
On 6/24/07, bsd_news <[EMAIL PROTECTED]> wrote: ... I have not proper sorts in my PostgreSQL 8.1 database on my OpenBSD 4.0 server. I had set in /etc/profile the LC_COLLATE to pl_PL.ISO8859-2. ... I do not know is there possibility to fix this problem - maybe OpenBSD now support only C and POSIX collation ? Your guess is correct: OpenBSD only supports the default locale, C. To quote the source for the strcoll() function: int strcoll(const char *s1, const char *s2) { /* LC_COLLATE is unimplemented, hence always "C" */ return (strcmp(s1, s2)); } ps. sorry for my poor English Seemed fine to me: I had no problem understanding what you were doing and what wasn't working as you expected. Indeed, you provided better information than some native speakers posting queries here. Philip Guenther
Re: LC_COLLATE and PostgreSQL
Hi Ben, than You for answer. My PostgreSQL instance works fine. The problem is in OpenBSD unfortunately. PostgreSQL uses COLLATION support served by operating system and default BSD has C and POSSIX COLLATION. I am looking for solution for this problem. I need native support for polish language with specific string sorts. Regards, Artur On Sun, 24 Jun 2007 18:47:29 -0700, Ben Calvert <[EMAIL PROTECTED]> wrote: > On Jun 24, 2007, at 1:41 PM, bsd_news wrote: > >> Hi >> I like OpenBSD very much but: >> I have not proper sorts in my PostgreSQL 8.1 database on my OpenBSD >> 4.0 >> server. >> I had set in /etc/profile the LC_COLLATE to pl_PL.ISO8859-2. >> The PostgreSQL cluster was created by command: >> initdb --locale=pl_PL.ISO8859-2 -E LATIN2 --lc-messages=C --lc- >> monetary=C >> --lc-numeric=C --lc-time=C -D /var/postgresql/data. >> >> I do not know is there possibility to fix this problem - maybe >> OpenBSD now >> support only C and POSIX collation ? > > If i understand you correctly, you're having trouble with how > Postgresql colates, not OpenBSD. you should consult the Postgresql > docs, starting with > > http://www.postgresql.org/docs/8.1/static/charset.html#AEN22133 > > >> >> Thanks for every help, >> best regards, >> Artur >> >> ps. sorry for my poor English >> > > Ben
Re: LC_COLLATE and PostgreSQL
Hi Philip, thank You match for answer and nice words :). I am starting be concern for my problem but my "love" for OpenBSD is stronger then sort problems (data on my web are sorted not correctly) and I do not want to change system on may server. Maybe in the nearest feature OpenBSD Developers change the COLLATION support or someone know solution for this problem. Best regards, Artur On Sun, 24 Jun 2007 21:07:56 -0600, "Philip Guenther" <[EMAIL PROTECTED]> wrote: > On 6/24/07, bsd_news <[EMAIL PROTECTED]> wrote: > ... >> I have not proper sorts in my PostgreSQL 8.1 database on my OpenBSD 4.0 >> server. >> >> I had set in /etc/profile the LC_COLLATE to pl_PL.ISO8859-2. > ... >> I do not know is there possibility to fix this problem - maybe OpenBSD > now >> support only C and POSIX collation ? > > Your guess is correct: OpenBSD only supports the default locale, C. > To quote the source for the strcoll() function: > > int > strcoll(const char *s1, const char *s2) > { > /* LC_COLLATE is unimplemented, hence always "C" */ > return (strcmp(s1, s2)); > } > > >> ps. sorry for my poor English > > Seemed fine to me: I had no problem understanding what you were doing > and what wasn't working as you expected. Indeed, you provided better > information than some native speakers posting queries here. > > > Philip Guenther
Re: LC_COLLATE and PostgreSQL
On Mon, Jun 25, 2007 at 11:50:10AM +0200, Artur Litwinowicz wrote: > Hi Philip, >thank You match for answer and nice words :). > I am starting be concern for my problem but my "love" for OpenBSD is > stronger then sort problems (data on my web are sorted not correctly) and I > do not want to change system on may server. > Maybe in the nearest feature OpenBSD Developers change the COLLATION > support or someone know solution for this problem. Well, if you like to code, you might be able to port strcoll() and the like from another BSD - like FreeBSD. Joachim -- PotD: x11/bbkeys - X keygrabber for blackbox
Re: LC_COLLATE and PostgreSQL
O I think it is not good idea to change the code of OpenBSD by me. Years ago I was coding in C++ (at the University but with best mark ;). Now I am working for Oracle Corp. (PL/SQL and etc.) and I am a little out of practice ;) with C/C++. Maybe someone core OpenBSD Developer will agree with me, that the extended COLLATION in OpenBSD will be the strong point in the system functionality ? Best regards, Artur On Mon, 25 Jun 2007 22:47:47 +0200, Joachim Schipper <[EMAIL PROTECTED]> wrote: > On Mon, Jun 25, 2007 at 11:50:10AM +0200, Artur Litwinowicz wrote: >> Hi Philip, >>thank You match for answer and nice words :). >> I am starting be concern for my problem but my "love" for OpenBSD is >> stronger then sort problems (data on my web are sorted not correctly) > and I >> do not want to change system on may server. >> Maybe in the nearest feature OpenBSD Developers change the COLLATION >> support or someone know solution for this problem. > > Well, if you like to code, you might be able to port strcoll() and the > like from another BSD - like FreeBSD. > > Joachim > > -- > PotD: x11/bbkeys - X keygrabber for blackbox
Re: LC_COLLATE and PostgreSQL
On Tue, 26 Jun 2007 at 12:35, Artur Litwinowicz wrote: > O I think it is not good idea to change the code of OpenBSD by me. > Years ago I was coding in C++ (at the University but with best mark ;). Now > I am working for Oracle Corp. (PL/SQL and etc.) and I am a little out of > practice ;) with C/C++. > > Maybe someone core OpenBSD Developer will agree with me, that the extended > COLLATION in OpenBSD will be the strong point in the system functionality ? > I had the same problem with PostgreSQL on OpenBSD a few years ago and I've written some C-language extensions (database functions). Be advised that this is a totally non-standard solution. Here you go: The idea was to make simple functions for sorting and upper/lower conversion, that would be faster than typical unicode table lookups. Since I only ever needed a single language at a time, I didn't need all the fancy unicode stuff. In "lang.h", there are lower and upper alphabet strings for each language (currently english and slovenian). You can add your own, of course. If you run "make" (you may need to edit the Makefile first), you'll get the "hash_en.so", "hash_sl.so", "upper_en.so", "upper_sl.so", "lower_en.so" and "lower_sl.so" shared libs containing postgres functions with the same names. To load them into database, use CREATE FUNCTION func_name(TEXT) RETURNS TEXT AS \ 'path/to/func_file.so', 'func_name' LANGUAGE C IMMUTABLE STRICT for each of them. The upper_XX and lower_XX functions return the upper/lowercase version of the input string. The hash_XX function replaces the input string with new string where each letter is replaced with its position in the alphabet. For example, instead of SELECT ... ORDER BY my_column you can use SELECT ... ORDER BY hash_sl(my_column) and you've got slovenian sort order. For performance, create an index on hash_sl(my_column), not my_column. This will only work on "unicode" databases. Of course, there may be bugs. They are quite likely, actually. I remember I wrote all this in a hurry. But it has worked OK for at least three projects now. All the files except for "lang.h" follow below. For "lang.h", go to "http://www.komna.com/tin/lang.h"; (it's UTF-8 encoded, so I can't put it here inline). # Makefile # # Copyright (c) 2004 Valentin Kozamernik <[EMAIL PROTECTED]> # # Permission to use, copy, modify, and distribute this software for any # purpose with or without fee is hereby granted, provided that the above # copyright notice and this permission notice appear in all copies. # # THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES # WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF # MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR # ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES # WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN # ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF # OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. # INCLUDE_DIR=/usr/local/include/postgresql/server build: hash_en.so hash_sl.so lower_en.so lower_sl.so upper_en.so upper_sl.so hash_en.so: hash.c cc -Wall -Werror -fpic -c -o hash_en.o hash.c -I$(INCLUDE_DIR) -DEN ld -Bshareable -o hash_en.so hash_en.o hash_sl.so: hash.c cc -Wall -Werror -fpic -c -o hash_sl.o hash.c -I$(INCLUDE_DIR) -DSL ld -Bshareable -o hash_sl.so hash_sl.o lower_en.so: lower.c cc -Wall -Werror -fpic -c -o lower_en.o lower.c -I$(INCLUDE_DIR) -DEN ld -Bshareable -o lower_en.so lower_en.o lower_sl.so: lower.c cc -Wall -Werror -fpic -c -o lower_sl.o lower.c -I$(INCLUDE_DIR) -DSL ld -Bshareable -o lower_sl.so lower_sl.o upper_en.so: upper.c cc -Wall -Werror -fpic -c -o upper_en.o upper.c -I$(INCLUDE_DIR) -DEN ld -Bshareable -o upper_en.so upper_en.o upper_sl.so: upper.c cc -Wall -Werror -fpic -c -o upper_sl.o upper.c -I$(INCLUDE_DIR) -DSL ld -Bshareable -o upper_sl.so upper_sl.o clean: -rm *.o *.so /* * $Id: utf8.h,v 1.1.1.1 2004/12/14 14:53:28 tin Exp $ * * Copyright (c) 2004 Valentin Kozamernik <[EMAIL PROTECTED]> * * Permission to use, copy, modify, and distribute this software for any * purpose with or without fee is hereby granted, provided that the above * copyright notice and this permission notice appear in all copies. * * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ #include typedef u_int32_t widechar_t; #d
Re: LC_COLLATE and PostgreSQL
Hi Valentin, thank You very, very much for Your answer. Your idea is great !!! I am very happy with this solution :) Of course I have to recall that pleasure with C/C++ coding but for now this is the best and fastest way for me. Have a nice day, Best regards :) Artur On Thu, 28 Jun 2007 11:18:00 +0200, Valentin Kozamernik <[EMAIL PROTECTED]> wrote: > On Tue, 26 Jun 2007 at 12:35, Artur Litwinowicz wrote: >> O I think it is not good idea to change the code of OpenBSD by me. >> Years ago I was coding in C++ (at the University but with best mark ;). > Now >> I am working for Oracle Corp. (PL/SQL and etc.) and I am a little out of >> practice ;) with C/C++. >> >> Maybe someone core OpenBSD Developer will agree with me, that the > extended >> COLLATION in OpenBSD will be the strong point in the system > functionality ? >> > > I had the same problem with PostgreSQL on OpenBSD a few years ago and > I've written some C-language extensions (database functions). Be > advised that this is a totally non-standard solution. Here you go: > > The idea was to make simple functions for sorting and upper/lower > conversion, that would be faster than typical unicode table lookups. > Since I only ever needed a single language at a time, I didn't need > all the fancy unicode stuff. > > In "lang.h", there are lower and upper alphabet strings for each > language (currently english and slovenian). You can add your own, of > course. > > If you run "make" (you may need to edit the Makefile first), you'll > get the "hash_en.so", "hash_sl.so", "upper_en.so", "upper_sl.so", > "lower_en.so" and "lower_sl.so" shared libs containing postgres > functions with the same names. > > To load them into database, use > CREATE FUNCTION func_name(TEXT) RETURNS TEXT AS \ > 'path/to/func_file.so', 'func_name' LANGUAGE C IMMUTABLE STRICT > for each of them. > > The upper_XX and lower_XX functions return the upper/lowercase version > of the input string. The hash_XX function replaces the input string > with new string where each letter is replaced with its position in the > alphabet. > > For example, instead of > SELECT ... ORDER BY my_column > you can use > SELECT ... ORDER BY hash_sl(my_column) > and you've got slovenian sort order. > > For performance, create an index on hash_sl(my_column), not my_column. > > This will only work on "unicode" databases. > > Of course, there may be bugs. They are quite likely, actually. I > remember I wrote all this in a hurry. But it has worked OK for at > least three projects now. > > All the files except for "lang.h" follow below. For "lang.h", go to > "http://www.komna.com/tin/lang.h"; (it's UTF-8 encoded, so I can't put > it here inline). > > > > # Makefile > # > # Copyright (c) 2004 Valentin Kozamernik <[EMAIL PROTECTED]> > # > # Permission to use, copy, modify, and distribute this software for any > # purpose with or without fee is hereby granted, provided that the above > # copyright notice and this permission notice appear in all copies. > # > # THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES > # WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > # MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR > # ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > # WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN > # ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF > # OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > # > > INCLUDE_DIR=/usr/local/include/postgresql/server > > build: hash_en.so hash_sl.so lower_en.so lower_sl.so upper_en.so > upper_sl.so > > hash_en.so: hash.c > cc -Wall -Werror -fpic -c -o hash_en.o hash.c -I$(INCLUDE_DIR) -DEN > ld -Bshareable -o hash_en.so hash_en.o > > hash_sl.so: hash.c > cc -Wall -Werror -fpic -c -o hash_sl.o hash.c -I$(INCLUDE_DIR) -DSL > ld -Bshareable -o hash_sl.so hash_sl.o > > lower_en.so: lower.c > cc -Wall -Werror -fpic -c -o lower_en.o lower.c -I$(INCLUDE_DIR) -DEN > ld -Bshareable -o lower_en.so lower_en.o > > lower_sl.so: lower.c > cc -Wall -Werror -fpic -c -o lower_sl.o lower.c -I$(INCLUDE_DIR) -DSL > ld -Bshareable -o lower_sl.so lower_sl.o > > upper_en.so: upper.c > cc -Wall -Werror -fpic -c -o upper_en.o upper.c -I$(INCLUDE_DIR) -DEN > ld -Bshareable -o upper_en.so upper_en.o > > upper_sl.so: upper.c > cc -Wall -Werror -fpic -c -o upper_sl.o upper.c -I$(INCLUDE_DIR) -DSL > ld -Bshareable -o upper_sl.so upper_sl.o > > clean: > -rm *.o *.so > > > > /* > * $Id: utf8.h,v 1.1.1.1 2004/12/14 14:53:28 tin Exp $ > * > * Copyright (c) 2004 Valentin Kozamernik <[EMAIL PROTECTED]> > * > * Permission to use, copy, modify, and distribute this software for any > * purpose wi