Re: LC_COLLATE and PostgreSQL

2007-06-24 Thread Ben Calvert

On Jun 24, 2007, at 1:41 PM, bsd_news wrote:


Hi
I like OpenBSD very much but:
I have not proper sorts in my PostgreSQL 8.1 database on my OpenBSD  
4.0

server.
I had set in /etc/profile the LC_COLLATE to pl_PL.ISO8859-2.
The PostgreSQL cluster was created by command:
initdb --locale=pl_PL.ISO8859-2 -E LATIN2 --lc-messages=C --lc- 
monetary=C

--lc-numeric=C --lc-time=C -D /var/postgresql/data.

I do not know is there possibility to fix this problem - maybe  
OpenBSD now

support only C and POSIX collation ?


If i understand you correctly, you're having trouble with how  
Postgresql colates, not OpenBSD.  you should consult the Postgresql  
docs, starting with


http://www.postgresql.org/docs/8.1/static/charset.html#AEN22133




Thanks for every help,
best regards,
Artur

ps. sorry for my poor English



Ben



Re: LC_COLLATE and PostgreSQL

2007-06-24 Thread Philip Guenther

On 6/24/07, bsd_news <[EMAIL PROTECTED]> wrote:
...

I have not proper sorts in my PostgreSQL 8.1 database on my OpenBSD 4.0
server.

I had set in /etc/profile the LC_COLLATE to pl_PL.ISO8859-2.

...

I do not know is there possibility to fix this problem - maybe OpenBSD now
support only C and POSIX collation ?


Your guess is correct: OpenBSD only supports the default locale, C.
To quote the source for the strcoll() function:

int
strcoll(const char *s1, const char *s2)
{
   /* LC_COLLATE is unimplemented, hence always "C" */
   return (strcmp(s1, s2));
}



ps. sorry for my poor English


Seemed fine to me: I had no problem understanding what you were doing
and what wasn't working as you expected.  Indeed, you provided better
information than some native speakers posting queries here.


Philip Guenther



Re: LC_COLLATE and PostgreSQL

2007-06-25 Thread Artur Litwinowicz
Hi Ben,

   than You for answer. My PostgreSQL instance works fine. The  problem is

in OpenBSD unfortunately. PostgreSQL uses COLLATION support served by

operating system and default BSD has C and POSSIX COLLATION. I am looking

for solution for this problem. I need native support for polish language

with specific string sorts.



Regards,

Artur



On Sun, 24 Jun 2007 18:47:29 -0700, Ben Calvert <[EMAIL PROTECTED]>

wrote:

> On Jun 24, 2007, at 1:41 PM, bsd_news wrote:

> 

>> Hi

>> I like OpenBSD very much but:

>> I have not proper sorts in my PostgreSQL 8.1 database on my OpenBSD

>> 4.0

>> server.

>> I had set in /etc/profile the LC_COLLATE to pl_PL.ISO8859-2.

>> The PostgreSQL cluster was created by command:

>> initdb --locale=pl_PL.ISO8859-2 -E LATIN2 --lc-messages=C --lc-

>> monetary=C

>> --lc-numeric=C --lc-time=C -D /var/postgresql/data.

>>

>> I do not know is there possibility to fix this problem - maybe

>> OpenBSD now

>> support only C and POSIX collation ?

> 

> If i understand you correctly, you're having trouble with how

> Postgresql colates, not OpenBSD.  you should consult the Postgresql

> docs, starting with

> 

> http://www.postgresql.org/docs/8.1/static/charset.html#AEN22133

> 

> 

>>

>> Thanks for every help,

>> best regards,

>> Artur

>>

>> ps. sorry for my poor English

>>

> 

> Ben



Re: LC_COLLATE and PostgreSQL

2007-06-25 Thread Artur Litwinowicz
Hi Philip,

   thank You match for answer and nice words :). 

I am starting be concern for my problem but my "love" for OpenBSD is

stronger then sort problems (data on my web are sorted not correctly) and I

do not want to change system on may server. 

Maybe in the nearest feature OpenBSD Developers change the COLLATION

support or someone know solution for this problem.



Best regards,

Artur



On Sun, 24 Jun 2007 21:07:56 -0600, "Philip Guenther" <[EMAIL PROTECTED]>

wrote:

> On 6/24/07, bsd_news <[EMAIL PROTECTED]> wrote:

> ...

>> I have not proper sorts in my PostgreSQL 8.1 database on my OpenBSD 4.0

>> server.

>>

>> I had set in /etc/profile the LC_COLLATE to pl_PL.ISO8859-2.

> ...

>> I do not know is there possibility to fix this problem - maybe OpenBSD

> now

>> support only C and POSIX collation ?

> 

> Your guess is correct: OpenBSD only supports the default locale, C.

> To quote the source for the strcoll() function:

> 

> int

> strcoll(const char *s1, const char *s2)

> {

> /* LC_COLLATE is unimplemented, hence always "C" */

> return (strcmp(s1, s2));

> }

> 

> 

>> ps. sorry for my poor English

> 

> Seemed fine to me: I had no problem understanding what you were doing

> and what wasn't working as you expected.  Indeed, you provided better

> information than some native speakers posting queries here.

> 

> 

> Philip Guenther



Re: LC_COLLATE and PostgreSQL

2007-06-25 Thread Joachim Schipper
On Mon, Jun 25, 2007 at 11:50:10AM +0200, Artur Litwinowicz wrote:
> Hi Philip,
>thank You match for answer and nice words :). 
> I am starting be concern for my problem but my "love" for OpenBSD is
> stronger then sort problems (data on my web are sorted not correctly) and I
> do not want to change system on may server. 
> Maybe in the nearest feature OpenBSD Developers change the COLLATION
> support or someone know solution for this problem.

Well, if you like to code, you might be able to port strcoll() and the
like from another BSD - like FreeBSD.

Joachim

-- 
PotD: x11/bbkeys - X keygrabber for blackbox



Re: LC_COLLATE and PostgreSQL

2007-06-26 Thread Artur Litwinowicz
O I think it is not good idea to change the code of OpenBSD by me. 

Years ago I was coding in C++ (at the University but with best mark ;). Now

I am working for Oracle Corp. (PL/SQL and etc.) and I am a little out of

practice ;) with C/C++.



Maybe someone core OpenBSD Developer will agree with me, that the extended

COLLATION in OpenBSD will be the strong point in the system functionality ?



Best regards,

Artur



On Mon, 25 Jun 2007 22:47:47 +0200, Joachim Schipper

<[EMAIL PROTECTED]> wrote:

> On Mon, Jun 25, 2007 at 11:50:10AM +0200, Artur Litwinowicz wrote:

>> Hi Philip,

>>thank You match for answer and nice words :).

>> I am starting be concern for my problem but my "love" for OpenBSD is

>> stronger then sort problems (data on my web are sorted not correctly)

> and I

>> do not want to change system on may server.

>> Maybe in the nearest feature OpenBSD Developers change the COLLATION

>> support or someone know solution for this problem.

> 

> Well, if you like to code, you might be able to port strcoll() and the

> like from another BSD - like FreeBSD.

> 

>   Joachim

> 

> --

> PotD: x11/bbkeys - X keygrabber for blackbox



Re: LC_COLLATE and PostgreSQL

2007-06-28 Thread Valentin Kozamernik
On Tue, 26 Jun 2007 at 12:35, Artur Litwinowicz wrote:
> O I think it is not good idea to change the code of OpenBSD by me. 
> Years ago I was coding in C++ (at the University but with best mark ;). Now
> I am working for Oracle Corp. (PL/SQL and etc.) and I am a little out of
> practice ;) with C/C++.
> 
> Maybe someone core OpenBSD Developer will agree with me, that the extended
> COLLATION in OpenBSD will be the strong point in the system functionality ?
> 

I had the same problem with PostgreSQL on OpenBSD a few years ago and
I've written some C-language extensions (database functions).  Be
advised that this is a totally non-standard solution.  Here you go:

The idea was to make simple functions for sorting and upper/lower
conversion, that would be faster than typical unicode table lookups.
Since I only ever needed a single language at a time, I didn't need
all the fancy unicode stuff.

In "lang.h", there are lower and upper alphabet strings for each
language (currently english and slovenian).  You can add your own, of
course.

If you run "make" (you may need to edit the Makefile first), you'll
get the "hash_en.so", "hash_sl.so", "upper_en.so", "upper_sl.so",
"lower_en.so" and "lower_sl.so" shared libs containing postgres
functions with the same names.

To load them into database, use
CREATE FUNCTION func_name(TEXT) RETURNS TEXT AS \
 'path/to/func_file.so', 'func_name' LANGUAGE C IMMUTABLE STRICT
for each of them.

The upper_XX and lower_XX functions return the upper/lowercase version
of the input string.  The hash_XX function replaces the input string
with new string where each letter is replaced with its position in the
alphabet.

For example, instead of
SELECT ... ORDER BY my_column
you can use
SELECT ... ORDER BY hash_sl(my_column)
and you've got slovenian sort order.

For performance, create an index on hash_sl(my_column), not my_column.

This will only work on "unicode" databases.

Of course, there may be bugs.  They are quite likely, actually.  I
remember I wrote all this in a hurry.  But it has worked OK for at
least three projects now.

All the files except for "lang.h" follow below.  For "lang.h", go to
"http://www.komna.com/tin/lang.h"; (it's UTF-8 encoded, so I can't put
it here inline).



# Makefile
#
# Copyright (c) 2004 Valentin Kozamernik <[EMAIL PROTECTED]>
#
# Permission to use, copy, modify, and distribute this software for any
# purpose with or without fee is hereby granted, provided that the above
# copyright notice and this permission notice appear in all copies.
#
# THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
# WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
# MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
# ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
# WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
# ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
# OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
#

INCLUDE_DIR=/usr/local/include/postgresql/server

build: hash_en.so hash_sl.so lower_en.so lower_sl.so upper_en.so upper_sl.so

hash_en.so: hash.c
cc -Wall -Werror -fpic -c -o hash_en.o hash.c -I$(INCLUDE_DIR) -DEN
ld -Bshareable -o hash_en.so hash_en.o

hash_sl.so: hash.c
cc -Wall -Werror -fpic -c -o hash_sl.o hash.c -I$(INCLUDE_DIR) -DSL
ld -Bshareable -o hash_sl.so hash_sl.o

lower_en.so: lower.c
cc -Wall -Werror -fpic -c -o lower_en.o lower.c -I$(INCLUDE_DIR) -DEN
ld -Bshareable -o lower_en.so lower_en.o

lower_sl.so: lower.c
cc -Wall -Werror -fpic -c -o lower_sl.o lower.c -I$(INCLUDE_DIR) -DSL
ld -Bshareable -o lower_sl.so lower_sl.o

upper_en.so: upper.c
cc -Wall -Werror -fpic -c -o upper_en.o upper.c -I$(INCLUDE_DIR) -DEN
ld -Bshareable -o upper_en.so upper_en.o

upper_sl.so: upper.c
cc -Wall -Werror -fpic -c -o upper_sl.o upper.c -I$(INCLUDE_DIR) -DSL
ld -Bshareable -o upper_sl.so upper_sl.o

clean:
-rm *.o *.so



/*
 * $Id: utf8.h,v 1.1.1.1 2004/12/14 14:53:28 tin Exp $
 *
 * Copyright (c) 2004 Valentin Kozamernik <[EMAIL PROTECTED]>
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */

#include 

typedef u_int32_t widechar_t;

#d

Re: LC_COLLATE and PostgreSQL

2007-06-28 Thread Artur Litwinowicz
Hi Valentin,

   thank You very, very much for Your answer.

Your idea is great !!! I am very happy with this solution :) 

Of course I have to recall that pleasure with C/C++ coding but for now this

is the best and fastest way for me.



Have a nice day,

Best regards :)

Artur



On Thu, 28 Jun 2007 11:18:00 +0200, Valentin Kozamernik <[EMAIL PROTECTED]>

wrote:

> On Tue, 26 Jun 2007 at 12:35, Artur Litwinowicz wrote:

>> O I think it is not good idea to change the code of OpenBSD by me.

>> Years ago I was coding in C++ (at the University but with best mark ;).

> Now

>> I am working for Oracle Corp. (PL/SQL and etc.) and I am a little out of

>> practice ;) with C/C++.

>>

>> Maybe someone core OpenBSD Developer will agree with me, that the

> extended

>> COLLATION in OpenBSD will be the strong point in the system

> functionality ?

>>

> 

> I had the same problem with PostgreSQL on OpenBSD a few years ago and

> I've written some C-language extensions (database functions).  Be

> advised that this is a totally non-standard solution.  Here you go:

> 

> The idea was to make simple functions for sorting and upper/lower

> conversion, that would be faster than typical unicode table lookups.

> Since I only ever needed a single language at a time, I didn't need

> all the fancy unicode stuff.

> 

> In "lang.h", there are lower and upper alphabet strings for each

> language (currently english and slovenian).  You can add your own, of

> course.

> 

> If you run "make" (you may need to edit the Makefile first), you'll

> get the "hash_en.so", "hash_sl.so", "upper_en.so", "upper_sl.so",

> "lower_en.so" and "lower_sl.so" shared libs containing postgres

> functions with the same names.

> 

> To load them into database, use

> CREATE FUNCTION func_name(TEXT) RETURNS TEXT AS \

>  'path/to/func_file.so', 'func_name' LANGUAGE C IMMUTABLE STRICT

> for each of them.

> 

> The upper_XX and lower_XX functions return the upper/lowercase version

> of the input string.  The hash_XX function replaces the input string

> with new string where each letter is replaced with its position in the

> alphabet.

> 

> For example, instead of

> SELECT ... ORDER BY my_column

> you can use

> SELECT ... ORDER BY hash_sl(my_column)

> and you've got slovenian sort order.

> 

> For performance, create an index on hash_sl(my_column), not my_column.

> 

> This will only work on "unicode" databases.

> 

> Of course, there may be bugs.  They are quite likely, actually.  I

> remember I wrote all this in a hurry.  But it has worked OK for at

> least three projects now.

> 

> All the files except for "lang.h" follow below.  For "lang.h", go to

> "http://www.komna.com/tin/lang.h"; (it's UTF-8 encoded, so I can't put

> it here inline).

> 

> 

> 

> # Makefile

> #

> # Copyright (c) 2004 Valentin Kozamernik <[EMAIL PROTECTED]>

> #

> # Permission to use, copy, modify, and distribute this software for any

> # purpose with or without fee is hereby granted, provided that the above

> # copyright notice and this permission notice appear in all copies.

> #

> # THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL

WARRANTIES

> # WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF

> # MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR

> # ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES

> # WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN

> # ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF

> # OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

> #

> 

> INCLUDE_DIR=/usr/local/include/postgresql/server

> 

> build: hash_en.so hash_sl.so lower_en.so lower_sl.so upper_en.so

> upper_sl.so

> 

> hash_en.so: hash.c

>   cc -Wall -Werror -fpic -c -o hash_en.o hash.c -I$(INCLUDE_DIR) -DEN

>   ld -Bshareable -o hash_en.so hash_en.o

> 

> hash_sl.so: hash.c

>   cc -Wall -Werror -fpic -c -o hash_sl.o hash.c -I$(INCLUDE_DIR) -DSL

>   ld -Bshareable -o hash_sl.so hash_sl.o

> 

> lower_en.so: lower.c

>   cc -Wall -Werror -fpic -c -o lower_en.o lower.c -I$(INCLUDE_DIR) -DEN

>   ld -Bshareable -o lower_en.so lower_en.o

> 

> lower_sl.so: lower.c

>   cc -Wall -Werror -fpic -c -o lower_sl.o lower.c -I$(INCLUDE_DIR) -DSL

>   ld -Bshareable -o lower_sl.so lower_sl.o

> 

> upper_en.so: upper.c

>   cc -Wall -Werror -fpic -c -o upper_en.o upper.c -I$(INCLUDE_DIR) -DEN

>   ld -Bshareable -o upper_en.so upper_en.o

> 

> upper_sl.so: upper.c

>   cc -Wall -Werror -fpic -c -o upper_sl.o upper.c -I$(INCLUDE_DIR) -DSL

>   ld -Bshareable -o upper_sl.so upper_sl.o

> 

> clean:

>   -rm *.o *.so

> 

> 

> 

> /*

>  * $Id: utf8.h,v 1.1.1.1 2004/12/14 14:53:28 tin Exp $

>  *

>  * Copyright (c) 2004 Valentin Kozamernik <[EMAIL PROTECTED]>

>  *

>  * Permission to use, copy, modify, and distribute this software for any

>  * purpose wi