On Fri, 30 Mar 2007, Listmail wrote:


        OK, I've solved my problem... thanks for the hint !

Anyway, just to signal that tsearch2 crashes if SELECT is not granted to pg_ts_dict (other tables give a proper error message when not GRANTed).On

I don't understand this. Are sure on this ?
From prompt in your select examples I see you have superuser's rights
and you have successfully select from pg_ts_dict column.

Oleg

Fri, 30 Mar 2007 13:20:30 +0200, Listmail <[EMAIL PROTECTED]> wrote:


        Hello,

I have just ditched Gentoo and installed a brand new kubuntu system (was tired of the endless compiles). I have a problem with crashing tsearch2. This appeared both on Gentoo and the brand new kubuntu.

I will describe all my install procedure, maybe I'm doing something wrong.

        Cluster is newly created and empty.

initdb was done with UNICODE encoding & locales.

# from postgresql.conf

# These settings are initialized by initdb -- they might be changed
lc_messages = 'fr_FR.UTF-8' # locale for system error message strings lc_monetary = 'fr_FR.UTF-8' # locale for monetary formatting lc_numeric = 'fr_FR.UTF-8' # locale for number formatting lc_time = 'fr_FR.UTF-8' # locale for time formatting

[EMAIL PROTECTED]:~$ locale
LANG=fr_FR.UTF-8
LC_CTYPE="fr_FR.UTF-8"
LC_NUMERIC="fr_FR.UTF-8"
etc...

First import needed .sql files from contrib and check that the default tsearch2 config works for English

$ createdb -U postgres test
$ psql -U postgres test <tsearch2.sql and other contribs I use
$ psql -U postgres test

test=# select lexize( 'en_stem', 'flying' );
 lexize
--------
 {fli}

test=# select to_tsvector('default', 'flying ducks');
   to_tsvector
------------------
 'fli':1 'duck':2

        OK, seems to work very nicely, now install French.
        Since this is Kubuntu there is no source, so download source, then :

- apply patch_tsearch_snowball_82 from tsearch2 website

./configure --prefix=/usr/lib/postgresql/8.2/ --datadir=/usr/share/postgresql/8.2 --enable-nls=fr --with-python
cd contrib/tsearch2
make
cd gendict
(copy french stem.c and stem.h from the snowball website)
./config.sh -n fr -s -p french_UTF_8 -i -v -c stem.c -h stem.h -C'Snowball stemmer for French'
cd ../../dict_fr
make clean && make
sudo make install

        Now we have :

/bin/sh ../../config/install-sh -c -m 644 dict_fr.sql '/usr/share/postgresql/8.2/contrib' /bin/sh ../../config/install-sh -c -m 755 libdict_fr.so.0.0 '/usr/lib/postgresql/8.2/lib/dict_fr.so'

        Okay...

- download and install UTF8 french dictionaries from http://www.davidgis.fr/download/tsearch2_french_files.zip and put them in contrib directory
(the files delivered by debian package ifrench are ISO8859, bleh)

- import french shared libs
psql -U postgres test < /usr/share/postgresql/8.2/contrib/dict_fr.sql

        Then :

test=# select lexize( 'en_stem', 'flying' );
 lexize
--------
 {fli}

        And :

test=# select * from pg_ts_dict where dict_name ~ '^(fr|en)';
dict_name | dict_init | dict_initoption | dict_lexize | dict_comment
-----------+-----------------------+----------------------+---------------------------------------+-----------------------------
en_stem | snb_en_init(internal) | contrib/english.stop | snb_lexize(internal,internal,integer) | English Stemmer. Snowball. fr | dinit_fr(internal) | | snb_lexize(internal,internal,integer) | Snowball stemmer for French

test=# select lexize( 'fr', 'voyageur' );
server closed the connection unexpectedly

        BLAM ! Try something else :

test=# UPDATE pg_ts_dict SET dict_initoption='/usr/share/postgresql/8.2/contrib/french.stop' WHERE dict_name = 'fr';
UPDATE 1
test=# select lexize( 'fr', 'voyageur' );
server closed the connection unexpectedly

        Try other options :

dict_name       | fr_ispell
dict_init       | spell_init(internal)
dict_initoption | DictFile="/usr/share/postgresql/8.2/contrib/french.dict",AffFile="/usr/share/postgresql/8.2/contrib/french.aff",StopFile="/usr/share/postgresql/8.2/contrib/french.stop"
dict_lexize     | spell_lexize(internal,internal,integer)
dict_comment    |

test=# select lexize( 'en_stem', 'traveler' ), lexize( 'fr_ispell', 'voyageur' );
-[ RECORD 1 ]-------
lexize | {travel}
lexize | {voyageuse}

Now it works (kinda) but stemming doesn't stem for French (since snowball is out). It should return 'voyage' (=travel) instead of 'voyageuse' (=female traveler)
        That's now what I want ; i want to use snowball to stem French words.

I'm going to make a debug build and try to debug it, but if anyone can help, you're really, really welcome.






---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

            http://www.postgresql.org/docs/faq

        Regards,
                Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Reply via email to