we're using tsearch2 with the german dictionary
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/dicts/ispell/ispell-german-compound.tar.gz
for fulltext search.

the indexing is configured as follows:

CREATE TEXT SEARCH DICTIONARY public.german (
    TEMPLATE = ispell,
    DictFile = german,
    AffFile = german,
    StopWords = german
);

CREATE TEXT SEARCH CONFIGURATION public.default ( COPY = pg_catalog.german );

ALTER TEXT SEARCH CONFIGURATION public.default
    ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
                      word, hword, hword_part
    WITH public.german;

-------------------------

select * from ts_debug('default', 'hundshütte');
works as expected: creates the two lexemes: "{hund,hütte}"

BUT

SELECT to_tsvector('default','lovely und bauarbeiter/in');
looses a lot of stuff:
"'bauarbeiter/in':2"

some more debugging shows:

SELECT * from ts_debug('default','lovely und bauarbeiter/in');

"asciiword";"Word, all ASCII";"lovely";"{german}";"german";""
"blank";"Space symbols";" ";"{}";"";""
"asciiword";"Word, all ASCII";"und";"{german}";"german";"{}"
"blank";"Space symbols";" ";"{}";"";""
"file";"File or path
name";"bauarbeiter/in";"{simple}";"simple";"{bauarbeiter/in}"

a) unknown words are just beeing dropped
b) words with slashes are interpreted as file paths and the first path
is beeing dropped.

any idea how we can fix this?

jodok

-- 
Jodok Batlogg, Vorstand

Lovely Systems AG
Telefon +43 5572 908060, Fax +43 5572 908060-77, Mobil +43 664 9636963
Schmelzhütterstraße 26a, 6850 Dornbirn, Austria

Sitz: Dornbirn, FB: Landesgericht Feldkirch, FN: 208859x, UID: ATU51736705
Aufsichtsratsvorsitzender: Christian Lutz
Vorstand: Jodok Batlogg, Manfred Schwendinger

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to