Marcus, it'd be easier for others if you show your tsearch2 configuration. btw, what version of pgsql and tsearch2 (any patches applied ?) Since I don't know german I could provide a little help, but I'd like to have some words from you when you get all things working right, so other people would appreciate your experience.
I wouldn't use tsearch2 in production until you understand your problem and get tsearch2 works correctly. Oleg On Wed, 21 Jul 2004, Markus Wollny wrote: > Hi! > > Okay, I changed locale via initdb and I've got it working to some extent now. > > Now I've got some problem with the ISpell-dictionary and the stopwords-list. Both > have been compiled with de_DE.utf8-locale. > > When I > SELECT to_tsvector('default_german', > 'Jeden Tag wirst Du ein bisschen ?lter, aber Du lernst'); > > I get > 'tag':2 'aber':8 'eint':5 'lernen':10 '?lter':7 'bisschen':6 > > I've got three questions regarding this result: > 1. both 'ein' and 'aber' are included in the stopwords-file, but they show up in the > result, whereas 'jeden', 'wirst', 'du' are removed correctly - why is the > stopword-list ignored for the former two? > 2. why does 'ein' appear as 'eint'? > 3. is this result actually no cause of alarm, so can I deploy tsearch2 to my > production databases nevertheless? > > I'm using http://j3e.de/ispell/igerman98/dict/igerman98-20030222.tar.bz2 (the latest > version of Heinz Knutzen's dictionary) and I've edited its Makefile to use > de_DE.utf8 in the locale settings; all.words was indeed the file used to generate > the hash, so I guess that I can now be more or less sure that I've actually followed > the instructions in the docs precisely. I dropped any references to the german > snowball stemmer dictionary which I had configured as fallback, so currently there's > only this one dictionary configured for ts_name default_german and tok_alias lhword, > lpard_hword, lword (the remaining tog_alias entries are set to use the simple > dictionary). > > Kind regards > > Markus > > > -----Urspr?ngliche Nachricht----- > > Von: Peter Eisentraut [mailto:[EMAIL PROTECTED] > > Gesendet: Mittwoch, 21. Juli 2004 12:17 > > An: Markus Wollny > > Cc: [EMAIL PROTECTED]; > > [EMAIL PROTECTED] > > Betreff: Re: AW: [GENERAL] tsearch2, ispell, utf-8 and german > > special characters > > > > Am Mittwoch, 21. Juli 2004 09:36 schrieb Markus Wollny: > > > Thanks for your answer. It's probably not sufficient to adjust the > > > current locale settings of the system, so I'll have to > > dump, re-initdb > > > and reload - am I correct or is there some procedure > > involving less downtime than that? > > > > Sorry, no. > > > > -- > > Peter Eisentraut > > http://developer.postgresql.org/~petere/ > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to [EMAIL PROTECTED] so that your > message can get through to the mailing list cleanly > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly