Re: [GENERAL] TSearch2 / Get all unique lexems

Hannes Dorbath Thu, 08 Dec 2005 00:49:53 -0800

On 07.12.2005 16:13, Oleg Bartunov wrote:

hmm, you could dump tsvector column and use awk+sort+uniq

Thanks. I hoped for something possible inside a pl/pgsql proc. I'mtrying to integrate pg_trgm with Tsearch2. I'm still on my UTF-8database. Yes I know, there is _NO_ UTF-8 support of any kind inTsearch2 yet, but I got it working to a degree that is OK for myapplication (Created my own stemmer variant, ispell dict, affix fileetc). The last missing bit is to get a source for pg_trgm. I cannot usethe the stat() function, because it breaks as soon it sees an UTF-8 char.

I thought of using lexise(), cast the text array to rows somehow, writeit to a temp table, use SELECT DISTINCT.. but I hadn't any success yet.



--
Regards,
Hannes Dorbath

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Re: [GENERAL] TSearch2 / Get all unique lexems

Reply via email to