Hello Teodor,

I've just recently implemented an advanced full-text search function on top of tsearch2. Searching through the manuals and websites to get the snowball stemmer and compile my own module took me way to long. I'd rather go fetch a cup of coffee during a 30 minute download...

That said, I don't necessarily mean that all stemmers must be included in CVS or such. It should just be simpler for the database administrator to install ispell or stemmer 'modules'. A non-plus-ultra solution would be to provide packages for each language (in debian or fedora, etc..).

Perhaps we can put together the source code for all languages modules available and provide scripts to fetch ispell data or to generate the snowball stemmers. A debian package maintainer would have to fetch all the data to generate all language packages. Someone else might just want to download and compile a norwegian snowball stemmer.

I'd be willing to help with such a project. I have experience with tsearch2 as well as with gentoo and debian packaging. I can't help with rpm, though.

Regards

Markus

Teodor Sigaev wrote:
We got a lot requests about including stemmers and ispell dictionaries for all accessible languages into tsearch2. I understand that tsearch2 will be closer to end user. But sources of snowball stemmers is about 800kb, each ispell dictionaries will takes about 0.5-2M. All sizes are sized with compression. I am afraid that is too big size...

What are opinions?


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to