mikkel.kamstrup at gmail.com (Mikkel Kamstrup Erlandsen) writes: > magnus.bergman at observer.net (Magnus Bergman) writes: > > One thing that English users seldom consider is the usages of several > > languages. Which language is being used is important to know in order > > to decide what stemming rules to use, and which stop-words use (in > > English "the" is a stop-word while it in Swedish means tea and is > > something that is adequate to search for). People using other languages > > are very often multi lingual (using English as well). Therefore it is > > interesting to know which language the query is in (search engines > > might also be able to translate queries to search in document written > > in different languages). > > This is a good point. However I suggest leaving this up to the actual > implementations. After all it is an indexing time question what stemmer to > use when indexing a document...
This is not true. An indexer can chose to perform stem processing at query time. Recoll is one, but I don't think it's the only one. There are quite good reasons to do so. jf -- Recoll: desktop search for all Unix environments. http://www.recoll.org _______________________________________________ xdg mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/xdg
