eDisMax, multiple language support and stopwords

Tom Mortimer Thu, 07 Nov 2013 03:58:52 -0800

Hi all,

Thanks for the help and advice I've got here so far!


Another question - I want to support stopwords at search time, so that e.g.
the query "oscar and wilde" is equivalent to "oscar wilde" (this is with
lowercaseOperators=false). Fair enough, I have stopword "and" in the query
analyser chain.

However, I also need to support French as well as English, so I've got _en
and _fr versions of the text fields, with appropriate stemming and
stopwords. I index French content into the _fr fields and English into the
_en fields. I'm searching with eDisMax over both versions, e.g.:

    <str name="qf">headline_en headline_fr</str>

However, this means I get no results for "oscar and wilde". The parsed
query is:

    (+((DisjunctionMaxQuery((headline_fr:osca | headline_en:oscar))
DisjunctionMaxQuery((headline_fr:and))
DisjunctionMaxQuery((headline_fr:wild | headline_en:wild)))~3))/no_coord

If I add "and" to the French stopwords list, I *do* get results, and the
parsed query is:

    (+((DisjunctionMaxQuery((headline_fr:osca | headline_en:oscar))
DisjunctionMaxQuery((headline_fr:wild | headline_en:wild)))~2))/no_coord

This implies that the only solution is to have a minimal, shared stopwords
list for all languages I want to support. Is this correct, or is there a
way of supporting this kind of searching with per-language stopword lists?

Thanks for any ideas!

Tom

eDisMax, multiple language support and stopwords

Reply via email to