Happy to see some one have similar solutions as ours.

we have similar multi-language search feature and we index different
language content to _fr, _en field like you've done

but in search, we need a language code as a parameter to specify the
language client wants to search on which is normally decided by the website
visited, such as: qf=name description&language=en

and in our search components we find the right field: name_en and
description_en to be searched on

we used to support on all language search and removed that later, as the
site tells the customer which language is supported, we also don't think we
have many language experts on our web sites that knows more than two
language and need to search them at the same time.

On 7 November 2013 23:01, Tom Mortimer <tom.m.f...@gmail.com> wrote:

> Ah, thanks Markus. I think I'll just add the Boolean operators to the
> stopwords list in that case.
> Tom
> On 7 November 2013 12:01, Markus Jelsma <markus.jel...@openindex.io>
> wrote:
> > This is an ancient problem. The issue here is your mm-parameter, it gets
> > confused because for separate fields different amount of tokens are
> > filtered/emitted so it is never going to work just like this. The easiest
> > option is not to use the stopfilter.
> >
> >
> >
> http://lucene.472066.n3.nabble.com/Dismax-Minimum-Match-Stopwords-Bug-td493483.html
> > https://issues.apache.org/jira/browse/SOLR-3085
> >
> > -----Original message-----
> > > From:Tom Mortimer <tom.m.f...@gmail.com>
> > > Sent: Thursday 7th November 2013 12:50
> > > To: solr-user@lucene.apache.org
> > > Subject: eDisMax, multiple language support and stopwords
> > >
> > > Hi all,
> > >
> > > Thanks for the help and advice I've got here so far!
> > >
> > > Another question - I want to support stopwords at search time, so that
> > e.g.
> > > the query "oscar and wilde" is equivalent to "oscar wilde" (this is
> with
> > > lowercaseOperators=false). Fair enough, I have stopword "and" in the
> > query
> > > analyser chain.
> > >
> > > However, I also need to support French as well as English, so I've got
> > _en
> > > and _fr versions of the text fields, with appropriate stemming and
> > > stopwords. I index French content into the _fr fields and English into
> > the
> > > _en fields. I'm searching with eDisMax over both versions, e.g.:
> > >
> > >     <str name="qf">headline_en headline_fr</str>
> > >
> > > However, this means I get no results for "oscar and wilde". The parsed
> > > query is:
> > >
> > >     (+((DisjunctionMaxQuery((headline_fr:osca | headline_en:oscar))
> > > DisjunctionMaxQuery((headline_fr:and))
> > > DisjunctionMaxQuery((headline_fr:wild |
> headline_en:wild)))~3))/no_coord
> > >
> > > If I add "and" to the French stopwords list, I *do* get results, and
> the
> > > parsed query is:
> > >
> > >     (+((DisjunctionMaxQuery((headline_fr:osca | headline_en:oscar))
> > > DisjunctionMaxQuery((headline_fr:wild |
> headline_en:wild)))~2))/no_coord
> > >
> > > This implies that the only solution is to have a minimal, shared
> > stopwords
> > > list for all languages I want to support. Is this correct, or is there
> a
> > > way of supporting this kind of searching with per-language stopword
> > lists?
> > >
> > > Thanks for any ideas!
> > >
> > > Tom
> > >
> >

All the best

Liu Bo

Reply via email to