Hi - combining two stemmers in one filter chain will lead to unexpected
results. It's best to define to different text_ fields even though you'd like
to avoid setting this up. It's not very hard. You can even use the LandID
update processor to sent spanish text to a spanish field and there would be no
application logic involved.
-----Original message-----
> From:blopez <balo...@hotmail.com>
> Sent: Thu 18-Oct-2012 13:33
> To: solr-user@lucene.apache.org
> Subject: Regional indexing/retrieval
>
> Hi all,
>
> I'm facing some problems with my solr index due to I have English and
> Spanish terms mixed.
>
> Actually I'm using Porter stemmer (works only for English terms). Btw, I've
> seen that I can use the Snowball stemmer with the flag language="English" or
> language="Spanish".
>
> Moreover, I've read something about using different fieldType elements for
> the different languages, for example <fieldType
> name="text_es"...</fieldType>, <fieldType name="text_en"...</fieldType>, BUT
> I'd like to avoid this solution, at least in the short-run.
>
> A fast solution I could find is using the Snowball stemmer twice in the same
> fieldType, I mean:
>
> <fieldType ...>
> <analyzer ...>
> ...
> *<filter class="solr.SnowballPorterFilterFactory" language="Spanish" />
> <filter class="solr.SnowballPorterFilterFactory" language="English" />*
> ...
> </analyzer>
> </fieldType>
>
> But I do not think it can be a good solution, maybe the Spanish filter
> (applied first) can make some noise to an English word that should only take
> into account of the English filter... and moreover I don't know how bad
> performance it can produce.
>
> What do you think?
>
> Regards,
> Borja.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Regional-indexing-retrieval-tp4014455.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>