Re: Sorting non-english text

Ahmet Arslan Thu, 25 Aug 2016 07:44:02 -0700

Hi Vasu,

There is a field type or something like that (CollationKeyAnalyzer) for 
language specific sorting.


Ahmet



On Thursday, August 25, 2016 12:29 PM, Vasu Y <vya...@gmail.com> wrote:
Hi,
I have a text field which can contain values (multiple tokens) in English;
to support sorting, I had <copyField> in schema.xml to copy this to a new
field of type "lowercase" (defined as below).
I also have text fields of type text_de, text_es, text_fr, ja, cn etc. I
intend to do <copyField> to copy them to a new field of type "lowercase" to
support sorting.

Would this "lowercase" field type work well for sorting non-English fields
that are non-tokenized (or are single-term) or do you suggest to use a
different tokenizer & filter?

     <!-- lowercases the entire field value, keeping it as a single token.
-->
     <fieldType name="lowercase" class="solr.TextField"
positionIncrementGap="100">
       <analyzer>
         <tokenizer class="solr.KeywordTokenizerFactory"/>
         <filter class="solr.LowerCaseFilterFactory" />
       </analyzer>
    </fieldType>

Thanks,
Vasu

Re: Sorting non-english text

Reply via email to