You can debug this with the 'Analysis' page in the Solr UI. You pick 
'text_general' and then give words with umlauts in the text box for indexing 
and queries.

Lance

----- Original Message -----
| From: "Daniel Brügge" <daniel.brue...@googlemail.com>
| To: solr-user@lucene.apache.org
| Sent: Wednesday, November 7, 2012 8:45:45 AM
| Subject: SolrCloud, Zookeeper and Stopwords with Umlaute or other special 
characters
| 
| Hi,
| 
| i am running a SolrCloud cluster with the 4.0.0 version. I have a
| stopwords
| file
| which is in the correct encoding. It contains german Umlaute like
| e.g. 'ü'.
| I am
| also running a standalone Zookeeper which contains this stopwords
| file. In
| my schema
| i am using the stopwords file in the standard way:
| 
| >
| >     <fieldType name="text_general" class="solr.TextField"
| > positionIncrementGap="100">
| >       <analyzer type="index">
| >                 <tokenizer class="solr.StandardTokenizerFactory"/>
| >                 <filter class="solr.StopFilterFactory"
| >                                 ignoreCase="true"
| >                                 words="my_stopwords.txt"
| >                                 enablePositionIncrements="true" />
| 
| 
| When I am indexing i recognized, that all stopwords without Umlaute
| are
| correctly removed, but the ones with
| Umlaute still exist.
| 
| Is this a problem with ZK or Solr?
| 
| Thanks & regards
| 
| Daniel
| 

Reply via email to