You can debug this with the 'Analysis' page in the Solr UI. You pick 'text_general' and then give words with umlauts in the text box for indexing and queries.
Lance ----- Original Message ----- | From: "Daniel Brügge" <daniel.brue...@googlemail.com> | To: solr-user@lucene.apache.org | Sent: Wednesday, November 7, 2012 8:45:45 AM | Subject: SolrCloud, Zookeeper and Stopwords with Umlaute or other special characters | | Hi, | | i am running a SolrCloud cluster with the 4.0.0 version. I have a | stopwords | file | which is in the correct encoding. It contains german Umlaute like | e.g. 'ü'. | I am | also running a standalone Zookeeper which contains this stopwords | file. In | my schema | i am using the stopwords file in the standard way: | | > | > <fieldType name="text_general" class="solr.TextField" | > positionIncrementGap="100"> | > <analyzer type="index"> | > <tokenizer class="solr.StandardTokenizerFactory"/> | > <filter class="solr.StopFilterFactory" | > ignoreCase="true" | > words="my_stopwords.txt" | > enablePositionIncrements="true" /> | | | When I am indexing i recognized, that all stopwords without Umlaute | are | correctly removed, but the ones with | Umlaute still exist. | | Is this a problem with ZK or Solr? | | Thanks & regards | | Daniel |