Ah, I have fixed it. It was necessary to import the files into Zookeeper using the file.encoding system property and set it to UTF-8. Then it worked. Hooray. :)
e.g. java -Dfile.encoding=UTF-8 -Dbootstrap_confdir=/home/me/myconfdir -Dcollection.configName=config1 -DzkHost="zkhost:2181" -DnumShards=2 -Dsolr.solr.home=/home/me/solr -jar start.jar On Thu, Nov 8, 2012 at 2:09 PM, Daniel Brügge <daniel.brue...@googlemail.com > wrote: > Weird, if i return the file contents in ZK with 'get' it returns me > > w??????rde | would > w??????rden | would > > for example. So the Umlaute are not shown. Does anyone have an idea if > this is because of Zookeepers cli or of the file contents itself? > > Thanks & regards. > > On Thu, Nov 8, 2012 at 12:24 PM, Daniel Brügge < > daniel.brue...@googlemail.com> wrote: > >> I trust the 'file' command output. And if i can read there "UTF-8 Unicode" >> I believe that this is correct. Don't know if this is the 'correct >> answer' for you ;) >> >> BTW: It works locally, but not with ZK. So it's maybe more a ZK issue, >> which >> somehow destroys my file. Will check. >> >> >> On Thu, Nov 8, 2012 at 12:12 PM, Robert Muir <rcm...@gmail.com> wrote: >> >>> On Wed, Nov 7, 2012 at 11:45 AM, Daniel Brügge >>> <daniel.brue...@googlemail.com> wrote: >>> > Hi, >>> > >>> > i am running a SolrCloud cluster with the 4.0.0 version. I have a >>> stopwords >>> > file >>> > which is in the correct encoding. >>> >>> What makes you think that? >>> >>> Note: "Because I can read it" is not the correct answer. >>> >>> Ensure any of your stopwords files etc are in UTF-8. This is often >>> different from the encoding your computer uses by default if you open >>> a file, start typing in it, and press save. >>> >> >> >