If you are running on Windows, it does not default to UTF-8. It has a java property that changes it to UTF-8. Unfortunately, not all libraries get this information, and some of the String converters don't have a character-encoding argument. I learned this the hard way.
_____ From: Ben Shlomo, Yatir [mailto:[EMAIL PROTECTED] Sent: Monday, August 20, 2007 8:40 AM To: solr-user@lucene.apache.org Subject: problem with quering solr after indexing UTF-8 encoded CSV files Hi! I have utf-8 encoded data inside a csv file (actually it’s a tab separated file - attached) I can index it with no apparent errors I did not forget to set this in my tomcat configuration <Server ...> <Service ...> <Connector ... URIEncoding="UTF-8"/> When I query a document using the UTF-8 text I get zero matches: <?xml version="1.0" encoding="UTF-8" ?> <http://localhost:8080/apache-solr-1.2.1-dev/select/?q=%D7%99%D7%AA%D7%99%D7%A8&version=2.2&start=0&rows=10&indent=on##> - <response> <http://localhost:8080/apache-solr-1.2.1-dev/select/?q=%D7%99%D7%AA%D7%99%D7%A8&version=2.2&start=0&rows=10&indent=on##> - <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">0</int> <http://localhost:8080/apache-solr-1.2.1-dev/select/?q=%D7%99%D7%AA%D7%99%D7%A8&version=2.2&start=0&rows=10&indent=on##> - <lst name="params"> <str name="indent">on</str> <str name="start">0</str> <ststr name="q">יתיר</str> // Note that - I can see the correct UTF-8 text in it (hebrew characters) <str name="rows">10</str> <str name="version">2.2</str> </lst> </lst> <result name="response" numFound="0" start="0" /> </response> When I observe this text in the response by querinig for *:* I notice that the text does not appear as desired: יתיר instead of יתיר Do you have any ideas? Thanks… Here is the response : <?xml version="1.0" encoding="UTF-8" ?> <http://localhost:8080/apache-solr-1.2.1-dev/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on##> - <response> <http://localhost:8080/apache-solr-1.2.1-dev/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on##> - <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">0</int> <http://localhost:8080/apache-solr-1.2.1-dev/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on##> - <lst name="params"> <str name="indent">on</str> <str name="start">0</str> <str name="q">*:*</str> <str name="rows">10</str> <str name="version">2.2</str> </lst> </lst> <http://localhost:8080/apache-solr-1.2.1-dev/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on##> - <result name="response" numFound="1" start="0"> <http://localhost:8080/apache-solr-1.2.1-dev/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on##> - <doc> <str name="country">1</str> <str name="desc">desc is a very good camera</str> <str name="dispname">display is יתיר ABC res123 </str> <str name="form">1</str> <str name="lang">1</str> <str name="manu">ABC</str> <str name="model"> res123 </str> <str name="pn">C123</str> <str name="productid">123456</str> <str name="upc">72900010123</str> </doc> </result> </response> yatir