Re: Solved! Solr interprets UTF-8 as ISO-8859-1
That did the trick. I actually figured it out on my own 10 minutes after I posted to the mailinglist. Typical ;-) Thanks for the help anyway everybody! //Daniel Uwe Klosa wrote: You should set uriEncoding=UTF-8 in your application server. For tomcat you can do that in the server.xml. For Glassfish you have to create a sun-web.xml containing the according parameters. Yoy r application server should provide a similar mechanism. Uwe On Mon, Mar 31, 2008 at 4:32 PM, Daniel Löfquist [EMAIL PROTECTED] wrote: Hello, We're building a webapplication that uses Solr for searching and I've come upon a problem that I can't seem to get my head around. We have a servlet that accepts input via XML-RPC and based on that input constructs the correct URL to perform a search with the Solr-servlet. I know that the call to Solr (the URL) from our servlet looks like this (which is what it should look like): http://myserver:8080/solrproducts/select/?q=all_SV:ljusbl å+status:onlinefl=id%2Cartno%2Ctitle_SV%2CtitleSort_SV%2Cdescription_SV%2Csort=titleSort_SV+asc,id+ascstart=0q.op=ANDrows=25 But Solr reports the input-fields (the GET-variables in the URL) as: INFO: /select/ fl=id,artno,title_SV,titleSort_SV,description_SV,sort=titleSort_SV+asc,id+ascstart=0q=all_SV:ljusblÃ¥+status:onlineq.op=ANDrows=25 which is all fine except where it says ljusblÃ¥. Apparently Solr is interpreting the UTF-8 string ljusblå as ISO-8859-1 and thus creates this garbage that makes the search return 0 when it should in reality return 3 hits. All other searches that don't use special characters work 100% fine. I'm new to Solr so I'm not sure what I'm doing wrong here. Can anybody help me out and point me in the direction of a solution? Sincerely, Daniel Löfquist -- Daniel Löfquist Application Manager / Software Engineer CDON.COM Bergsgatan 20, Box 385, SE 201 23 Malmö, Sweden Office: +46 40 601 61 00 Direct: +46 40 601 61 16 Mobile: +46 702 92 21 75 Fax: +46 40 601 61 20 E-mail: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] CDON.COM http://www.cdon.com/ Confidentiality Information contained in this e-mail is intended for the use of the addressee only, and is confidential. Any dissemination, distribution, copying or use of this communication without prior permission of the addressee is strictly prohibited. If you are not the intended addressee you must delete this e-mail and its attachments.
Re: Solr interprets UTF-8 as ISO-8859-1
Send the URL with the å character URL encoded as %C3%A5. That is the UTF-8 URL encoding. http://myserver:8080/solrproducts/select/?q=all_SV:ljusbl%C3%A5+status:onlinefl=id%2Cartno%2Ctitle_SV%2CtitleSort_SV%2Cdescription_SV%2Csort=titleSort_SV+asc,id+ascstart=0q.op=ANDrows=25 -Sean Daniel Löfquist wrote: Hello, We're building a webapplication that uses Solr for searching and I've come upon a problem that I can't seem to get my head around. We have a servlet that accepts input via XML-RPC and based on that input constructs the correct URL to perform a search with the Solr-servlet. I know that the call to Solr (the URL) from our servlet looks like this (which is what it should look like): http://myserver:8080/solrproducts/select/?q=all_SV:ljusblå+status:onlinefl=id%2Cartno%2Ctitle_SV%2CtitleSort_SV%2Cdescription_SV%2Csort=titleSort_SV+asc,id+ascstart=0q.op=ANDrows=25 But Solr reports the input-fields (the GET-variables in the URL) as: INFO: /select/ fl=id,artno,title_SV,titleSort_SV,description_SV,sort=titleSort_SV+asc,id+ascstart=0q=all_SV:ljusblÃ¥+status:onlineq.op=ANDrows=25 which is all fine except where it says ljusblÃ¥. Apparently Solr is interpreting the UTF-8 string ljusblå as ISO-8859-1 and thus creates this garbage that makes the search return 0 when it should in reality return 3 hits. All other searches that don't use special characters work 100% fine. I'm new to Solr so I'm not sure what I'm doing wrong here. Can anybody help me out and point me in the direction of a solution? Sincerely, Daniel Löfquist
Re: Solr interprets UTF-8 as ISO-8859-1
You should set uriEncoding=UTF-8 in your application server. For tomcat you can do that in the server.xml. For Glassfish you have to create a sun-web.xml containing the according parameters. Yoy r application server should provide a similar mechanism. Uwe On Mon, Mar 31, 2008 at 4:32 PM, Daniel Löfquist [EMAIL PROTECTED] wrote: Hello, We're building a webapplication that uses Solr for searching and I've come upon a problem that I can't seem to get my head around. We have a servlet that accepts input via XML-RPC and based on that input constructs the correct URL to perform a search with the Solr-servlet. I know that the call to Solr (the URL) from our servlet looks like this (which is what it should look like): http://myserver:8080/solrproducts/select/?q=all_SV:ljusbl å+status:onlinefl=id%2Cartno%2Ctitle_SV%2CtitleSort_SV%2Cdescription_SV%2Csort=titleSort_SV+asc,id+ascstart=0q.op=ANDrows=25 But Solr reports the input-fields (the GET-variables in the URL) as: INFO: /select/ fl=id,artno,title_SV,titleSort_SV,description_SV,sort=titleSort_SV+asc,id+ascstart=0q=all_SV:ljusblÃ¥+status:onlineq.op=ANDrows=25 which is all fine except where it says ljusblÃ¥. Apparently Solr is interpreting the UTF-8 string ljusblå as ISO-8859-1 and thus creates this garbage that makes the search return 0 when it should in reality return 3 hits. All other searches that don't use special characters work 100% fine. I'm new to Solr so I'm not sure what I'm doing wrong here. Can anybody help me out and point me in the direction of a solution? Sincerely, Daniel Löfquist