Does your servlet container have the URI encoding set correctly, e.g.
URIEncoding="UTF-8" for tomcat6?

http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config

Older versions of Jetty use ISO-8859-1 as the default URI encoding,
but jetty 6 should use UTF-8 as default:

http://docs.codehaus.org/display/JETTY/International+Characters+and+Character+Encodings

-Peter

On Sat, Apr 30, 2011 at 6:31 AM, Pavel Kukačka <pavel.kuka...@seznam.cz> wrote:
> Hello,
>
>        I've hit a (probably trivial) roadblock I don't know how to overcome 
> with Solr 3.1:
> I have a document with common fields (title, keywords, content) and I'm
> trying to use highlighting.
>        With queries using ASCII characters there is no problem; it works 
> smoothly. However,
> when I search using a czech word including non-ascii chars (like "slovíčko" 
> for example - 
> http://localhost:8983/solr/select/?q=slov%C3%AD%C4%8Dko&version=2.2&start=0&rows=10&indent=on&hl=on&hl.fl=*),
>  the document is found, but
> the response doesn't contain the highlighted snippet in the highlighting node 
> - there is only an
> empty node - like this:
> ******************
> .
> .
> .
> <lst name="highlighting">
>  <lst name="2009"/>
> </lst>
> ************************
>
>
> When searching for the other keyword ( 
> http://localhost:8983/solr/select/?q=slovo&version=2.2&start=0&rows=10&indent=on&hl=on&hl.fl=*),
>  the resulting response is fine - like this:
> ************************************
> <lst name="highlighting">
>  <lst name="2009">
> <arr name="user_keywords">
>      <str>slov&amp;#237;&amp;#269;ko &lt;em 
> id="highlighting"&gt;slovo&lt;/em&gt;</str>
>    </arr>
>  </lst>
> </lst>
>
> ************************************
>
> Did anyone come accross this problem?
> Cheers,
> Pavel
>
>
>



-- 
Peter M. Wolanin, Ph.D.      : Momentum Specialist,  Acquia. Inc.
peter.wola...@acquia.com : 978-296-5247

"Get a free, hosted Drupal 7 site: http://www.drupalgardens.com";

Reply via email to