Re: VelocityResponseWriter/Solritas character encoding issue

Sascha Szott Wed, 18 Nov 2009 08:15:43 -0800

Hi Erik,

Erik Hatcher wrote:

Can you give me a test document that causes an issue? (maybe send me aSolr XML document in private e-mail). I'll see what I can do once Ican see the issue first hand.

Thank you! Just try the utf8-example.xml file in the exampledocdirectory. After having indexed the document, the output of the scripttest_utf8.sh suggests to me that everything works correctly:


 Solr server is up.
 HTTP GET is accepting UTF-8
 HTTP POST is accepting UTF-8
 HTTP POST does not default to UTF-8
 HTTP GET is accepting UTF-8 beyond the basic multilingual plane
 HTTP POST is accepting UTF-8 beyond the basic multilingual plane
 HTTP POST + URL params is accepting UTF-8 beyond the basic multilingual

If I'm using the standard QueryResponseWriter and the query q=umlauts,the responding xml page contains properly printed non-ASCII characters.The same query against the VelocityResponseWriter returns a lot ofUnicode replacement characters (u+FFFD) instead.


-Sascha

On Nov 18, 2009, at 2:48 PM, Sascha Szott wrote:
Hi,
I've played around with Solr's VelocityResponseWriter (which is indeeda very useful feature for rapid prototyping). I've realized thatVelocity uses ISO-8859-1 as default character encoding. I've changedthis setting to UTF-8 in my velocity.properties file (inside the confdirectory), i.e.,
  input.encoding=UTF-8
  output.encoding=UTF-8

and checked that the settings were successfully loaded.
Within the main Velocity template, browse.vm, the character encodingis set to UTF-8 as well, i.e.,
  <meta http-equiv="content-type" content="text/html; charset=UTF-8"/>
After starting Solr (which is deployed in a Tomcat 6 server on aUbuntu machine), I ran into some character encoding problems.
Due to the change of input.encoding to UTF-8, no problems occur whennon-ASCII characters are presend in the query string, e.g. germanumlauts. But unfortunately, something is wrong with the encoding ofcharacters in the html page that is generated byVelocityResponseWriter. The non-ASCII characters aren't displayedproperly (for example, FF prints a black diamond with a white questionmark). If I manually set the encoding to ISO-8859-1, the non-ASCIIcharacters are displayed correctly. Does anybody have a clue?
Thanks in advance,
Sascha

Re: VelocityResponseWriter/Solritas character encoding issue

Reply via email to