On Fri, Mar 27, 2009 at 8:41 PM, Rui Pereira <ruipereira...@gmail.com>wrote:

> I'm having problems with encoding in responses from search queries. The
> encoding problem only occurs in the topologyname field, if a instancename
> has accents it is returned correctly. In all my configurations I have
> UTF-8.
>
> <?xml version="1.0" encoding="UTF-8"?>
> <dataConfig>
>    <document name="topologies">
> <entity query="SELECT DISTINCT '3141-' || Sub0.SUBID as id, 'Inventário' as
> topologyname, 3141 as topologyid, Sub0.SUBID as instancekey, Sub0.NAME as
> instancename FROM ...
>              <field column="INSTANCEKEY" name="instancekey"/>
>              <field column="ID" name="id"/>
>              <field column="TOPOLOGYID" name="topologyid"/>
>              <field column="INSTANCENAME" name="instancename"/>
>              <field column="TOPOLOGYNAME" name="topologyname"/>...
>
>
> As an example, I can have in the response the following result:
>
> <doc>
> <long name="instancekey">285</long>
> <str name="instancename">Informática</str>
> <long name="topologyid">3141</long>
> <str name="topologyname">Inventário</str>
> </doc>
>

I see that you are specifying the topologyname's value in the query itself.
It might be a bug in DataImportHandler because it reads the data-config as a
string from an InputStream. If your default platform encoding is not UTF-8,
this may be the cause.

Can you try running the Solr's (or your servlet-container's) java process
with -Dfile.encoding=UTF-8 and see if that fixes the problem?

-- 
Regards,
Shalin Shekhar Mangar.

Reply via email to