It looks like the Postgresql JDBC driver sets the encoding itself,
from what I can find.  So I would guess that it is setting the
character encoding based on the database you are connected to.  So if
the euro symbol is not handled by the database's encoding, there would
be no way to include it in the query string.  I think...


Karl

On Mon, Nov 12, 2012 at 6:22 AM, Karl Wright <daddy...@gmail.com> wrote:
> To clarify, we pass every string to the JDBC driver as a unicode
> string, but it is up to the JDBC driver to decide how to interpret it.
>  I don't know what exactly the PostgreSQL 9.1 driver does here.  It
> would be interesting to see what is posted to Solr, if you have those
> logs.  It may be that it is picking an encoding that is based on your
> machine's default encoding, which would be unfortunate.
>
> This page apparently indicates that there is somehow a way to set the
> encoding that JDBC communicates with the database with:
>
> http://stackoverflow.com/questions/3040597/jdbc-character-encoding
>
> I don't know if this is applicable to us at all though.  You can try:
>
> java -Dfile.encoding=utf8 start.jar
>
> ...and see if that changes things - it would be a good hint.
>
> Karl
>
>
> On Mon, Nov 12, 2012 at 6:12 AM, Karl Wright <daddy...@gmail.com> wrote:
>> Hi Abe-san,
>>
>> Quoted strings in SQL queries are not necessarily unicode.  See this
>> page for details:
>>
>> http://www.postgresql.org/docs/7.3/static/functions-string.html
>>
>> There is nothing you can do in JDBC invocations to control character
>> set.  This must be done in the query itself, or in the database
>> itself.
>>
>> Karl
>>
>> On Mon, Nov 12, 2012 at 6:03 AM, Shinichiro Abe
>> <shinichiro.ab...@gmail.com> wrote:
>>> Hi,
>>>
>>> I'm using Solr 4.0 and JDBC connection via PostgreSQL.
>>> The dataQuery is configured below:
>>>
>>> SELECT idfield AS $(IDCOLUMN), 'http://server?id=' || idfield AS 
>>> $(URLCOLUMN), '12345' AS $(DATACOLUMN) FROM album WHERE idfield IN $(IDLIST)
>>>
>>> On Solr side, '12345' was be able to indexed and stored.
>>>
>>> But when not-ascii character was configured,
>>>
>>> SELECT idfield AS $(IDCOLUMN), 'http://server?id=' || idfield AS 
>>> $(URLCOLUMN), '€€€' AS $(DATACOLUMN) FROM album WHERE idfield IN $(IDLIST)
>>>
>>> On Solr side, '€€€' was not indexed and stored.
>>>
>>> Actually, I configure the column which contains not-ascii characters into 
>>> DATACOLUMN.
>>> It seems content-type differ between them.
>>> Can JDBC connection control content-type?
>>>
>>> Regards,
>>> Shinichiro Abe
>>>

Reply via email to