Hi Raymond,
I agree with you, 0xfffe is a special character, that is why I was asking
how it's handled in solr.
In my document, 0xfffe does not appear at the beginning, it's in the
content.

Just an update about testing I'm doing: in a SolrCloud two shards
environment, if I launch dataimport on one node of the shard that will be
target for that doc, all the docs got written properly; if I launch
dataimport on one node of the other shard and then it forwards to the
target, I get the error.

Thanks
Federico


2013/8/5 Raymond Wiker <rwi...@gmail.com>

> I think #xfffe is special; it is used as a "byte order mark" to identify
> the encoding used. In that case, it should only appear at the beginning of
> the document.
>
> Sent from my iPhone
>
> On 5 Aug 2013, at 17:19, Federico Chiacchiaretta <federico.c...@gmail.com>
> wrote:
>
> > Hi Shawn,
> > thanks for your answer.
> > From the docs you linked i found:
> > "This property is only relevent for server versions less than or equal to
> > 7.2".
> >
> > I'm using version 9.1, I gave it a try but unfortunately I had no luck.
> > Besides, I checked encoding settings on DB and it's UTF-8.
> >
> > Please note that import of data works with a single instance of Solr, but
> > it doesn't on a SolrCloud when the update gets forwarded to another node.
> > Thinking about jetty bug (or misconfiguration), I also tried a test
> > environment based on tomcat, but I have the same result.
> >
> > How utf character 0xfffe is supposed to be handled? It seems that solr
> can
> > handle it well, while sending it over HTTP to another node breaks things.
> > Can it be a HttpSolrServer bug?
> >
> > Thanks,
> > Federico
> >
> >
> >
> >
> > 2013/8/5 Shawn Heisey <s...@elyograg.org>
> >
> >> On 8/1/2013 7:20 AM, Federico Chiacchiaretta wrote:
> >>> on data import from a PostgreSQL db, I get the following error in
> >> solr.log:
> >>>
> >>> ERROR - 2013-08-01 09:51:00.217; org.apache.solr.common.SolrException;
> >>> shard update error RetryNode:
> >>
> http://172.16.201.173:8983/solr/archive/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException
> >> :
> >>> Invalid
> >>> UTF-8 character 0xfffe at char #416, byte #127)
> >>
> >> It sounds like your database is not using the UTF-8 character set, but
> >> the JDBC driver (or the driver-server combination) is not aware that the
> >> character set is different.  Solr expects UTF-8.
> >>
> >> Generally what you want to do is tell the JDBC driver to use the UTF-8
> >> character set, which will hopefully cause either the driver or the DB
> >> server to translate for you.
> >>
> >> There is a charSet parameter for the postgresql jdbc driver:
> >>
> >> http://jdbc.postgresql.org/documentation/80/connect.html
> >>
> >> These are added to the jdbc URL after a ? character, just like
> >> parameters on an http URL.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
>

Reply via email to