I managed to resolve this issue. Turns out that the issue was because of a
faulty XML file being generated by ruby-solr gem. I had to install
libxml-ruby, rsolr and I used rsolr gem instead of ruby-solr.

Also, if you face this kind of issue, the test-utf8.sh file included in
exampledocs is a good file to test Solr's behavior towards UTF-8 chars.

Great wok Solr team, and special thanks to Erik Hatcher.

*Pranav Prakash*

"temet nosce"

Twitter <http://twitter.com/pranavprakash> | Blog <http://blog.myblive.com> |
Google <http://www.google.com/profiles/pranny>


On Mon, Sep 19, 2011 at 15:54, Pranav Prakash <pra...@gmail.com> wrote:

>
> Just in case, someone might be intrested here is the log
>
> SEVERE: java.lang.RuntimeException: [was class
> java.io.CharConversionException] Invalid UTF-8 middle byte 0x73 (at char
> #66641, byte #65289)
>  at
> com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
> at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)
>  at
> com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)
> at com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
>  at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:287)
> at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:146)
>  at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77)
> at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:67)
>  at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
>  at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
>  at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>  at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>  at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>  at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>  at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
>  at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
>  at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
>  at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
>  at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> Caused by: java.io.CharConversionException: Invalid UTF-8 middle byte 0x73
> (at char #66641, byte #65289)
>  at com.ctc.wstx.io.UTF8Reader.reportInvalidOther(UTF8Reader.java:313)
> at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:204)
>  at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101)
> at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84)
>  at
> com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57)
> at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:992)
>  at
> com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4628)
> at
> com.ctc.wstx.sr.BasicStreamReader.readCoalescedText(BasicStreamReader.java:4126)
>  at
> com.ctc.wstx.sr.BasicStreamReader.finishToken(BasicStreamReader.java:3701)
> at
> com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3649)
>  ... 26 more
>
>
> Also, is there a setting so I can change the level of backtrace? This would
> be helpful in showing the complete stack instead of 26 more ...
>
> *Pranav Prakash*
>
> "temet nosce"
>
> Twitter <http://twitter.com/pranavprakash> | Blog<http://blog.myblive.com> |
> Google <http://www.google.com/profiles/pranny>
>
>
> On Mon, Sep 19, 2011 at 14:16, Pranav Prakash <pra...@gmail.com> wrote:
>
>>
>> Hi List,
>>
>> I tried Solr 3.4.0 today and while indexing I got the error
>> java.lang.RuntimeException: [was class java.io.CharConversionException]
>> Invalid UTF-8 middle byte 0x73 (at char #66611, byte #65289)
>>
>> My earlier version was Solr 1.4 and this same document went into index
>> successfully. Looking around, I see issue
>> https://issues.apache.org/jira/browse/SOLR-2381 which seems to fix the
>> issue. I thought this patch is already applied to Solr 3.4.0. Is there
>> something I am missing?
>>
>> Is there anything else I need to mention? Logs/ My document details etc.?
>>
>> *Pranav Prakash*
>>
>> "temet nosce"
>>
>> Twitter <http://twitter.com/pranavprakash> | Blog<http://blog.myblive.com> |
>> Google <http://www.google.com/profiles/pranny>
>>
>
>

Reply via email to