There are two different sets of readers for binary and character-mode data, and I don't remember which is which. You may be reading the PDF binary blob as a character blob.
On Wed, Aug 22, 2012 at 1:34 AM, anarchos78 <rigasathanasio...@hotmail.com> wrote: > Thanks for your reply, > I had tryied many things (copy field etc) with no succes. Notice that the > "pdfs" are stored as BLOB in mysql database. I am trying to use DIH in order > to fetch the binaries from DB. Is it possible? > Thanks! > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-search-Tika-extracted-text-from-PDF-not-return-highlighting-snippet-tp3999647p4002587.html > Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com