Indexing PDF

Héctor Trujillo Tue, 04 Oct 2011 07:49:53 -0700

Hi all, I'm indexing pdf's files with SolrJ, and most of them work. But with
some files I’ve got problems because they stored estrange characters. I got
stored this content:
+++++++


Starting a Search Application

Abstract
Starting
a Search Application A Lucid Imagination White Paper ¥ April 2009 Page i

Starting a Search Application A Lucid Imagination White Paper ¥ April 2009
Page ii Do You Need Full-text Search?
∞
∞
∞
Starting
a Search Application A Lucid Imagination White Paper ¥ April 2009 Page 1
Identifying
Ideal Results
Starting
a Search Application A Lucid Imagination White Paper ¥ April 2009 Page 2
Starting
a Search Application A Lucid Imagination White Paper


+++++++

But if I open the pdf file I have no problem to see the content correctly.

I think this is a question of the charset encoding, but I don't know if I
can avoid this behaviour with a different analyzer o tokenizer to be applied
in indexing time, may be.

I've got this problem with some documents downloaded from Lucid's Web.



I don't know if some have had the same problem and know how to solve this.

Thanks

Best regards

Indexing PDF

Reply via email to