Re: search result problem

Stefan Colella Mon, 21 May 2007 02:25:20 -0700

hello,

thx for u reply, i used the explain method and i understand now why somedocuments are returned.


I am using the same Analyzer for indexing and searching.

I tried to only add the content of the page where that expression can befound (instead of the whole document) and then the search works.

Do i have to split my pdf text into more field? Or what could be theproblem?



Grant Ingersoll wrote:

Try using the explain() method to see why the documents that werereturned scored the way they did.
If I am understanding correctly, you are saying that Luke shows thatthose words aren't actually in your index? Can you elaborate on whatyour analysis process is? Are you searching using the same Analyzeras you are indexing with? I would try to isolate the problem down tosome unit tests, if possible.
Cheers,
Grant

On May 18, 2007, at 8:12 AM, Stefan Colella wrote:
Hello,
My application is working with PDF files so i use lucene with PdfBoxto create a little search engine. I am new to lucene.
All seemed to work fine but after some tests I saw that someexpressions like "stock option" where never found (or returns thewrong documents) even if it exist in my PDF files. I searched in themail archive and found that I have to use the "French Analyser" butthat didn't work too.
I found that there is a tool named Luke to check the lucene index. Icould see that the original text contains those words but nothing inthe tokenizer.
Anybody who can help or can explain where I can start to look ?

thanks
--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org/tech/lucene.asp
Read the Lucene Java FAQ athttp://wiki.apache.org/jakarta-lucene/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: search result problem

Reply via email to