Hi,
I think the easiest way is ro exclude the pages while you are parsing the
pdf document. So you will provide just the necessary pages to lucene.
Another solution is to create for each site an own document, this should
hafe a field "pagenumber" or, und you can delete the document from the
index
Shyam - I moderated your message through, so please subscribe to the
list to send to it in the future.
Please provide us with some details - a standalone RAMDirectory-using
JUnit TestCase is the most ideal way to share an issue like this and
have someone else take a look at it. And frequen
Hi,
I am working on a search project using Lucene and currently I am working on
parsing PDF documents. I was successful in implementing my parser using
Lucene and PDFBox. I have a doubt on how to exclude or (maybe delete) pages
from the index. I am not sure how to do this.. I mean when exactly it