Hi,
This is the first time i am using Lucene.
I need to index pdf's with very few fields, title, date and body (long
field) for a web based search.
The results i need to display have to show not only the documents found but
for each document a snapshot of the text where the search term has been
Hi
Lucene can store the original text of the document. You make the
lucene fields to do what you need. Have a look at the apidocs for
Field.Store and you'll see that you've got three choices: Yes, No or
Compress.
For your display snapshots, have a look at the lucene highlighter package.
And a
Thanks very much. Looks like Field.Store.COMPRESS is what i want.
I'll also have a look at the search highlight stuff and getting Lucene in
Action.
Ian Lea wrote:
>
> Hi
>
>
> Lucene can store the original text of the document. You make the
> lucene fields to do what you need. Have a look
I also encountered these options of the Field constructor but I never
found a way to be sure that the field is really not loaded in RAM and
only return with Field.reader(). There seems to be no contract in the
javadoc.
Moreover the reader access methods went away between 1.9 and 2.2 if I
-user@lucene.apache.org
Subject: Beginner: Best way to index and display orginal text of pdfs in
search results
Hi,
This is the first time i am using Lucene.
I need to index pdf's with very few fields, title, date and body (long
field) for a web based search.
The results i need to display