Hi everyone

When I query a lucene index, I get back a list of document ids. This index
search is fast. Now for all documents matching the result I need a unique
String field called "id" which is stored in the document. From the
documentation I gather that document ids are internal and I should not use
them for referencing my own data structures. Currently I iterate over all
the hits matching the document and then for each one I get the document to
read the field using IndexReader.document().
http://lucene.apache.org/core/4_5_0/core/org/apache/lucene/index/IndexReader.html

I read the "id" field from the document and then use it further in my
processing logic.
The problem is that reading all documents to get all "id"'s is turning out
to be very slow. It is the bottleneck in my application. It would be nice
to have a way if lucene could return some metadata along with the internal
document id when I did a search. I do not want to read all documents just
to retrieve this metadata.

The best solution I have come across searching on the net is to use
payloads which will be returned by the fast index search query along with
the document ids.

Is my understanding correct that using payloads I can get "id" string field
for all my documents faster than reading my entire document?

I am not able to find a good example of how to store and retrieve payloads?
Can you please point me to a good resource to learn how to use payloads and
how they will impact performance?
I am using Lucene 4.5.

Thanks
Rohit Banga
http://iamrohitbanga.com/

Reply via email to