​Thanks Michael for your response. Few questions:
1. Can I expect better performance when retrieving a single NumericDocValue for all hits vs when I retrieve documents for all hits to fetch the field value? As far as I understand retrieving n documents from the index requires n disk reads. How many disk reads to I do when using NumericDocValues? How are they stored? 2. I tried looking for examples on how to use numeric doc values. I found that in new versions of lucene we have to use "AtomicReader". Found this: http://www.gossamer-threads.com/lists/lucene/java-user/182641 So is this the code I am looking for: long getNumericDocValueForDocument(IndexSearcher searcher, int docId) { IndexReader reader = searcher.getIndexReader(); long docVal = 0; for (AtomicReaderContext rc : reader.leaves()) { AtomicReader ar = rc.reader(); docVal = ar.getNumericDocValues().get(*docID*); } return docVal; } How do I know which docVal to return? It appears that each AtomicReader (every iteration of the loop) may return a docVal? 3. Can I only store NumericDocValues? Can I get something like StringDocValues? I have a string "id". I guess I could keep a mapping from numeric doc value (Long) to String but I want to avoid keeping two sources of information (Lucene Index and a HashMap). I can use SearcherManager to deal with concurrent searches and index updates ( http://blog.mikemccandless.com/2011/09/lucenes-searchermanager-simplifies.html), but how about managing two data sources Lucene index and HashMap<Long, String> with SearcherManager? Is there a way to achieve this using a custom SearcherFactory? Thanks Rohit Banga http://iamrohitbanga.com/ On Fri, Mar 21, 2014 at 3:26 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > DocValues are better than payloads. > > E.g. index a NumericDocValuesField with each doc, holding your id. > > Then at search time you can use MultiDocValues.getNumericValues. > > Mike McCandless > > http://blog.mikemccandless.com > > > On Fri, Mar 21, 2014 at 4:35 PM, Rohit Banga <iamrohitba...@gmail.com> > wrote: > > Hi everyone > > > > When I query a lucene index, I get back a list of document ids. This > index > > search is fast. Now for all documents matching the result I need a unique > > String field called "id" which is stored in the document. From the > > documentation I gather that document ids are internal and I should not > use > > them for referencing my own data structures. Currently I iterate over all > > the hits matching the document and then for each one I get the document > to > > read the field using IndexReader.document(). > > > http://lucene.apache.org/core/4_5_0/core/org/apache/lucene/index/IndexReader.html > > > > I read the "id" field from the document and then use it further in my > > processing logic. > > The problem is that reading all documents to get all "id"'s is turning > out > > to be very slow. It is the bottleneck in my application. It would be nice > > to have a way if lucene could return some metadata along with the > internal > > document id when I did a search. I do not want to read all documents just > > to retrieve this metadata. > > > > The best solution I have come across searching on the net is to use > > payloads which will be returned by the fast index search query along with > > the document ids. > > > > Is my understanding correct that using payloads I can get "id" string > field > > for all my documents faster than reading my entire document? > > > > I am not able to find a good example of how to store and retrieve > payloads? > > Can you please point me to a good resource to learn how to use payloads > and > > how they will impact performance? > > I am using Lucene 4.5. > > > > Thanks > > Rohit Banga > > http://iamrohitbanga.com/ > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >