I will take a look at DocField. Thanks for the suggestion.
On Fri, Sep 19, 2014 at 6:30 PM, Neil Bacon <neil.ba...@nicta.com.au> wrote: > Hi > Have you looked at DocFieldValue / DocField? It's fast for this use case. > Regards > Neil > > Sent from my mobile doovalaki > > On 20/09/2014 6:44 am, Shouvik Bardhan <sbard...@gisfederal.com> wrote: > Sujit, thanks for the response. I have already done what you said. My issue > is that after setting up the data in lucene index and the DB, when a query > comes and say it matches 25 million docs in Lucene, then I need to get all > the 25 million values of this field (record_id in your example) quickly. In > the current way, I can get all those Lucene doc Ids (even 25 million of > them) very fast. But I dont know a way to get one field (recordid) from all > the matched documents (when 25 million docs have matched) that fast. > > thanks > Shouvik > > On Fri, Sep 19, 2014 at 2:26 PM, Sujit Pal <sujit....@comcast.net> wrote: > > > Hi Shouvik, not sure if you have already considered this, but you could > put > > the database primary key for the record into the index - ie, reverse your > > insert to do DB first, get the record_id and then add this to the Lucene > > index as "record_id" field. During retrieval you can minimize the network > > traffic by setting field list to only this record_id. > > > > -sujit > > > > > > On Thu, Sep 18, 2014 at 9:23 PM, Shouvik Bardhan < > sbard...@gisfederal.com> > > wrote: > > > > > Pardon the length of the question. I have an index with 100 million > docs > > > (lucene not solr) and term queries (A*, A AND B* type queries) return > > > pretty quickly (2 -4 secs) and I pick the lucene docIds up pretty > quickly > > > with a collector. This is good for us since we take the docIds and do > > > further filtering based on another database we maintain whose record > ids > > > match with the stored lucene doc ids and we are able to do what we > want. > > I > > > know that depending on the lucene doc id value is not a good thing, > since > > > after delete/merge/optimize, the doc ids may change and if that was to > > > happen, our other datastore will not line up with lucene doc index and > > > chaps will ensue. Thus we do not optimize the index etc. > > > > > > My question is what is the fastest way I can gather 1 field value from > > the > > > docs which are found to match the query? Is there any way to do this as > > > fast as (or at least not much slower) I am able to collect the lucene > > > docids? I want to get away from depending on the "lucene docids not > > > changing" if possible. > > > > > > Thanks for any suggestions. > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >