Re: Quickest way to collect one field from the searched docs....

Neil Bacon Fri, 19 Sep 2014 15:30:54 -0700

Hi
Have you looked at DocFieldValue / DocField? It's fast for this use case.
Regards
Neil


Sent from my mobile doovalaki

On 20/09/2014 6:44 am, Shouvik Bardhan <[email protected]> wrote:
Sujit, thanks for the response. I have already done what you said. My issue
is that after setting up the data in lucene index and the DB, when a query
comes and say it matches 25 million docs in Lucene, then I need to get all
the 25 million values of this field (record_id in your example) quickly. In
the current way, I can get all those Lucene doc Ids (even 25 million of
them) very fast. But I dont know a way to get one field (recordid) from all
the matched documents (when 25 million docs have matched) that fast.

thanks
Shouvik

On Fri, Sep 19, 2014 at 2:26 PM, Sujit Pal <[email protected]> wrote:

> Hi Shouvik, not sure if you have already considered this, but you could put
> the database primary key for the record into the index - ie, reverse your
> insert to do DB first, get the record_id and then add this to the Lucene
> index as "record_id" field. During retrieval you can minimize the network
> traffic by setting field list to only this record_id.
>
> -sujit
>
>
> On Thu, Sep 18, 2014 at 9:23 PM, Shouvik Bardhan <[email protected]>
> wrote:
>
> > Pardon the length of the question. I have an index with 100 million docs
> > (lucene not solr) and term queries (A*, A AND B* type queries) return
> > pretty quickly (2 -4 secs) and I pick the lucene docIds up pretty quickly
> > with a collector. This is good for us since we take the docIds and do
> > further filtering based on another database we maintain whose record ids
> > match with the stored lucene doc ids and we are able to do what we want.
> I
> > know that depending on the lucene doc id value is not a good thing, since
> > after delete/merge/optimize, the doc ids may change and if that was to
> > happen, our other datastore will not line up with lucene doc index and
> > chaps will ensue. Thus we do not optimize the index etc.
> >
> > My question is what is the fastest way I can gather 1 field value from
> the
> > docs which are found to match the query? Is there any way to do this as
> > fast as (or at least not much slower) I am able to collect the lucene
> > docids?  I want to get away from depending on the "lucene docids not
> > changing" if possible.
> >
> > Thanks for any suggestions.
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Quickest way to collect one field from the searched docs....

Reply via email to