Re: Filtering top hits based on stored field? And Lucene 1.x -> 3.x for Dummies

Ian Lea Fri, 25 Jan 2013 13:57:25 -0800

To the best of my understanding you are spot on about the degradation.
 Loading fields is costly, loading for thousands of docs is liable to
be very costly.  You can mitigate it by only loading the fields you
want with (in 4.x) reader.doc(id, fields) but it will still be costly.



--
Ian.


On Fri, Jan 25, 2013 at 9:20 PM, Andrew Gilmartin
<and...@andrewgilmartin.com> wrote:
> Ian Lea wrote:
>
> Thank you for the quick and helpful reply. I had forgotten that Lucene's
> change document was one of best example of change documents around. I will
> read it.
>
>> On the specific question, calling doc() is still expensive.  You could
>> look at the FieldCache or the new DocValues stuff. See
>>
>> http://www.searchworkings.org/blog/-/blogs/introducing-lucene-index-doc-values
>> for info on the latter.
>
>
> I will explore that.
>
> I occurred to me that I do not know why the search performance degrades when
> doc() is called within the Collector. Is it simply that Lucene will present,
> for example, thousands of candidate hits (from millions of indexed
> documents) to the Collector even though the collector might only return the
> top handful? And so the Collector will need to load thousands of documents
> and it is this document loading that causes the performance degradation? Or
> is it more complex -- perhaps having to do with caches and other internals?
>
> -- Andrew
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Filtering top hits based on stored field? And Lucene 1.x -> 3.x for Dummies

Reply via email to