Oh - yeah - also - youll be passed a segment reader if thats what makes
sense. And sense it does, you will be passed one every time. You can
warm a multireader the same way though, so no reason to pin it down.
Mark Miller wrote:
> Come on dude :) Spend a half ounce of effort first. Mike's time is too
> valuable !
>
> Luckily mine is not.
>
> There is no default impl - the class is dead simple (and the class has
> been pointed out like 3 times in this thread - I'm not even fully
> following and I know where to find it):
>
> public static abstract class IndexReaderWarmer {
> public abstract void warm(IndexReader reader) throws IOException;
> }
>
> Now pass something in that warms the reader. Load a fieldcache - do a
> search. Do the hokey pokey and turn your self around ...
>
> Investigation time: 5 seconds.
>
> John Wang wrote:
>
>> Hi Michael:
>>
>> Thanks for the pointer!
>>
>> Pardon my ignorance, but I am still no seeing the connection
>> between this api to per/segment loading of FieldCache. (the api takes
>> in an IndexReader instead of maybe SegmentReader[])
>>
>> Can you point me to maybe the default impl of IndexReaderWarmer
>> to help me understand?
>>
>> Thanks
>>
>> -John
>>
>> On Wed, Sep 23, 2009 at 7:17 AM, Michael McCandless
>> <[email protected] <mailto:[email protected]>> wrote:
>>
>> This is exactly why we added IndexWriter.setMergedSegmentWarmer -- you
>> can warm the reader w/o blocking ongoing updates.
>>
>> Mike
>>
>> On Tue, Sep 22, 2009 at 7:15 PM, Mark Miller
>> <[email protected] <mailto:[email protected]>> wrote:
>> > Right - when a large segment is invalidated, you will have a bigger
>> > fieldcache piece to reload - pre 2.9, you'd be reloading the *whole*
>> > field cache every time though. Sounds like you are trying to
>> deal with
>> > those large segments changing anyway :) They are always an issue
>> when
>> > doing RT it seems.
>> >
>> > I don't believe deletes invalidate a field cache - terms from
>> deleted
>> > docs stay in a field cache and segmentreaders use their
>> freqStream as
>> > the fieldcache key. Only when the deletes are merged out would they
>> > invalidate - but because your writing a new segment anyway ...
>> >
>> > - Mark
>> >
>> > John Wang wrote:
>> >> I understand what you are saying. Let me detail what I am
>> trying to say:
>> >>
>> >> When "currently processed segments" are flushed down, merge may
>> >> happen. When merges happen, some of those "stable segments" will be
>> >> invalidated, and so will the fieldcache data keyed by them.
>> >>
>> >> In a high update environment, such scenarios can happen quite
>> often.
>> >>
>> >> The way the default mergePolicy works is that small segments get
>> >> merged into the larger segments. Eventually, what will be
>> invalidated
>> >> would be a large segment, and when that happens, a large chunk
>> of the
>> >> field cache would be invalidated.
>> >>
>> >> Furthermore, in the case where there are high updates, the stable
>> >> segments can be invalidate much sooner when there are deletes
>> in those
>> >> segments, and I would guess the corresponding FieldCache needs
>> to be
>> >> adjusted. Not sure how it is handled right now.
>> >>
>> >> Just my two cents, and of course when I find the time I will
>> need to
>> >> run some tests to see.
>> >>
>> >> -John
>> >>
>> >> On Tue, Sep 22, 2009 at 3:59 PM, Uwe Schindler <[email protected]
>> <mailto:[email protected]>
>> >> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>> >>
>> >> The NRT reader coming from the IndexWriter.getReader() has only
>> >> changes in the currently processed segments, the other segments
>> >> keep stable (and even their IndexReader keys used for the
>> >> FieldCache). The rest of the segments keep stable. For the
>> >> consumer it looks like a normal reader (it is in fact a
>> >> ReadOnlyDirectoryReader) supporting
>> getSequentialSubReaders() and
>> >> so on.
>> >>
>> >>
>> >>
>> >> -----
>> >> Uwe Schindler
>> >> H.-H.-Meier-Allee 63, D-28213 Bremen
>> >> http://www.thetaphi.de
>> >> eMail: [email protected] <mailto:[email protected]>
>> <mailto:[email protected] <mailto:[email protected]>>
>> >>
>> >>
>> ------------------------------------------------------------------------
>> >>
>> >> *From:* John Wang [mailto:[email protected]
>> <mailto:[email protected]>
>> >> <mailto:[email protected] <mailto:[email protected]>>]
>> >> *Sent:* Tuesday, September 22, 2009 9:32 AM
>> >> *To:* [email protected]
>> <mailto:[email protected]>
>> <mailto:[email protected]
>> <mailto:[email protected]>>
>> >> *Subject:* Re: 2.9 NRT w.r.t. sorting and field cache
>> >>
>> >>
>> >>
>> >> Thanks Mark for the pointer!
>> >>
>> >> I guess my point is with NRT, and when segment files change
>> often,
>> >> this would be an issue, no?
>> >>
>> >> Anyway, I can run some tests.
>> >>
>> >> Thanks
>> >>
>> >> -John
>> >>
>> >> On Tue, Sep 22, 2009 at 3:21 PM, Mark Miller
>> >> <[email protected] <mailto:[email protected]>
>> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>> >>
>> >> 1483 - indexsearcher pulls out a readers subreaders
>> >> (segmentreaders) and sends a collector over them one by one,
>> >> rather than using the multireader. So only fc for seg
>> readers that
>> >> change need to be reloaded.
>> >>
>> >> - Mark
>> >>
>> >>
>> >>
>> >> http://www.lucidimagination.com (mobile)
>> >>
>> >>
>> >> On Sep 22, 2009, at 1:27 AM, John Wang <[email protected]
>> <mailto:[email protected]>
>> >> <mailto:[email protected] <mailto:[email protected]>>>
>> wrote:
>> >>
>> >>> Hi Yonik:
>> >>>
>> >>> Actually that is what I am looking for. Can you
>> please point
>> >>> me to where/how sorting is done per-segment?
>> >>>
>> >>> When heaving indexing introduces or modifies
>> segments, would
>> >>> it cause reloading of FieldCache at query time and thus would
>> >>> impact search performance?
>> >>>
>> >>> thanks
>> >>>
>> >>> -John
>> >>>
>> >>> On Tue, Sep 22, 2009 at 1:05 PM, Yonik Seeley
>> >>> <[email protected]
>> <mailto:[email protected]>
>> <mailto:[email protected]
>> <mailto:[email protected]>>>
>> >>> wrote:
>> >>>
>> >>> On Tue, Sep 22, 2009 at 12:56 AM, John Wang
>> <[email protected] <mailto:[email protected]>
>> >>> <mailto:[email protected] <mailto:[email protected]>>>
>> wrote:
>> >>> > Looking at the code, seems there is a disconnect between
>> >>> how/when field
>> >>> > cache is loaded when IndexWriter.getReader() is called.
>> >>>
>> >>> I'm not sure what you mean by "disconnect"
>> >>>
>> >>> > Is FieldCache updated?
>> >>>
>> >>> FieldCache entries are populated on demand, as they always
>> have been.
>> >>>
>> >>>
>> >>> > Otherwise, are we reloading FieldCache for each
>> >>> > reader instance?
>> >>>
>> >>> Searching/sorting is now per-segment, and so is the use of the
>> >>> FieldCache. Segments that don't change shouldn't have to
>> reload
>> >>> their
>> >>> FieldCache entries.
>> >>>
>> >>> -Yonik
>> >>> http://www.lucidimagination.com
>> >>>
>> >>>
>> ---------------------------------------------------------------------
>> >>> To unsubscribe, e-mail:
>> [email protected]
>> <mailto:[email protected]>
>> >>> <mailto:[email protected]
>> <mailto:[email protected]>>
>> >>> For additional commands, e-mail:
>> [email protected]
>> <mailto:[email protected]>
>> >>> <mailto:[email protected]
>> <mailto:[email protected]>>
>> >>>
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >
>> >
>> > --
>> > - Mark
>> >
>> > http://www.lucidimagination.com
>> >
>> >
>> >
>> >
>> >
>> ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: [email protected]
>> <mailto:[email protected]>
>> > For additional commands, e-mail: [email protected]
>> <mailto:[email protected]>
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> <mailto:[email protected]>
>> For additional commands, e-mail: [email protected]
>> <mailto:[email protected]>
>>
>>
>>
>
>
>
--
- Mark
http://www.lucidimagination.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]