Thanks Mike for your valuable time! Sorry to be a pest, I am trying to write a fair perf test and to understand the feature. If there are other experts on the subject of index reader warming, please chime in.
I am not seeing the connection between given an IndexReader and the FieldCacheImpl API, e.g. how to warm up the FieldCache for this particular segment? Are you suggesting to just do a IndexSearcher.search on the given index for warming up within the IndexReaderWarmer impl? In which case the searcher would need to know the incoming searches pretty well I guess. Thanks -John On Wed, Sep 23, 2009 at 7:57 AM, Mark Miller <markrmil...@gmail.com> wrote: > Oh - yeah - also - youll be passed a segment reader if thats what makes > sense. And sense it does, you will be passed one every time. You can > warm a multireader the same way though, so no reason to pin it down. > > Mark Miller wrote: > > Come on dude :) Spend a half ounce of effort first. Mike's time is too > > valuable ! > > > > Luckily mine is not. > > > > There is no default impl - the class is dead simple (and the class has > > been pointed out like 3 times in this thread - I'm not even fully > > following and I know where to find it): > > > > public static abstract class IndexReaderWarmer { > > public abstract void warm(IndexReader reader) throws IOException; > > } > > > > Now pass something in that warms the reader. Load a fieldcache - do a > > search. Do the hokey pokey and turn your self around ... > > > > Investigation time: 5 seconds. > > > > John Wang wrote: > > > >> Hi Michael: > >> > >> Thanks for the pointer! > >> > >> Pardon my ignorance, but I am still no seeing the connection > >> between this api to per/segment loading of FieldCache. (the api takes > >> in an IndexReader instead of maybe SegmentReader[]) > >> > >> Can you point me to maybe the default impl of IndexReaderWarmer > >> to help me understand? > >> > >> Thanks > >> > >> -John > >> > >> On Wed, Sep 23, 2009 at 7:17 AM, Michael McCandless > >> <luc...@mikemccandless.com <mailto:luc...@mikemccandless.com>> wrote: > >> > >> This is exactly why we added IndexWriter.setMergedSegmentWarmer -- > you > >> can warm the reader w/o blocking ongoing updates. > >> > >> Mike > >> > >> On Tue, Sep 22, 2009 at 7:15 PM, Mark Miller > >> <markrmil...@gmail.com <mailto:markrmil...@gmail.com>> wrote: > >> > Right - when a large segment is invalidated, you will have a > bigger > >> > fieldcache piece to reload - pre 2.9, you'd be reloading the > *whole* > >> > field cache every time though. Sounds like you are trying to > >> deal with > >> > those large segments changing anyway :) They are always an issue > >> when > >> > doing RT it seems. > >> > > >> > I don't believe deletes invalidate a field cache - terms from > >> deleted > >> > docs stay in a field cache and segmentreaders use their > >> freqStream as > >> > the fieldcache key. Only when the deletes are merged out would > they > >> > invalidate - but because your writing a new segment anyway ... > >> > > >> > - Mark > >> > > >> > John Wang wrote: > >> >> I understand what you are saying. Let me detail what I am > >> trying to say: > >> >> > >> >> When "currently processed segments" are flushed down, merge may > >> >> happen. When merges happen, some of those "stable segments" will > be > >> >> invalidated, and so will the fieldcache data keyed by them. > >> >> > >> >> In a high update environment, such scenarios can happen quite > >> often. > >> >> > >> >> The way the default mergePolicy works is that small segments get > >> >> merged into the larger segments. Eventually, what will be > >> invalidated > >> >> would be a large segment, and when that happens, a large chunk > >> of the > >> >> field cache would be invalidated. > >> >> > >> >> Furthermore, in the case where there are high updates, the stable > >> >> segments can be invalidate much sooner when there are deletes > >> in those > >> >> segments, and I would guess the corresponding FieldCache needs > >> to be > >> >> adjusted. Not sure how it is handled right now. > >> >> > >> >> Just my two cents, and of course when I find the time I will > >> need to > >> >> run some tests to see. > >> >> > >> >> -John > >> >> > >> >> On Tue, Sep 22, 2009 at 3:59 PM, Uwe Schindler <u...@thetaphi.de > >> <mailto:u...@thetaphi.de> > >> >> <mailto:u...@thetaphi.de <mailto:u...@thetaphi.de>>> wrote: > >> >> > >> >> The NRT reader coming from the IndexWriter.getReader() has > only > >> >> changes in the currently processed segments, the other > segments > >> >> keep stable (and even their IndexReader keys used for the > >> >> FieldCache). The rest of the segments keep stable. For the > >> >> consumer it looks like a normal reader (it is in fact a > >> >> ReadOnlyDirectoryReader) supporting > >> getSequentialSubReaders() and > >> >> so on. > >> >> > >> >> > >> >> > >> >> ----- > >> >> Uwe Schindler > >> >> H.-H.-Meier-Allee 63, D-28213 Bremen > >> >> http://www.thetaphi.de > >> >> eMail: u...@thetaphi.de <mailto:u...@thetaphi.de> > >> <mailto:u...@thetaphi.de <mailto:u...@thetaphi.de>> > >> >> > >> >> > >> > ------------------------------------------------------------------------ > >> >> > >> >> *From:* John Wang [mailto:john.w...@gmail.com > >> <mailto:john.w...@gmail.com> > >> >> <mailto:john.w...@gmail.com <mailto:john.w...@gmail.com>>] > >> >> *Sent:* Tuesday, September 22, 2009 9:32 AM > >> >> *To:* java-dev@lucene.apache.org > >> <mailto:java-dev@lucene.apache.org> > >> <mailto:java-dev@lucene.apache.org > >> <mailto:java-dev@lucene.apache.org>> > >> >> *Subject:* Re: 2.9 NRT w.r.t. sorting and field cache > >> >> > >> >> > >> >> > >> >> Thanks Mark for the pointer! > >> >> > >> >> I guess my point is with NRT, and when segment files change > >> often, > >> >> this would be an issue, no? > >> >> > >> >> Anyway, I can run some tests. > >> >> > >> >> Thanks > >> >> > >> >> -John > >> >> > >> >> On Tue, Sep 22, 2009 at 3:21 PM, Mark Miller > >> >> <markrmil...@gmail.com <mailto:markrmil...@gmail.com> > >> <mailto:markrmil...@gmail.com <mailto:markrmil...@gmail.com>>> > wrote: > >> >> > >> >> 1483 - indexsearcher pulls out a readers subreaders > >> >> (segmentreaders) and sends a collector over them one by one, > >> >> rather than using the multireader. So only fc for seg > >> readers that > >> >> change need to be reloaded. > >> >> > >> >> - Mark > >> >> > >> >> > >> >> > >> >> http://www.lucidimagination.com (mobile) > >> >> > >> >> > >> >> On Sep 22, 2009, at 1:27 AM, John Wang <john.w...@gmail.com > >> <mailto:john.w...@gmail.com> > >> >> <mailto:john.w...@gmail.com <mailto:john.w...@gmail.com>>> > >> wrote: > >> >> > >> >>> Hi Yonik: > >> >>> > >> >>> Actually that is what I am looking for. Can you > >> please point > >> >>> me to where/how sorting is done per-segment? > >> >>> > >> >>> When heaving indexing introduces or modifies > >> segments, would > >> >>> it cause reloading of FieldCache at query time and thus > would > >> >>> impact search performance? > >> >>> > >> >>> thanks > >> >>> > >> >>> -John > >> >>> > >> >>> On Tue, Sep 22, 2009 at 1:05 PM, Yonik Seeley > >> >>> <yo...@lucidimagination.com > >> <mailto:yo...@lucidimagination.com> > >> <mailto:yo...@lucidimagination.com > >> <mailto:yo...@lucidimagination.com>>> > >> >>> wrote: > >> >>> > >> >>> On Tue, Sep 22, 2009 at 12:56 AM, John Wang > >> <john.w...@gmail.com <mailto:john.w...@gmail.com> > >> >>> <mailto:john.w...@gmail.com <mailto:john.w...@gmail.com>>> > >> wrote: > >> >>> > Looking at the code, seems there is a disconnect between > >> >>> how/when field > >> >>> > cache is loaded when IndexWriter.getReader() is called. > >> >>> > >> >>> I'm not sure what you mean by "disconnect" > >> >>> > >> >>> > Is FieldCache updated? > >> >>> > >> >>> FieldCache entries are populated on demand, as they always > >> have been. > >> >>> > >> >>> > >> >>> > Otherwise, are we reloading FieldCache for each > >> >>> > reader instance? > >> >>> > >> >>> Searching/sorting is now per-segment, and so is the use of > the > >> >>> FieldCache. Segments that don't change shouldn't have to > >> reload > >> >>> their > >> >>> FieldCache entries. > >> >>> > >> >>> -Yonik > >> >>> http://www.lucidimagination.com > >> >>> > >> >>> > >> > --------------------------------------------------------------------- > >> >>> To unsubscribe, e-mail: > >> java-dev-unsubscr...@lucene.apache.org > >> <mailto:java-dev-unsubscr...@lucene.apache.org> > >> >>> <mailto:java-dev-unsubscr...@lucene.apache.org > >> <mailto:java-dev-unsubscr...@lucene.apache.org>> > >> >>> For additional commands, e-mail: > >> java-dev-h...@lucene.apache.org > >> <mailto:java-dev-h...@lucene.apache.org> > >> >>> <mailto:java-dev-h...@lucene.apache.org > >> <mailto:java-dev-h...@lucene.apache.org>> > >> >>> > >> >>> > >> >>> > >> >> > >> >> > >> >> > >> > > >> > > >> > -- > >> > - Mark > >> > > >> > http://www.lucidimagination.com > >> > > >> > > >> > > >> > > >> > > >> > --------------------------------------------------------------------- > >> > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > >> <mailto:java-dev-unsubscr...@lucene.apache.org> > >> > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >> <mailto:java-dev-h...@lucene.apache.org> > >> > > >> > > >> > >> > --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > >> <mailto:java-dev-unsubscr...@lucene.apache.org> > >> For additional commands, e-mail: java-dev-h...@lucene.apache.org > >> <mailto:java-dev-h...@lucene.apache.org> > >> > >> > >> > > > > > > > > > -- > - Mark > > http://www.lucidimagination.com > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >