Mark: I did spend at least a quarter of an ounce. :) And I am sure Mike's time is more valuable than mine, but it was meant to be a "double-check"
I was under the impression there is a default impl from previous email threads on how to handle field cache warming, perhaps I misunderstood. The real question here is "warms the reader" From a public API point of view, I wasn't sure if passing in a IndexReader impl is something we can do to avoid loading the entire field cache. e.g. would I need to down cast? can it be a filtered reader? etc. If you think there is something I could have done witin 5 sec, please point me to the right direction. Thanks -John On Wed, Sep 23, 2009 at 7:55 AM, Mark Miller <markrmil...@gmail.com> wrote: > Come on dude :) Spend a half ounce of effort first. Mike's time is too > valuable ! > > Luckily mine is not. > > There is no default impl - the class is dead simple (and the class has > been pointed out like 3 times in this thread - I'm not even fully > following and I know where to find it): > > public static abstract class IndexReaderWarmer { > public abstract void warm(IndexReader reader) throws IOException; > } > > Now pass something in that warms the reader. Load a fieldcache - do a > search. Do the hokey pokey and turn your self around ... > > Investigation time: 5 seconds. > > John Wang wrote: > > Hi Michael: > > > > Thanks for the pointer! > > > > Pardon my ignorance, but I am still no seeing the connection > > between this api to per/segment loading of FieldCache. (the api takes > > in an IndexReader instead of maybe SegmentReader[]) > > > > Can you point me to maybe the default impl of IndexReaderWarmer > > to help me understand? > > > > Thanks > > > > -John > > > > On Wed, Sep 23, 2009 at 7:17 AM, Michael McCandless > > <luc...@mikemccandless.com <mailto:luc...@mikemccandless.com>> wrote: > > > > This is exactly why we added IndexWriter.setMergedSegmentWarmer -- > you > > can warm the reader w/o blocking ongoing updates. > > > > Mike > > > > On Tue, Sep 22, 2009 at 7:15 PM, Mark Miller > > <markrmil...@gmail.com <mailto:markrmil...@gmail.com>> wrote: > > > Right - when a large segment is invalidated, you will have a bigger > > > fieldcache piece to reload - pre 2.9, you'd be reloading the > *whole* > > > field cache every time though. Sounds like you are trying to > > deal with > > > those large segments changing anyway :) They are always an issue > > when > > > doing RT it seems. > > > > > > I don't believe deletes invalidate a field cache - terms from > > deleted > > > docs stay in a field cache and segmentreaders use their > > freqStream as > > > the fieldcache key. Only when the deletes are merged out would they > > > invalidate - but because your writing a new segment anyway ... > > > > > > - Mark > > > > > > John Wang wrote: > > >> I understand what you are saying. Let me detail what I am > > trying to say: > > >> > > >> When "currently processed segments" are flushed down, merge may > > >> happen. When merges happen, some of those "stable segments" will > be > > >> invalidated, and so will the fieldcache data keyed by them. > > >> > > >> In a high update environment, such scenarios can happen quite > > often. > > >> > > >> The way the default mergePolicy works is that small segments get > > >> merged into the larger segments. Eventually, what will be > > invalidated > > >> would be a large segment, and when that happens, a large chunk > > of the > > >> field cache would be invalidated. > > >> > > >> Furthermore, in the case where there are high updates, the stable > > >> segments can be invalidate much sooner when there are deletes > > in those > > >> segments, and I would guess the corresponding FieldCache needs > > to be > > >> adjusted. Not sure how it is handled right now. > > >> > > >> Just my two cents, and of course when I find the time I will > > need to > > >> run some tests to see. > > >> > > >> -John > > >> > > >> On Tue, Sep 22, 2009 at 3:59 PM, Uwe Schindler <u...@thetaphi.de > > <mailto:u...@thetaphi.de> > > >> <mailto:u...@thetaphi.de <mailto:u...@thetaphi.de>>> wrote: > > >> > > >> The NRT reader coming from the IndexWriter.getReader() has > only > > >> changes in the currently processed segments, the other > segments > > >> keep stable (and even their IndexReader keys used for the > > >> FieldCache). The rest of the segments keep stable. For the > > >> consumer it looks like a normal reader (it is in fact a > > >> ReadOnlyDirectoryReader) supporting > > getSequentialSubReaders() and > > >> so on. > > >> > > >> > > >> > > >> ----- > > >> Uwe Schindler > > >> H.-H.-Meier-Allee 63, D-28213 Bremen > > >> http://www.thetaphi.de > > >> eMail: u...@thetaphi.de <mailto:u...@thetaphi.de> > > <mailto:u...@thetaphi.de <mailto:u...@thetaphi.de>> > > >> > > >> > > > ------------------------------------------------------------------------ > > >> > > >> *From:* John Wang [mailto:john.w...@gmail.com > > <mailto:john.w...@gmail.com> > > >> <mailto:john.w...@gmail.com <mailto:john.w...@gmail.com>>] > > >> *Sent:* Tuesday, September 22, 2009 9:32 AM > > >> *To:* java-dev@lucene.apache.org > > <mailto:java-dev@lucene.apache.org> > > <mailto:java-dev@lucene.apache.org > > <mailto:java-dev@lucene.apache.org>> > > >> *Subject:* Re: 2.9 NRT w.r.t. sorting and field cache > > >> > > >> > > >> > > >> Thanks Mark for the pointer! > > >> > > >> I guess my point is with NRT, and when segment files change > > often, > > >> this would be an issue, no? > > >> > > >> Anyway, I can run some tests. > > >> > > >> Thanks > > >> > > >> -John > > >> > > >> On Tue, Sep 22, 2009 at 3:21 PM, Mark Miller > > >> <markrmil...@gmail.com <mailto:markrmil...@gmail.com> > > <mailto:markrmil...@gmail.com <mailto:markrmil...@gmail.com>>> > wrote: > > >> > > >> 1483 - indexsearcher pulls out a readers subreaders > > >> (segmentreaders) and sends a collector over them one by one, > > >> rather than using the multireader. So only fc for seg > > readers that > > >> change need to be reloaded. > > >> > > >> - Mark > > >> > > >> > > >> > > >> http://www.lucidimagination.com (mobile) > > >> > > >> > > >> On Sep 22, 2009, at 1:27 AM, John Wang <john.w...@gmail.com > > <mailto:john.w...@gmail.com> > > >> <mailto:john.w...@gmail.com <mailto:john.w...@gmail.com>>> > > wrote: > > >> > > >>> Hi Yonik: > > >>> > > >>> Actually that is what I am looking for. Can you > > please point > > >>> me to where/how sorting is done per-segment? > > >>> > > >>> When heaving indexing introduces or modifies > > segments, would > > >>> it cause reloading of FieldCache at query time and thus would > > >>> impact search performance? > > >>> > > >>> thanks > > >>> > > >>> -John > > >>> > > >>> On Tue, Sep 22, 2009 at 1:05 PM, Yonik Seeley > > >>> <yo...@lucidimagination.com > > <mailto:yo...@lucidimagination.com> > > <mailto:yo...@lucidimagination.com > > <mailto:yo...@lucidimagination.com>>> > > >>> wrote: > > >>> > > >>> On Tue, Sep 22, 2009 at 12:56 AM, John Wang > > <john.w...@gmail.com <mailto:john.w...@gmail.com> > > >>> <mailto:john.w...@gmail.com <mailto:john.w...@gmail.com>>> > > wrote: > > >>> > Looking at the code, seems there is a disconnect between > > >>> how/when field > > >>> > cache is loaded when IndexWriter.getReader() is called. > > >>> > > >>> I'm not sure what you mean by "disconnect" > > >>> > > >>> > Is FieldCache updated? > > >>> > > >>> FieldCache entries are populated on demand, as they always > > have been. > > >>> > > >>> > > >>> > Otherwise, are we reloading FieldCache for each > > >>> > reader instance? > > >>> > > >>> Searching/sorting is now per-segment, and so is the use of > the > > >>> FieldCache. Segments that don't change shouldn't have to > > reload > > >>> their > > >>> FieldCache entries. > > >>> > > >>> -Yonik > > >>> http://www.lucidimagination.com > > >>> > > >>> > > --------------------------------------------------------------------- > > >>> To unsubscribe, e-mail: > > java-dev-unsubscr...@lucene.apache.org > > <mailto:java-dev-unsubscr...@lucene.apache.org> > > >>> <mailto:java-dev-unsubscr...@lucene.apache.org > > <mailto:java-dev-unsubscr...@lucene.apache.org>> > > >>> For additional commands, e-mail: > > java-dev-h...@lucene.apache.org > > <mailto:java-dev-h...@lucene.apache.org> > > >>> <mailto:java-dev-h...@lucene.apache.org > > <mailto:java-dev-h...@lucene.apache.org>> > > >>> > > >>> > > >>> > > >> > > >> > > >> > > > > > > > > > -- > > > - Mark > > > > > > http://www.lucidimagination.com > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > > <mailto:java-dev-unsubscr...@lucene.apache.org> > > > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > <mailto:java-dev-h...@lucene.apache.org> > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > > <mailto:java-dev-unsubscr...@lucene.apache.org> > > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > <mailto:java-dev-h...@lucene.apache.org> > > > > > > > -- > - Mark > > http://www.lucidimagination.com > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >