Right - when a large segment is invalidated, you will have a bigger fieldcache piece to reload - pre 2.9, you'd be reloading the *whole* field cache every time though. Sounds like you are trying to deal with those large segments changing anyway :) They are always an issue when doing RT it seems.
I don't believe deletes invalidate a field cache - terms from deleted docs stay in a field cache and segmentreaders use their freqStream as the fieldcache key. Only when the deletes are merged out would they invalidate - but because your writing a new segment anyway ... - Mark John Wang wrote: > I understand what you are saying. Let me detail what I am trying to say: > > When "currently processed segments" are flushed down, merge may > happen. When merges happen, some of those "stable segments" will be > invalidated, and so will the fieldcache data keyed by them. > > In a high update environment, such scenarios can happen quite often. > > The way the default mergePolicy works is that small segments get > merged into the larger segments. Eventually, what will be invalidated > would be a large segment, and when that happens, a large chunk of the > field cache would be invalidated. > > Furthermore, in the case where there are high updates, the stable > segments can be invalidate much sooner when there are deletes in those > segments, and I would guess the corresponding FieldCache needs to be > adjusted. Not sure how it is handled right now. > > Just my two cents, and of course when I find the time I will need to > run some tests to see. > > -John > > On Tue, Sep 22, 2009 at 3:59 PM, Uwe Schindler <u...@thetaphi.de > <mailto:u...@thetaphi.de>> wrote: > > The NRT reader coming from the IndexWriter.getReader() has only > changes in the currently processed segments, the other segments > keep stable (and even their IndexReader keys used for the > FieldCache). The rest of the segments keep stable. For the > consumer it looks like a normal reader (it is in fact a > ReadOnlyDirectoryReader) supporting getSequentialSubReaders() and > so on. > > > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de <mailto:u...@thetaphi.de> > > ------------------------------------------------------------------------ > > *From:* John Wang [mailto:john.w...@gmail.com > <mailto:john.w...@gmail.com>] > *Sent:* Tuesday, September 22, 2009 9:32 AM > *To:* java-dev@lucene.apache.org <mailto:java-dev@lucene.apache.org> > *Subject:* Re: 2.9 NRT w.r.t. sorting and field cache > > > > Thanks Mark for the pointer! > > I guess my point is with NRT, and when segment files change often, > this would be an issue, no? > > Anyway, I can run some tests. > > Thanks > > -John > > On Tue, Sep 22, 2009 at 3:21 PM, Mark Miller > <markrmil...@gmail.com <mailto:markrmil...@gmail.com>> wrote: > > 1483 - indexsearcher pulls out a readers subreaders > (segmentreaders) and sends a collector over them one by one, > rather than using the multireader. So only fc for seg readers that > change need to be reloaded. > > - Mark > > > > http://www.lucidimagination.com (mobile) > > > On Sep 22, 2009, at 1:27 AM, John Wang <john.w...@gmail.com > <mailto:john.w...@gmail.com>> wrote: > >> Hi Yonik: >> >> Actually that is what I am looking for. Can you please point >> me to where/how sorting is done per-segment? >> >> When heaving indexing introduces or modifies segments, would >> it cause reloading of FieldCache at query time and thus would >> impact search performance? >> >> thanks >> >> -John >> >> On Tue, Sep 22, 2009 at 1:05 PM, Yonik Seeley >> <yo...@lucidimagination.com <mailto:yo...@lucidimagination.com>> >> wrote: >> >> On Tue, Sep 22, 2009 at 12:56 AM, John Wang <john.w...@gmail.com >> <mailto:john.w...@gmail.com>> wrote: >> > Looking at the code, seems there is a disconnect between >> how/when field >> > cache is loaded when IndexWriter.getReader() is called. >> >> I'm not sure what you mean by "disconnect" >> >> > Is FieldCache updated? >> >> FieldCache entries are populated on demand, as they always have been. >> >> >> > Otherwise, are we reloading FieldCache for each >> > reader instance? >> >> Searching/sorting is now per-segment, and so is the use of the >> FieldCache. Segments that don't change shouldn't have to reload >> their >> FieldCache entries. >> >> -Yonik >> http://www.lucidimagination.com >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> <mailto:java-dev-unsubscr...@lucene.apache.org> >> For additional commands, e-mail: java-dev-h...@lucene.apache.org >> <mailto:java-dev-h...@lucene.apache.org> >> >> >> > > > -- - Mark http://www.lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org