This is exactly why we added IndexWriter.setMergedSegmentWarmer -- you can warm the reader w/o blocking ongoing updates.
Mike On Tue, Sep 22, 2009 at 7:15 PM, Mark Miller <markrmil...@gmail.com> wrote: > Right - when a large segment is invalidated, you will have a bigger > fieldcache piece to reload - pre 2.9, you'd be reloading the *whole* > field cache every time though. Sounds like you are trying to deal with > those large segments changing anyway :) They are always an issue when > doing RT it seems. > > I don't believe deletes invalidate a field cache - terms from deleted > docs stay in a field cache and segmentreaders use their freqStream as > the fieldcache key. Only when the deletes are merged out would they > invalidate - but because your writing a new segment anyway ... > > - Mark > > John Wang wrote: >> I understand what you are saying. Let me detail what I am trying to say: >> >> When "currently processed segments" are flushed down, merge may >> happen. When merges happen, some of those "stable segments" will be >> invalidated, and so will the fieldcache data keyed by them. >> >> In a high update environment, such scenarios can happen quite often. >> >> The way the default mergePolicy works is that small segments get >> merged into the larger segments. Eventually, what will be invalidated >> would be a large segment, and when that happens, a large chunk of the >> field cache would be invalidated. >> >> Furthermore, in the case where there are high updates, the stable >> segments can be invalidate much sooner when there are deletes in those >> segments, and I would guess the corresponding FieldCache needs to be >> adjusted. Not sure how it is handled right now. >> >> Just my two cents, and of course when I find the time I will need to >> run some tests to see. >> >> -John >> >> On Tue, Sep 22, 2009 at 3:59 PM, Uwe Schindler <u...@thetaphi.de >> <mailto:u...@thetaphi.de>> wrote: >> >> The NRT reader coming from the IndexWriter.getReader() has only >> changes in the currently processed segments, the other segments >> keep stable (and even their IndexReader keys used for the >> FieldCache). The rest of the segments keep stable. For the >> consumer it looks like a normal reader (it is in fact a >> ReadOnlyDirectoryReader) supporting getSequentialSubReaders() and >> so on. >> >> >> >> ----- >> Uwe Schindler >> H.-H.-Meier-Allee 63, D-28213 Bremen >> http://www.thetaphi.de >> eMail: u...@thetaphi.de <mailto:u...@thetaphi.de> >> >> ------------------------------------------------------------------------ >> >> *From:* John Wang [mailto:john.w...@gmail.com >> <mailto:john.w...@gmail.com>] >> *Sent:* Tuesday, September 22, 2009 9:32 AM >> *To:* java-dev@lucene.apache.org <mailto:java-dev@lucene.apache.org> >> *Subject:* Re: 2.9 NRT w.r.t. sorting and field cache >> >> >> >> Thanks Mark for the pointer! >> >> I guess my point is with NRT, and when segment files change often, >> this would be an issue, no? >> >> Anyway, I can run some tests. >> >> Thanks >> >> -John >> >> On Tue, Sep 22, 2009 at 3:21 PM, Mark Miller >> <markrmil...@gmail.com <mailto:markrmil...@gmail.com>> wrote: >> >> 1483 - indexsearcher pulls out a readers subreaders >> (segmentreaders) and sends a collector over them one by one, >> rather than using the multireader. So only fc for seg readers that >> change need to be reloaded. >> >> - Mark >> >> >> >> http://www.lucidimagination.com (mobile) >> >> >> On Sep 22, 2009, at 1:27 AM, John Wang <john.w...@gmail.com >> <mailto:john.w...@gmail.com>> wrote: >> >>> Hi Yonik: >>> >>> Actually that is what I am looking for. Can you please point >>> me to where/how sorting is done per-segment? >>> >>> When heaving indexing introduces or modifies segments, would >>> it cause reloading of FieldCache at query time and thus would >>> impact search performance? >>> >>> thanks >>> >>> -John >>> >>> On Tue, Sep 22, 2009 at 1:05 PM, Yonik Seeley >>> <yo...@lucidimagination.com <mailto:yo...@lucidimagination.com>> >>> wrote: >>> >>> On Tue, Sep 22, 2009 at 12:56 AM, John Wang <john.w...@gmail.com >>> <mailto:john.w...@gmail.com>> wrote: >>> > Looking at the code, seems there is a disconnect between >>> how/when field >>> > cache is loaded when IndexWriter.getReader() is called. >>> >>> I'm not sure what you mean by "disconnect" >>> >>> > Is FieldCache updated? >>> >>> FieldCache entries are populated on demand, as they always have been. >>> >>> >>> > Otherwise, are we reloading FieldCache for each >>> > reader instance? >>> >>> Searching/sorting is now per-segment, and so is the use of the >>> FieldCache. Segments that don't change shouldn't have to reload >>> their >>> FieldCache entries. >>> >>> -Yonik >>> http://www.lucidimagination.com >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >>> <mailto:java-dev-unsubscr...@lucene.apache.org> >>> For additional commands, e-mail: java-dev-h...@lucene.apache.org >>> <mailto:java-dev-h...@lucene.apache.org> >>> >>> >>> >> >> >> > > > -- > - Mark > > http://www.lucidimagination.com > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org