I think this (truly componentizing SegmentReader) makes tons of sense. After all, a SegmentReader is just a bunch of separate components handling different parts of the index.
This is really orthogonal to LUCENE-831 (the field cache is just one component). They can land in either order... Earwin do you want to take an initial stab (patch) at this? I think it'll be interesting how the components API handles near real-time search, because we want/expect components to be able to merge themselves efficiently "in RAM" when possible. EG if field cache already has certain fields loaded, they can be merged in RAM; if not, they should be merged on disk. If field cache has pending changes (in a future world when CSF makes it possible to suddenly change say the price of certain documents), then the components must properly implement clone (ideally incremental copy-on-write cloning). Mike On Sun, Apr 12, 2009 at 7:34 PM, Earwin Burrfoot <ear...@gmail.com> wrote: > To support my dream of kicking fieldCache out of the core and to add > some extensibility to Lucene, I want to introduce IndexReaderPlugins. > Rough pseudocode follows: > > interface IndexReaderPlugin { > void attach(SegmentReader reader); > void detach(SegmentReader reader); > > void attach(MultiSegmentReader reader); > void detach(MultiSegmentReader reader); > } > > IndexReader.java: > private Map<Class, IndexReaderPlugin> plugins; > > on opening/closing toplevel/segment reader we iterate over plugins: > for(IndexReaderPlugin plugin : plugins) > plugin.attach(reader); > > the map is passed to toplevel reader initially, and then shared with > lowlevel readers, we can also retrieve plugins: > public <T> T plugin(Class<T> pluginType); > > then we can do something like: > indexReader.plugin(ValueSource.class).doSomething // lucene code > indexReader.plugin(FieldsCache.class).forField(LAST_UPDATE_TIME).doSomething > // my code > filter.apply(indexReader.plugin(FilterCache.class)) // my code > > Benefits are numerous. We get rid of alien code like: > +++ src/java/org/apache/lucene/index/SegmentReader.java (working copy) > @@ -83,6 +86,8 @@ > + protected ValueSource valueSource; > + > @@ -555,6 +560,8 @@ > + > + valueSource = new CachingValueSource(this, new > UninversionValueSource(this)); > > If I don't need ValueSource attached to my readers, I won't have it. > If I need my custom caches attached to my readers, I can do it in a > natural way instead of hacking around MergeScheduler, or comparing > subreader lists. > If I want, I can replace Lucene's native ValueSource with my own > implementation, and all Lucene classes that use it will happily accept > it. > > On second thought, we shouldn't share plugin map across subreaders. If > we allow attach(SegmentReader reader) to return an instance of plugin > (plugin decides if it is the same instance always, or per-reader), and > populate the map for subreader with results of attach invoked on > toplevel reader map, we'll turn this code: > segmentReader.plugin(SomeClass.class).segmentReaderDependentMethod(segmentReader); > into: > segmentReader.plugin(SomeClass.class).segmentReaderDependentMethod(); > which makes more sense > > Any way the general idea is still the same. > > -- > Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) > Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 > ICQ: 104465785 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org