[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699880#action_12699880 ]
Mark Miller commented on LUCENE-831: ------------------------------------ Okay, now that I half way understand this issue, I think I have to go back to the basic motivations. The original big win was taken away by 1483, so lets see if we really need a new API for the wins we have left. h3. Advantage of new API (kind of as it is in the patch) FieldCache is interface and it would be nice to move to abstract class, ExtendedFieldCache is ugly Avoid global sync by IndexReader to access cache its easier/cleaner to block caching by multireaders (though I am almost thinking I would prefer warnings/advice about performance and encouragement to move to per segment) It becomes easier to share a ValueSource instance across readers. h3. Disadvantages of new API If we want only SegmentReaders to have a ValueSource, you can't efficiently back the old API with the new, causing RAM reqs jumps if you straddle the two APIs and ask for the same array data from each. Its probably a higher barrier to a custom Parser to implement and init a Reader with a ValueSource (presumably that works per field) than to simply pass the Parser on a SortField. However, Parser stops making sense if we end up being able to back ValueSource with column stride fields. We could allow ValueSource to be passed on the SortField (the current incarnation of this patch), but then you have to go back to a global cache by reader the ValueSources passed that way (you would also still have the per segment reader, settable ValueSource). h3. Advantages of staying with old API Avoid forcing large migration for users, with possible RAM req penalties if they don't switch from deprecated code (we are doing something similar with 1483 even without deprecated code though - if you were using an external multireader FieldCache that matched a sort FieldCache key, youd double your RAM reqs). h3. Thoughts If we stayed with the old API, we could still allow a custom FieldCache to be supplied. We could still back FieldCacheImpl with Uninverter to reduce code. We could still have CachingFieldCache. Though CachingValueSource is much better :) FieldCache implies caching, and so the name would be confusing. We could also avoid CachingFieldCache though, as just making a pluggable FieldCache would allow alternate caching implementations (with a bit more effort). We could deprecate the Parser methods and force supplying a new FieldCache impl for custom uninversion to get to an API suitable to be backed by CSF. Or: We could also move to ValueSource, but allow a ValueSource on multi-readers. That would probably make straddling the API's much more possible (and efficient) in the default case. We could advise that its best to work per segment, but leave the option to the user. h3. Conclusion I am not sure. I thought I was convinced we might as well not even move from FieldCache at all, but now that I've written a bit out, I'm thinking it would be worth going to ValueSource. I'm just not positive on what we should support. SortField ValueSource override keyed by reader? ValueSources on MultiReaders? > Complete overhaul of FieldCache API/Implementation > -------------------------------------------------- > > Key: LUCENE-831 > URL: https://issues.apache.org/jira/browse/LUCENE-831 > Project: Lucene - Java > Issue Type: Improvement > Components: Search > Reporter: Hoss Man > Assignee: Mark Miller > Fix For: 3.0 > > Attachments: ExtendedDocument.java, fieldcache-overhaul.032208.diff, > fieldcache-overhaul.diff, fieldcache-overhaul.diff, > LUCENE-831-trieimpl.patch, LUCENE-831.03.28.2008.diff, > LUCENE-831.03.30.2008.diff, LUCENE-831.03.31.2008.diff, LUCENE-831.patch, > LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, > LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, > LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, > LUCENE-831.patch > > > Motivation: > 1) Complete overhaul the API/implementation of "FieldCache" type things... > a) eliminate global static map keyed on IndexReader (thus > eliminating synch block between completley independent IndexReaders) > b) allow more customization of cache management (ie: use > expiration/replacement strategies, disk backed caches, etc) > c) allow people to define custom cache data logic (ie: custom > parsers, complex datatypes, etc... anything tied to a reader) > d) allow people to inspect what's in a cache (list of CacheKeys) for > an IndexReader so a new IndexReader can be likewise warmed. > e) Lend support for smarter cache management if/when > IndexReader.reopen is added (merging of cached data from subReaders). > 2) Provide backwards compatibility to support existing FieldCache API with > the new implementation, so there is no redundent caching as client code > migrades to new API. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org