[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708307#action_12708307
]
Mark Miller commented on LUCENE-831:
------------------------------------
I won't likely be getting to this anytime soon if someone else wants to work on
it. I'll get back at it at some point if not though.
I believe the latest patch is a nice base to work from.
I'm still not clear to me if its best to start merging using the ValueSource
somehow, or do something where the ValueSource has a merge implementation
(allowing for a more efficient private merge). It seems the merge code for
fields, norms, dels, is fairly specialized now, but could become a bit more
generic. Then perhaps you could add any old ValueSource (other than norms,
fields, dels) and easily hook into the merge process. Maybe even in RAM merges
of RAM based ValueSources - FieldCache etc. Of course, I guess you could also
still do things specialized as now, and just provide access to the files
through a ValueSource. That really crimps the pluggability though.
The next step (in terms of the current patch) seems to be to start working
ValueSource into norms, dels, possibly stored fields. Eventually they should
become pluggable, but I'm not sure how best to plug them in. I was thinking you
could set a default ValueSource by field for the FieldCache using the Reader
open method with a new param. Perhaps it should take a ValueSourceFactory that
can provide a variety of ValueSources based on field, norms, dels, stored
fields, with variations for read-only? The proposed componentization of
IndexReader could be another approach if it materializes, or worked into this
issue.
I don't think I'll understand whats needed for updatability until I'm in
deeper. It almost seems like something like setInt(int doc, int n), setByte(int
doc, byte b) on the ValueSource might work. They could possibly throw
Unsupported. I know there are a lot of little difficulties involved in all of
this though, so I'm not very sure of anything at the moment. The backing impl
would be free to update in RAM (say synced dels), or do a copy on write, etc. I
guess all methods would throw Unsupported by default, but if you override a
getXXX you would have the option of overriding a setXXX.
ValueSources also need the ability to be sharable across IndexReaders with the
ability to do copy on write if they are shared and updatable.
> Complete overhaul of FieldCache API/Implementation
> --------------------------------------------------
>
> Key: LUCENE-831
> URL: https://issues.apache.org/jira/browse/LUCENE-831
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Search
> Reporter: Hoss Man
> Assignee: Mark Miller
> Fix For: 3.0
>
> Attachments: ExtendedDocument.java, fieldcache-overhaul.032208.diff,
> fieldcache-overhaul.diff, fieldcache-overhaul.diff,
> LUCENE-831-trieimpl.patch, LUCENE-831.03.28.2008.diff,
> LUCENE-831.03.30.2008.diff, LUCENE-831.03.31.2008.diff, LUCENE-831.patch,
> LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch,
> LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch,
> LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch,
> LUCENE-831.patch
>
>
> Motivation:
> 1) Complete overhaul the API/implementation of "FieldCache" type things...
> a) eliminate global static map keyed on IndexReader (thus
> eliminating synch block between completley independent IndexReaders)
> b) allow more customization of cache management (ie: use
> expiration/replacement strategies, disk backed caches, etc)
> c) allow people to define custom cache data logic (ie: custom
> parsers, complex datatypes, etc... anything tied to a reader)
> d) allow people to inspect what's in a cache (list of CacheKeys) for
> an IndexReader so a new IndexReader can be likewise warmed.
> e) Lend support for smarter cache management if/when
> IndexReader.reopen is added (merging of cached data from subReaders).
> 2) Provide backwards compatibility to support existing FieldCache API with
> the new implementation, so there is no redundent caching as client code
> migrades to new API.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]