[
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657401#action_12657401
]
Jeremy Volkman commented on LUCENE-831:
---------------------------------------
A couple things:
# Looking at the getCachedData method for MultiReader and MultiSegmentReader,
it doesn't appear that the CacheData objects from merge operations are cached.
Is there any reason for this?
# I've written a merge method for StringIndexCacheKey. The process isn't all
that complicated (apart from all of the off-by-ones), but it's expensive.
{code:java}
public boolean isMergable() {
return true;
}
private static class OrderNode {
int index;
OrderNode next;
}
public CacheData mergeData(int[] starts, CacheData[] data)
throws UnsupportedOperationException {
int[] mergedOrder = new int[starts[starts.length - 1]];
// Lookup map is 1-based
String[] mergedLookup = new String[starts[starts.length - 1] + 1];
// Unwrap cache payloads and flip order arrays
StringIndex[] unwrapped = new StringIndex[data.length];
/* Flip the order arrays (reverse indices and values)
* Since the ord map has a many-to-one relationship with the lookup table,
* the flipped structure must be one-to-many which results in an array of
* linked lists.
*/
OrderNode[][] flippedOrders = new OrderNode[data.length][];
for (int i = 0; i < data.length; i++) {
StringIndex si = (StringIndex) data[i].getCachePayload();
unwrapped[i] = si;
flippedOrders[i] = new OrderNode[si.lookup.length];
for (int j = 0; j < si.order.length; j++) {
OrderNode a = new OrderNode();
a.index = j;
a.next = flippedOrders[i][si.order[j]];
flippedOrders[i][si.order[j]] = a;
}
}
// Lookup map is 1-based
int[] lookupIndices = new int[unwrapped.length];
Arrays.fill(lookupIndices, 1);
int lookupIndex = 0;
String currentVal;
int currentSeg;
while (true) {
currentVal = null;
currentSeg = -1;
int remaining = 0;
// Find the next ordered value from all the segments
for (int i = 0; i < unwrapped.length; i++) {
if (lookupIndices[i] < unwrapped[i].lookup.length) {
remaining++;
String that = unwrapped[i].lookup[lookupIndices[i]];
if (currentVal == null || currentVal.compareTo(that) > 0) {
currentVal = that;
currentSeg = i;
}
}
}
if (remaining == 1) {
break;
} else if (remaining == 0) {
/* The only way this could happen is if there are 0 segments or if
* all segments have 0 terms. In either case, we can return
* early.
*/
return new CacheData(new StringIndex(
new int[starts[starts.length - 1]], new String[1]));
}
if (!currentVal.equals(mergedLookup[lookupIndex])) {
lookupIndex++;
mergedLookup[lookupIndex] = currentVal;
}
OrderNode a = flippedOrders[currentSeg][lookupIndices[currentSeg]];
while (a != null) {
mergedOrder[a.index + starts[currentSeg]] = lookupIndex;
a = a.next;
}
lookupIndices[currentSeg]++;
}
{code}
> Complete overhaul of FieldCache API/Implementation
> --------------------------------------------------
>
> Key: LUCENE-831
> URL: https://issues.apache.org/jira/browse/LUCENE-831
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Search
> Reporter: Hoss Man
> Fix For: 3.0
>
> Attachments: ExtendedDocument.java, fieldcache-overhaul.032208.diff,
> fieldcache-overhaul.diff, fieldcache-overhaul.diff,
> LUCENE-831.03.28.2008.diff, LUCENE-831.03.30.2008.diff,
> LUCENE-831.03.31.2008.diff, LUCENE-831.patch, LUCENE-831.patch,
> LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch
>
>
> Motivation:
> 1) Complete overhaul the API/implementation of "FieldCache" type things...
> a) eliminate global static map keyed on IndexReader (thus
> eliminating synch block between completley independent IndexReaders)
> b) allow more customization of cache management (ie: use
> expiration/replacement strategies, disk backed caches, etc)
> c) allow people to define custom cache data logic (ie: custom
> parsers, complex datatypes, etc... anything tied to a reader)
> d) allow people to inspect what's in a cache (list of CacheKeys) for
> an IndexReader so a new IndexReader can be likewise warmed.
> e) Lend support for smarter cache management if/when
> IndexReader.reopen is added (merging of cached data from subReaders).
> 2) Provide backwards compatibility to support existing FieldCache API with
> the new implementation, so there is no redundent caching as client code
> migrades to new API.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]