[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

Jeremy Volkman (JIRA) Wed, 17 Dec 2008 06:30:08 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657401#action_12657401
 ]


Jeremy Volkman commented on LUCENE-831:
---------------------------------------

A couple things:

# Looking at the getCachedData method for MultiReader and MultiSegmentReader, 
it doesn't appear that the CacheData objects from merge operations are cached.  
Is there any reason for this?
# I've written a merge method for StringIndexCacheKey. The process isn't all 
that complicated (apart from all of the off-by-ones), but it's expensive.

{code:java}
  public boolean isMergable() {
    return true;
  }

  private static class OrderNode {
      int index;
      OrderNode next;
  }
  
  public CacheData mergeData(int[] starts, CacheData[] data) 
  throws UnsupportedOperationException {
    int[] mergedOrder = new int[starts[starts.length - 1]];
    // Lookup map is 1-based
    String[] mergedLookup = new String[starts[starts.length - 1] + 1];
    
    // Unwrap cache payloads and flip order arrays
    StringIndex[] unwrapped = new StringIndex[data.length];

    /* Flip the order arrays (reverse indices and values)
     * Since the ord map has a many-to-one relationship with the lookup table,
     * the flipped structure must be one-to-many which results in an array of
     * linked lists.
     */
    OrderNode[][] flippedOrders = new OrderNode[data.length][];
    for (int i = 0; i < data.length; i++) {
        StringIndex si = (StringIndex) data[i].getCachePayload();
        unwrapped[i] = si;
        flippedOrders[i] = new OrderNode[si.lookup.length];
        for (int j = 0; j < si.order.length; j++) {
            OrderNode a = new OrderNode();
            a.index = j;
            a.next = flippedOrders[i][si.order[j]];
            flippedOrders[i][si.order[j]] = a;
        }
    }

    // Lookup map is 1-based
    int[] lookupIndices = new int[unwrapped.length];
    Arrays.fill(lookupIndices, 1);

    int lookupIndex = 0;
    String currentVal;
    int currentSeg;
    while (true) {
        currentVal = null;
        currentSeg = -1;
        int remaining = 0;
        // Find the next ordered value from all the segments
        for (int i = 0; i < unwrapped.length; i++) {
            if (lookupIndices[i] < unwrapped[i].lookup.length) {
                remaining++;
                String that = unwrapped[i].lookup[lookupIndices[i]];
                if (currentVal == null || currentVal.compareTo(that) > 0) {
                    currentVal = that;
                    currentSeg = i;
                }
            }
        }
        if (remaining == 1) {
            break;
        } else if (remaining == 0) {
            /* The only way this could happen is if there are 0 segments or if
             * all segments have 0 terms. In either case, we can return
             * early.
             */
            return new CacheData(new StringIndex(
                    new int[starts[starts.length - 1]], new String[1]));
        }
        if (!currentVal.equals(mergedLookup[lookupIndex])) {
            lookupIndex++;
            mergedLookup[lookupIndex] = currentVal;
        }
        OrderNode a = flippedOrders[currentSeg][lookupIndices[currentSeg]];
        while (a != null) {
            mergedOrder[a.index + starts[currentSeg]] = lookupIndex;
            a = a.next;
        }
        lookupIndices[currentSeg]++;
    }
{code}



> Complete overhaul of FieldCache API/Implementation
> --------------------------------------------------
>
>                 Key: LUCENE-831
>                 URL: https://issues.apache.org/jira/browse/LUCENE-831
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Hoss Man
>             Fix For: 3.0
>
>         Attachments: ExtendedDocument.java, fieldcache-overhaul.032208.diff, 
> fieldcache-overhaul.diff, fieldcache-overhaul.diff, 
> LUCENE-831.03.28.2008.diff, LUCENE-831.03.30.2008.diff, 
> LUCENE-831.03.31.2008.diff, LUCENE-831.patch, LUCENE-831.patch, 
> LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch, LUCENE-831.patch
>
>
> Motivation:
> 1) Complete overhaul the API/implementation of "FieldCache" type things...
>     a) eliminate global static map keyed on IndexReader (thus
>         eliminating synch block between completley independent IndexReaders)
>     b) allow more customization of cache management (ie: use 
>         expiration/replacement strategies, disk backed caches, etc)
>     c) allow people to define custom cache data logic (ie: custom
>         parsers, complex datatypes, etc... anything tied to a reader)
>     d) allow people to inspect what's in a cache (list of CacheKeys) for
>         an IndexReader so a new IndexReader can be likewise warmed. 
>     e) Lend support for smarter cache management if/when
>         IndexReader.reopen is added (merging of cached data from subReaders).
> 2) Provide backwards compatibility to support existing FieldCache API with
>     the new implementation, so there is no redundent caching as client code
>     migrades to new API.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

Reply via email to