Hi Folks,

I just tried to index a data set that was probably 2x as large as the previous 
one I'd been using with the same code.  The indexing completed fine, although 
it was slower than I would have liked. ;-)  But the following problem occurs 
when I try to use FieldCache to look up an indexed and stored value:

java.lang.ArrayIndexOutOfBoundsException: -65406
        at 
org.apache.lucene.util.PagedBytes$Reader.fillUsingLengthPrefix(PagedBytes.java:98)
        at 
org.apache.lucene.search.FieldCacheImpl$DocTermsImpl.getTerm(FieldCacheImpl.java:918)
        at ...

The code that does this has been working for quite some time and has been 
unmodified:

    /** Find a string field value, given the lucene ID, field name, and value.
    */
    protected String getStringValue(int luceneID, String fieldName)
      throws IOException
    {
      // Find the right reader
      final int idx = readerIndex(luceneID, starts, readers.length);
      final int docBase = starts[idx];
      final IndexReader reader = readers[idx];

      BytesRef ref = 
FieldCache.DEFAULT.getTerms(reader,fieldName).getTerm(luceneID-docBase,new 
BytesRef());
      String rval = ref.utf8ToString();
      //System.out.println(" Reading luceneID "+Integer.toString(luceneID)+" 
field "+fieldName+" with result '"+rval+"'");
      return rval;
    }

  }

I added a try/catch to see what values were going into the key line:

catch (RuntimeException e)
    {
        System.out.println("LuceneID = "+luceneID+", fieldName='"+fieldName+"', 
idx="+idx+", docBase="+docBase);
        System.out.println("Readers = "+readers.length);
        int i = 0;
        while (i < readers.length)
            {
                System.out.println(" Reader start "+i+" is "+starts[i]);
                i++;
            }
        throw e;
    }

The resulting output was:

LuceneID = 34466856, fieldName='id', idx=0, docBase=0
Readers = 1
     Reader start 0 is 0

... which looks reasonable on the face of things.  This is a version of trunk 
from approximately 8/12/2010, so it is fairly old.  Was there a fix for a 
problem that could account for this behavior?  Should I simply synch up?  Or am 
I doing something wrong here?  The schema for the id field is:

<fieldType name="string_idx" class="solr.StrField" sortMissingLast="true" 
indexed="true" stored="true"/>
<field name="id" type="string_idx" required="true"/>

Karl

Reply via email to