Unsupported operation in TermDocs.next() when migrating from 2.4 to 2.9

Jerven Bolleman Tue, 29 Jun 2010 02:25:13 -0700

Hi All,

I am finally having some time to upgrade our lucene from the 2.4 seriesto the 2.9 series. And I am having a problem that while everythingcompiles great I am getting a new UnsupportedOperationException.



java.lang.UnsupportedOperationException

atorg.apache.lucene.index.AbstractAllTermDocs.seek(AbstractAllTermDocs.java:42)atorg.apache.lucene.index.DirectoryReader$MultiTermDocs.termDocs(DirectoryReader.java:1186)atorg.apache.lucene.index.DirectoryReader$MultiTermDocs.next(DirectoryReader.java:1118)atorg.expasy.core.index.SubQueryFilter.fastForLargeResultSets(SubQueryFilter.java:129)

I copied in the code that calls this. See an explanation of what ittries to achieve underneath.

private void fastForLargeResultSets(String foreignField, BitSet bits,TermDocs docs, TermDocs foreignDocs, IndexReader foreignReader, BitSetqueryResults)

        throws IOException
{
        int start = queryResults.nextSetBit(0);
        TermEnum foreignEnum = foreignReader.terms(new Term(foreignField, ""));
        while (foreignEnum.next())
                {
                Term term = foreignEnum.term();
                if (term == null || !term.field().equals(foreignField))
                        break;
                if (!term.text().equals("not_null"))
                {
                        foreignDocs.skipTo(start);
                        foreignDocs.seek(term);
//Source of exception in my code
                        while (foreignDocs.next())
                        {
                                int doc = foreignDocs.doc();
                                if (queryResults.get(doc))
                                {
                                        foreignDocs.skipTo(doc);
                                        if (term != null && term.text() != null)
                                                buffer.add(term.text());
                                }
// Use a buffer to avoid jumping around on disk to much.
//
                                if (buffer.size() >= BUFFERSIZE)
                                {
                                        emptyBuffer(buffer, bits, docs);
                                }
                        }
                }
        }

        if (!buffer.isEmpty())
        {
                emptyBuffer(buffer, bits, docs);
        }
}

The purpose of this code is to fill a bitset as a filter. The filter isused to find documents in index a who have a linking key value to themin index b.

While resource intensive this code path was quite fast for when you havemultimillion documents in index b pointing to multimillion documents inindex b.


i.e. it creates a "join" between two queries on different indexes.

for a live example
http://www.uniprot.org/uniprot/?query=citation%3A%28author%3Afink%29
this a search for fink in the field author in the "citation" index.

For each document in the "citation" index that matches term "fink" inthe field "author" retrieve the terms that contain an uniquelyidentifying key value for documents in the "uniprot" index. Generate abitset to use in filtering the documents in the "uniprot" index (done inthe emptybuffer method).

Is this a bug? and does anyone have ideas for an effective (maybesuperior) work around?


Regards and thanks for a great project!

Jerven

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Unsupported operation in TermDocs.next() when migrating from 2.4 to 2.9

Reply via email to