Well, I know I didn't think of this case back when we were discussion this change. As a recap, the issue was mainly that on some architectures, the clock was not granular enough to detect updates reliably, so some test cases were failing some of the time. You are right, Bernhard, we didn't consider longer running systems where entire indexes might be deleted and recreated while the cache was still around.

I don't know, having version start out as a date and then get incremented as a version leaves a bad taste in my mouth somehow. At the time, we discussed other ideas that would use the date "most of the time" but would increment it explicitly if the clock was seen as not being granular enough. But the simple 0-based version number was seen as a much cleaner and superior solution when it was proposed.

Perhaps it would be cleaner to leave the version number 0-based and add an index creation date that would be explicitly available? This would mean that checking index validity would require checking the date and then the version. I would guess that only some applications or general purpose cache implementations would have to go to such an extent, while the majority can continue using just the getCurrentVersion() method by itself. How does this sound? Is there (should there be) an isCurrent() method on the IndexReader that could encapsulate this process?

Dmitry.


Bernhard Messer wrote:

Hi,

I'm sending a patch which should help to fix a problem using the new method IndexReader.getCurrentVersion(). As far as i understand the current lucene documentation, developers should use this new method to verify if an index is out of date. The older method IndexReader.lastModified() is deprecated and therefore a possible candidate for deletion.

The problem with getCurrentVersion is, that it's base is 0, when creating a new index. Therefore the version number will be identical if you delete an index and recreate a new one, using the same document set, doesn't matter if there is a change in the document content or a different analyzer is used. The idea of the patch is to intialize the version number with the current time in millis as base when creating a new SegmentInfos object. So it's "nearly" impossible to get the same version number again.

Without this patch, it's impossible for developers to store an IndexReader in cache and check it's validity thru getCurrentVersion.

In the attachment is the patch and a JUnit TestCase which tests the scenario with a sample implementation for an IndexReader cache.

As far as i can see, there are no negativ side effects when implementing this patch. But let's see what the lucene-specialists will see ;-)

best regards
Bernhard





------------------------------------------------------------------------

Index: src/java/org/apache/lucene/index/SegmentInfos.java
===================================================================
RCS file: /home/cvspublic/jakarta-lucene/src/java/org/apache/lucene/index/SegmentInfos.java,v
retrieving revision 1.5
diff -r1.5 SegmentInfos.java
32c32,37
< private long version = 0; //counts how often the index has been changed by adding or deleting docs
---


/**
* counts how often the index has been changed by adding or deleting docs.
* starting with the current time in milliseconds forces to create unique version numbers.
*/
private long version = System.currentTimeMillis();





------------------------------------------------------------------------


package org.apache.lucene.index;

/**
* Copyright 2004 The Apache Software Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
*     http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

import java.io.IOException;
import java.util.Hashtable;

import junit.framework.Test;
import junit.framework.TestCase;
import junit.framework.TestSuite;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.Searcher;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

class CachedIndex { // an entry in the cache
        IndexReader reader;
        long version;

        CachedIndex(String name) throws IOException {
                version = IndexReader.getCurrentVersion(name);
                reader = IndexReader.open(name); // open reader
        }
}

public class TestIndexReaderVersion extends TestCase {
        
        public TestIndexReaderVersion (String name) {
                super(name);
        }

        static final Hashtable indexCache = new Hashtable();
                
        public static Test suite () {
                TestSuite suite = new TestSuite(TestIndexReaderVersion.class);
                
                for (int i = 1; i < 100; i++)
                        suite.addTest(new TestSuite(TestIndexReaderVersion.class));
                
                return suite;
        }
        
        public void testVersion() {

                Analyzer analyzer = new SimpleAnalyzer();
                String name = "/tmp/lucy";

                String[] docs = { "a", "a b" };
                String[] titles = docs;
                String q = "+a +b";
                
                testVersionControl(analyzer, name, docs, titles, q);

                String[] docs2 = { "c", "c d" };
                String[] titles2 = docs;
                q = "+c +d";
                
                testVersionControl(analyzer, name, docs2, titles2, q);

        }

        synchronized private IndexReader getReader(String name) {
                CachedIndex index =
                        (CachedIndex) indexCache.get(name);
                // look in cache

                try {
                        if (index != null
                                // check up-to-date
                                && index.version == 
IndexReader.getCurrentVersion(name)) {
                                        //System.out.println("IndexReader cache hit (maxDocs=" + 
index.reader.maxDoc() + ")");
                                return index.reader; // cache hit
                                
                        } else {
                                // Index was open but is not up-to-date, close it 
before creating a new one
                                if (index != null) {
                                        //System.out.println(
                                        //      "IndexReader not up-to-date, creating 
new");
                                        try {
                                                index.reader.close();
                                        } catch (IOException ignore) {
                                                System.err.println(
                                                        "IndexReader was already closed by 
third party.");
                                        }
                                } else {
                                        //System.out.println(
                                        //      "IndexReader does not exist, creating 
new");
                                }
                                index = new CachedIndex(name); // cache miss
                        }
                } catch (IOException e) {
                        System.err.println(e);
                }

                indexCache.put(name, index); // add to cache
                return index.reader;
        }

        private void testVersionControl(
                Analyzer analyzer,
                String indexName,
                String[] docs,
                String[] titles,
                String queryString) {
                try {

                        assertEquals(docs.length, titles.length);

                        Directory directory = FSDirectory.getDirectory(indexName, 
true);
                        IndexWriter indexer = new IndexWriter(directory, analyzer, 
true);
                        indexer.setUseCompoundFile(true);
                        
                        //for (int y = 0; y < 500; y++)
                        for (int z = 0; z < docs.length; z++) {
                                Document d = new Document();

                                Field field = new Field("body", docs[z], true, true, 
true);
                                d.add(field);

                                field = new Field("title", titles[z], true, true, 
true);
                                d.add(field);

                                indexer.addDocument(d);
                        }

                        indexer.optimize();
                        indexer.close();
                        
                        Hits hits = null;
                        QueryParser parser = new QueryParser("body", analyzer);
                        
                        /** try to get an reader from cache */
                        IndexReader reader = getReader(indexName);
                        
                        /** create a new searcher */
                        Searcher searcher = new IndexSearcher(reader);                 
 
                        
                        Query query = parser.parse(queryString);
                        hits = searcher.search(query);
                        //System.out.println(" doc's found: " + hits.length());
                        
                        assertEquals (1, hits.length());

                        searcher.close();

                } catch (Exception e) {
                        System.out.println(
                                " caught a "
                                        + e.getClass()
                                        + "\n with message: "
                                        + e.getMessage());
                }
        }
}



------------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to