[ 
https://issues.apache.org/jira/browse/LUCENE-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14138564#comment-14138564
 ] 

Vitaly Funstein commented on LUCENE-5931:
-----------------------------------------

Mike/Robert,

I have a follow-up question. I have backported the fix to 4.6 and now I believe 
I am seeing another serious issue here. :(

If the old reader passed in to {{DirectoryReader.openIfChanged(DirectoryReader, 
IndexCommit)}} is actually an NRT reader, then it seems that if there is 
unflushed/uncommitted data in the associated writer's buffers, in particular 
deletes, the returned reader will see those changes - thus violating the intent 
of opening the index at just the commit point we wanted, frozen in time. Here's 
my original test case modified to show the problem:

{code}
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;

import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field.Store;
import org.apache.lucene.document.StringField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexCommit;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy;
import org.apache.lucene.index.ReaderManager;
import org.apache.lucene.index.SnapshotDeletionPolicy;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;

import java.io.File;

public class CommitReuseTest {

  private final File path = new File("indexDir");
  private IndexWriter writer;
  private final SnapshotDeletionPolicy snapshotter = new 
SnapshotDeletionPolicy(new KeepOnlyLastCommitDeletionPolicy());
 
  @Before
  public void initIndex() throws Exception {
    path.mkdirs();
    IndexWriterConfig idxWriterCfg = new IndexWriterConfig(Version.LUCENE_46, 
null);
    idxWriterCfg.setIndexDeletionPolicy(snapshotter);
    idxWriterCfg.setInfoStream(System.out);
   
    Directory dir = FSDirectory.open(path);
    writer = new IndexWriter(dir, idxWriterCfg);
   
    writer.commit(); // make sure all index metadata is written out
  }
 
  @After
  public void stop() throws Exception {
    writer.close();
  }

  @Test
  public void test() throws Exception {
    Document doc;
    ReaderManager rm = new ReaderManager(writer, true);
   
    // Index some data
    for (int i = 0; i < 100; i++) {
      doc = new Document();
      doc.add(new StringField("key-" + i, "ABC", Store.YES));
      writer.addDocument(doc);
    }
   
    writer.commit();
   
    IndexCommit ic1 = snapshotter.snapshot();
   
    doc = new Document();
    doc.add(new StringField("key-" + 0, "AAA", Store.YES));
    writer.updateDocument(new Term("key-" + 0, "ABC"), doc);

    rm.maybeRefreshBlocking();
    DirectoryReader latest = rm.acquire();
    assertTrue(latest.hasDeletions());
   
    // This reader will be used for searching against commit point 1
    DirectoryReader searchReader = DirectoryReader.openIfChanged(latest, ic1);
//    assertFalse(searchReader.hasDeletions()); // XXX - this fails too!
    rm.release(latest);
   
    IndexSearcher s = new IndexSearcher(searchReader);
    Query q = new TermQuery(new Term("key-0", "ABC"));
    TopDocs td = s.search(q, 10);
    assertEquals(1, td.totalHits);
       
    searchReader.close();
    rm.close();
    snapshotter.release(ic1);
  }
}
{code}

Note, that if I comment out the {{updateDocument()}} call, the test passes. 
Also, if you only have one entry in the index and not enough, then it appears 
that while refreshing the NRT reader, the segment containing just the single 
delete will be removed, making it look like the test passes:

{noformat}
IW 0 [Wed Sep 17 22:32:47 PDT 2014; main]: drop 100% deleted segments: 
_4(4.6):c1/1
{noformat}

This output does not appear when running the code above, unchanged. Hope this 
helps... I can't make further headway myself though.

> DirectoryReader.openIfChanged(oldReader, commit) incorrectly assumes given 
> commit point has deletes/field updates
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-5931
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5931
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 4.6.1
>            Reporter: Vitaly Funstein
>            Assignee: Michael McCandless
>            Priority: Critical
>         Attachments: CommitReuseTest.java, LUCENE-5931.patch, 
> LUCENE-5931.patch, LUCENE-5931.patch
>
>
> {{StandardDirectoryReader}} assumes that the segments from commit point have 
> deletes, when they may not, yet the original SegmentReader for the segment 
> that we are trying to reuse does. This is evident when running attached JUnit 
> test case with asserts enabled (default): 
> {noformat}
> java.lang.AssertionError
>       at 
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:188)
>       at 
> org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:326)
>       at 
> org.apache.lucene.index.StandardDirectoryReader$2.doBody(StandardDirectoryReader.java:320)
>       at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:702)
>       at 
> org.apache.lucene.index.StandardDirectoryReader.doOpenFromCommit(StandardDirectoryReader.java:315)
>       at 
> org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:311)
>       at 
> org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:262)
>       at 
> org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:183)
> {noformat}
> or, if asserts are disabled then it falls through into NPE:
> {noformat}
> java.lang.NullPointerException
>       at java.io.File.<init>(File.java:305)
>       at 
> org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:80)
>       at 
> org.apache.lucene.codecs.lucene40.BitVector.<init>(BitVector.java:327)
>       at 
> org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:90)
>       at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:131)
>       at 
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:194)
>       at 
> org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:326)
>       at 
> org.apache.lucene.index.StandardDirectoryReader$2.doBody(StandardDirectoryReader.java:320)
>       at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:702)
>       at 
> org.apache.lucene.index.StandardDirectoryReader.doOpenFromCommit(StandardDirectoryReader.java:315)
>       at 
> org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:311)
>       at 
> org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:262)
>       at 
> org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:183)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to