[ https://issues.apache.org/jira/browse/LUCENE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jason Rutherglen updated LUCENE-1314: ------------------------------------- Attachment: LUCENE-1314.patch LUCENE-1314.patch Michael, thanks for reviewing the patch in such detail. All of your comments have been included in the latest version of the patch. M.M: In SegmentReader.java we have "if (doClone) acquireWriteLock();", which isn't right? Ie, if this reader does not currently have the write lock (it has no "local mods") it should not acquire it? One should be allowed to clone a stale reader? Cloning a stale reader is fixed in the patch. The problem is the user may get into trouble by updating the stale reader which was debated before. I got the impression insuring the reader being updated was the latest was important. M.M.: Have you done any tests to see the cost of the copy-on-write cloning of deleted docs BitVector & norms? The first new mod to the cloned reader pays that penalty. Marvin's "tombstone" deletions would bring this penalty to near zero, but it's a big change (and should certainly be decoupled from this!). The cost of cloning them meaning the creating a new byte array or some other cost like memory consumption? I need to reread Marvin's tombstones which at first glance seemed to be an iterative approach to saving deletions that seems like a transaction log. Correct? M.M.: SegmentReader.Norm now has two refCounts, and I think both are necessary. One tracks refs to the Norm instance itself and the other tracks refs to the byte[]. Can you add some comments explaining the difference (because it's confusing at first blush)? Byte[] referencing is used because a new norm object needs to be created for each clone, and the byte array is all that is needed for sharing between cloned readers. The current norm referencing is for sharing between readers whereas the byte[] referencing is for copy on write which is independent of reader references. M.M.: In SegmentReader.doClose() you are failing to call deletedDocsCopyOnWriteRef.decRef(), so you have a refCount leak. Can you create a unit test that 1) opens reader 1, 2) does deletes on reader 1, 3) clones reader 1 --> reader 2, 4) closes reader 1, 5) deletes more docs with reader 1, and 6) asserts that the deletedDocs BitVector did not get cloned? First verify the test fails, then fix the bug... In regards to #5, the test cannot delete from reader 1 once it's closed. A method called TestIndexReaderClone.testSegmentReaderCloseReferencing was added to test this closing use case. > IndexReader.clone > ----------------- > > Key: LUCENE-1314 > URL: https://issues.apache.org/jira/browse/LUCENE-1314 > Project: Lucene - Java > Issue Type: New Feature > Components: Index > Affects Versions: 2.3.1 > Reporter: Jason Rutherglen > Assignee: Michael McCandless > Priority: Minor > Fix For: 2.9 > > Attachments: LUCENE-1314.patch, LUCENE-1314.patch, LUCENE-1314.patch, > LUCENE-1314.patch, LUCENE-1314.patch, LUCENE-1314.patch, LUCENE-1314.patch, > LUCENE-1314.patch, lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, > lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, > lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, > lucene-1314.patch > > > Based on discussion > http://www.nabble.com/IndexReader.reopen-issue-td18070256.html. The problem > is reopen returns the same reader if there are no changes, so if docs are > deleted from the new reader, they are also reflected in the previous reader > which is not always desired behavior. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org