[jira] Updated: (LUCENE-1314) IndexReader.clone

Jason Rutherglen (JIRA) Mon, 05 Jan 2009 15:06:13 -0800

     [ 
https://issues.apache.org/jira/browse/LUCENE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jason Rutherglen updated LUCENE-1314:
-------------------------------------

    Attachment: LUCENE-1314.patch

LUCENE-1314.patch

Michael, thanks for reviewing the patch in such detail.  All of your comments 
have been included in the latest version of the patch.

M.M: In SegmentReader.java we have "if (doClone) acquireWriteLock();", which 
isn't right? Ie, if this reader does not currently have the write lock (it has 
no "local mods") it should not acquire it? One should be allowed to clone a 
stale reader?

Cloning a stale reader is fixed in the patch.  The problem is the user may get 
into trouble by updating the stale reader which was debated before.  I got the 
impression insuring the reader being updated was the latest was important.  

M.M.: Have you done any tests to see the cost of the copy-on-write cloning of 
deleted docs BitVector & norms? The first new mod to the cloned reader pays 
that penalty. Marvin's "tombstone" deletions would bring this penalty to near 
zero, but it's a big change (and should certainly be decoupled from this!).

The cost of cloning them meaning the creating a new byte array or some other 
cost like memory consumption?  I need to reread Marvin's tombstones which at 
first glance seemed to be an iterative approach to saving deletions that seems 
like a transaction log.  Correct?

M.M.: SegmentReader.Norm now has two refCounts, and I think both are necessary. 
One tracks refs to the Norm instance itself and the other tracks refs to the 
byte[]. Can you add some comments explaining the difference (because it's 
confusing at first blush)?

Byte[] referencing is used because a new norm object needs to be created for 
each clone, and the byte array is all that is needed for sharing between cloned 
readers.  The current norm referencing is for sharing between readers whereas 
the byte[] referencing is for copy on write which is independent of reader 
references.

M.M.: In SegmentReader.doClose() you are failing to call 
deletedDocsCopyOnWriteRef.decRef(), so you have a refCount leak.  Can you 
create a unit test that 1) opens reader 1, 2) does deletes on reader 1, 3) 
clones reader 1 --> reader 2, 4) closes reader 1, 5) deletes more docs with 
reader 1, and 6) asserts that the
deletedDocs BitVector did not get cloned? First verify the test fails, then fix 
the bug...

In regards to #5, the test cannot delete from reader 1 once it's closed.  A 
method called TestIndexReaderClone.testSegmentReaderCloseReferencing was added 
to test this closing use case.


> IndexReader.clone
> -----------------
>
>                 Key: LUCENE-1314
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1314
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>    Affects Versions: 2.3.1
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1314.patch, LUCENE-1314.patch, LUCENE-1314.patch, 
> LUCENE-1314.patch, LUCENE-1314.patch, LUCENE-1314.patch, LUCENE-1314.patch, 
> LUCENE-1314.patch, lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, 
> lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, 
> lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, lucene-1314.patch, 
> lucene-1314.patch
>
>
> Based on discussion 
> http://www.nabble.com/IndexReader.reopen-issue-td18070256.html.  The problem 
> is reopen returns the same reader if there are no changes, so if docs are 
> deleted from the new reader, they are also reflected in the previous reader 
> which is not always desired behavior.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1314) IndexReader.clone

Reply via email to