[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge

2018-05-04 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463663#comment-16463663
 ] 

ASF subversion and git services commented on LUCENE-8293:
-

Commit dad48603aec715063fdcb71e11fe73599d63c3a2 in lucene-solr's branch 
refs/heads/branch_7x from [~simonw]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=dad4860 ]

LUCENE-8293: Ensure only hard deletes are carried over in a merge

Today we carry over hard deletes based on the SegmentReaders liveDocs.
This is not correct if soft-deletes are used especially with rentention
policies. If a soft delete is added while a segment is merged the document
might end up hard deleted in the target segment. This isn't necessarily a
correctness issue but causes unnecessary writes of hard-deletes. The biggest
issue here is that we assert that previously deleted documents are still deleted
in the live-docs we apply and that might be violated by the retention policy.


> Ensure only hard deletes are carried over in a merge
> 
>
> Key: LUCENE-8293
> URL: https://issues.apache.org/jira/browse/LUCENE-8293
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Simon Willnauer
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: LUCENE-8293.patch, LUCENE-8293.patch
>
>
> Today we carry over hard deletes based on the SegmentReaders liveDocs.
> This is not correct if soft-deletes are used especially with rentention
> policies. If a soft delete is added while a segment is merged the document
> might end up hard deleted in the target segment. This isn't necessarily a
> correctness issue but causes unnecessary writes of hard-deletes. The 
> biggest
> issue here is that we assert that previously deleted documents are still 
> deleted
> in the live-docs we apply and that might be violated by the retention 
> policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge

2018-05-04 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463664#comment-16463664
 ] 

ASF subversion and git services commented on LUCENE-8293:
-

Commit 3a6f5313d6b4a23dea2030cb5d63ad522536f501 in lucene-solr's branch 
refs/heads/master from [~simonw]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3a6f531 ]

LUCENE-8293: Ensure only hard deletes are carried over in a merge

Today we carry over hard deletes based on the SegmentReaders liveDocs.
This is not correct if soft-deletes are used especially with rentention
policies. If a soft delete is added while a segment is merged the document
might end up hard deleted in the target segment. This isn't necessarily a
correctness issue but causes unnecessary writes of hard-deletes. The biggest
issue here is that we assert that previously deleted documents are still deleted
in the live-docs we apply and that might be violated by the retention policy.


> Ensure only hard deletes are carried over in a merge
> 
>
> Key: LUCENE-8293
> URL: https://issues.apache.org/jira/browse/LUCENE-8293
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Simon Willnauer
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: LUCENE-8293.patch, LUCENE-8293.patch
>
>
> Today we carry over hard deletes based on the SegmentReaders liveDocs.
> This is not correct if soft-deletes are used especially with rentention
> policies. If a soft delete is added while a segment is merged the document
> might end up hard deleted in the target segment. This isn't necessarily a
> correctness issue but causes unnecessary writes of hard-deletes. The 
> biggest
> issue here is that we assert that previously deleted documents are still 
> deleted
> in the live-docs we apply and that might be violated by the retention 
> policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge

2018-05-03 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462795#comment-16462795
 ] 

Michael McCandless commented on LUCENE-8293:


+1

 

I like how you tap into segment warmer in the test cases to sneak in a "delete 
during merge"!

> Ensure only hard deletes are carried over in a merge
> 
>
> Key: LUCENE-8293
> URL: https://issues.apache.org/jira/browse/LUCENE-8293
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Simon Willnauer
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: LUCENE-8293.patch, LUCENE-8293.patch
>
>
> Today we carry over hard deletes based on the SegmentReaders liveDocs.
> This is not correct if soft-deletes are used especially with rentention
> policies. If a soft delete is added while a segment is merged the document
> might end up hard deleted in the target segment. This isn't necessarily a
> correctness issue but causes unnecessary writes of hard-deletes. The 
> biggest
> issue here is that we assert that previously deleted documents are still 
> deleted
> in the live-docs we apply and that might be violated by the retention 
> policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge

2018-05-03 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462609#comment-16462609
 ] 

Simon Willnauer commented on LUCENE-8293:
-

[~mikemccand] I added another test and fixed some corner cases with 
soft-deletes. Can you take another look?

> Ensure only hard deletes are carried over in a merge
> 
>
> Key: LUCENE-8293
> URL: https://issues.apache.org/jira/browse/LUCENE-8293
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Simon Willnauer
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: LUCENE-8293.patch, LUCENE-8293.patch
>
>
> Today we carry over hard deletes based on the SegmentReaders liveDocs.
> This is not correct if soft-deletes are used especially with rentention
> policies. If a soft delete is added while a segment is merged the document
> might end up hard deleted in the target segment. This isn't necessarily a
> correctness issue but causes unnecessary writes of hard-deletes. The 
> biggest
> issue here is that we assert that previously deleted documents are still 
> deleted
> in the live-docs we apply and that might be violated by the retention 
> policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge

2018-05-03 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462527#comment-16462527
 ] 

Simon Willnauer commented on LUCENE-8293:
-

[~erickerickson] no it doesn't

> Ensure only hard deletes are carried over in a merge
> 
>
> Key: LUCENE-8293
> URL: https://issues.apache.org/jira/browse/LUCENE-8293
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Simon Willnauer
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: LUCENE-8293.patch
>
>
> Today we carry over hard deletes based on the SegmentReaders liveDocs.
> This is not correct if soft-deletes are used especially with rentention
> policies. If a soft delete is added while a segment is merged the document
> might end up hard deleted in the target segment. This isn't necessarily a
> correctness issue but causes unnecessary writes of hard-deletes. The 
> biggest
> issue here is that we assert that previously deleted documents are still 
> deleted
> in the live-docs we apply and that might be violated by the retention 
> policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge

2018-05-03 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462508#comment-16462508
 ] 

Erick Erickson commented on LUCENE-8293:


Related question: Does this have any implications for TieredMergePolicy? In 
particular TMP relies on:

IndexWriter.numDeletesToMerge(info);
SegmentCommitInfo.info.maxDoc()

in order to score documents to pass off to the merging code. I'm not worried 
about the nuts and bolts of merging you're addressing here, mostly whether 
IndexWriter.numDeletesToMerge(info); will continue to reflect the number of 
docs that will be merged away.

> Ensure only hard deletes are carried over in a merge
> 
>
> Key: LUCENE-8293
> URL: https://issues.apache.org/jira/browse/LUCENE-8293
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Simon Willnauer
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: LUCENE-8293.patch
>
>
> Today we carry over hard deletes based on the SegmentReaders liveDocs.
> This is not correct if soft-deletes are used especially with rentention
> policies. If a soft delete is added while a segment is merged the document
> might end up hard deleted in the target segment. This isn't necessarily a
> correctness issue but causes unnecessary writes of hard-deletes. The 
> biggest
> issue here is that we assert that previously deleted documents are still 
> deleted
> in the live-docs we apply and that might be violated by the retention 
> policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge

2018-05-03 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462266#comment-16462266
 ] 

Michael McCandless commented on LUCENE-8293:


+1, tricky ... we were previously just asking the merged reader for its live 
docs, but this included hard and soft deletes, so you fixed it to explicitly 
pull only hard deletes from the {{RLD}}.

I like how you factored out a method from that scary 
{{commitMergedDeletesAndUpdates}} method.

> Ensure only hard deletes are carried over in a merge
> 
>
> Key: LUCENE-8293
> URL: https://issues.apache.org/jira/browse/LUCENE-8293
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Simon Willnauer
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: LUCENE-8293.patch
>
>
> Today we carry over hard deletes based on the SegmentReaders liveDocs.
> This is not correct if soft-deletes are used especially with rentention
> policies. If a soft delete is added while a segment is merged the document
> might end up hard deleted in the target segment. This isn't necessarily a
> correctness issue but causes unnecessary writes of hard-deletes. The 
> biggest
> issue here is that we assert that previously deleted documents are still 
> deleted
> in the live-docs we apply and that might be violated by the retention 
> policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org