[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge
[ https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463663#comment-16463663 ] ASF subversion and git services commented on LUCENE-8293: - Commit dad48603aec715063fdcb71e11fe73599d63c3a2 in lucene-solr's branch refs/heads/branch_7x from [~simonw] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=dad4860 ] LUCENE-8293: Ensure only hard deletes are carried over in a merge Today we carry over hard deletes based on the SegmentReaders liveDocs. This is not correct if soft-deletes are used especially with rentention policies. If a soft delete is added while a segment is merged the document might end up hard deleted in the target segment. This isn't necessarily a correctness issue but causes unnecessary writes of hard-deletes. The biggest issue here is that we assert that previously deleted documents are still deleted in the live-docs we apply and that might be violated by the retention policy. > Ensure only hard deletes are carried over in a merge > > > Key: LUCENE-8293 > URL: https://issues.apache.org/jira/browse/LUCENE-8293 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Simon Willnauer >Priority: Major > Fix For: 7.4, master (8.0) > > Attachments: LUCENE-8293.patch, LUCENE-8293.patch > > > Today we carry over hard deletes based on the SegmentReaders liveDocs. > This is not correct if soft-deletes are used especially with rentention > policies. If a soft delete is added while a segment is merged the document > might end up hard deleted in the target segment. This isn't necessarily a > correctness issue but causes unnecessary writes of hard-deletes. The > biggest > issue here is that we assert that previously deleted documents are still > deleted > in the live-docs we apply and that might be violated by the retention > policy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge
[ https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463664#comment-16463664 ] ASF subversion and git services commented on LUCENE-8293: - Commit 3a6f5313d6b4a23dea2030cb5d63ad522536f501 in lucene-solr's branch refs/heads/master from [~simonw] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3a6f531 ] LUCENE-8293: Ensure only hard deletes are carried over in a merge Today we carry over hard deletes based on the SegmentReaders liveDocs. This is not correct if soft-deletes are used especially with rentention policies. If a soft delete is added while a segment is merged the document might end up hard deleted in the target segment. This isn't necessarily a correctness issue but causes unnecessary writes of hard-deletes. The biggest issue here is that we assert that previously deleted documents are still deleted in the live-docs we apply and that might be violated by the retention policy. > Ensure only hard deletes are carried over in a merge > > > Key: LUCENE-8293 > URL: https://issues.apache.org/jira/browse/LUCENE-8293 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Simon Willnauer >Priority: Major > Fix For: 7.4, master (8.0) > > Attachments: LUCENE-8293.patch, LUCENE-8293.patch > > > Today we carry over hard deletes based on the SegmentReaders liveDocs. > This is not correct if soft-deletes are used especially with rentention > policies. If a soft delete is added while a segment is merged the document > might end up hard deleted in the target segment. This isn't necessarily a > correctness issue but causes unnecessary writes of hard-deletes. The > biggest > issue here is that we assert that previously deleted documents are still > deleted > in the live-docs we apply and that might be violated by the retention > policy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge
[ https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462795#comment-16462795 ] Michael McCandless commented on LUCENE-8293: +1 I like how you tap into segment warmer in the test cases to sneak in a "delete during merge"! > Ensure only hard deletes are carried over in a merge > > > Key: LUCENE-8293 > URL: https://issues.apache.org/jira/browse/LUCENE-8293 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Simon Willnauer >Priority: Major > Fix For: 7.4, master (8.0) > > Attachments: LUCENE-8293.patch, LUCENE-8293.patch > > > Today we carry over hard deletes based on the SegmentReaders liveDocs. > This is not correct if soft-deletes are used especially with rentention > policies. If a soft delete is added while a segment is merged the document > might end up hard deleted in the target segment. This isn't necessarily a > correctness issue but causes unnecessary writes of hard-deletes. The > biggest > issue here is that we assert that previously deleted documents are still > deleted > in the live-docs we apply and that might be violated by the retention > policy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge
[ https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462609#comment-16462609 ] Simon Willnauer commented on LUCENE-8293: - [~mikemccand] I added another test and fixed some corner cases with soft-deletes. Can you take another look? > Ensure only hard deletes are carried over in a merge > > > Key: LUCENE-8293 > URL: https://issues.apache.org/jira/browse/LUCENE-8293 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Simon Willnauer >Priority: Major > Fix For: 7.4, master (8.0) > > Attachments: LUCENE-8293.patch, LUCENE-8293.patch > > > Today we carry over hard deletes based on the SegmentReaders liveDocs. > This is not correct if soft-deletes are used especially with rentention > policies. If a soft delete is added while a segment is merged the document > might end up hard deleted in the target segment. This isn't necessarily a > correctness issue but causes unnecessary writes of hard-deletes. The > biggest > issue here is that we assert that previously deleted documents are still > deleted > in the live-docs we apply and that might be violated by the retention > policy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge
[ https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462527#comment-16462527 ] Simon Willnauer commented on LUCENE-8293: - [~erickerickson] no it doesn't > Ensure only hard deletes are carried over in a merge > > > Key: LUCENE-8293 > URL: https://issues.apache.org/jira/browse/LUCENE-8293 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Simon Willnauer >Priority: Major > Fix For: 7.4, master (8.0) > > Attachments: LUCENE-8293.patch > > > Today we carry over hard deletes based on the SegmentReaders liveDocs. > This is not correct if soft-deletes are used especially with rentention > policies. If a soft delete is added while a segment is merged the document > might end up hard deleted in the target segment. This isn't necessarily a > correctness issue but causes unnecessary writes of hard-deletes. The > biggest > issue here is that we assert that previously deleted documents are still > deleted > in the live-docs we apply and that might be violated by the retention > policy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge
[ https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462508#comment-16462508 ] Erick Erickson commented on LUCENE-8293: Related question: Does this have any implications for TieredMergePolicy? In particular TMP relies on: IndexWriter.numDeletesToMerge(info); SegmentCommitInfo.info.maxDoc() in order to score documents to pass off to the merging code. I'm not worried about the nuts and bolts of merging you're addressing here, mostly whether IndexWriter.numDeletesToMerge(info); will continue to reflect the number of docs that will be merged away. > Ensure only hard deletes are carried over in a merge > > > Key: LUCENE-8293 > URL: https://issues.apache.org/jira/browse/LUCENE-8293 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Simon Willnauer >Priority: Major > Fix For: 7.4, master (8.0) > > Attachments: LUCENE-8293.patch > > > Today we carry over hard deletes based on the SegmentReaders liveDocs. > This is not correct if soft-deletes are used especially with rentention > policies. If a soft delete is added while a segment is merged the document > might end up hard deleted in the target segment. This isn't necessarily a > correctness issue but causes unnecessary writes of hard-deletes. The > biggest > issue here is that we assert that previously deleted documents are still > deleted > in the live-docs we apply and that might be violated by the retention > policy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8293) Ensure only hard deletes are carried over in a merge
[ https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462266#comment-16462266 ] Michael McCandless commented on LUCENE-8293: +1, tricky ... we were previously just asking the merged reader for its live docs, but this included hard and soft deletes, so you fixed it to explicitly pull only hard deletes from the {{RLD}}. I like how you factored out a method from that scary {{commitMergedDeletesAndUpdates}} method. > Ensure only hard deletes are carried over in a merge > > > Key: LUCENE-8293 > URL: https://issues.apache.org/jira/browse/LUCENE-8293 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Simon Willnauer >Priority: Major > Fix For: 7.4, master (8.0) > > Attachments: LUCENE-8293.patch > > > Today we carry over hard deletes based on the SegmentReaders liveDocs. > This is not correct if soft-deletes are used especially with rentention > policies. If a soft delete is added while a segment is merged the document > might end up hard deleted in the target segment. This isn't necessarily a > correctness issue but causes unnecessary writes of hard-deletes. The > biggest > issue here is that we assert that previously deleted documents are still > deleted > in the live-docs we apply and that might be violated by the retention > policy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org