[jira] [Commented] (LUCENE-3246) Invert IR.getDelDocs -> IR.getLiveDocs
[ https://issues.apache.org/jira/browse/LUCENE-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131869#comment-13131869 ] Michael McCandless commented on LUCENE-3246: bq. I think the javadoc comments for TermsEnum.docs are now incorrect due to this commit. You're right! I just committed a fix. Thanks Sean. > Invert IR.getDelDocs -> IR.getLiveDocs > -- > > Key: LUCENE-3246 > URL: https://issues.apache.org/jira/browse/LUCENE-3246 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.0 > > Attachments: LUCENE-3246-IndexSplitters.patch, LUCENE-3246.patch, > LUCENE-3246.patch > > > Spinoff from LUCENE-1536, where we need to fix the low level filtering > we do for deleted docs to "match" Filters (ie, a set bit means the doc > is accepted) so that filters can be pushed all the way down to the > enums when possible/appropriate. > This change also inverts the meaning first arg to > TermsEnum.docs/AndPositions (renames from skipDocs to liveDocs). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3246) Invert IR.getDelDocs -> IR.getLiveDocs
[ https://issues.apache.org/jira/browse/LUCENE-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13130945#comment-13130945 ] Sean Lavelle commented on LUCENE-3246: -- I think the javadoc comments for TermsEnum.docs are now incorrect due to this commit. It says "@param liveDocs set bits are documents that should not be returned", which looks backwards. (maybe other places are wrong too; I haven't checked) > Invert IR.getDelDocs -> IR.getLiveDocs > -- > > Key: LUCENE-3246 > URL: https://issues.apache.org/jira/browse/LUCENE-3246 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.0 > > Attachments: LUCENE-3246-IndexSplitters.patch, LUCENE-3246.patch, > LUCENE-3246.patch > > > Spinoff from LUCENE-1536, where we need to fix the low level filtering > we do for deleted docs to "match" Filters (ie, a set bit means the doc > is accepted) so that filters can be pushed all the way down to the > enums when possible/appropriate. > This change also inverts the meaning first arg to > TermsEnum.docs/AndPositions (renames from skipDocs to liveDocs). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3246) Invert IR.getDelDocs -> IR.getLiveDocs
[ https://issues.apache.org/jira/browse/LUCENE-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060583#comment-13060583 ] Michael McCandless commented on LUCENE-3246: This commit changed the index format (the *.del), but the change is fully back-compat even with trunk indices. > Invert IR.getDelDocs -> IR.getLiveDocs > -- > > Key: LUCENE-3246 > URL: https://issues.apache.org/jira/browse/LUCENE-3246 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.0 > > Attachments: LUCENE-3246-IndexSplitters.patch, LUCENE-3246.patch, > LUCENE-3246.patch > > > Spinoff from LUCENE-1536, where we need to fix the low level filtering > we do for deleted docs to "match" Filters (ie, a set bit means the doc > is accepted) so that filters can be pushed all the way down to the > enums when possible/appropriate. > This change also inverts the meaning first arg to > TermsEnum.docs/AndPositions (renames from skipDocs to liveDocs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3246) Invert IR.getDelDocs -> IR.getLiveDocs
[ https://issues.apache.org/jira/browse/LUCENE-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13058107#comment-13058107 ] Michael McCandless commented on LUCENE-3246: bq. As we have now both variants to read/write BitVectors, would it be not a good idea to automatically use the old encoding for liveDocs, if more than 50% of all bits are unset? That seems like a good idea? Ie, handle both sparse set and sparse unset compactly? Though it should be unusual that you have so many deletes against a segment (esp. because TMP now targets such segs more aggressively). We should do this under a new issue (the old code also didn't handle the "many deletions" case sparsely either, just the "few deletions" case). > Invert IR.getDelDocs -> IR.getLiveDocs > -- > > Key: LUCENE-3246 > URL: https://issues.apache.org/jira/browse/LUCENE-3246 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.0 > > Attachments: LUCENE-3246-IndexSplitters.patch, LUCENE-3246.patch, > LUCENE-3246.patch > > > Spinoff from LUCENE-1536, where we need to fix the low level filtering > we do for deleted docs to "match" Filters (ie, a set bit means the doc > is accepted) so that filters can be pushed all the way down to the > enums when possible/appropriate. > This change also inverts the meaning first arg to > TermsEnum.docs/AndPositions (renames from skipDocs to liveDocs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3246) Invert IR.getDelDocs -> IR.getLiveDocs
[ https://issues.apache.org/jira/browse/LUCENE-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13057979#comment-13057979 ] Uwe Schindler commented on LUCENE-3246: --- Hi Mike, As we have now both variants to read/write BitVectors, would it be not a good idea to automatically use the old encoding for liveDocs, if more than 50% of all bits are unset? This would save disk space if a segments has more deletetions than live docs. Not sure if this can easily be implemented and is worth the complexity (that we already have because of both versions)? The patch looks fine! > Invert IR.getDelDocs -> IR.getLiveDocs > -- > > Key: LUCENE-3246 > URL: https://issues.apache.org/jira/browse/LUCENE-3246 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.0 > > Attachments: LUCENE-3246-IndexSplitters.patch, LUCENE-3246.patch, > LUCENE-3246.patch > > > Spinoff from LUCENE-1536, where we need to fix the low level filtering > we do for deleted docs to "match" Filters (ie, a set bit means the doc > is accepted) so that filters can be pushed all the way down to the > enums when possible/appropriate. > This change also inverts the meaning first arg to > TermsEnum.docs/AndPositions (renames from skipDocs to liveDocs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3246) Invert IR.getDelDocs -> IR.getLiveDocs
[ https://issues.apache.org/jira/browse/LUCENE-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13055683#comment-13055683 ] Michael McCandless commented on LUCENE-3246: Awesome, thanks Uwe! I'll work on SR cutting over to live docs on disk... > Invert IR.getDelDocs -> IR.getLiveDocs > -- > > Key: LUCENE-3246 > URL: https://issues.apache.org/jira/browse/LUCENE-3246 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.0 > > Attachments: LUCENE-3246-IndexSplitters.patch, LUCENE-3246.patch > > > Spinoff from LUCENE-1536, where we need to fix the low level filtering > we do for deleted docs to "match" Filters (ie, a set bit means the doc > is accepted) so that filters can be pushed all the way down to the > enums when possible/appropriate. > This change also inverts the meaning first arg to > TermsEnum.docs/AndPositions (renames from skipDocs to liveDocs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org