[ https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697468#action_12697468 ]
Michael McCandless commented on LUCENE-1539: -------------------------------------------- This patch still has some noise, eg the unused *Property additions to PerfRunData, the nocommit "first" logic in ReadTask. On DeleteTaskByPercentTask: should it delete a pctg of the undeleted (numDocs()) docs or of the total (maxDoc()) doc space? Right now its implementation is dangerous, eg, if I delete 5% of the index and then 10%, that 10% delete will do nothing, since the docs it deletes will fall onto the exact docs that the 5% had deleted. {quote} It seems a bit awkward that DeleteByPercentTask needs to call IR.undeleteAll before executing the deletes. {quote} Oh, I see. I don't think it should do that? I think it should mean "delete XXX% of the remaining undeleted docs"? {quote} Also that subsequent delete by percent calls in deletepercent.alg need to open the latest version of the index rather than the original (which does not have deletes) {quote} This seems correct? Ie the purpose of this task is "open the latest commit on the index, delete XXX% of its undeleted docs". {quote} This is due to DirectoryIndexReader.acquireWriteLock checking to insure the latest version of the index is locked. Perhaps we can relax this? I would rather be able to open a commit point and delete from the reader, then flush as the latest version. {quote} I don't think we can relax that. This (single transaction (writer) open at once) is a core assumption in Lucene. > Improve Benchmark > ----------------- > > Key: LUCENE-1539 > URL: https://issues.apache.org/jira/browse/LUCENE-1539 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/benchmark > Affects Versions: 2.4 > Reporter: Jason Rutherglen > Assignee: Michael McCandless > Priority: Minor > Fix For: 2.9 > > Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, > LUCENE-1539.patch, sortBench2.py, sortCollate2.py > > Original Estimate: 336h > Remaining Estimate: 336h > > Benchmark can be improved by incorporating recent suggestions posted > on java-dev. M. McCandless' Python scripts that execute multiple > rounds of tests can either be incorporated into the codebase or > converted to Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org