[ https://issues.apache.org/jira/browse/LUCENE-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655849#action_12655849 ]
Jason Rutherglen commented on LUCENE-1485: ------------------------------------------ Grant write: "Is the "slightly" in the noise? " Seems to be. Perhaps it needs more performance tests. It is somewhat surprising given OpenBitSet is supposed to be the "fastest" bitset. It seems that Lucene should have ways to incorporate new bitset implementations in the future using interfaces and things. That being said it would be great if in Lucene 3.0 the entire IndexReader class tree was rewritten to not be such as mess with the locking, reopen, and ref counting. Marvin is proposing some good ideas to make it all more pluggable. I need to spend some time with folks figuring what APIs would be optimal for not tying all the APIs together like the twisted mess it is now. For example, IndexReader shouldn't have a static open method attached to it. It seems like new index features like column stride fields implemented in todays system would exacerbate the problem because then there's more code that is impossible to customize if desired. SegmentMerger needs to be pluggable as today it cannot be customized without possibly breaking the entirety of Lucene, and the custom code cannot be checked in as a contrib. There more to write but I should save it for a more structured and timely discussion. > Use OpenBitSet instead of BitVector in SegmentReader > ---------------------------------------------------- > > Key: LUCENE-1485 > URL: https://issues.apache.org/jira/browse/LUCENE-1485 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Affects Versions: 2.4 > Reporter: Jason Rutherglen > Priority: Minor > Attachments: TestDeletedDocsSpeed.java > > Original Estimate: 96h > Remaining Estimate: 96h > > Tried out BitVector.get vs OpenBitSet.get here's the results which are about > the same after running 25 times in milliseconds. It is assumed that > implementing DocIdSetIterator in SegmentTermDocs will speed things up more. > bit set size: 10,485,760 > set bits count: 524,032 > openbitset: 68 > bitvector: 89 > 24% speed increase. > I will implement a patch that adds the WriteableBitSet interface and make a > subclass of OpenBitSet that is writeable to disk. We're working on an > isSparse method for OpenBitSet. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org