[
https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Rutherglen updated LUCENE-1476:
-------------------------------------
Attachment: searchdeletes.alg
searchdeletes.alg uses reuters, deletes many docs, then performs
searches. If it's working properly, iteration rather than calling
BitVector.get has some serious performance drawbacks.
Compare:
DocIdSet SrchNewRdr_8: 32.0 rec/s
DelDoc.get SrchNewRdr_8: 2,959.5 rec/s
Next step is running JProfiler. Perhaps BitVector needs to be
replaced by OpenBitSet for iterating, or there's some other issue.
BitVector.get:
[java] Operation round mrg buf runCnt recsPerRun rec/s
elapsedSec avgUsedMem avgTotalMem
[java] CreateIndex 0 10 100 1 1 17.2
0.06 3,953,984 9,072,640
[java] CloseIndex - - - - 0 10 100 - - 1 - - - - 1 - - 1,000.0 - -
0.00 - 3,953,984 - - 9,072,640
[java] Populate 0 10 100 1 200003 6,539.7
30.58 8,665,528 10,420,224
[java] Deletions - - - - 0 10 100 - - 1 - - - 8002 - 533,466.7 - -
0.01 - 8,665,528 - 10,420,224
[java] OpenReader(false) 0 10 100 1 1 1,000.0
0.00 8,691,040 10,420,224
[java] Seq_8000 - - - - 0 10 100 - - 1 - - - 8000 - 800,000.0 - -
0.01 - 8,833,912 - 10,420,224
[java] CloseReader 0 10 100 9 1 2,250.0
0.00 7,672,217 10,420,224
[java] SrchNewRdr_8 - - - 0 10 100 - - 1 - - - 4016 - - 2,959.5 - -
1.36 - 8,232,384 - 10,420,224
[java] OpenReader 0 10 100 8 1 1,333.3
0.01 7,579,584 10,420,224
[java] Seq_500 - - - - - 0 10 100 - - 8 - - - 500 - - 2,963.0 - -
1.35 - 8,591,199 - 10,420,224
DocIdSet:
[java] Operation round mrg buf runCnt recsPerRun rec/s
elapsedSec avgUsedMem avgTotalMem
[java] CreateIndex 0 10 100 1 1 17.5
0.06 3,954,376 9,076,736
[java] CloseIndex - - - - 0 10 100 - - 1 - - - - 1 - - 1,000.0 - -
0.00 - 3,954,376 - - 9,076,736
[java] Populate 0 10 100 1 200003 6,503.1
30.75 5,951,816 10,321,920
[java] Deletions - - - - 0 10 100 - - 1 - - - 8002 - 500,125.0 - -
0.02 - 6,190,816 - 10,321,920
[java] OpenReader(false) 0 10 100 1 1 1,000.0
0.00 5,976,960 10,321,920
[java] Seq_8000 - - - - 0 10 100 - - 1 - - - 8000 - 727,272.8 - -
0.01 - 6,122,904 - 10,321,920
[java] CloseReader 0 10 100 9 1 3,000.0
0.00 7,727,980 10,321,920
[java] SrchNewRdr_8 - - - 0 10 100 - - 1 - - - 4016 - - - 32.0 - -
125.67 - 7,960,824 - 10,321,920
[java] OpenReader 0 10 100 8 1 1,333.3
0.01 7,742,855 10,321,920
[java] Seq_500 - - - - - 0 10 100 - - 8 - - - 500 - - - 31.8 - -
125.66 - 8,744,057 - 10,321,920
> BitVector implement DocIdSet, IndexReader returns DocIdSet deleted docs
> -----------------------------------------------------------------------
>
> Key: LUCENE-1476
> URL: https://issues.apache.org/jira/browse/LUCENE-1476
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Affects Versions: 2.4
> Reporter: Jason Rutherglen
> Priority: Trivial
> Attachments: LUCENE-1476.patch, LUCENE-1476.patch, LUCENE-1476.patch,
> quasi_iterator_deletions.diff, quasi_iterator_deletions_r2.diff,
> searchdeletes.alg
>
> Original Estimate: 12h
> Remaining Estimate: 12h
>
> Update BitVector to implement DocIdSet. Expose deleted docs DocIdSet from
> IndexReader.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]