[jira] Commented: (LUCENE-1526) For near real-time search, use paged copy-on-write BitVector impl

Jake Mannix (JIRA) Wed, 11 Nov 2009 10:49:12 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776568#action_12776568
 ]


Jake Mannix commented on LUCENE-1526:
-------------------------------------

bq. These sound serious - if you can provide any details, that'd help.  I'll do 
some stress testing too. Thanks for testing and reporting 

Out of these, the specific issue of incorrectness of applied deletes is easiest 
to see - we saw it by indexing up to a million docs, then keep adding docs but 
only after doing a delete on the UID where UID instead of increasing, is looped 
around mod 1million.  Calling numDocs (not maxDoc) on the reader with Zoie 
always returns 1M after looping around, but with NRT, it starts slowly growing 
above 1M.

The CPU and memory is undoubtedly due to the constant reopening of these 
readers, and yes, could be aleiviated by not doing this - we're just comparing 
to the zoie case, where we *do* reopen (the RAMDir) on every request (and copy 
the delSet) if there have been modifications since the last update.

bq. Lucene NRT makes a clone of the BitVector for every reader that has new 
deletions. Once this is done, searching is "normal" - it's as if the reader 
were a disk reader. There's no extra checking of deleted docs (unlike Zoie), no 
OR'ing of 2 BitVectors, etc.

Ok, so if this is copy-on-write, it's done every time there is a new delete for 
that segment?  If the disk index is optimized that means it would happen on 
every update, a clone of the full numDocs sized BitVector?  I'm still a little 
unsure of how this happens. 

* somebody calls getReader() - they've got all the SegmentReaders for the disk 
segments, and each of them have BitVectors for deletions.
* IW.update() gets called - the BitVector for the segment which now has a 
deletion is cloned, and set on a new pooled SegmentReader as its deletedSet
* maybe IW.update() gets called a bunch more - do these modify the pooled but 
as-yet-unused SegmentReader?  New readers in the pool?  What?
* another call to getReader() comes in, and gets an IndexReader wrapping the 
pooled SegmentReaders.

bq. Yes, this makes Lucene's reopen more costly. But, then there's no double 
checking for deletions. That's the tradeoff, and this is why the 64 msec is 
added to Zoie's search time. Zoie's searches are slower.

So we re-ran some of our tests last night, commenting out our deleted check to 
measure it's cost in the most extreme case possible: a dead easy query (in that 
it's only one term), but one which yes, hits the entire index (doing a 
MatchAllDocs query is actually special-cased in our code, and is perfectly 
fast, so not a good worst case to check), and as the index grows up above a 
million documents, zoie could shave somewhere from 22-28% of its time off by 
not doing the extra check.  

We haven't re-run the test to see what happens as the index grows to 5M or 10M 
yet, but I can probably run that later today.

bq. The fact that Zoie on the pure indexing case (ie no deletions) was 10X 
faster than Lucene is very weird - that means something else is up,
besides how deletions are carried in RAM. It's entirely possible it's the fact 
that Lucene doesn't flush the tiny segments to a RAMDir (which LUCENE-1313 
addresses).

Yeah, if you call getReader() a bunch of times per second, each one does a 
flush(true,true,true), right?  Without having LUCENE-1313, this kills the 
indexing performance if querying is going on.  If no getReader() is being 
called at all, Zoie is about 10% slower than pure Lucene IndexWriter.add() 
(that's the cost of doing it in two steps - index into *two* RAMDirs [so they 
are hot-swappable] and then writing segments to disk with 
addIndexesNoOptimize() periodically).

I don't think there's any difference in the MergePolicy - I think they're both 
using the BalancedMergePolicy (since that's the one which is optimized for the 
realtime case).

bq. Actually I thought of a simple way to run the "search only" (not reopen) 
test - I'll just augment TopScoreDocCollector to optionally check the 
IntSetAccelerator, and measure the cost in practice, for different numbers of 
docs added to the IntSet.

Due to the bloomfilter living on top of the hashSet, at least at the scales 
we're dealing with, we didn't see any change in cost due to the number of 
deletions (zoie by default keeps no more than 10k modifications in memory 
before flushing to disk, so the biggest the delSet is going to be is that, and 
we don't see the more-than-constant scaling yet at that size).

bq. But your test is missing a dimension: frequency of reopen. If you reopen 
once per second, how do Zoie/Lucene compare? Twice per second? Once every 5 
seconds? Etc.

Yep, this is true.  It's a little more invasive to put this into Zoie, because 
the reopen time is so fast that there's no pooling, so it would need to be 
kinda hacked in, or tacked on to the outside.  Not rocket science, but not just 
the change of a parameter.

LinkedIn doesn't have any hard requirements of having to reopen hundreds of 
times per second, we're just stressing the system, to see what's going on.  As 
you can see, nobody's filing a bug here that Lucene NRT is "broken" because it 
can't handle zero-latency updates.  What we did try to make sure was in the 
system was determinism: not knowing whether an update will be seen because 
there is some background process doing addIndexes from another thread which 
hasn't completed, or not knowing how fresh the pooled reader is, that kind of 
thing.  

This kind of determinism can certainly be gotten with NRT, by locking down the 
IndexWriter wrapped up in another class to keep it from being monkeyed with by 
other threads, and then tuning exactly how often the reader is reopened, and 
then dictate to clients that the freshness is exactly at or better than this 
freshness timeout, sure.  This kind of user-friendliness is one of Zoie's main 
points - it provides an indexing system which manages all this, and certainly 
for some clients, we should add in the ability to pool the readers for less 
real-timeness, that's a good idea.

Of course, if your reopen() time is pretty heavy (lots of FieldCache data / 
other custom faceting data needs to be loaded for a bunch of fields), then at 
least for us, even not needing zero-latency updates means that the more 
realistically 5-10% degredation in query performance for normal queries is 
negligable, and we get deterministic zero-latency updates as a consequence.

This whole discussion reminded me that there's another realtime update case, 
which neither Zoie nor NRT is properly optimized for: the absolutely zero 
deletes case with very fast indexing load and the desire for minimal latency of 
updates (imagine that you're indexing twitter - no changes, just adds), and you 
want to be able to provide a totally stream-oriented view on things as they're 
being added (matching some query, naturally) with sub-second turnaround.  A 
subclass of SegmentReader which is constructed which doesn't even have a 
deletedSet could be instantiated, and the deleted check could be removed 
entirely, speeding things up even further.

> For near real-time search, use paged copy-on-write BitVector impl
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1526
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1526
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Priority: Minor
>         Attachments: LUCENE-1526.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> SegmentReader currently uses a BitVector to represent deleted docs.
> When performing rapid clone (see LUCENE-1314) and delete operations,
> performing a copy on write of the BitVector can become costly because
> the entire underlying byte array must be created and copied. A way to
> make this clone delete process faster is to implement tombstones, a
> term coined by Marvin Humphrey. Tombstones represent new deletions
> plus the incremental deletions from previously reopened readers in
> the current reader. 
> The proposed implementation of tombstones is to accumulate deletions
> into an int array represented as a DocIdSet. With LUCENE-1476,
> SegmentTermDocs iterates over deleted docs using a DocIdSet rather
> than accessing the BitVector by calling get. This allows a BitVector
> and a set of tombstones to by ANDed together as the current reader's
> delete docs. 
> A tombstone merge policy needs to be defined to determine when to
> merge tombstone DocIdSets into a new deleted docs BitVector as too
> many tombstones would eventually be detrimental to performance. A
> probable implementation will merge tombstones based on the number of
> tombstones and the total number of documents in the tombstones. The
> merge policy may be set in the clone/reopen methods or on the
> IndexReader. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1526) For near real-time search, use paged copy-on-write BitVector impl

Reply via email to