> On Mar 30, 2016, at 6:26 AM, Jan Lehnardt <j...@apache.org> wrote:
> 
>> 
>> On 29 Mar 2016, at 20:14, Adam Kocoloski <kocol...@apache.org 
>> <mailto:kocol...@apache.org>> wrote:
>> 
>> Neat stuff. Years ago I actually committed this feature to the codebase 
>> using a table scan and then Damien backed it out because of the scalability 
>> concern. Glad to see we’re approaching it in a more considered fashion this 
>> time around :)
>> 
>> One thing we might consider is to maintain a *count* of the number of 
>> conflicted documents in the database automatically. If the count is nonzero 
>> when you expected it to be zero, build the conflicted documents index and do 
>> your inspection. In the happy case where there are no conflicts we just 
>> saved you a bunch of effort.
>> 
>> We don’t really need a separate index to accomplish this; we just need to 
>> modify the reducer function supplied to the by_id btree. We’ve played that 
>> game before to add things like data size accumulators to the DB info object. 
>> There may be a modest hit to the write performance to count the number of 
>> non-deleted leafs in the rev tree on document update, but honestly that says 
>> as much about the inefficiencies in couch_key_tree as anything else - that 
>> quantity ought to be very cheap to uncover.
> 
> Bob Newson and I talked about this on IRC some more and I think this is all 
> similar if not the same thinking: remember how we optimised `skip` in view 
> results? We could keep track of the number of conflicts per b-tree node and 
> then easily skip over the subtrees that don’t have any conflicts, so a 
> table-scan would be relatively cheap.
> 
> Best
> Jan
> —

Yes, even better. I only started with the count but it’s easy to add the 
efficient scan once you have the count in the subtree.

Adam

Reply via email to