Replicated documents do not properly go through validate_doc_update functions
-----------------------------------------------------------------------------

                 Key: COUCHDB-915
                 URL: https://issues.apache.org/jira/browse/COUCHDB-915
             Project: CouchDB
          Issue Type: Bug
          Components: Replication
    Affects Versions: 1.0.1
         Environment: Linux  RedHat AS 4 x86
            Reporter: Oguzhan Eris


Before a replicated document gets written, it undergoes a validation check just 
like a normal doc update would, but in the case of replication, the "oldDoc" 
variable for a validate_doc_update function does not represent the latest 
revision of the document.

Imagine two couchdb instances each with the same starting doc.

{_id:"docA", lastModifiedTime:1}

now nodeA  updates the doc to

{_id:"docA", lastModifiedTime:2}

and nodeB updates independently to  {_id:"docA",lastModifiedTime:30}

and imagine both nodes already having a validate_doc_update function that says  
 (if oldDoc && Number(oldDoc.lastModifiedTime) >= 
Number(newDoc.lastModifiedTime)) { throw ([forbidden, "already have a more 
recent doc"]);

so each doc has properly gone through a validation check to get their second 
revision, and when we replicate from nodeB to nodeA, we should expect that the 
document should indeed be updated to lastModifiedTime:30  and when we replicate 
from nodeA to nodeB we should NOT get lastModifiedTime:2  since our validation 
function should prevent it.

What happens however is that when nodeA replicates to nodeB, nodeB's validation 
function gets called with the Rev1 as the oldDoc argument, and in this 
situation lastModifiedTime:2 does get the "ok" from the validate function since 
it's comparing against the first revision with lastModifiedTime:1  instead of  
latest revision lastModifiedTime:30

This happens in couch_db.erl 

prep_and_validate_replicated_updates(Db, [Bucket|RestBuckets], 
[OldInfo|RestOldInfo], AccPrepped, AccErrors)

more specifically:

lists:foldl(
            fun(#doc{id=Id,revs={Pos, [RevId|_]}}=Doc, {AccValidated, 
AccErrors2}) ->
                case dict:find({Pos, RevId}, LeafRevsFullDict) of
                {ok, {Start, Path}} ->
                    % our unflushed doc is a leaf node. Go back on the path
                    % to find the previous rev that's on disk.


I am not sure what the reasoning behind this is, but to me, it'd make more 
sense to compare (or at least have a way to compare) the latest revision 
locally as the oldDoc in a validate_doc_update function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to