[ 
https://issues.apache.org/jira/browse/COUCHDB-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034718#comment-13034718
 ] 

James Howe commented on COUCHDB-1163:
-------------------------------------

I'm also trying to get a simple reproducible case, here are further details of 
our setup at the time these broken documents turned up.

8 couches, with each couch replicating to 3 others (continuous remote-remote 
replication with trivial filters).
Validators present for all classes of document.
Attachments being added to existing documents, creating conflicts (see 
COUCHDB-885).
Due to a bug on our end, a lot of documents were updated on every couch at the 
same revision, repeatedly, causing lots more conflicts.
At the same time, every 30 seconds, we queried for all conflicts (using a view 
with doc._conflicts) and did a bulk_docs POST for each doc, performing a 
no-change update to the deterministic couch winner, and deleting all others 
(i.e. {_id: foo, _rev: bar, _deleted: true}).

This lasted for no more than a day after which we started noticing all kinds of 
things going wrong (replication getting stuck, documents that are impossible to 
update or delete, etc.)

We're not in a position to run this exact setup again until we are certain 
corruption will not occur.

> Document returned by id, but cannot be found by rev
> ---------------------------------------------------
>
>                 Key: COUCHDB-1163
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1163
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>    Affects Versions: 1.0.1, 1.0.2
>            Reporter: James Howe
>            Priority: Critical
>         Attachments: Couch logging for jira issue
>
>
> Somehow, our cluster has developed the following problem on a handful of 
> documents. Will post reproduction steps if we find them. All properties have 
> been redacted. All the documents this affects also have attachments, if that 
> is significant. Once a document is in this situation, it causes conflict 
> detection, replication and include_docs to behave incorrectly or outright 
> fail.
> GET /database/4cdee83a118ea1cf3050b1d006144d46 returns
> {
>     "_id": "4cdee83a118ea1cf3050b1d006144d46",
>     "_rev": "10-df4bf65a6104ea240f100c30d3cb245d",
>     "foo": "bar"
> }
> GET /database/4cdee83a118ea1cf3050b1d006144d46?open_revs=all returns
> [
>     {
>         "ok": {
>             "_id": "4cdee83a118ea1cf3050b1d006144d46",
>             "_rev": "10-df4bf65a6104ea240f100c30d3cb245d",
>             "foo": "bar"
>         }
>     },
>     {
>         "ok": {
>             "_id": "4cdee83a118ea1cf3050b1d006144d46",
>             "_rev": "8-eea5e36daee12acd79a127abf36f7720",
>             _deleted: true
>         }
>     },
>     {
>         "ok": {
>             "_id": "4cdee83a118ea1cf3050b1d006144d46",
>             "_rev": "9-2cead1e4c813a4f0d10a9bc4aa28bfda",
>             _deleted: true
>         }
>     },
>     {
>         "ok": {
>             "_id": "4cdee83a118ea1cf3050b1d006144d46",
>             "_rev": "7-c3b44f004660caa496804409089b53d9",
>             _deleted: true
>         }
>     },
>     {
>         "ok": {
>             "_id": "4cdee83a118ea1cf3050b1d006144d46",
>             "_rev": "6-52e978041bb324d19e01a2ac5a243702",
>             _deleted: true
>         }
>     },
>     {
>         "ok": {
>             "_id": "4cdee83a118ea1cf3050b1d006144d46",
>             "_rev": "5-761bf28c6989f0fde41bdd5732c33159",
>             _deleted: true
>         }
>     },
>     {
>         "ok": {
>             "_id": "4cdee83a118ea1cf3050b1d006144d46",
>             "_rev": "4-abb005cf4b2d2dd12880a33af1e7066e",
>             _deleted: true
>         }
>     },
>     {
>         "ok": {
>             "_id": "4cdee83a118ea1cf3050b1d006144d46",
>             "_rev": "3-233e4624e620ec1c8b66f21a051832f8",
>             _deleted: true
>         }
>     },
>     {
>         "ok": {
>             "_id": "4cdee83a118ea1cf3050b1d006144d46",
>             "_rev": "10-55f0cdf9dd95ed230b733a2c826c842c",
>             _deleted: true
>         }
>     },
>     {
>         "ok": {
>             "_id": "4cdee83a118ea1cf3050b1d006144d46",
>             "_rev": "11-264c9d6c249ba2fc9b13df35cb447fd7",
>             _deleted: true
>         }
>     },
>     {
>         "ok": {
>             "_id": "4cdee83a118ea1cf3050b1d006144d46",
>             "_rev": "9-2cead1e4c813a4f0d10a9bc4aa28bfda",
>             _deleted: true
>         }
>     },
>     {
>         "ok": {
>             "_id": "4cdee83a118ea1cf3050b1d006144d46",
>             "_rev": "2-9f2df19059d9a460a12740a63a4d95e9",
>             _deleted: true
>         }
>     }
> ]
> GET 
> /database/4cdee83a118ea1cf3050b1d006144d46?rev=10-df4bf65a6104ea240f100c30d3cb245d
>  returns
> {
>     "error": "not_found",
>     "reason": "missing"
> }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to