Re: replication usage? creating dupes?

Damien Katz Wed, 16 Jul 2008 08:58:58 -0700

That problem is likely due to the fact the user HTTP request is timingout while waiting for the replication to complete, that in turn killsthe underlying replication process. Restarting the replication willusually help as CouchDB avoids sending the same document twice, but ifthe replication is exceptionally long it might not get past the pointwhere it it finishing examining the documents.

The problem is its only saves off the replication record once itcompletes successfully, so until it completes it always examine thesame number of documents to see if they exist on the target replica.The fix I need to implement is to have it save off the replicationrecord every x seconds during replication, then if it diesunexpectedly it will pick back up from the last replication record,reducing the number of documents needing to be reexamined.

Then we need to solve is the current problem of synchronous HTTPrequest to perform the replication. In Futon, the browser doesn't dothe replication, it just sends a single replication request to theCouchDB server. A CouchDB Erlang process then performs thereplication, accessing database either locally or via HTTP on otherErlang servers. Right now, the browser can timeout the HTTP requestduring a long replication, that in turn kills the replication process.

There are two potential solutions here, the first is to send a browserping to keep the connection alive. Easy do do with HTTP 1.1 I think,just send an empty HTTP chunk. The second is to make it impossible forthe broken HTTP request to kill the replication request. They aren'tmutually exclusive, but the more I think about it, the more I dislikethe second solution.


-Damien


On Jul 16, 2008, at 11:13 AM, Chris Anderson wrote:

On Wed, Jul 16, 2008 at 2:18 AM, Jan Lehnardt <[EMAIL PROTECTED]> wrote:

I'm surprised that his wasn't reported earlier. CouchDB replication
is supposed to be reliable (when we got all the bugs out), so an
external replication thing should not be necessary. I would have
guessed that reporting this is easier than writing code to circumvent
the problem. This should be fixed in CouchDB and not worked
around.


My experience with replication has been that it works flawlessly for
smaller datasets, and as the dataset grows, it either starts to take
so long it may as well be broken (but shows no errors in the log) or
occasionally does the =ERROR REPORT==== thing in the log. The later is
a new symptom in my experience.

I haven't had a chance to bring my install up to latest trunk, so I
hesitated to report it. Today's my only sane day for a couple of weeks
on each side, so I'll see what progress I can make.

Chris


--
Chris Anderson
http://jchris.mfdz.com

Re: replication usage? creating dupes?

Reply via email to