Hi Bob, comments inline: > On Mar 19, 2016, at 2:36 PM, Robert Samuel Newson <rnew...@apache.org> wrote: > > Hi, > > The problem is that _db_updates is not guaranteed to see every update, so I > think it falls at the first hurdle.
Do you mean to say that a listener of _db_updates is not guaranteed to see every updated *database*? I think it would be helpful for the discussion to describe the scenario in which an updated database permanently fails to show up in the feed. My recollection is that it’s quite byzantine. > What couch_replicator_manager does in couchdb 2.0 (though not in the version > that Cloudant originally contributed) is to us ecouch_event, notice which are > to _replicator shards, and trigger management work from that. Did you mean to say “couch_event”? I assume so. You’re describing how the replicator manager discovers new replication jobs, not how the jobs discover new updates to source databases specified by replication jobs. Seems orthogonal to me unless I missed something. > Some work I'm embarking on, with a few other devs here at Cloudant, is to > enhance the replicator manager to not run all jobs at once and it is indeed > the plan to have each of those jobs run for a while, kill them (they > checkpoint then close all resources) and reschedule them later. It's TBD > whether we'd always strip feed=continuous from those. We _could_ let each job > run to completion (i.e, caught up to the source db as of the start of the > replication job) but I think we have to be a bit smarter and allow > replication jobs that constantly have work to do (i.e, the source db is > always busy), to run as they run today, with feed=continuous, unless forcibly > ousted by a scheduler due to some configuration concurrency setting. So I think this is really the crux of the issue. My contention is that permanently occupying a socket for each continuous replication with the same source and mediator is needlessly expensive, and that _db_updates could be an elegant replacement. > I note for completeness that the work we're planning explicitly includes > "multi database" strategies, you'll hopefully be able to make a single > _replicator doc that represents your entire intention (e.g, "replicate _all_ > dbs from server1 to server2”). Nice! It’ll be good to hear more about that design as it evolves, particularly in aspects like discovery of newly created source databases and reporting of 403s and other fatal errors. Adam