Thanks Adam, that's what I thought as well, but believe me I'm having a
really hard time understanding the explanation of max_jobs and
max_churns from the docs.
I don't exactly get the difference between those two values. My first
guess was that max_jobs was a systemwide max value while max_churn would
define how many jobs would run at the same time.
I tried it and it wasn't working as expected.
Now I just reread it and I'm guessing that
while true {
if (jobs > max_jobs) {
for (x = 1 to max_churn) {
kill_or_start(something)
}
}
sleep(interval)
}
Is this correct?
---
Andrea Brancatelli
On 2018-10-30 17:17, Adam Kocoloski wrote:
> Hi Andrea, your numbers don't sound crazy for an out-of-the-box setup.
>
> Worth noting that in CouchDB 2.1 and above there is a replication scheduler
> which can cycle through an ~unlimited number of continuous replications
> within a defined resource envelope. The scheduler is documented here:
>
> http://docs.couchdb.org/en/stable/replication/replicator.html#replication-scheduler
>
> <http://docs.couchdb.org/en/stable/replication/replicator.html#replication-scheduler>
>
> There are a number of configuration properties that govern the behavior of
> the scheduler and also the default resources allocated to any particular
> replication. These are clustered in the [replicator] configuration block:
>
> http://docs.couchdb.org/en/stable/config/replicator.html#replicator
> <http://docs.couchdb.org/en/stable/config/replicator.html#replicator>
>
> The `worker_processes` and `http_connections` in particular can have a
> significant impact on the resource consumption of each replication job. If
> your goal is to host a large number of lightweight replications you could
> reduce those settings, and then configure the scheduler to keep a large
> `max_jobs` running. It's also possible to override resource settings on a
> per-replication basis.
>
> Cheers, Adam
>
> On Oct 30, 2018, at 11:52 AM, Stefan Klein <[email protected]> wrote:
>
> Hi,
>
> can't comment on the behavior of recent, 2.x, versions of couchdb.
>
> Long time ago, with couchdb 1.4 or so I ran a similar test.
> Our solution was to:
> * keep a list of "active" users (by our application specific definition)
> * listen to _db_changes
> * run one-shot replications for the changed documents to the per-user dbs
> of the users who got access to the documents and are "active"
> When a users becomes "active" - again determined by application logic - a
> one-shot replication is run to bring the per-user db up to date.
>
> Sadly this logic is deeply integrated in our application code and can't be
> easily extracted to a module (we're using nodejs).
> It's also basically unchanged since then and we have to adapt to couchdb
> 2.x.
>
> regards,
> Stefan
>
> Am Di., 30. Okt. 2018 um 16:22 Uhr schrieb Andrea Brancatelli <
> [email protected]>:
>
> Sorry the attachment got stripped - here it is:
> https://pasteboard.co/HKRwOFy.png
>
> ---
>
> Andrea Brancatelli
>
> On 2018-10-30 15:51, Andrea Brancatelli wrote:
>
> Hi,
>
> I have a bare curiosity - I know it's a pretty vague question, but how many
> continuous replication jobs one can expect to run on a single "common"
> machine?
> With common I'd say a quad/octa core with ~16GB RAM...
>
> I don't need an exact number, just the order of it... 1? 10? 100? 1000?
>
> I've read a lot about the per-user approach, the filtered replication and all
> that stuff, but on a test server with 64 replication jobs (1
> central user and 32 test users) the machine is totally bent on its knees:
> root@bigdata-free-rm-01:~/asd # uptime
> 3:50PM up 5 days, 4:55, 3 users, load averages: 9.28, 9.84, 9.39
>
> I'm attaching a screenshot of current htop output (filtered for CouchDB user,
> but it's the only thing running on the machine)...
> --
>
> Andrea Brancatelli