Amazing, I knew there was a replicator database but I didn't see it in the API 
reference. It's working as expected, thanks ! 
   
 
--
Matthieu Rakotojaona
Research Engineer, Inria <https://www.inria.fr/>
STACK team <https://stack-research-group.gitlabpages.inria.fr/web/>  

 
   

-----Message original-----

De: Nick <[email protected]>
à: user <[email protected]>
Envoyé: jeudi 26 octobre 2023 19:40 CEST
Sujet : Re: Replication with unreachable targets are automatically removed

That's currently the expected behavior for _replicate (transient) 
replication jobs. There is retries_per_request parameter 
https://docs.couchdb.org/en/stable/config/replicator.html#replicator/retries_per_request
 
to help configure retries for individual http requests the replication 
job makes, but if it the whole job fails it will be removed. The jobs 
which are transient are expected to be managed/monitored by some 
external application code. However if you do want the jobs to keep 
trying after failure, consider using regular replication jobs backed 
by a document in a `_replicator` database. 

Cheers, 
-Nick 

On Thu, Oct 26, 2023 at 8:05 AM Matthieu RAKOTOJAONA RAINIMANGAVELO 
<[email protected]> wrote: 
> 
> 
> Hello there, 
> 
> I realized that when a replication, continuous or transient, is ran, but the 
> target host is unreachable the replication job is deleted. Here are a few 
> examples of logs: 
> 
> 
> [error] 2023-10-18T12:43:37.394891Z [email protected] <0.582.0> -------- 
> couch_replicator_scheduler : Transient job 
> {"0f63c93e6e24efacede944ce1ed14795","+continuous"} failed, removing. Error: 
> <<"{checkpoint_commit_failure,<<\"instance_start_time on source and target 
> database has changed since last checkpoint.\">>}">> 
> 
> [error] 2023-10-18T12:43:38.679316Z [email protected] <0.582.0> -------- 
> couch_replicator_scheduler : Transient job 
> {"ecc1efdf8f86c4ef626f0ba36766ec56","+continuous"} failed, removing. Error: 
> <<"{checkpoint_commit_failure,<<\"instance_start_time on source and target 
> database has changed since last checkpoint.\">>}">> 
> 
> [error] 2023-10-18T12:43:42.230056Z [email protected] <0.582.0> -------- 
> couch_replicator_scheduler : Transient job 
> {"a3ede72be05b5aaf0da538843928491a","+continuous"} failed, removing. Error: 
> <<"{checkpoint_commit_failure,<<\"instance_start_time on source and target 
> database has changed since last checkpoint.\">>}">> 
> 
> [error] 2023-10-18T12:44:37.178885Z [email protected] <0.582.0> -------- 
> couch_replicator_scheduler : Transient job 
> {"76e126167983ab9e8003853ad5cbcfaa",[]} failed, removing. Error: 
> <<"{http_request_failed,\"GET\",\n 
> \"http://some.anonymized.host:5984/db/\",\n 
> {error,{error,{conn_failed,{error,econnrefused}}}}}">> 
> 
> [error] 2023-10-18T12:44:43.901342Z [email protected] <0.582.0> -------- 
> couch_replicator_scheduler : Transient job 
> {"0f63c93e6e24efacede944ce1ed14795","+continuous"} failed, removing. Error: 
> <<"{checkpoint_commit_failure,<<\"Failure on target commit: {'EXIT',\\n 
> {http_request_failed,\\\"POST\\\",\\n 
> \\\"http://some.anonymized.host:5984/db/_ensure_full_commit\\\",\\n 
> {error,{error,{conn_failed,{error,econnrefused}}}}}}\">>}">> 
> 
> 
> 
> The problem is that in my usecase it is expected for these hosts to be 
> unreachable. I want couchdb to consider this as a transient error and 
> continue, and a human will tell Couchdb when a replication job should be 
> actually removed. Today I need some application code to recreate those 
> replication jobs but I'd like not to. 
> 
> Is there a way to have those replication persist ? 
> 
> -- 
> Matthieu Rakotojaona 
> Research Engineer, Inria <https://www.inria.fr/> 
> STACK team <https://stack-research-group.gitlabpages.inria.fr/web/>   

Reply via email to