Sdas0000 commented on issue #4897:
URL: https://github.com/apache/couchdb/issues/4897#issuecomment-3001436948

   HI @nickva we upgraded couchdb to version 3.4.3 and we see the same issues. 
One thing we noticed 
   
   we see this messages for in almost all databases periodically "Jun 24 
14:35:33 item-cache-ro-us-west1-1 couchdb[10011]: 
[email protected] <0.21671.2412> 
-------- Replication `9517e37a183c600d85e42a9cb94ea4b9+continuous` 
(`https://item-cache-master-r.pr-store-monarch.str.xxxxxx.com/item_store-53/` 
-> 
`https://item-cache-ro-us-west1-r.pr-store-monarch.str.xxxxxx.com/item_store-53/`)
 failed: {changes_reader_died,{timeout,ibrowse_stream_cleanup}}" 
   
   but we have noticed once this massages appears , the replication scheduler 
history shows that the replication  crashes and restarts automatically. But 
sometime the replication do not crash for that database( even though timeout 
happens), and the entries in _active_tasks for that replication does not 
exists. so replication for that database is kind of hung. To fix this we had to 
 restart couchdb on the node where that replication was running. Again after 3 
4 days same issue appears on some other database.
   
   example: item_store_replication_53 is in scheduler/jobs but does not appears 
in active_tasks ( the source has mode doc count than the target) . The doc 
count matched after we bounce the node where this replication was running. 
   
     {
         "database": "_replicator",
         "id": "9517e37a183c600d85e42a9cb94ea4b9+continuous",
         "pid": "<0.21671.2412>",
         "source": 
"https://item-cache-master-r.pr-store-monarch.str.xxxx.com/item_store-53/";,
         "target": 
"https://item-cache-ro-us-west1-r.pr-store-monarch.str.xxxx.com/item_store-53/";,
         "user": null,
         "doc_id": "item_store_replication_53",
         "info": {
           "revisions_checked": 6236990,
           "missing_revisions_found": 2288404,
           "docs_read": 2288389,
           "docs_written": 2288389,
           "changes_pending": 0,
           "doc_write_failures": 0,
           "bulk_get_docs": 2288387,
           "bulk_get_attempts": 2288387,
           "checkpointed_source_seq": 
"24857240-g1AAAAJveJzLYWBgYMlgTmEwT84vTc5ISXLILEnN1U1OTM5I1c1NLC5JLdI11EvWKyjSLS7JLwKK5eclFiVn6GXmAaXyEnNygAYwJTIk2f___z8rgzmJIe7Os1ygGLuJUVJKUpo5-SYTcJUJQVclOQDJpHq4wy43gx1mZJqSbJ6WTL7hBBxmRNBheSxAkqEBSAHdth_suHg2f7Dj0owTLZMt8IcaXgsojkuI4w5AHAcNuckPIFFqYW5mkmxJvgVZAIOF1jo",
           "source_seq": 
"24857240-g1AAAAJveJzLYWBgYMlgTmEwT84vTc5ISXLILEnN1U1OTM5I1c1NLC5JLdI11EvWKyjSLS7JLwKK5eclFiVn6GXmAaXyEnNygAYwJTIk2f___z8rgzmJIe7Os1ygGLuJUVJKUpo5-SYTcJUJQVclOQDJpHq4wy43gx1mZJqSbJ6WTL7hBBxmRNBheSxAkqEBSAHdth_suHg2f7Dj0owTLZMt8IcaXgsojkuI4w5AHAcNuckPIFFqYW5mkmxJvgVZAIOF1jo",
           "through_seq": 
"24857240-g1AAAAJveJzLYWBgYMlgTmEwT84vTc5ISXLILEnN1U1OTM5I1c1NLC5JLdI11EvWKyjSLS7JLwKK5eclFiVn6GXmAaXyEnNygAYwJTIk2f___z8rgzmJIe7Os1ygGLuJUVJKUpo5-SYTcJUJQVclOQDJpHq4wy43gx1mZJqSbJ6WTL7hBBxmRNBheSxAkqEBSAHdth_suHg2f7Dj0owTLZMt8IcaXgsojkuI4w5AHAcNuckPIFFqYW5mkmxJvgVZAIOF1jo"
         },
         "history": [
           {
             "timestamp": "2025-06-24T09:24:54Z",
             "type": "started"
           },
           {
             "timestamp": "2025-06-24T09:24:54Z",
             "type": "crashed",
             "reason": "{changes_reader_died,{timeout,ibrowse_stream_cleanup}}"
           },
           {
             "timestamp": "2025-06-24T08:12:16Z",
             "type": "started"
           },
           {
             "timestamp": "2025-06-24T08:12:16Z",
             "type": "crashed",
             "reason": "{changes_reader_died,{timeout,ibrowse_stream_cleanup}}"
           }, 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to