feliixx commented on issue #5585:
URL: https://github.com/apache/couchdb/issues/5585#issuecomment-3044521597

   Hi @nickva,
   
   Thank you very much for your answer !
   
   > More open dbs is an expected behavior #5402. Set a maximum dbs open that 
suites your case and they'll be replace / cycled in an LRU fashion.
   
   Ah so that's the explanation, thanks a lot ! We further reduced 
`max_dbs_open` to 5000 as it better suits our need, and this decreased the CPU 
consumption of the v3.5.0 node a little bit (it's still noticeably higher than 
the two other v3.4.3 nodes though).
   
   > There was a recent issue regarding high memory usage as well related to a 
nofiles ulimit interaction with a large (unlimited ulimit from OS). #5575. But 
that seems mostly docker related, but worth checking out just in case.
   
   We did a small check and `nofiles` does have a proper limit set in our VMs, 
so the problem doesn't come from here unfortunately.
   
   > It could be an Erlang behavior change perhaps, how scheduling works and 
such. To specifically reduce CPU usage could try these settings in vm.args 
(uncomment those):
   
   Interesting, it might indeed help (we haven't tried yet) !
   But that's more a general optimization change and it's not tied to the 
v3.5.0 release right ?
   
   > I am not entirely sure about the higher CPU usage. The databases being 
open but idle, shouldn't add a lot more CPU usage in general. Do you see any 
other corresponding changes - more throughput, more disk I/O usage. Any change 
or differences with request latency.
   
   There is one major change, it's the number of bytes sent/receive by the 
v3.5.0 node:
   
   **Network in - All node on 3.4.3**          
   
   
![Image](https://github.com/user-attachments/assets/26ef22fd-73a6-47f6-93a6-909bf8f049dd)
   
   **Network in -  prod-v7-a on v3.5.0, other two nodes on 3.4.3**
   
   
![Image](https://github.com/user-attachments/assets/16cba031-ba1c-49be-b3ca-8570e68acc52)
   
   **Network out - All node on 3.4.3**
   
   
![Image](https://github.com/user-attachments/assets/8887d81c-cda8-4bc2-9aa7-b1b725e08faf)
   
   **Network out -  prod-v7-a on v3.5.0, other two nodes on 3.4.3**
   
   
![Image](https://github.com/user-attachments/assets/e9bdd366-d78c-43bf-8e21-b3347cf22649)
   
   
   notes: 
   - traffic was similar on both days 
   - network in / out at the load balancer level was similar on both days, so 
this traffic difference is only due to intra-cluster communication (ie, 
exchange between nodes)
   
   Do you know why the v3.5.0 nodes send so much more bytes to the two other 
machines ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to