Hi everyone,

Recently, I just noticed that there is a lot of log about Broken pipe error 
from all RGW nodes.

Log:
2021-08-04T06:25:05.997+0000 7f4f15f7b700  1 ====== starting new request 
req=0x7f4fac3d7670 =====
2021-08-04T06:25:05.997+0000 7f4f15f7b700  0 ERROR: 
client_io->complete_request() returned Broken pipe
2021-08-04T06:25:05.997+0000 7f4f15f7b700  1 ====== req done req=0x7f4fac3d7670 
op status=0 http_status=200 latency=0s ======
2021-08-04T06:25:05.997+0000 7f4f15f7b700  1 beast: 0x7f4fac3d7670: 
10.0.244.246 - - [2021-08-04T06:25:05.997988+0000] "HEAD / HTTP/1.0" 200 0 - - -
2021-08-04T06:25:06.337+0000 7f4ebe6cc700  1 ====== starting new request 
req=0x7f4fac3d7670 =====
2021-08-04T06:25:06.337+0000 7f4f13776700  0 ERROR: 
client_io->complete_request() returned Broken pipe
2021-08-04T06:25:06.337+0000 7f4f13776700  1 ====== req done req=0x7f4fac3d7670 
op status=0 http_status=200 latency=0s ======
2021-08-04T06:25:06.337+0000 7f4f13776700  1 beast: 0x7f4fac3d7670: 
10.0.244.244 - - [2021-08-04T06:25:06.337994+0000] "HEAD / HTTP/1.0" 200 0 - - -
2021-08-04T06:25:07.994+0000 7f4eaa6a4700  1 ====== starting new request 
req=0x7f4fac3d7670 =====
2021-08-04T06:25:07.994+0000 7f4eaa6a4700  1 ====== req done req=0x7f4fac3d7670 
op status=0 http_status=200 latency=0s ======
2021-08-04T06:25:07.994+0000 7f4eaa6a4700  1 beast: 0x7f4fac3d7670: 
10.0.244.245 - - [2021-08-04T06:25:07.994022+0000] "HEAD / HTTP/1.0" 200 5 - - -
2021-08-04T06:25:08.002+0000 7f4ee1f13700  1 ====== starting new request 
req=0x7f4fac3d7670 =====
2021-08-04T06:25:08.002+0000 7f4ee1f13700  0 ERROR: 
client_io->complete_request() returned Broken pipe
2021-08-04T06:25:08.002+0000 7f4ee1f13700  1 ====== req done req=0x7f4fac3d7670 
op status=0 http_status=200 latency=0s ======
2021-08-04T06:25:08.002+0000 7f4ee1f13700  1 beast: 0x7f4fac3d7670: 
10.0.244.246 - - [2021-08-04T06:25:08.002023+0000] "HEAD / HTTP/1.0" 200 0 - - -
….

We setup the cluster by using Ceph-ansible script. The currently version of 
cluster is Octopus (15.2.13). After check the configuration in RGW nodes, I see 
that there is a config in HAProxy for sending a request to RGW instances every 
2s for health check.
The problem is gone after disabling the check but I think this is not a good 
way to fix the problem…

Does anyone have experience on this problem?

References
- Ansible script commit for adding health check: 
https://github.com/ceph/ceph-ansible/commit/a951c1a3f0a34e086964f52b0bbf7a8d89481aad#diff-1ea21f2851186f2a01ff25e715ed670b9d96629c6b7bc385aefd9e4154204bde


Many thanks,
Nghia.
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to