Re: [ceph-users] Troubleshooting remapped PG's + OSD flaps

2017-05-18 Thread David Turner
"340 osds: 101 up, 112 in" This is going to be your culprit. Your CRUSH map is in a really weird state. How many OSDs do you have in this cluster? When OSDs go down, secondary OSDs take over for it, but when OSDs get marked out, the cluster re-balances to distribute the data according to how

[ceph-users] Troubleshooting remapped PG's + OSD flaps

2017-05-18 Thread nokia ceph
Hello, Env;- Bluestore EC 4+1 v11.2.0 RHEL7.3 16383 PG We did our resiliency testing and found OSD's keeps on flapping and cluster went to error state. What we did:- 1. we have 5 node cluster 2. poweroff/stop ceph.target on last node and waited everything seems to reach back to normal. 3.