med out after 150
2017-09-19 03:06:39.782749 7fdfeae86700 -1 common/HeartbeatMap.cc: In
function 'bool ceph::HeartbeatMap::_check(const
ceph::heartbeat_handle_d*, const char*, time_t)' thread 7fdfeae86700
time 2017-09-19
03:06:39.778940
common/HeartbeatMap.cc: 86: FAILED assert(0 == "hit suicide timeout")
ng due to
an IO error.
A other idea: The OSD daemon keeps running in a defined error state
and only stops the listeners with other OSDs and the clients.
--
*Stanley Zhang | * Senior Operations Engineer
*Telephone:* +64 9 302 0515 *Fax:* +64 9 302 0518
*Mobile:* +64 22 318 3664 *Freephone:* 08
Your bucket index got corrupted. I believe there is no easy way to
restore the index other than downloading existing objects and re-upload
them, correct me if anybody else know a better way.
You can check out all your objects in that bucket with:
rados -p .rgw.buckets ls | grep default.3278576
ot-usable before but usable 2
days later? One thing that might fix the index object is leveldb
compactions I guess. By the way the above problematic index object has
~30k keys, the biggest index object in our cluster holds about 300k keys.
Regards
Stanley
--
*Stanley Zhang | * Senior Operations