Hi Guys,

We are experiencing some OSD crashing issues recently, like messenger crash, 
some strange crash (still being investigating), etc. Those crashes seems not to 
reproduce after restarting OSD.

So we are thinking about the strategy of auto-restarting crashed OSD for 1 or 2 
times, then leave it as down if restarting doesn't work. This strategy might 
help us on pg peering and recovering impact to online traffic to some extent, 
since we won't mark OSD out automatically even if it is down unless we are sure 
it is disk failure.

However, we are also aware that this strategy may bring us some problems. Since 
your guys have more experience on CEPH, so we would like to hear some 
suggestions from you.

Thanks.

David Zhang  
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to