Thanks Kyle, 
--I'll look into and  try out udev and upstart.
-- yes on set "noout", definitely a good idea, until for sure that osd is gone 
for good.

If osd disk is totally gone,
Then  down-n'-out.
Remove  from crushmap/Update crushmap.
Verify crushmap
Then used ceph-deploy to add a replacement osd in the same osd.num

Does this sound about right?

-Ben




-----Original Message-----
From: ceph-users-boun...@lists.ceph.com 
[mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Kyle Bader
Sent: Friday, November 15, 2013 12:58 PM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Today I’ve encountered multiple OSD down and multiple 
OSD won’t start and OSD disk access “Input/Output” error”

> 3).Comment out,  #hashtag the bad OSD drives in the “/etc/fstab”.

This is unnecessary if your using the provided upstart and udev scripts, OSD 
data devices will be identified by label and mounted. If you choose not to use 
the upstart and udev scripts then you should write init scripts that do similar 
so that you don't have to have /etc/fstab entries.

>                 3).Login to Ceph Node  with bad OSD net/serial/video.

I'd put check dmesg somewhere near the top of this section, often if you lose 
an OSD due to a filesystem hiccup then it will be evident in dmesg output.

>  4).Stop only this local Ceph node  with “service Ceph stop”

You may want to set "noout" depending on whether you expect it to come back 
online within your "mon osd down out interval" threshold.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to