> I'm not sure why the monitor did not mark it _out_ after 600 seconds
> (default)

Well, that part I understand.  The monitor didn't mark the OSD out because the
monitor still considered the OSD up.  No reason to mark an up OSD out.

I think the monitor should have marked the OSD down upon not hearing from it
for 15 minutes ("mon osd report interval"), then out 10 minutes after that
("mon osd down out interval").

And that's worst case.  Though details of how OSDs watch each other are vague,
I suspect an existing OSD was supposed to detect the dead OSDs and report that
to the monitor, which would believe it within about a minute and mark the OSDs
down.  ("osd heartbeat interval", "mon osd min down reports", "mon osd min down
reporters", "osd reporter subtree level").

-- 
Bryan Henderson                                   San Jose, California
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to