On 19/09/17 10:40, Wido den Hollander wrote:
> 
>> Op 19 september 2017 om 10:24 schreef Adrian Saul 
>> <adrian.s...@tpgtelecom.com.au>:
>>
>>
>>> I understand what you mean and it's indeed dangerous, but see:
>>> https://github.com/ceph/ceph/blob/master/systemd/ceph-osd%40.service
>>>
>>> Looking at the systemd docs it's difficult though:
>>> https://www.freedesktop.org/software/systemd/man/systemd.service.ht
>>> ml
>>>
>>> If the OSD crashes due to another bug you do want it to restart.
>>>
>>> But for systemd it's not possible to see if the crash was due to a disk I/O-
>>> error or a bug in the OSD itself or maybe the OOM-killer or something.
>>
>> Perhaps using something like RestartPreventExitStatus and defining a 
>> specific exit code for the OSD to exit on when it is exiting due to an IO 
>> error.
>>
> 
> That's a very, very good idea! I didn't know that one existed.
> 
> That would prevent restarts in case of I/O error indeed.

That would depend on the OSD gracefully handling the I/O failure - IME
they quite often seem to end up abort()ing...

Regards,

Matthew


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to