> Op 19 september 2017 om 10:02 schreef Manuel Lausch <manuel.lau...@1und1.de>:
> 
> 
> Hi,
> 
> I see a issue with systemd's restart behaviour and disk IO-errors 
> If a disk fails with IO-errors ceph-osd stops running. Systemd detects
> this and starts the daemon again. In our cluster I did see some loops
> with osd crashes caused by disk failure and restarts triggerd by
> systemd. Every time with peering impact and timeouts to our application
> until systemd gave up.
> 
> Obviously ceph needs the restart feature (at least with dmcrypt) to
> avoid raceconditions In the startup process. But in the
> case of disk related failures this is contraproductive. 
> 
> What do you think about this? Is this a bug which should be fixed?
> 

I understand what you mean and it's indeed dangerous, but see: 
https://github.com/ceph/ceph/blob/master/systemd/ceph-osd%40.service

Looking at the systemd docs it's difficult though: 
https://www.freedesktop.org/software/systemd/man/systemd.service.html

If the OSD crashes due to another bug you do want it to restart.

But for systemd it's not possible to see if the crash was due to a disk 
I/O-error or a bug in the OSD itself or maybe the OOM-killer or something.

Wido

> We use ceph jewel (10.2.9)
> 
> 
> Regards
> Manuel 
> 
> 
> -- 
> Manuel Lausch
> 
> Systemadministrator
> Cloud Services
> 
> 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
> 76135 Karlsruhe | Germany Phone: +49 721 91374-1847
> E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de
> 
> Amtsgericht Montabaur, HRB 5452
> 
> Geschäftsführer: Thomas Ludwig, Jan Oetjen
> 
> 
> Member of United Internet
> 
> Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
> Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
> sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
> bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
> bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
> weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
> verwenden.
> 
> This e-mail may contain confidential and/or privileged information. If
> you are not the intended recipient of this e-mail, you are hereby
> notified that saving, distribution or use of the content of this e-mail
> in any way is prohibited. If you have received this e-mail in error,
> please notify the sender and delete the e-mail.
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to