Hello all,

our RAID storage got stuck this night (CET).  Likely some deadlock caused by
high I/O - caused by the periodic "RAID check" process together with the Fedora
Rawhide => F38 branching task that has been done yesterday.  But the actual
cause is unknown (if this looks familiar to you, let us know).

We ended up with about 400 processes hanged, waiting for disk (kill -9 has
no effect).  This is the second time this happened on F37 but "echo idle >
/sys/block/md127/md/sync_action" did not help now (nor 'frozen'), only
hard reboot using the AWS console helped eventually.

We were more than 7 hours offline (the team was sleeping):
https://pagure.io/fedora-infrastructure/issue/11120

Sorry for the inconvenience, and thank you for all the reports!

Pavel


_______________________________________________
copr-devel mailing list -- copr-devel@lists.fedorahosted.org
To unsubscribe send an email to copr-devel-le...@lists.fedorahosted.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedorahosted.org/archives/list/copr-devel@lists.fedorahosted.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to