Hi Sandro, Thanks for the response. I logged onto oVirt this morning, and I see the node is in a “Unassigned” state. I can ping it, but cannot SSH, so there is something that is causing the host to be unresponsive.
On Saturday after I sent the mail, I opened a console to the node, and I saw the below entries before logging in: audit:backlog limit exceeded I the tried the solution of increasing the buffer size in the audit.rules file in /etc/audit/rules.d/ , as per below, but it did not resolve the issue. ## First rule - delete all -D ## Increase the buffers to survive stress events. ## Make this bigger for busy systems -b 8192 ## Set failure mode to syslog -f 1 Is it possible to upgrade the node to 4.4 while the engine is still on 4.3? Thanks Anton Louw Cloud Engineer: Storage and Virtualization ______________________________________ D: 087 805 1572 | M: N/A A: Rutherford Estate, 1 Scott Street, Waverley, Johannesburg anton.l...@voxtelecom.co.za www.vox.co.za From: Sandro Bonazzola <sbona...@redhat.com> Sent: 13 November 2020 18:39 To: Anton Louw <anton.l...@voxtelecom.co.za>; Arik Hadas <aha...@redhat.com>; Dominik Holler <dhol...@redhat.com> Cc: users@ovirt.org; Johan Koen <johan.k...@voxtelecom.co.za> Subject: Re: [ovirt-users] oVirt Node Crash Il giorno ven 13 nov 2020 alle ore 17:37 Sandro Bonazzola <sbona...@redhat.com<mailto:sbona...@redhat.com>> ha scritto: Il giorno ven 13 nov 2020 alle ore 13:38 Anton Louw via Users <users@ovirt.org<mailto:users@ovirt.org>> ha scritto: Hi Everybody, I have built a new host which has been running fine for the last couple of days. I noticed today that the host crashed, but it is not giving me a reason as to why. It happened at 13:45 today, but I have given time before that on the logs as well. Is there something I am missing here? Not related to the crash, but I see in the logs that 5 out of 20 guests have qemu guest agent not responding. Also you seem to have some issues with some firewalld rules. (Maybe +Dominik Holler<mailto:dhol...@redhat.com> would like to have a look) I don't see anything explaining why the host got rebooted. Still related to guest agent I find a bit alarming the following lines: Nov 13 13:29:34 jb2-node03 libvirtd: 2020-11-13 11:29:34.294+0000: 12603: error : qemuDomainAgentAvailable:9144 : Guest agent is not responding: QEMU guest agent is not connected Nov 13 13:29:34 jb2-node03 vdsm[13843]: ERROR Shutdown by QEMU Guest Agent failed#012Traceback (most recent call last):#012 File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5304, in qemuGuestAgentShutdown#012 self._dom.shutdownFlags(libvirt.VIR_DOMAIN_SHUTDOWN_GUEST_AGENT)#012 File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 100, in f#012 ret = attr(*args, **kwargs)#012 File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper#012 ret = f(*args, **kwargs)#012 File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 94, in wrapper#012 return func(inst, *args, **kwargs)#012 File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2517, in shutdownFlags#012 if ret == -1: raise libvirtError ('virDomainShutdownFlags() failed', dom=self)#012libvirtError: Guest agent is not responding: QEMU guest agent is not connected Nov 13 13:29:42 jb2-node03 kernel: vlan0077: port 11(vnet15) entered disabled state Nov 13 13:29:42 jb2-node03 kernel: device vnet15 left promiscuous mode Nov 13 13:29:42 jb2-node03 kernel: vlan0077: port 11(vnet15) entered disabled state Nov 13 13:29:42 jb2-node03 NetworkManager[6027]: <info> [1605266982.6539] device (vnet15): state change: disconnected -> unmanaged (reason 'unmanaged', sys-iface-state: 'removed') Nov 13 13:29:42 jb2-node03 NetworkManager[6027]: <info> [1605266982.6550] device (vnet15): released from master device vlan0077 Nov 13 13:29:42 jb2-node03 libvirtd: 2020-11-13 11:29:42.669+0000: 12557: error : qemuMonitorIO:718 : internal error: End of file from qemu monitor +Arik Hadas<mailto:aha...@redhat.com> any clue? About the crash, can you please provide full sos report from the host? the log you provided is not enough to understand what caused the reported crash Also, given python2 is used here, I assume you're on 4.3 or older. I would recommend to upgrade to 4.4 as soon as practical. Thanks Anton Louw Cloud Engineer: Storage and Virtualization at Vox ________________________________ T: 087 805 0000 | D: 087 805 1572 M: N/A E: anton.l...@voxtelecom.co.za<mailto:anton.l...@voxtelecom.co.za> A: Rutherford Estate, 1 Scott Street, Waverley, Johannesburg www.vox.co.za<http://www.vox.co.za> [F]<https://www.facebook.com/voxtelecomZA> [T]<https://www.twitter.com/voxtelecom> [I]<https://www.instagram.com/voxtelecomza/> [L]<https://www.linkedin.com/company/voxtelecom> [Y]<https://www.youtube.com/user/VoxTelecom> [#VoxBrand]<https://www.vox.co.za/fibre/fibre-to-the-home/?prod=HOME> Disclaimer The contents of this email are confidential to the sender and the intended recipient. Unless the contents are clearly and entirely of a personal nature, they are subject to copyright in favour of the holding company of the Vox group of companies. Any recipient who receives this email in error should immediately report the error to the sender and permanently delete this email from all storage devices. This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more Click Here<https://www.voxtelecom.co.za/security/mimecast/?prod=Enterprise>. _______________________________________________ Users mailing list -- users@ovirt.org<mailto:users@ovirt.org> To unsubscribe send an email to users-le...@ovirt.org<mailto:users-le...@ovirt.org> Privacy Statement: https://www.ovirt.org/privacy-policy.html<https://www.ovirt.org/privacy-policy.html> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/<https://www.ovirt.org/community/about/community-guidelines/> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XMRUDMRBYZKUJQXVPPAEAJIP7N3JPRLY/<https://lists.ovirt.org/archives/list/users@ovirt.org/message/XMRUDMRBYZKUJQXVPPAEAJIP7N3JPRLY/> -- Sandro Bonazzola MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV Red Hat EMEA<https://www.redhat.com/> sbona...@redhat.com<mailto:sbona...@redhat.com> [https://static.redhat.com/libs/redhat/brand-assets/2/corp/logo--200.png]<https://www.redhat.com/> Red Hat respects your work life balance. Therefore there is no need to answer this email out of your office hours. <https://mojo.redhat.com/docs/DOC-1199578> -- Sandro Bonazzola MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV Red Hat EMEA<https://www.redhat.com/> sbona...@redhat.com<mailto:sbona...@redhat.com> [https://static.redhat.com/libs/redhat/brand-assets/2/corp/logo--200.png]<https://www.redhat.com/> Red Hat respects your work life balance. Therefore there is no need to answer this email out of your office hours. <https://mojo.redhat.com/docs/DOC-1199578>
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/G4FI754TMZIPC3KJL62LPOGQ6M363IVV/