Il giorno mar 17 nov 2020 alle ore 16:01 Anton Louw <
anton.l...@voxtelecom.co.za> ha scritto:

>
>
> Hi Sandro,
>
>
>
> Have you perhaps seen anything in the SOS report that could shed some
> light on the issues?
>

Sadly no. I see it's oVirt Node 4.3.8, I can suggest to upgrade to 4.3.10
at least and consider upgrading to 4.4.3 the whole datacenter.
I had the feeling watchdog was the trigger of the reboot but couldn't find
any evidence.
I also don't see anything suspicious in the logs.




>
>
> Thanks
>
>
>
> *Anton Louw*
> *Cloud Engineer: Storage and Virtualization* at *Vox*
> ------------------------------
> *T:*  087 805 0000 | *D:* 087 805 1572
> *M:* N/A
> *E:* anton.l...@voxtelecom.co.za
> *A:* Rutherford Estate, 1 Scott Street, Waverley, Johannesburg
> www.vox.co.za
>
> [image: F] <https://www.facebook.com/voxtelecomZA>
> [image: T] <https://www.twitter.com/voxtelecom>
> [image: I] <https://www.instagram.com/voxtelecomza/>
> [image: L] <https://www.linkedin.com/company/voxtelecom>
> [image: Y] <https://www.youtube.com/user/VoxTelecom>
>
> *From:* Anton Louw
> *Sent:* 16 November 2020 07:30
> *To:* Sandro Bonazzola <sbona...@redhat.com>; Arik Hadas <
> aha...@redhat.com>; Dominik Holler <dhol...@redhat.com>
> *Cc:* users@ovirt.org; Johan Koen <johan.k...@voxtelecom.co.za>
> *Subject:* RE: [ovirt-users] oVirt Node Crash
>
>
>
> I have also attached the SOS report as requested
>
>
>
> *From:* Anton Louw
> *Sent:* 16 November 2020 06:54
> *To:* Sandro Bonazzola <sbona...@redhat.com>; Arik Hadas <
> aha...@redhat.com>; Dominik Holler <dhol...@redhat.com>
> *Cc:* users@ovirt.org; Johan Koen <johan.k...@voxtelecom.co.za>
> *Subject:* RE: [ovirt-users] oVirt Node Crash
>
>
>
> Hi Sandro,
>
>
>
> Thanks for the response. I logged onto oVirt this morning, and I see the
> node is in a “Unassigned” state. I can ping it, but cannot SSH, so there is
> something that is causing the host to be unresponsive.
>
>
>
> On Saturday after I sent the mail, I opened a console to the node, and I
> saw the below entries before logging in:
>
>
>
> audit:backlog limit exceeded
>
>
>
> I the tried the solution of increasing the buffer size in the audit.rules
> file in /etc/audit/rules.d/ , as per below, but it did not resolve the
> issue.
>
>
>
> ## First rule - delete all
>
> -D
>
>
>
> ## Increase the buffers to survive stress events.
>
> ## Make this bigger for busy systems
>
> -b 8192
>
>
>
> ## Set failure mode to syslog
>
> -f 1
>
>
>
> Is it possible to upgrade the node to 4.4 while the engine is still on 4.3?
>
>
>
> Thanks
>
>
>
> *From:* Sandro Bonazzola <sbona...@redhat.com>
> *Sent:* 13 November 2020 18:39
> *To:* Anton Louw <anton.l...@voxtelecom.co.za>; Arik Hadas <
> aha...@redhat.com>; Dominik Holler <dhol...@redhat.com>
> *Cc:* users@ovirt.org; Johan Koen <johan.k...@voxtelecom.co.za>
> *Subject:* Re: [ovirt-users] oVirt Node Crash
>
>
>
>
>
>
>
> Il giorno ven 13 nov 2020 alle ore 17:37 Sandro Bonazzola <
> sbona...@redhat.com> ha scritto:
>
>
>
>
>
> Il giorno ven 13 nov 2020 alle ore 13:38 Anton Louw via Users <
> users@ovirt.org> ha scritto:
>
>
>
> Hi Everybody,
>
>
>
> I have built a new host which has been running fine for the last couple of
> days. I noticed today that the host crashed, but it is not giving me a
> reason as to why.
>
>
>
> It happened at 13:45 today, but I have given time before that on the logs
> as well.
>
>
>
> Is there something I am missing here?
>
>
>
> Not related to the crash, but I see in the logs that 5 out of 20 guests
> have qemu guest agent not responding.
>
>
>
> Also you seem to have some issues with some firewalld rules. (Maybe +Dominik
> Holler <dhol...@redhat.com> would like to have a look)
>
>
>
> I don't see anything explaining why the host got rebooted.
>
>
>
> Still related to guest agent I find a bit alarming the following lines:
>
> Nov 13 13:29:34 jb2-node03 libvirtd: 2020-11-13 11:29:34.294+0000: 12603:
> error : qemuDomainAgentAvailable:9144 : Guest agent is not responding: QEMU
> guest agent is not connected
> Nov 13 13:29:34 jb2-node03 vdsm[13843]: ERROR Shutdown by QEMU Guest Agent
> failed#012Traceback (most recent call last):#012  File
> "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5304, in
> qemuGuestAgentShutdown#012
>  self._dom.shutdownFlags(libvirt.VIR_DOMAIN_SHUTDOWN_GUEST_AGENT)#012  File
> "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 100, in
> f#012    ret = attr(*args, **kwargs)#012  File
> "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line
> 131, in wrapper#012    ret = f(*args, **kwargs)#012  File
> "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 94, in
> wrapper#012    return func(inst, *args, **kwargs)#012  File
> "/usr/lib64/python2.7/site-packages/libvirt.py", line 2517, in
> shutdownFlags#012    if ret == -1: raise libvirtError
> ('virDomainShutdownFlags() failed', dom=self)#012libvirtError: Guest agent
> is not responding: QEMU guest agent is not connected
> Nov 13 13:29:42 jb2-node03 kernel: vlan0077: port 11(vnet15) entered
> disabled state
> Nov 13 13:29:42 jb2-node03 kernel: device vnet15 left promiscuous mode
> Nov 13 13:29:42 jb2-node03 kernel: vlan0077: port 11(vnet15) entered
> disabled state
> Nov 13 13:29:42 jb2-node03 NetworkManager[6027]: <info>  [1605266982.6539]
> device (vnet15): state change: disconnected -> unmanaged (reason
> 'unmanaged', sys-iface-state: 'removed')
> Nov 13 13:29:42 jb2-node03 NetworkManager[6027]: <info>  [1605266982.6550]
> device (vnet15): released from master device vlan0077
> Nov 13 13:29:42 jb2-node03 libvirtd: 2020-11-13 11:29:42.669+0000: 12557:
> error : qemuMonitorIO:718 : internal error: End of file from qemu monitor
>
>
>
> +Arik Hadas <aha...@redhat.com> any clue?
>
>
>
> About the crash, can you please provide full sos report from the host? the
> log you provided is not enough to understand what caused the reported crash
>
>
>
> Also, given python2 is used here, I assume you're on 4.3 or older. I would
> recommend to upgrade to 4.4 as soon as practical.
>
>
>
>
>
>
>
>
>
>
>
>
>
> Thanks
>
>
>
> *Anton Louw*
>
> *Cloud Engineer: Storage and Virtualization* at *Vox*
> ------------------------------
>
> *T:*  087 805 0000 | *D:* 087 805 1572
> *M:* N/A
> *E:* anton.l...@voxtelecom.co.za
> *A:* Rutherford Estate, 1 Scott Street, Waverley, Johannesburg
> www.vox.co.za
>
>
>
> [image: F] <https://www.facebook.com/voxtelecomZA>
>
>
>
> [image: T] <https://www.twitter.com/voxtelecom>
>
>
>
> [image: I] <https://www.instagram.com/voxtelecomza/>
>
>
>
> [image: L] <https://www.linkedin.com/company/voxtelecom>
>
>
>
> [image: Y] <https://www.youtube.com/user/VoxTelecom>
>
>
>
>
>
> [image: #VoxBrand]
> <https://www.vox.co.za/fibre/fibre-to-the-home/?prod=HOME>
>
>
> *Disclaimer*
>
> The contents of this email are confidential to the sender and the intended
> recipient. Unless the contents are clearly and entirely of a personal
> nature, they are subject to copyright in favour of the holding company of
> the Vox group of companies. Any recipient who receives this email in error
> should immediately report the error to the sender and permanently delete
> this email from all storage devices.
>
> This email has been scanned for viruses and malware, and may have been
> automatically archived by *Mimecast Ltd*, an innovator in Software as a
> Service (SaaS) for business. Providing a *safer* and *more useful* place
> for your human generated data. Specializing in; Security, archiving and
> compliance. To find out more Click Here
> <https://www.voxtelecom.co.za/security/mimecast/?prod=Enterprise>.
>
>
>
> _______________________________________________
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/XMRUDMRBYZKUJQXVPPAEAJIP7N3JPRLY/
>
>
>
>
> --
>
> *Sandro Bonazzola*
>
> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
>
> Red Hat EMEA <https://www.redhat.com/>
>
> sbona...@redhat.com
>
> <https://www.redhat.com/>
>
>
> *Red Hat respects your work life balance. Therefore there is no need to
> answer this email out of your office hours.
> <https://mojo.redhat.com/docs/DOC-1199578>*
>
>
>
>
>
>
> --
>
> *Sandro Bonazzola*
>
> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
>
> Red Hat EMEA <https://www.redhat.com/>
>
> sbona...@redhat.com
>
> <https://www.redhat.com/>
>
>
> *Red Hat respects your work life balance. Therefore there is no need to
> answer this email out of your office hours.
> <https://mojo.redhat.com/docs/DOC-1199578>*
>
>
>
>

-- 

Sandro Bonazzola

MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV

Red Hat EMEA <https://www.redhat.com/>

sbona...@redhat.com
<https://www.redhat.com/>

*Red Hat respects your work life balance. Therefore there is no need to
answer this email out of your office hours.
<https://mojo.redhat.com/docs/DOC-1199578>*
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ALICE2DCOBLTDSDAGJEDM6KY36YKJZTS/

Reply via email to