On Tue, Jun 8, 2021 at 8:01 AM Guillaume Pavese < guillaume.pav...@interactiv-group.com> wrote:
> Hello, > > I used the cluster upgrade feature that moves hosts in maintenance one by > one. > This is not a HCI cluster, my storage is on iSCSI multipath > > I managed to fully upgrade the 1st hosts after rebooting and fixing some > network/iSCSI errors. > However, now the second one is stuck at upgrading the ovirt-node layers > too but I can not succeed in upgrading that one. > On this 2nd host, the workaround of removing and reinstalling > ovirt-node-ng-image-update doesn't work. I only get the following error : > > [root@ps-inf-prd-kvm-fr-511 ~]# nodectl check > Status: OK > Bootloader ... OK > Layer boot entries ... OK > Valid boot entries ... OK > Mount points ... OK > Separate /var ... OK > Discard is used ... OK > Basic storage ... OK > Initialized VG ... OK > Initialized Thin Pool ... OK > Initialized LVs ... OK > Thin storage ... OK > Checking available space in thinpool ... OK > Checking thinpool auto-extend ... OK > vdsmd ... OK > > > [root@ps-inf-prd-kvm-fr-511 ~]# nodectl info > bootloader: > default: ovirt-node-ng-4.4.5.1-0.20210323.0 > (4.18.0-240.15.1.el8_3.x86_64) > entries: > ovirt-node-ng-4.4.5.1-0.20210323.0 (4.18.0-240.15.1.el8_3.x86_64): > index: 0 > kernel: > /boot//ovirt-node-ng-4.4.5.1-0.20210323.0+1/vmlinuz-4.18.0-240.15.1.el8_3.x86_64 > args: resume=/dev/mapper/onn-swap > rd.lvm.lv=onn/ovirt-node-ng-4.4.5.1-0.20210323.0+1 > rd.lvm.lv=onn/swap rhgb quiet > boot=UUID=a676b18f-0f1b-4ad4-88e1-533fe61ff063 rootflags=discard > img.bootid=ovirt-node-ng-4.4.5.1-0.20210323.0+1 intel_iommu=on > root: /dev/onn/ovirt-node-ng-4.4.5.1-0.20210323.0+1 > initrd: > /boot//ovirt-node-ng-4.4.5.1-0.20210323.0+1/initramfs-4.18.0-240.15.1.el8_3.x86_64.img > title: ovirt-node-ng-4.4.5.1-0.20210323.0 > (4.18.0-240.15.1.el8_3.x86_64) > blsid: > ovirt-node-ng-4.4.5.1-0.20210323.0+1-4.18.0-240.15.1.el8_3.x86_64 > layers: > ovirt-node-ng-4.4.5.1-0.20210323.0: > ovirt-node-ng-4.4.5.1-0.20210323.0+1 > current_layer: ovirt-node-ng-4.4.5.1-0.20210323.0+1 > > > [root@ps-inf-prd-kvm-fr-511 ~]# yum remove ovirt-node-ng-image-update > [...] > Removing: > ovirt-node-ng-image-update noarch > 4.4.6.3-1.el8 > @ovirt-4.4 886 M > Erasing : ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch > Verifying : ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch > Unpersisting: ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch.rpm > > Removed: > ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch > Complete! > [root@ps-inf-prd-kvm-fr-511 ~]# > > > [root@ps-inf-prd-kvm-fr-511 ~]# yum install ovirt-node-ng-image-update > [...] > Installing: > ovirt-node-ng-image-update noarch > 4.4.6.3-1.el8 > ovirt-4.4 887 M > [...] > ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch.rpm > > 23 MB/s | 887 MB 00:39 > Running transaction check > Transaction check succeeded. > Running transaction test > Transaction test succeeded. > Running transaction > Preparing : > Running scriptlet: ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch > Installing : ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch > Running scriptlet: ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch > *warning: %post(ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch) scriptlet > failed, exit status 1* > > *Error in POSTIN scriptlet in rpm package ovirt-node-ng-image-update * > Verifying : ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch > > > Installed: > > > ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch > Complete! > [root@ps-inf-prd-kvm-fr-511 ~]# > > [root@ps-inf-prd-kvm-fr-511 ~]# nodectl info > bootloader: > default: ovirt-node-ng-4.4.5.1-0.20210323.0 > (4.18.0-240.15.1.el8_3.x86_64) > entries: > ovirt-node-ng-4.4.5.1-0.20210323.0 (4.18.0-240.15.1.el8_3.x86_64): > index: 0 > kernel: > /boot//ovirt-node-ng-4.4.5.1-0.20210323.0+1/vmlinuz-4.18.0-240.15.1.el8_3.x86_64 > args: resume=/dev/mapper/onn-swap > rd.lvm.lv=onn/ovirt-node-ng-4.4.5.1-0.20210323.0+1 > rd.lvm.lv=onn/swap rhgb quiet > boot=UUID=a676b18f-0f1b-4ad4-88e1-533fe61ff063 rootflags=discard > img.bootid=ovirt-node-ng-4.4.5.1-0.20210323.0+1 intel_iommu=on > root: /dev/onn/ovirt-node-ng-4.4.5.1-0.20210323.0+1 > initrd: > /boot//ovirt-node-ng-4.4.5.1-0.20210323.0+1/initramfs-4.18.0-240.15.1.el8_3.x86_64.img > title: ovirt-node-ng-4.4.5.1-0.20210323.0 > (4.18.0-240.15.1.el8_3.x86_64) > blsid: > ovirt-node-ng-4.4.5.1-0.20210323.0+1-4.18.0-240.15.1.el8_3.x86_64 > layers: > ovirt-node-ng-4.4.5.1-0.20210323.0: > ovirt-node-ng-4.4.5.1-0.20210323.0+1 > current_layer: ovirt-node-ng-4.4.5.1-0.20210323.0+1 > [root@ps-inf-prd-kvm-fr-511 ~]# > > Is there any way to see where in the POSTIN scriplet the installation > fails? > You can try using dnf's option '--rpmverbosity'. Best regards, > > Guillaume Pavese > Ingénieur Système et Réseau > Interactiv-Group > > > On Fri, Jun 4, 2021 at 8:44 PM Lev Veyde <lve...@redhat.com> wrote: > >> Hi Guillaume, >> >> Have you moved the host to the maintenance before the upgrade (making >> sure that Gluster related options are unchecked)? >> >> Or you started the upgrade directly? >> >> Thanks in advance, >> >> On Thu, Jun 3, 2021 at 10:50 AM wodel youchi <wodel.you...@gmail.com> >> wrote: >> >>> Hi, >>> >>> Is this an hci deployment? >>> >>> If yes : >>> >>> - try to boot using the old version 4.4.5 >>> - verify that your network configuration is still intact >>> - verify that the gluster part is working properly >>> # gluster peer status >>> # gluster volume status >>> >>> - verify the your engine and your host can resolve each other hostname >>> >>> If all is ok try to bring the host at available state >>> >>> >>> Regards. >>> >>> Le mer. 2 juin 2021 08:21, Guillaume Pavese < >>> guillaume.pav...@interactiv-group.com> a écrit : >>> >>>> Maybe my problem is in part linked to an issue seen by Jayme earlier, >>>> but then the resolution that worked for him did not succeed for me : >>>> >>>> I first upgraded my Self Hosted Engine from 4.4.5 to 4.4.6 and then >>>> upgraded it to Centos-Stream and rebooted >>>> >>>> Then I tried to upgrade the cluster (3 ovirt-nodes on 4.4.5) but it >>>> failed at the first host. >>>> They are all ovir-node hosts, originally first installed in 4.4.5 >>>> >>>> In Host Event Logs I saw : >>>> >>>> ... >>>> Update of host ps-inf-prd-kvm-fr-510.hostics.fr. >>>> Upgrade packages >>>> Update of host ps-inf-prd-kvm-fr-510.hostics.fr. >>>> Check if image was updated. >>>> Update of host ps-inf-prd-kvm-fr-510.hostics.fr. >>>> Check if image was updated. >>>> Update of host ps-inf-prd-kvm-fr-510.hostics.fr. >>>> Check if image-updated file exists. >>>> Failed to upgrade Host ps-inf-prd-kvm-fr-510.hostics.fr (User: >>>> g...@hostics.fr). >>>> >>>> >>>> >>>> ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch was installed according >>>> to yum, >>>> I tried reinstalling it but got errors: "Error in POSTIN scriptlet" : >>>> >>>> Downloading Packages: >>>> [SKIPPED] ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch.rpm: Already >>>> downloaded >>>> ... >>>> Running scriptlet: ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch >>>> >>>> Reinstalling : ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch >>>> >>>> Running scriptlet: ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch >>>> >>>> warning: %post(ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch) >>>> scriptlet failed, exit status 1 >>>> >>>> Error in POSTIN scriptlet in rpm package ovirt-node-ng-image-update >>>> --- >>>> Reinstalled: >>>> ovirt-node-ng-image-update-4.4.6.3-1.el8.noarch >>>> >>>> >>>> >>>> nodectl still showed it was on 4.4.5 : >>>> >>>> [root@ps-inf-prd-kvm-fr-510 ~]# nodectl info >>>> bootloader: >>>> default: ovirt-node-ng-4.4.5.1-0.20210323.0 >>>> (4.18.0-240.15.1.el8_3.x86_64) >>>> ... >>>> current_layer: ovirt-node-ng-4.4.5.1-0.20210323.0+1 >>>> >>>> >>>> >>>> I tried to upgrade the Host again from oVirt and this time there was no >>>> error, and the host rebooted. >>>> However, it did not pass active after rebooting and nodectl still shows >>>> that it's 4.4.5 installed. Similar symptoms as OP >>>> >>>> So I removed ovirt-node-ng-image-update, then reinstalled it and got no >>>> error this time. >>>> nodectl info seemed to show that it was installed : >>>> >>>> >>>> [root@ps-inf-prd-kvm-fr-510 yum.repos.d]# nodectl info >>>> bootloader: >>>> default: ovirt-node-ng-4.4.6.3-0.20210518.0 (4.18.0-301.1.el8.x86_64) >>>> ... >>>> current_layer: ovirt-node-ng-4.4.5.1-0.20210323.0+1 >>>> >>>> >>>> However, after reboot the Host was still shown as "unresponsive" >>>> After Marking it as "Manually rebooted", passing it in maintenance mode >>>> and trying to activate it, the Host was automatically fenced. And still >>>> unresponsive after this new reboot. >>>> >>>> I passed it in maintenance mode again, And tried to reinstall it with >>>> "Deploy Hosted Engine" selected >>>> However if failed : "Task Stop services failed to execute." >>>> >>>> In >>>> /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20210602082519-ps-inf-prd-kvm-fr-510.hostics.fr-0565d681-9406-4fa7-a444-7ee34804579c.log >>>> : >>>> >>>> "msg" : "Unable to stop service vdsmd.service: Job for vdsmd.service >>>> canceled.\n", "failed" : true, >>>> >>>> "msg" : "Unable to stop service supervdsmd.service: Job for >>>> supervdsmd.service canceled.\n", failed" : true, >>>> >>>> "stderr" : "Error: ServiceOperationError: _systemctlStop failed\nb'Job >>>> for vdsmd.service canceled.\\n' ", >>>> >>>> "stderr_lines" : [ "Error: ServiceOperationError: _systemctlStop >>>> failed", "b'Job for vdsmd.service canceled.\\n' " ], >>>> >>>> >>>> If I try on the Host I get : >>>> >>>> [root@ps-inf-prd-kvm-fr-510 ~]# systemctl stop vdsmd >>>> Job for vdsmd.service canceled. >>>> >>>> [root@ps-inf-prd-kvm-fr-510 ~]# systemctl status vdsmd >>>> ● vdsmd.service - Virtual Desktop Server Manager >>>> Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; >>>> vendor preset: disabled) >>>> Active: deactivating (stop-sigterm) since Wed 2021-06-02 08:49:21 >>>> CEST; 7s ago >>>> Process: 54037 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh >>>> --pre-start (code=exited, status=0/SUCCESS) >>>> ... >>>> >>>> Jun 02 08:47:34 ps-inf-prd-kvm-fr-510.hostics.fr vdsm[54100]: WARN >>>> Failed to retrieve Hosted Engine HA info, is Hosted Engine setup finished? >>>> ... >>>> Jun 02 08:48:31 ps-inf-prd-kvm-fr-510.hostics.fr vdsm[54100]: WARN >>>> Worker blocked: <Worker name=jsonrpc/4 running <Task <JsonRpcTask >>>> {'jsonrpc': '2.0', 'method': 'StoragePool.connectStorageServer', 'params': >>>> {'storage> >>>> File: >>>> "/usr/lib64/python3.6/threading.py", line 884, in _bootstrap >>>> >>>> self._bootstrap_inner() >>>> >>>> >>>> >>>> Retrying to manually stop vdsmd a second time then seems to work... >>>> I tried rebooting again, restarting the install always fail at the the >>>> same spot >>>> >>>> What should I try to get this host back up? >>>> >>>> >>>> >>>> Guillaume Pavese >>>> Ingénieur Système et Réseau >>>> Interactiv-Group >>>> >>>> Ce message et toutes les pièces jointes (ci-après le “message”) sont >>>> établis à l’intention exclusive de ses destinataires et sont confidentiels. >>>> Si vous recevez ce message par erreur, merci de le détruire et d’en avertir >>>> immédiatement l’expéditeur. Toute utilisation de ce message non conforme a >>>> sa destination, toute diffusion ou toute publication, totale ou partielle, >>>> est interdite, sauf autorisation expresse. L’internet ne permettant pas >>>> d’assurer l’intégrité de ce message . Interactiv-group (et ses filiales) >>>> décline(nt) toute responsabilité au titre de ce message, dans l’hypothèse >>>> ou il aurait été modifié. IT, ES, UK. >>>> <https://interactiv-group.com/disclaimer.html> >>>> _______________________________________________ >>>> Users mailing list -- users@ovirt.org >>>> To unsubscribe send an email to users-le...@ovirt.org >>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html >>>> oVirt Code of Conduct: >>>> https://www.ovirt.org/community/about/community-guidelines/ >>>> List Archives: >>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZB2CLJYXO6SX53XLQAPTXEK7JKZQVPSW/ >>>> >>> _______________________________________________ >>> Users mailing list -- users@ovirt.org >>> To unsubscribe send an email to users-le...@ovirt.org >>> Privacy Statement: https://www.ovirt.org/privacy-policy.html >>> oVirt Code of Conduct: >>> https://www.ovirt.org/community/about/community-guidelines/ >>> List Archives: >>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/TX2CUCG6DDSPUQMHABWM6LZDEVOODGJI/ >>> >> >> >> -- >> >> Lev Veyde >> >> Senior Software Engineer, RHCE | RHCVA | MCITP >> >> Red Hat Israel >> >> <https://www.redhat.com> >> >> l...@redhat.com | lve...@redhat.com >> <https://red.ht/sig> >> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> >> > > Ce message et toutes les pièces jointes (ci-après le “message”) sont > établis à l’intention exclusive de ses destinataires et sont confidentiels. > Si vous recevez ce message par erreur, merci de le détruire et d’en avertir > immédiatement l’expéditeur. Toute utilisation de ce message non conforme a > sa destination, toute diffusion ou toute publication, totale ou partielle, > est interdite, sauf autorisation expresse. L’internet ne permettant pas > d’assurer l’intégrité de ce message . Interactiv-group (et ses filiales) > décline(nt) toute responsabilité au titre de ce message, dans l’hypothèse > ou il aurait été modifié. IT, ES, UK. > <https://interactiv-group.com/disclaimer.html> > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/Q3RYE4KCAJ3CPRKQCTF4V24HPMVKG6DQ/ > -- Didi
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/UY32FOSH54SSKVNDTAITJR3KXS6IVRW5/