On Wed, Apr 20, 2016 at 11:40 AM, Martin Sivak <msi...@redhat.com> wrote: >> Doesn't cleaning sanlock lockspace require also to stop sanlock itself? >> I guess it's supposed to be able to handle this, but perhaps users want >> to clean the lockspace because dirt there causes also problems with >> sanlock, no? > > Sanlock can be up, but the lockspace has to be unused. > >> So the only tool we have to clean metadata is '–clean-metadata', which >> works one-by-one? > > Correct, it needs to acquire the lock first to make sure nobody is writing. > > The dirty disk issue should not be happening anymore, we added an > equivalent of the DD to hosted engine setup. But we might have a bug there > of course.
And we also do not clean on upgrades... Perhaps we can? Should? Optionally? > > Martin > > On Wed, Apr 20, 2016 at 10:34 AM, Yedidyah Bar David <d...@redhat.com> wrote: >> On Wed, Apr 20, 2016 at 11:20 AM, Martin Sivak <msi...@redhat.com> wrote: >>>> after moving to global maintenance. >>> >>> Good point. >>> >>>> Martin - any advantage of this over '–reinitialize-lockspace'? Besides >>>> that it works also in older versions? Care to add this to the howto page? >>> >>> Reinitialize lockspace clears the sanlock lockspace, not the metadata >>> file. Those are two different places. >> >> So the only tool we have to clean metadata is '–clean-metadata', which >> works one-by-one? >> >> Doesn't cleaning sanlock lockspace require also to stop sanlock itself? >> I guess it's supposed to be able to handle this, but perhaps users want >> to clean the lockspace because dirt there causes also problems with >> sanlock, no? >> >>> >>>> Care to add this to the howto page? >>> >>> Yeah, I can do that. >> >> Thanks! >> >>> >>> Martin >>> >>> On Wed, Apr 20, 2016 at 10:17 AM, Yedidyah Bar David <d...@redhat.com> >>> wrote: >>>> On Wed, Apr 20, 2016 at 11:11 AM, Martin Sivak <msi...@redhat.com> wrote: >>>>>> Assuming you never deployed a host with ID 52, this is likely a result >>>>>> of a >>>>>> corruption or dirt or something like that. >>>>> >>>>>> I see that you use FC storage. In previous versions, we did not clean >>>>>> such >>>>>> storage, so you might have dirt left. >>>>> >>>>> This is the exact reason for an error like yours. Using dirty block >>>>> storage. Please stop all hosted engine tooling (both agent and broker) >>>>> and fill the metadata drive with zeros. >>>> >>>> after moving to global maintenance. >>>> >>>> Martin - any advantage of this over '–reinitialize-lockspace'? Besides >>>> that it works also in older versions? Care to add this to the howto page? >>>> Thanks! >>>> >>>>> >>>>> You will have to find the proper hosted-engine.metadata file (which >>>>> will be a symlink) under /rhev: >>>>> >>>>> Example: >>>>> >>>>> [root@dev-03 rhev]# find . -name hosted-engine.metadata >>>>> >>>>> ./data-center/mnt/str-01.rhev.lab.eng.brq.redhat.com:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/ha_agent/hosted-engine.metadata >>>>> >>>>> [root@dev-03 rhev]# ls -al >>>>> ./data-center/mnt/str-01:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/ha_agent/hosted-engine.metadata >>>>> >>>>> lrwxrwxrwx. 1 vdsm kvm 201 Mar 15 15:00 >>>>> ./data-center/mnt/str-01:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/ha_agent/hosted-engine.metadata >>>>> -> >>>>> /rhev/data-center/mnt/str-01:_mnt_export_nfs_lv2_msivak/868a1a4e-9f94-42f5-af23-8f884b3c53d5/images/6ab3f215-f234-4cd4-b9d4-8680767c3d99/dcbfa48d-8543-42d1-93dc-aa40855c4855 >>>>> >>>>> And use (for example) dd if=/dev/zero of=/path/to/metadata bs=1M to >>>>> clean it - But be CAREFUL to not touch any other file or disk you >>>>> might find. >>>>> >>>>> Then restart the hosted engine tools and all should be fine. >>>>> >>>>> >>>>> >>>>> Martin >>>>> >>>>> >>>>> On Wed, Apr 20, 2016 at 8:20 AM, Yedidyah Bar David <d...@redhat.com> >>>>> wrote: >>>>>> On Wed, Apr 20, 2016 at 7:15 AM, Wee Sritippho <we...@forest.go.th> >>>>>> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I used CentOS-7-x86_64-Minimal-1511.iso to install the hosts and the >>>>>>> engine. >>>>>>> >>>>>>> The 1st host and the hosted-engine were installed successfully, but the >>>>>>> 2nd >>>>>>> host failed with this error message: >>>>>>> >>>>>>> "Failed to execute stage 'Setup validation': Metadata version 2 from >>>>>>> host 52 >>>>>>> too new for this agent (highest compatible version: 1)" >>>>>> >>>>>> Assuming you never deployed a host with ID 52, this is likely a result >>>>>> of a >>>>>> corruption or dirt or something like that. >>>>>> >>>>>> What do you get on host 1 running 'hosted-engine --vm-status'? >>>>>> >>>>>> I see that you use FC storage. In previous versions, we did not clean >>>>>> such >>>>>> storage, so you might have dirt left. See also [1]. You can try cleaning >>>>>> using [2]. >>>>>> >>>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1238823 >>>>>> [2] >>>>>> https://www.ovirt.org/documentation/how-to/hosted-engine/#lockspace-corrupted-recovery-procedure >>>>>> >>>>>>> >>>>>>> Here is the package versions: >>>>>>> >>>>>>> [root@host02 ~]# rpm -qa | grep ovirt >>>>>>> libgovirt-0.3.3-1.el7_2.1.x86_64 >>>>>>> ovirt-vmconsole-1.0.0-1.el7.centos.noarch >>>>>>> ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch >>>>>>> ovirt-host-deploy-1.4.1-1.el7.centos.noarch >>>>>>> ovirt-hosted-engine-ha-1.3.5.1-1.el7.centos.noarch >>>>>>> ovirt-hosted-engine-setup-1.3.4.0-1.el7.centos.noarch >>>>>>> ovirt-release36-007-1.noarch >>>>>>> ovirt-engine-sdk-python-3.6.3.0-1.el7.centos.noarch >>>>>>> ovirt-setup-lib-1.0.1-1.el7.centos.noarch >>>>>>> >>>>>>> [root@engine ~]# rpm -qa | grep ovirt >>>>>>> ovirt-engine-setup-base-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-engine-setup-plugin-ovirt-engine-common-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-vmconsole-proxy-1.0.0-1.el7.centos.noarch >>>>>>> ovirt-engine-tools-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-engine-vmconsole-proxy-helper-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-host-deploy-1.4.1-1.el7.centos.noarch >>>>>>> ovirt-release36-007-1.noarch >>>>>>> ovirt-engine-sdk-python-3.6.3.0-1.el7.centos.noarch >>>>>>> ovirt-iso-uploader-3.6.0-1.el7.centos.noarch >>>>>>> ovirt-engine-extensions-api-impl-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-setup-lib-1.0.1-1.el7.centos.noarch >>>>>>> ovirt-host-deploy-java-1.4.1-1.el7.centos.noarch >>>>>>> ovirt-engine-cli-3.6.2.0-1.el7.centos.noarch >>>>>>> ovirt-engine-setup-plugin-websocket-proxy-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-vmconsole-1.0.0-1.el7.centos.noarch >>>>>>> ovirt-engine-backend-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-engine-dbscripts-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-engine-webadmin-portal-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-engine-setup-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-engine-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-guest-agent-common-1.0.11-1.el7.noarch >>>>>>> ovirt-engine-wildfly-8.2.1-1.el7.x86_64 >>>>>>> ovirt-engine-wildfly-overlay-8.0.5-1.el7.noarch >>>>>>> ovirt-engine-websocket-proxy-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-engine-restapi-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-engine-userportal-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-engine-setup-plugin-ovirt-engine-3.6.4.1-1.el7.centos.noarch >>>>>>> ovirt-image-uploader-3.6.0-1.el7.centos.noarch >>>>>>> ovirt-engine-extension-aaa-jdbc-1.0.6-1.el7.noarch >>>>>>> ovirt-engine-lib-3.6.4.1-1.el7.centos.noarch >>>>>>> >>>>>>> >>>>>>> Here are the log files: >>>>>>> https://gist.github.com/weeix/1743f88d3afe1f405889a67ed4011141 >>>>>>> >>>>>>> -- >>>>>>> Wee >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Users mailing list >>>>>>> Users@ovirt.org >>>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Didi >>>>>> _______________________________________________ >>>>>> Users mailing list >>>>>> Users@ovirt.org >>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>> >>>> >>>> >>>> -- >>>> Didi >> >> >> >> -- >> Didi -- Didi _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users