Hi Martin, Can you please create a cerry pick patch that is based on 4.2?
Thanks On Tue, May 29, 2018 at 1:34 PM, Dan Kenigsberg <dan...@redhat.com> wrote: > On Tue, May 29, 2018 at 1:21 PM, Elad Ben Aharon <ebena...@redhat.com> > wrote: > > Hi Dan, > > > > In the last execution, the success rate was very low due to a large > number > > of failures on start VM caused, according to Michal, by the > > vdsm-hook-allocate_net that was installed on the host. > > > > This is the latest status here, would you like me to re-execute? > > yes, of course. but you should rebase Polednik's code on top of > *current* ovirt-4.2.3 branch. > > > If so, with > > or W/O vdsm-hook-allocate_net installed? > > There was NO reason to have that installed. Please keep it (and any > other needless code) out of the test environment. > > > > > On Tue, May 29, 2018 at 1:14 PM, Dan Kenigsberg <dan...@redhat.com> > wrote: > >> > >> On Mon, May 7, 2018 at 3:53 PM, Michal Skrivanek > >> <michal.skriva...@redhat.com> wrote: > >> > Hi Elad, > >> > why did you install vdsm-hook-allocate_net? > >> > > >> > adding Dan as I think the hook is not supposed to fail this badly in > any > >> > case > >> > >> yep, this looks bad and deserves a little bug report. Installing this > >> little hook should not block vm startup. > >> > >> But more importantly - what is the conclusion of this thread? Do we > >> have a green light from QE to take this in? > >> > >> > >> > > >> > Thanks, > >> > michal > >> > > >> > On 5 May 2018, at 19:22, Elad Ben Aharon <ebena...@redhat.com> wrote: > >> > > >> > Start VM fails on: > >> > > >> > 2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] > >> > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path: > >> > > >> > 'dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- > 9a78-bdd13a843c62/images/6cdabfe5- > >> > d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a01928211' -> > >> > > >> > u'*dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- > 9a78-bdd13a843c62/images/6cdabfe5-d1ca-40af-ae63- > 9834f235d1c8/7ef97445-30e6-4435-8425- > >> > f35a01928211' (storagexml:334) > >> > 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START > >> > getSpmStatus(spUUID='940fe6f3-b0c6-4d0c-a921-198e7819c1cc', > >> > options=None) > >> > from=::ffff:10.35.161.127,53512, > >> > task_id=c70ace39-dbfe-4f5c-ae49-a1e3a82c > >> > 2758 (api:46) > >> > 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] > >> > /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=2 > >> > err=vm > >> > net allocation hook: [unexpected error]: Traceback (most recent call > >> > last): > >> > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", > >> > line > >> > 105, in <module> > >> > main() > >> > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", > >> > line > >> > 93, in main > >> > allocate_random_network(device_xml) > >> > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", > >> > line > >> > 62, in allocate_random_network > >> > net = _get_random_network() > >> > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", > >> > line > >> > 50, in _get_random_network > >> > available_nets = _parse_nets() > >> > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", > >> > line > >> > 46, in _parse_nets > >> > return [net for net in os.environ[AVAIL_NETS_KEY].split()] > >> > File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ > >> > raise KeyError(key) > >> > KeyError: 'equivnets' > >> > > >> > > >> > (hooks:110) > >> > 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] > >> > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process > >> > failed > >> > (vm:943) > >> > Traceback (most recent call last): > >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, > in > >> > _startUnderlyingVm > >> > self._run() > >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, > in > >> > _run > >> > domxml = hooks.before_vm_start(self._buildDomainXML(), > >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2254, > in > >> > _buildDomainXML > >> > dom, self.id, self._custom['custom']) > >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_ > preprocess.py", > >> > line 240, in replace_device_xml_with_hooks_xml > >> > dev_custom) > >> > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line > 134, > >> > in > >> > before_device_create > >> > params=customProperties) > >> > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line > 120, > >> > in > >> > _runHooksDir > >> > raise exception.HookError(err) > >> > HookError: Hook Error: ('vm net allocation hook: [unexpected error]: > >> > Traceback (most recent call last):\n File > >> > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line > >> > 105, in > >> > <module>\n main()\n > >> > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", > >> > line > >> > 93, in main\n allocate_random_network(device_xml)\n File > >> > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line > 62, > >> > i > >> > n allocate_random_network\n net = _get_random_network()\n File > >> > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line > 50, > >> > in > >> > _get_random_network\n available_nets = _parse_nets()\n File "/us > >> > r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, > in > >> > _parse_nets\n return [net for net in > >> > os.environ[AVAIL_NETS_KEY].split()]\n File > >> > "/usr/lib64/python2.7/UserDict.py", line 23, in __getit > >> > em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',) > >> > > >> > > >> > > >> > Hence, the success rate was 28% against 100% running with d/s (d/s). > If > >> > needed, I'll compare against the latest master, but I think you get > the > >> > picture with d/s. > >> > > >> > vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 > >> > libvirt-3.9.0-14.el7_5.3.x86_64 > >> > qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 > >> > kernel 3.10.0-862.el7.x86_64 > >> > rhel7.5 > >> > > >> > > >> > Logs attached > >> > > >> > On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon <ebena...@redhat.com> > >> > wrote: > >> >> > >> >> nvm, found gluster 3.12 repo, managed to install vdsm > >> >> > >> >> On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon <ebena...@redhat.com > > > >> >> wrote: > >> >>> > >> >>> No, vdsm requires it: > >> >>> > >> >>> Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 > >> >>> (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64) > >> >>> Requires: glusterfs-fuse >= 3.12 > >> >>> Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64 > (@rhv-4.2.3) > >> >>> > >> >>> Therefore, vdsm package installation is skipped upon force install. > >> >>> > >> >>> On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek > >> >>> <michal.skriva...@redhat.com> wrote: > >> >>>> > >> >>>> > >> >>>> > >> >>>> On 5 May 2018, at 00:38, Elad Ben Aharon <ebena...@redhat.com> > wrote: > >> >>>> > >> >>>> Hi guys, > >> >>>> > >> >>>> The vdsm build from the patch requires glusterfs-fuse > 3.12. This > is > >> >>>> while the latest 4.2.3-5 d/s build requires 3.8.4 > (3.4.0.59rhs-1.el7) > >> >>>> > >> >>>> > >> >>>> because it is still oVirt, not a downstream build. We can’t really > do > >> >>>> downstream builds with unmerged changes:/ > >> >>>> > >> >>>> Trying to get this gluster-fuse build, so far no luck. > >> >>>> Is this requirement intentional? > >> >>>> > >> >>>> > >> >>>> it should work regardless, I guess you can force install it without > >> >>>> the > >> >>>> dependency > >> >>>> > >> >>>> > >> >>>> On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek > >> >>>> <michal.skriva...@redhat.com> wrote: > >> >>>>> > >> >>>>> Hi Elad, > >> >>>>> to make it easier to compare, Martin backported the change to 4.2 > so > >> >>>>> it > >> >>>>> is actually comparable with a run without that patch. Would you > >> >>>>> please try > >> >>>>> that out? > >> >>>>> It would be best to have 4.2 upstream and this[1] run to really > >> >>>>> minimize the noise. > >> >>>>> > >> >>>>> Thanks, > >> >>>>> michal > >> >>>>> > >> >>>>> [1] > >> >>>>> > >> >>>>> http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on- > demand-el7-x86_64/28/ > >> >>>>> > >> >>>>> On 27 Apr 2018, at 09:23, Martin Polednik <mpoled...@redhat.com> > >> >>>>> wrote: > >> >>>>> > >> >>>>> On 24/04/18 00:37 +0300, Elad Ben Aharon wrote: > >> >>>>> > >> >>>>> I will update with the results of the next tier1 execution on > latest > >> >>>>> 4.2.3 > >> >>>>> > >> >>>>> > >> >>>>> That isn't master but old branch though. Could you run it against > >> >>>>> *current* VDSM master? > >> >>>>> > >> >>>>> On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik > >> >>>>> <mpoled...@redhat.com> > >> >>>>> wrote: > >> >>>>> > >> >>>>> On 23/04/18 01:23 +0300, Elad Ben Aharon wrote: > >> >>>>> > >> >>>>> Hi, I've triggered another execution [1] due to some issues I saw > in > >> >>>>> the > >> >>>>> first which are not related to the patch. > >> >>>>> > >> >>>>> The success rate is 78% which is low comparing to tier1 executions > >> >>>>> with > >> >>>>> code from downstream builds (95-100% success rates) [2]. > >> >>>>> > >> >>>>> > >> >>>>> Could you run the current master (without the dynamic_ownership > >> >>>>> patch) > >> >>>>> so that we have viable comparision? > >> >>>>> > >> >>>>> From what I could see so far, there is an issue with move and copy > >> >>>>> > >> >>>>> operations to and from Gluster domains. For example [3]. > >> >>>>> > >> >>>>> The logs are attached. > >> >>>>> > >> >>>>> > >> >>>>> [1] > >> >>>>> *https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv > >> >>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/ > >> >>>>> <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv > >> >>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/>* > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> [2] > >> >>>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ > >> >>>>> > >> >>>>> rhv-4.2-ge-runner-tier1-after-upgrade/7/ > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> [3] > >> >>>>> 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] FINISH > >> >>>>> deleteImage error=Image does not exist in domain: > >> >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, > >> >>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' > >> >>>>> from=: > >> >>>>> :ffff:10.35.161.182,40936, > >> >>>>> flow_id=disks_syncAction_ba6b2630-5976-4935, > >> >>>>> task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) > >> >>>>> 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) > >> >>>>> [storage.TaskManager.Task] > >> >>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error > >> >>>>> (task:875) > >> >>>>> Traceback (most recent call last): > >> >>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", > line > >> >>>>> 882, > >> >>>>> in > >> >>>>> _run > >> >>>>> return fn(*args, **kargs) > >> >>>>> File "<string>", line 2, in deleteImage > >> >>>>> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line > 49, > >> >>>>> in > >> >>>>> method > >> >>>>> ret = func(*args, **kwargs) > >> >>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line > >> >>>>> 1503, > >> >>>>> in > >> >>>>> deleteImage > >> >>>>> raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) > >> >>>>> ImageDoesNotExistInSD: Image does not exist in domain: > >> >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, > >> >>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' > >> >>>>> > >> >>>>> 2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) > >> >>>>> [storage.TaskManager.Task] > >> >>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task is > >> >>>>> aborted: > >> >>>>> "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835- > >> >>>>> 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'" - > code > >> >>>>> 268 > >> >>>>> (task:1181) > >> >>>>> 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) > [storage.Dispatcher] > >> >>>>> FINISH > >> >>>>> deleteImage error=Image does not exist in domain: > >> >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, > >> >>>>> domain=e5fd29c8-52ba-467e-be09 > >> >>>>> -ca40ff054d > >> >>>>> d4' (dispatcher:82) > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon > >> >>>>> <ebena...@redhat.com> > >> >>>>> wrote: > >> >>>>> > >> >>>>> Triggered a sanity tier1 execution [1] using [2], which covers all > >> >>>>> the > >> >>>>> > >> >>>>> requested areas, on iSCSI, NFS and Gluster. > >> >>>>> I'll update with the results. > >> >>>>> > >> >>>>> [1] > >> >>>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2 > >> >>>>> _dev/job/rhv-4.2-ge-flow-storage/1161/ > >> >>>>> > >> >>>>> [2] > >> >>>>> https://gerrit.ovirt.org/#/c/89830/ > >> >>>>> vdsm-4.30.0-291.git77aef9a.el7.x86_64 > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik > >> >>>>> <mpoled...@redhat.com> > >> >>>>> wrote: > >> >>>>> > >> >>>>> On 19/04/18 14:54 +0300, Elad Ben Aharon wrote: > >> >>>>> > >> >>>>> > >> >>>>> Hi Martin, > >> >>>>> > >> >>>>> > >> >>>>> I see [1] requires a rebase, can you please take care? > >> >>>>> > >> >>>>> > >> >>>>> Should be rebased. > >> >>>>> > >> >>>>> At the moment, our automation is stable only on iSCSI, NFS, > Gluster > >> >>>>> and > >> >>>>> > >> >>>>> FC. > >> >>>>> Ceph is not supported and Cinder will be stabilized soon, AFAIR, > >> >>>>> it's > >> >>>>> not > >> >>>>> stable enough at the moment. > >> >>>>> > >> >>>>> > >> >>>>> That is still pretty good. > >> >>>>> > >> >>>>> > >> >>>>> [1] https://gerrit.ovirt.org/#/c/89830/ > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> Thanks > >> >>>>> > >> >>>>> On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik > >> >>>>> <mpoled...@redhat.com > >> >>>>> > > >> >>>>> wrote: > >> >>>>> > >> >>>>> On 18/04/18 11:37 +0300, Elad Ben Aharon wrote: > >> >>>>> > >> >>>>> > >> >>>>> Hi, sorry if I misunderstood, I waited for more input regarding > what > >> >>>>> > >> >>>>> areas > >> >>>>> have to be tested here. > >> >>>>> > >> >>>>> > >> >>>>> I'd say that you have quite a bit of freedom in this regard. > >> >>>>> > >> >>>>> GlusterFS > >> >>>>> should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some > >> >>>>> suite > >> >>>>> that covers basic operations (start & stop VM, migrate it), > >> >>>>> snapshots > >> >>>>> and merging them, and whatever else would be important for storage > >> >>>>> sanity. > >> >>>>> > >> >>>>> mpolednik > >> >>>>> > >> >>>>> > >> >>>>> On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < > >> >>>>> mpoled...@redhat.com > >> >>>>> > > >> >>>>> > >> >>>>> wrote: > >> >>>>> > >> >>>>> > >> >>>>> On 11/04/18 16:52 +0300, Elad Ben Aharon wrote: > >> >>>>> > >> >>>>> > >> >>>>> We can test this on iSCSI, NFS and GlusterFS. As for ceph and > >> >>>>> cinder, > >> >>>>> > >> >>>>> will > >> >>>>> > >> >>>>> have to check, since usually, we don't execute our automation on > >> >>>>> them. > >> >>>>> > >> >>>>> > >> >>>>> Any update on this? I believe the gluster tests were successful, > >> >>>>> OST > >> >>>>> > >> >>>>> passes fine and unit tests pass fine, that makes the storage > >> >>>>> backends > >> >>>>> test the last required piece. > >> >>>>> > >> >>>>> > >> >>>>> On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir <rata...@redhat.com> > >> >>>>> wrote: > >> >>>>> > >> >>>>> > >> >>>>> +Elad > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg < > dan...@redhat.com > >> >>>>> > >> >>>>> > > >> >>>>> wrote: > >> >>>>> > >> >>>>> On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer <nsof...@redhat.com> > >> >>>>> wrote: > >> >>>>> > >> >>>>> > >> >>>>> On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri <ee...@redhat.com> > >> >>>>> > >> >>>>> wrote: > >> >>>>> > >> >>>>> > >> >>>>> Please make sure to run as much OST suites on this patch as > >> >>>>> > >> >>>>> possible > >> >>>>> > >> >>>>> before merging ( using 'ci please build' ) > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> But note that OST is not a way to verify the patch. > >> >>>>> > >> >>>>> > >> >>>>> Such changes require testing with all storage types we support. > >> >>>>> > >> >>>>> Nir > >> >>>>> > >> >>>>> On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < > >> >>>>> mpoled...@redhat.com > >> >>>>> > > >> >>>>> > >> >>>>> wrote: > >> >>>>> > >> >>>>> > >> >>>>> Hey, > >> >>>>> > >> >>>>> > >> >>>>> I've created a patch[0] that is finally able to activate > >> >>>>> > >> >>>>> libvirt's > >> >>>>> dynamic_ownership for VDSM while not negatively affecting > >> >>>>> functionality of our storage code. > >> >>>>> > >> >>>>> That of course comes with quite a bit of code removal, mostly > >> >>>>> in > >> >>>>> the > >> >>>>> area of host devices, hwrng and anything that touches devices; > >> >>>>> bunch > >> >>>>> of test changes and one XML generation caveat (storage is > >> >>>>> handled > >> >>>>> by > >> >>>>> VDSM, therefore disk relabelling needs to be disabled on the > >> >>>>> VDSM > >> >>>>> level). > >> >>>>> > >> >>>>> Because of the scope of the patch, I welcome > >> >>>>> storage/virt/network > >> >>>>> people to review the code and consider the implication this > >> >>>>> change > >> >>>>> has > >> >>>>> on current/future features. > >> >>>>> > >> >>>>> [0] https://gerrit.ovirt.org/#/c/89830/ > >> >>>>> > >> >>>>> > >> >>>>> In particular: dynamic_ownership was set to 0 prehistorically > >> >>>>> (as > >> >>>>> > >> >>>>> > >> >>>>> part > >> >>>>> > >> >>>>> > >> >>>>> of https://bugzilla.redhat.com/show_bug.cgi?id=554961 ) because > >> >>>>> > >> >>>>> libvirt, > >> >>>>> running as root, was not able to play properly with root-squash > >> >>>>> nfs > >> >>>>> mounts. > >> >>>>> > >> >>>>> Have you attempted this use case? > >> >>>>> > >> >>>>> I join to Nir's request to run this with storage QE. > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> -- > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> Raz Tamir > >> >>>>> Manager, RHV QE > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> _______________________________________________ > >> >>>>> Devel mailing list > >> >>>>> Devel@ovirt.org > >> >>>>> http://lists.ovirt.org/mailman/listinfo/devel > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>> > >> >>>> > >> >>> > >> >> > >> > > >> > <logs.tar.gz> > >> > > >> > > > > > >
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/5VMAQHDLLIGQ53MP6ETWTUZ3LIYISVTT/