[ovirt-devel] Re: Purging inactive maintainers from vdsm-master-maintainers
On 11/27/19 3:25 PM, Nir Soffer wrote: I want to remove inactive contributors from vdsm-master-maintainers. I suggest the simple rule of 2 years of inactivity for removing from this group, based on git log. See the list below for current status: https://gerrit.ovirt.org/#/admin/groups/106,members No objections, keeping the list minimal and current is a good idea. -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/EBCSQK5AV75CXQA4THAFYTQ5MP4D4JYJ/
[ovirt-devel] Re: Proposing Marcin Sobczyk as VDSM infra maintainer
On 11/7/19 3:13 PM, Martin Perina wrote: Hi, Marcin has joined infra team more than a year ago and during this time he contributed a lot to VDSM packaging, improved automation and ported all infra team parts of VDSM (jsonrpc, ssl, vdms-client, hooks infra, ...) to Python 3. He is a very nice person to talk, is usually very responsive and takes care a lot about code quality. So I'd like to propose Marcin as VDSM infra maintainer. Please share your thoughts. +1 -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/GWBS6O6MJSGM7QCVCL54MMKMIUMKSZAL/
[ovirt-devel] Re: [VDSM] CentOS 8 build on Travis - only 3 tests fail
On 10/11/19 10:21 AM, Marcin Sobczyk wrote: On 10/11/19 10:07 AM, Francesco Romani wrote: On 10/10/19 10:42 PM, Nir Soffer wrote: I added centos-8 docker image: https://hub.docker.com/r/ovirtorg/vdsm-test-centos-8 And centos-8 build in Travis: https://travis-ci.org/nirs/vdsm/jobs/596270609 After fixing 22 tests trying to run /usr/bin/python2, we have only 3 failing tests in lib-py36: lib/hooks_test.py .x..F.. [ 50%] ssl_test.py ...FF... [100%] Marcin already handled the ssl tests in: https://gerrit.ovirt.org/c/103812/ I'm not sue why the hook test fail. I *guess* incompatible pickle format between python versions? Nope, the pickling and unpickling is done with the same interpreter version. I already posted that the patch for the issue is here and the explanation is in the commit message (tldr it's about some environments using ascii locale): https://gerrit.ovirt.org/#/c/102455/ Yep, seen the patch and LGTM Thanks! -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/V7DVGA47AJ5ALLLTJSXZMLPFRB6XPZEY/
[ovirt-devel] Re: [VDSM] CentOS 8 build on Travis - only 3 tests fail
On 10/10/19 10:42 PM, Nir Soffer wrote: I added centos-8 docker image: https://hub.docker.com/r/ovirtorg/vdsm-test-centos-8 And centos-8 build in Travis: https://travis-ci.org/nirs/vdsm/jobs/596270609 After fixing 22 tests trying to run /usr/bin/python2, we have only 3 failing tests in lib-py36: lib/hooks_test.py .x..F.. [ 50%] ssl_test.py ...FF... [100%] Marcin already handled the ssl tests in: https://gerrit.ovirt.org/c/103812/ I'm not sue why the hook test fail. I *guess* incompatible pickle format between python versions? Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/7PEGPUT2VVI5M6SDEYGXLYRHDUEEWOYX/
[ovirt-devel] Re: ovirt-vmconsole package missing on fc29
On 10/9/19 12:56 PM, Evgheni Dereveanchin wrote: This looks like another issue of a package being rebuilt without bumping the version. It typically results in our proxy still delivering the old cached RPM file while yum expects the one with a different size. I've removed this package from cache and remember that this exact issue already happened with ovirt-vmconsole-1.0.7-3 about a month ago: https://ovirt-jira.atlassian.net/browse/OVIRT-2795 To avoid it we need to ensure each build that ends up in tested has a unique file name. In case this issue happens again please reach out to me and I'll purge squid caches again. Thanks. I merged the patch from Sandro (thanks!) to sidestep this issue. -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/NQL557Y3TNC57IVZXEUSZMKDAS3KFIG7/
[ovirt-devel] Re: ovirt-vmconsole package missing on fc29
On 10/9/19 10:48 AM, Miguel Duarte de Mora Barroso wrote: Hi, CI is failing for all ovirt-provider-ovn patches, because of an issue concerning ovirt-vmconsole. This log excerpt explains what's wrong: Downloading Packages: 10:40:26 (1/583): ioprocess-1.3.0-1.201909241116.git3742 182 kB/s | 34 kB 00:00 10:40:26 (2/583): ovirt-imageio-daemon-1.6.0-0.201910071 369 kB/s | 39 kB 00:00 10:40:26 (3/583): ovirt-imageio-common-1.6.0-0.201910071 290 kB/s | 86 kB 00:00 10:40:26 [MIRROR] ovirt-vmconsole-1.0.7-3.fc29.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 38804 but expected size is: 38800 10:40:26 [MIRROR] ovirt-vmconsole-1.0.7-3.fc29.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 38804 but expected size is: 38800 10:40:26 (4/583): mom-0.5.17-0.0.master.20191001143931.g 376 kB/s | 131 kB 00:00 10:40:26 (5/583): python2-ioprocess-1.3.0-1.201909241116 558 kB/s | 29 kB 00:00 10:40:26 [MIRROR] ovirt-vmconsole-1.0.7-3.fc29.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 38804 but expected size is: 38800 10:40:26 [MIRROR] ovirt-vmconsole-1.0.7-3.fc29.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 38804 but expected size is: 38800 10:40:26 [FAILED] ovirt-vmconsole-1.0.7-3.fc29.noarch.rpm: No more mirrors to try - All mirrors were already tried without success It seems a mirror issue, I've experienced other times with different packages. I don't know how to fix it. -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WPEB4YYO5YY5DJBG4OFBBC2NRJBZPJUE/
[ovirt-devel] Re: Successful 'vdsm-client Host getStats' call on py3
On 7/10/19 11:25 AM, Marcin Sobczyk wrote: Hi, all of the py3-stomp-yajsonrpc patches have now +2 - I know they're not flawless, but they get the job done and we definitely need them. Can someone with merge rights please merge them? https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:py3-stomp-yajsonrpc-encode-decode https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:py3-stomp-yajsonrpc-frame https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:py3-stomp-yajsonrpc-parser https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:py3-stomp-yajsonrpc-COMMANDS https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:py3-stomp-yajsonrpc-http-detector https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:py3-stomp-yajsonrpc-stomp-detector https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:py3-stomp-yajsonrpc-ssl-socket https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:py3-stomp-yajsonrpc-follow-up I'll take care tomorrow(-ish) if there is still need. bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/5YXU4ODNHGWQPTSHI4TO5JDALR4MH2CJ/
[ovirt-devel] Re: Successful 'vdsm-client Host getStats' call on py3
On 6/26/19 2:32 PM, Marcin Sobczyk wrote: Hi, I'm currently working on making yajsonrpc/stomp implementation py3-compatible so we can have basic communication with vdsm running on py3. Today for the first time I was able to run vdsm [1] with py3 on fc29 and do a successful 'vdsm-client Host getStats' call. Kudos for reaching this milestone! -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/JJ6ROEXE7XBLQ2CQ3PPG2HSL6FKKCP2H/
[ovirt-devel] Re: ovirt-vmconsole CI check-patch
On 4/19/19 8:40 AM, Sandro Bonazzola wrote: Hi, looks like ovirt-vmconsole is missing CI check-patch stage. Is there any reason for not running at least an rpm build in check-patch? I see the project has unit testing, why not run them on check-patch? No reason, we just didn't do the integration. If you can provide instructions/docs, I think I can squeeze this task in my schedule bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/IEW4L4XN5AW5ZAVZ64QGY57OMYLDST3T/
[ovirt-devel] Re: oci: command line tool for oVirt CI
On 3/15/19 11:51 PM, Nir Soffer wrote: I want to share a nice little tool that can make your life easier. https://github.com/nirs/oci VERY nice! kudos! -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/FPTWT6EHS5CVSIZQRVJ6ABJAZAAU74GJ/
[ovirt-devel] [hyperv] hvinfo: command line tool to get the HyperV enlightements status from within the guest
Hi all, lately I've been involved again in hyperv support. I was tasked to help improve the testability of the hyperv guest configuration. Hyperv support is a set of optimizations that libvirt/qemu offer to improve the runtime behaviour (stability, performance) of Windows guets. See for example: https://libvirt.org/formatdomain.html#elementsFeatures oVirt does support them (see for example https://github.com/oVirt/vdsm/blob/master/lib/vdsm/virt/libvirtxml.py#L243) and will keep supporting them because they are a key feature for windows guests. Up until now the easiest (only?) way to check if a hyperv optmization was really enabled was to inspect the libvirt XML and/or the QEMU command line flags But we wanted to have another check. Enter hvinfo: https://github.com/fromanirh/hvinfo hvinfo is a simple tool that decodes CPUID informations according to the publicly-available HyperV specs (https://github.com/MicrosoftDocs/Virtualization-Documentation/tree/live/tlfs) and report what the guest sees. It takes no arguments - just run it!-, it requires no special privileges and emits easy to consume JSON. It is designed to help and integrate into fully automated CI/QA. Being a commandline tool, is hard to give a "screenshot", so let me just report sample output (admin privileges not actually required) Windows PowerShell Copyright (C) Microsoft Corporation. All rights reserved. PS C:\Users\admin> cd .\Downloads\ PS C:\Users\admin\Downloads> .\hvinfo.exe { "HyperVsupport": true, "Features": { "GuestDebugging": false, "PerformanceMonitor": false, "PCPUDynamicPartitioningEvents": true, "HypercallInputParamsXMM": false, "VirtualGuestIdleState": false, "HypervisorSleepState": false, "NUMADistanceQuery": false, "TimerFrequenciesQuery": false, "SytheticMCEInjection": false, "GuestCrashMSR": false, "DebugMSR": false, "NPIEP": false, "DisableHypervisorAvailable": false, "ExtendedGvaRangesForFlushVirtualAddressList": false, "HypercallOutputReturnXMM": false, "SintPollingMode": false, "HypercallMsrLock": false, "UseDirectSyntheticTimers": false }, "Recommendations": { "HypercallAddressSpaceSwitch": false, "HypercallLocalTLBFlush": false, "HypercallRemoteTLBFlush": false, "MSRAPICRegisters": true, "MSRSystemReset": false, "RelaxedTiming": true, "DMARemapping": false, "InterruptRemapping": false, "X2APICMSR": false, "DeprecatingAutoEOI": false, "SyntheticClusterIPI": false, "ExProcessorMasks": false, "Nested": false, "INTForMBECSyscalls": false, "NestedEVMCS": false, "SyncedTimeline": false, "DirectLocalFlushEntire": false, "NoNonArchitecturalCoreSharing": false, "SpinlockRetries": 8191 } } PS C:\Users\admin\Downloads> Caveat: the name of the features are the same of the spec, so we need mappings for oVirt flags, libvirt flags and so on. For example libvirt xml domain.features.hyperv.relaxed[state="on"] maps to hvinfo json Features.Recommendations.RelaxedTiming=true Feel free to test it and report any issue Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/Z6OSC4RLC3TXINE2FV5NKYYHOI5L7PLN/
[ovirt-devel] Re: Python 3 vdsm RPM packages
On 2/28/19 12:30 PM, Dan Kenigsberg wrote: I am afraid of fresh starts; I'd very much prefer to start from the sh*tty thing we have, and evolve it. A lot of time, re-writing a piece of software is tempting, but existing code is imbued with knowledge of past problems, which is often forgotten when you do a hard cut. In general, this is very true. However, in the *specific case* of the spec file, it is mostly an aggregation of fixes, hacks and workarounds arised from contingent issues. Or at very least this is my experience with the spec file. My point is the spec file is mostly a snapshot of the fixes needed for a given set of supported OSes.[1] There isn't much to salvage here, and for what's worth keeping, git history is the best record, and this is not going to be lost. For these reasons, I support Marcin's plan to start afresh side by side with a new spec file, using the old one as reference. I trust Marcin (and us) to be careful and look back often at the old spec file when writing the new one, to avoid forgetting the lessons of the past. +++ [1] the current spec file looks much more like a scratchpad full of scribbled notes than a careful curated chronicle imbued with knowledge. Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/5DXGFUPQCGMQNYRND3UVXDQFE55ZYI5G/
[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master/4.2 (ovirt-vmconsole) ] [ 15-02-2019 ] [ TEST NAME ]
On 2/15/19 4:05 PM, Dafna Ron wrote: On Fri, Feb 15, 2019 at 2:54 PM Francesco Romani <mailto:from...@redhat.com>> wrote: On 2/15/19 3:49 PM, Dafna Ron wrote: On Fri, Feb 15, 2019 at 2:39 PM Francesco Romani mailto:from...@redhat.com>> wrote: On 2/15/19 1:40 PM, Dafna Ron wrote: Hi, Hi, We are failing to deploy hosts in upgrade suites on both master and 4.2 for project ovirt-vmconsole. it seems we are missing packages for selinux-policy. Root cause identified by CQ as: https://gerrit.ovirt.org/#/c/97704/ - spec: clean up and reorganize can you please take a look at this issue? Sandro requested a bug so I opened one: https://bugzilla.redhat.com/show_bug.cgi?id=1677630 Yep, I replied https://bugzilla.redhat.com/show_bug.cgi?id=1677630#c2 more discussion follows: Error: 019-02-14 12:11:42,063-0500 ERROR otopi.plugins.otopi.packagers.yumpackager yumpackager.error:85 Yum [u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy >= 3.13.1-229.el7_6.9', u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy-base >= 3.13.1-229.el7_6.9'] 2019-02-14 12:11:42,063-0500 DEBUG otopi.context context._executeMethod:142 method exception Traceback (most recent call last): File "/tmp/ovirt-8JzESBo7eU/pythonlib/otopi/context.py", line 132, in _executeMethod method['method']() File "/tmp/ovirt-8JzESBo7eU/otopi-plugins/otopi/packagers/yumpackager.py", line 248, in _packages self.processTransaction() File "/tmp/ovirt-8JzESBo7eU/otopi-plugins/otopi/packagers/yumpackager.py", line 262, in processTransaction if self._miniyum.buildTransaction(): File "/tmp/ovirt-8JzESBo7eU/pythonlib/otopi/miniyum.py", line 920, in buildTransaction raise yum.Errors.YumBaseError(msg) YumBaseError: [u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy >= 3.13.1-229.el7_6.9', u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy-base >= 3.13.1-229.el7_6.9'] 2019-02-14 12:11:42,064-0500 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Package installation': [u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy >= 3.13.1-229.el7_6.9', u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy-base >= 3.13.1-229.el7_6.9'] Thanks, Dafna It seems to me this is happening in CentOS. So: The patchhttps://gerrit.ovirt.org/#/c/97704/ *wants* to use this spec file macro %{?selinux_requires} This macro automatically set the right dependency for the platform on which the package is being built. From the error above, we can see that the host on which the package, built from master, is going to be installed does *not* have that right package. However, on a test box of mine: 1005 15:07:43 root@kenji:~ $ cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) 1006 15:07:49 root@kenji:~ $ rpm -qa | grep selinux-policy selinux-policy-3.13.1-229.el7_6.9.noarch selinux-policy-targeted-3.13.1-229.el7_6.9.noarch selinux-policy-devel-3.13.1-229.el7_6.6.noarch 1010 15:08:50 root@kenji:~ $ rpm -q --provides selinux-policy config(selinux-policy) = 3.13.1-229.el7_6.9 selinux-policy = 3.13.1-229.el7_6.9 1011 15:08:52 root@kenji:~ $ rpm -q --provides selinux-policy-targeted config(selinux-policy-targeted) = 3.13.1-229.el7_6.9 selinux-policy-base = 3.13.1-229.el7_6.9 selinux-policy-targeted = 3.13.1-229.el7_6.9 so it seems that the package was built on up-to-date host, while is being installed in a host outdated. Not sure I understand that. we are running on an isolated environment which is running 7.6 and the package we have available in the centos repo is: selinux-policy-0:3.13.1-229.el7_6.9.noarch when i force the download of the package (i.e I tell lago to grab that package on deploy of vms) then the package is available and downloaded. So I am not sure what you mean about the package running on an outdated host? I mean that the package is available, so the dependency could be fullfilled (e.g ovirt-vmconsole is not depending on bogus, unreleased package). If the dependency is not being fullfilled, it's an issue of the specific host on which the test fails. It should install cleanly on an up-to-date RHEL/CentOS 7.6 host. But as I was saying, the host is not related to our CI runs as they run in mock in a clean environment (each run is cleaned and re-installed) The vms are created and des
[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master/4.2 (ovirt-vmconsole) ] [ 15-02-2019 ] [ TEST NAME ]
On 2/15/19 3:49 PM, Dafna Ron wrote: On Fri, Feb 15, 2019 at 2:39 PM Francesco Romani <mailto:from...@redhat.com>> wrote: On 2/15/19 1:40 PM, Dafna Ron wrote: Hi, Hi, We are failing to deploy hosts in upgrade suites on both master and 4.2 for project ovirt-vmconsole. it seems we are missing packages for selinux-policy. Root cause identified by CQ as: https://gerrit.ovirt.org/#/c/97704/ - spec: clean up and reorganize can you please take a look at this issue? Sandro requested a bug so I opened one: https://bugzilla.redhat.com/show_bug.cgi?id=1677630 Yep, I replied https://bugzilla.redhat.com/show_bug.cgi?id=1677630#c2 more discussion follows: Error: 019-02-14 12:11:42,063-0500 ERROR otopi.plugins.otopi.packagers.yumpackager yumpackager.error:85 Yum [u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy >= 3.13.1-229.el7_6.9', u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy-base >= 3.13.1-229.el7_6.9'] 2019-02-14 12:11:42,063-0500 DEBUG otopi.context context._executeMethod:142 method exception Traceback (most recent call last): File "/tmp/ovirt-8JzESBo7eU/pythonlib/otopi/context.py", line 132, in _executeMethod method['method']() File "/tmp/ovirt-8JzESBo7eU/otopi-plugins/otopi/packagers/yumpackager.py", line 248, in _packages self.processTransaction() File "/tmp/ovirt-8JzESBo7eU/otopi-plugins/otopi/packagers/yumpackager.py", line 262, in processTransaction if self._miniyum.buildTransaction(): File "/tmp/ovirt-8JzESBo7eU/pythonlib/otopi/miniyum.py", line 920, in buildTransaction raise yum.Errors.YumBaseError(msg) YumBaseError: [u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy >= 3.13.1-229.el7_6.9', u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy-base >= 3.13.1-229.el7_6.9'] 2019-02-14 12:11:42,064-0500 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Package installation': [u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy >= 3.13.1-229.el7_6.9', u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy-base >= 3.13.1-229.el7_6.9'] Thanks, Dafna It seems to me this is happening in CentOS. So: The patchhttps://gerrit.ovirt.org/#/c/97704/ *wants* to use this spec file macro %{?selinux_requires} This macro automatically set the right dependency for the platform on which the package is being built. From the error above, we can see that the host on which the package, built from master, is going to be installed does *not* have that right package. However, on a test box of mine: 1005 15:07:43 root@kenji:~ $ cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) 1006 15:07:49 root@kenji:~ $ rpm -qa | grep selinux-policy selinux-policy-3.13.1-229.el7_6.9.noarch selinux-policy-targeted-3.13.1-229.el7_6.9.noarch selinux-policy-devel-3.13.1-229.el7_6.6.noarch 1010 15:08:50 root@kenji:~ $ rpm -q --provides selinux-policy config(selinux-policy) = 3.13.1-229.el7_6.9 selinux-policy = 3.13.1-229.el7_6.9 1011 15:08:52 root@kenji:~ $ rpm -q --provides selinux-policy-targeted config(selinux-policy-targeted) = 3.13.1-229.el7_6.9 selinux-policy-base = 3.13.1-229.el7_6.9 selinux-policy-targeted = 3.13.1-229.el7_6.9 so it seems that the package was built on up-to-date host, while is being installed in a host outdated. Not sure I understand that. we are running on an isolated environment which is running 7.6 and the package we have available in the centos repo is: selinux-policy-0:3.13.1-229.el7_6.9.noarch when i force the download of the package (i.e I tell lago to grab that package on deploy of vms) then the package is available and downloaded. So I am not sure what you mean about the package running on an outdated host? I mean that the package is available, so the dependency could be fullfilled (e.g ovirt-vmconsole is not depending on bogus, unreleased package). If the dependency is not being fullfilled, it's an issue of the specific host on which the test fails. It should install cleanly on an up-to-date RHEL/CentOS 7.6 host. -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/PIENPZ4MCV4Y6PT2S4UNBAUJHZFDGYKJ/
[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master/4.2 (ovirt-vmconsole) ] [ 15-02-2019 ] [ TEST NAME ]
On 2/15/19 1:40 PM, Dafna Ron wrote: Hi, Hi, We are failing to deploy hosts in upgrade suites on both master and 4.2 for project ovirt-vmconsole. it seems we are missing packages for selinux-policy. Root cause identified by CQ as: https://gerrit.ovirt.org/#/c/97704/ - spec: clean up and reorganize can you please take a look at this issue? Sandro requested a bug so I opened one: https://bugzilla.redhat.com/show_bug.cgi?id=1677630 Yep, I replied https://bugzilla.redhat.com/show_bug.cgi?id=1677630#c2 more discussion follows: Error: 019-02-14 12:11:42,063-0500 ERROR otopi.plugins.otopi.packagers.yumpackager yumpackager.error:85 Yum [u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy >= 3.13.1-229.el7_6.9', u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy-base >= 3.13.1-229.el7_6.9'] 2019-02-14 12:11:42,063-0500 DEBUG otopi.context context._executeMethod:142 method exception Traceback (most recent call last): File "/tmp/ovirt-8JzESBo7eU/pythonlib/otopi/context.py", line 132, in _executeMethod method['method']() File "/tmp/ovirt-8JzESBo7eU/otopi-plugins/otopi/packagers/yumpackager.py", line 248, in _packages self.processTransaction() File "/tmp/ovirt-8JzESBo7eU/otopi-plugins/otopi/packagers/yumpackager.py", line 262, in processTransaction if self._miniyum.buildTransaction(): File "/tmp/ovirt-8JzESBo7eU/pythonlib/otopi/miniyum.py", line 920, in buildTransaction raise yum.Errors.YumBaseError(msg) YumBaseError: [u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy >= 3.13.1-229.el7_6.9', u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy-base >= 3.13.1-229.el7_6.9'] 2019-02-14 12:11:42,064-0500 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Package installation': [u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy >= 3.13.1-229.el7_6.9', u'ovirt-vmconsole-1.0.6-3.el7.noarch requires selinux-policy-base >= 3.13.1-229.el7_6.9'] Thanks, Dafna It seems to me this is happening in CentOS. So: The patchhttps://gerrit.ovirt.org/#/c/97704/ *wants* to use this spec file macro %{?selinux_requires} This macro automatically set the right dependency for the platform on which the package is being built. From the error above, we can see that the host on which the package, built from master, is going to be installed does *not* have that right package. However, on a test box of mine: 1005 15:07:43 root@kenji:~ $ cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) 1006 15:07:49 root@kenji:~ $ rpm -qa | grep selinux-policy selinux-policy-3.13.1-229.el7_6.9.noarch selinux-policy-targeted-3.13.1-229.el7_6.9.noarch selinux-policy-devel-3.13.1-229.el7_6.6.noarch 1010 15:08:50 root@kenji:~ $ rpm -q --provides selinux-policy config(selinux-policy) = 3.13.1-229.el7_6.9 selinux-policy = 3.13.1-229.el7_6.9 1011 15:08:52 root@kenji:~ $ rpm -q --provides selinux-policy-targeted config(selinux-policy-targeted) = 3.13.1-229.el7_6.9 selinux-policy-base = 3.13.1-229.el7_6.9 selinux-policy-targeted = 3.13.1-229.el7_6.9 so it seems that the package was built on up-to-date host, while is being installed in a host outdated. For this issue there is no action needed besides making sure that the installation host is up to date. Please note, however, that ovirt-vmconsole >= 1.0.7 should be installed in CentOS/RHEL >= 7.6 If needed, I think it could work in 7.4/7.5 too, but we will need a rebuild of the package and some testing. Last thing, question: do we need a package build on Fedora? I tested the el7 packages, they work fine in F29. Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/Y72WAQ4LMF4T7JVCWB5CDHPN6CSYKRSK/
[ovirt-devel] Re: Announcing 'COVERAGE' option in OST
On 12/12/18 9:42 PM, Marcin Sobczyk wrote: Hi, I've been working on adding coverage report for VDSM in OST recently, and I'm happy to announce, that the first batch of patches is merged! To run a suite with coverage, look for 'COVERAGE' drop-down on OST's build parameters page. If you run OST locally, pass a '--coverage' argument to 'run_suite.sh'. Currently, coverage works only for VDSM in basic-suite-master, but adding VDSM support for other suites is now a no-brainer. More patches are on the way! Since the option is named 'COVERAGE', and not 'VDSM_COVERAGE', other projects are welcome to implement their coverage reports on top of it. Cheers, Marcin Kudos! very nice and very helpful addition. -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/LENCKYBECVR5W3O3SI3YH5SGCFJIJ7OO/
[ovirt-devel] vdsm has been tagged (v4.30.4)
___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WUM4SIORNY5RNGK3MX4TP2TASDF7NPDW/ ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/3EKX5QS5BHOEGOPD3EITNLNSEU4ERK5Z/
[ovirt-devel] Re: [VDSM] Proposing Denis as vdsm gluster maintainer
On 11/12/18 1:33 PM, Nir Soffer wrote: Hi all, Denis is practically maintaining vdsm gluster code in the recent years, and it is time to make this official. Please ack, Nir Obvious +1 from me. Keep up the good work Denis! -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/YIQMQHMRD6ROJSTXJJLC7JJ7BZJNQJ5T/
[ovirt-devel] Re: [VDSM] running tests locally
On 10/19/18 3:40 PM, Vojtech Juranek wrote: Hi, Hi, beginner question: I'm trying to run some VDSM tests locally (e.g. blocksd_test.py), using PYTHON_EXE=python3 ./run_tests_local.sh storage/blocksd_test.py Please be aware some tests were not ported to python3. "blocksd_test.py" is among them. What I'm doing wrong and how I can run selected tests locally? I don't know about storage, but virt tests should be each one runnable using ./run_test_local.sh. (in general) I think we should preserve this property when we will move to py.test. Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/O5DIOKFUBCVV55XIGDONIC4PPSIYYQN7/
[ovirt-devel] vdsm has been tagged (v4.20.39.1)
Vdsm 4.20.39.1 for oVirt 4.2.6 (async) sub-branch ovirt-4.2.6 created -- Francesco Romani (@fromanirh) ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/ZQWYG6EAFWIHGFT7TLEZYTLCSP6MRHUX/
[ovirt-devel] Re: Build failed - TestPyWatch.test_kill_grandkids() did someone encounter this failure?
On 07/18/2018 02:32 PM, Nir Soffer wrote: On Wed, Jul 18, 2018 at 2:10 PM Francesco Romani <mailto:from...@redhat.com>> wrote: On 07/18/2018 12:11 PM, Nir Soffer wrote: On Wed, Jul 18, 2018 at 1:00 PM Francesco Romani mailto:from...@redhat.com>> wrote: On 07/15/2018 05:50 PM, Dan Kenigsberg wrote: May I repeat Nir's question: does it fail consistently? And are you rebased on master? Undefined command: "py-bt" Is a known xfail for Fedora I'd just like to add that I'm facing this error pretty regularly in the past couple of days. I can provide examples if needed but it seems to me we figured it out. We can disable the relevant pywatch tests in ovirt CI until this is resolved. Done: https://gerrit.ovirt.org/#/c/93108/ Thanks, but the failing test isTestPyWatch.test_timeout_backtrace Interesting, I've seen fail the other test too. Maybe it was just an accident? I will check. -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/5ZWJWHRJBJFU7J2XAA5RWGBXIHERDWDO/
[ovirt-devel] Re: Build failed - TestPyWatch.test_kill_grandkids() did someone encounter this failure?
On 07/18/2018 12:11 PM, Nir Soffer wrote: On Wed, Jul 18, 2018 at 1:00 PM Francesco Romani <mailto:from...@redhat.com>> wrote: On 07/15/2018 05:50 PM, Dan Kenigsberg wrote: May I repeat Nir's question: does it fail consistently? And are you rebased on master? Undefined command: "py-bt" Is a known xfail for Fedora I'd just like to add that I'm facing this error pretty regularly in the past couple of days. I can provide examples if needed but it seems to me we figured it out. We can disable the relevant pywatch tests in ovirt CI until this is resolved. Done: https://gerrit.ovirt.org/#/c/93108/ -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/UTI3WTT2VCBTLYQVP7YPNLDIZMIZA7PJ/
[ovirt-devel] Re: Build failed - TestPyWatch.test_kill_grandkids() did someone encounter this failure?
On 07/15/2018 05:50 PM, Dan Kenigsberg wrote: May I repeat Nir's question: does it fail consistently? And are you rebased on master? Undefined command: "py-bt" Is a known xfail for Fedora I'd just like to add that I'm facing this error pretty regularly in the past couple of days. I can provide examples if needed but it seems to me we figured it out. -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/PNHOHR65EANIVIGVVSBUNQU6HI3L2GEX/
[ovirt-devel] Propose Milan Zamazal as virt maintainer
Hi all, Milan Zamazal has been working on the oVirt project for more than 2.5 years. Let me highlight some of his many contributions to the project: - Since January this year - 2018, he's been a maintainer for the stable branches. - He developed important features like memory hotunplug, which required tight cooperation and communication with the other layers of the stack (libvirt, qemu). - He is a mentor in the Outreachy program, which lead to creation of the oVirt Log Analyzer: https://github.com/mz-pdm/ovirt-log-analyzer - He contributed more than 290 patches to the Vdsm project in master branch alone, excluding backports and contributions to Engine - He contributed and is contributing testcases and fixes to the oVirt System Test suite, a tool which was already pivotal in ensuring the quality of the oVirt project. As reviewer, Milan is responsive and his comments are always comprehensive and well focused, with strong attitude towards getting things done and done right. Milan also demonstrated his ability to adapt to the needs of the project: - he demonstrated careful and thoughtful patch management while maintaining the stable branches - he also demonstrated he's not shy to tackle large and needed changes during the 4.1 and 4.2 cycles, when we deeply reorganized the XML processing in the virt code. For those reasons, and many more, I think he will be a good addition to the maintainers team, and I propose him as virt co-maintainer. Please share your thoughts Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/YDWNE3UMXBHU2NSCNPXPOC4ZYOTYHTSU/
Re: [ovirt-devel] OST: Enabling DEBUG logging level in Vdsm?
On 04/12/2018 03:17 PM, Milan Zamazal wrote: > Hi, > > it's quite inconvenient that DEBUG messages are missing in OST Vdsm > logs. Would it be possible to enable them some way? Maybe just call vdsm-client once Vdsm is up? we can toggle the log verbosiness at runtime. I strongly support this idea - OST is probably one of the few places outside developers' environments where DEBUG level by default fully makes sense However, I'd like to reminder that since the addition of the metadata.py module, virt logs can be quite verbose. Raising this just because it could eat some storage space on OST workers, so let's keep this item on the watchlist. -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [vdsm][ci] EL7 failing for timeout lately
Hi all, in the last few days quite a lot of CI tests on LE7 workers failed for timeout. Random example: http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/246/consoleFull This happened both on 4.2 and on master test suite, quite often but not always. I wonder if we should just increase the timeout or if something else is going on. Thoughts? -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [vdsm] network/integration test failures
Hi developers, we had another network CI failure while testing unrelated changes: http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/22708/consoleFull *12:20:19* self = *12:20:19* *12:20:19* def test_bond_clear_arp_ip_target(self): *12:20:19* OPTIONS_A = { *12:20:19* 'mode': '1', *12:20:19* 'arp_interval': '1000', *12:20:19* 'arp_ip_target': '192.168.122.1'} *12:20:19* OPTIONS_B = { *12:20:19* 'mode': '1', *12:20:19* 'arp_interval': '1000'} *12:20:19* *12:20:19* with bond_device() as bond: *12:20:19* > bond.set_options(OPTIONS_A) *12:20:19* *12:20:19* network/integration/link_bond_test.py:171: Could please some network developer have a look? Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [vdsm][maintainership] proposal for a new stable branch policy
Hi all, Recently Milan, Petr and me discussed the state of ovirt.4.2, considering that release 4.2.2 is still pending and this prevents merging of patches in the sub-branch ovirt-4.2. We agreed we could improve the handling of the stable branch(es), in order to make the process smoother and more predictable. Currently, we avoid doing Z- branches (e.g. ovirt-4.2.2, ovirt-4.2.3...) as much as we can, to avoid the hassle of double-backporting patches to stable branch. However, if a release hits unexpected delay, this policy causes different hassles: the Y-branch (ovirt-4.2, ovirt-4.3) is effectively locked, so patches already queued and ready for next releases can't be merged and need to wait. The new proposed policy is the following: - we will keep working exactly as now until we hit a certain RC version. We choosed RC3, a rule of thumb made out of experience. - if RC3 is final, everyone's happy and things resume as usual - if RC3 is NOT final, we will branch out at RC3 -- from that moment on, patches for next version could be accepted on the Y-branch -- stabilization of the late Z version will continue on the Z-branch -- patches will be backported twice Example using made up numbers - We just released ovirt 4.3.1 - We are working on the ovirt-4.3 branch - The last tag is v4.30.10, from ovirt-4.3 branch - We accept patches for ovirt 4.3.2 on the ovirt-4.3 branch - We keep collecting patches, until we tag v4.30.11 (oVirt 4.3.2 RC 1). Tag is made from ovirt-4.3 branch. - Same for tags 4.30.12 - oVirt 4.3.2 RC 2 and 4.30.13 - oVirt 4.3.2 RC 3. Both tags are made from ovirt-4.3 branch. - Damn, RC3 is not final. We branch out ovirt-4.3.2 form branch ovirt-4.3 from the same commit pointed by tag 4.30.13 - Next tags (4.30.13.1) for ovirt 4.3.2 will be taken from the ovirt-4.3.2 branch I believe this approach will make predictable for everyone if and when the branch will be made, so when the patches could be merged and where. The only drawback I can see - and that I realized while writing the example - is the version number which can be awkward: v4.30.11 -> 4.3.2 RC1 v4.30.12 -> 4.3.2 RC2 v4.30.13 -> 4.3.2 RC3 v4.30.13.1 -> 4.3.2 RC4 ?!?! v4.30.13.5 -> 4.3.2 RC5 ?!?! Perhaps we should move to four versions digit? So we could have v4.30.11.0 -> 4.3.2 RC1 v4.30.11.1 -> 4.3.2 RC2 v4.30.11.2 -> 4.3.2 RC3 v4.30.11.3 -> 4.3.2 RC4 v4.30.11.4 -> 4.3.2 RC5 I don't see any real drawback in using 4-digit versions by default, besides a minor increase in complexity, which is balanced by more predictable and consistent versions. Plus, we already had 4-digit versions in Vdsm, so packaging should work just fine. Please share your thoughts, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [vdsm] network test failure
Hi all, we had a bogus failure on CI again, some network test failed and it seems totally unrelated to the patch being tested: http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/22410/consoleFull could someone please have a look? Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [vdsm][network] false test failiures in CI
Hi there, recently some patches of mine failed on CI with this error which is totally unrelated and which seems bogus. Could some network developer please have a look and improve the current state? :) https://gerrit.ovirt.org/#/c/87881/ 19:43:33 == 19:43:33 ERROR: test_list_ipv4_ipv6 (network.ip_address_test.IPAddressTest) 19:43:33 -- 19:43:33 Traceback (most recent call last): 19:43:33 File "/home/jenkins/workspace/vdsm_4.2_check-patch-fc27-x86_64/vdsm/tests/testValidation.py", line 330, in wrapper 19:43:33 return f(*args, **kwargs) 19:43:33 File "/home/jenkins/workspace/vdsm_4.2_check-patch-fc27-x86_64/vdsm/tests/network/ip_address_test.py", line 289, in test_list_ipv4_ipv6 19:43:33 ipv6_addresses=[IPV6_B_WITH_PREFIXLEN] 19:43:33 File "/home/jenkins/workspace/vdsm_4.2_check-patch-fc27-x86_64/vdsm/tests/network/ip_address_test.py", line 297, in _test_list 19:43:33 address.IPAddressData(addr, device=nic)) 19:43:33 File "/home/jenkins/workspace/vdsm_4.2_check-patch-fc27-x86_64/vdsm/lib/vdsm/network/ip/address/iproute2.py", line 40, in add 19:43:33 addr_data.address, addr_data.prefixlen, addr_data.family 19:43:33 File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__ 19:43:33 self.gen.throw(type, value, traceback) 19:43:33 File "/home/jenkins/workspace/vdsm_4.2_check-patch-fc27-x86_64/vdsm/lib/vdsm/network/ip/address/iproute2.py", line 91, in _translate_iproute2_exception 19:43:33 new_exception, new_exception(str(address_data), error_message), tb) 19:43:33 File "/home/jenkins/workspace/vdsm_4.2_check-patch-fc27-x86_64/vdsm/lib/vdsm/network/ip/address/iproute2.py", line 86, in _translate_iproute2_exception 19:43:33 yield 19:43:33 File "/home/jenkins/workspace/vdsm_4.2_check-patch-fc27-x86_64/vdsm/lib/vdsm/network/ip/address/iproute2.py", line 40, in add 19:43:33 addr_data.address, addr_data.prefixlen, addr_data.family 19:43:33 File "/home/jenkins/workspace/vdsm_4.2_check-patch-fc27-x86_64/vdsm/lib/vdsm/network/ipwrapper.py", line 561, in addrAdd 19:43:33 _exec_cmd(command) 19:43:33 File "/home/jenkins/workspace/vdsm_4.2_check-patch-fc27-x86_64/vdsm/lib/vdsm/network/ipwrapper.py", line 482, in _exec_cmd 19:43:33 raise exc(returnCode, error.splitlines()) 19:43:33 IPAddressAddError: ("IPAddressData(device='dummy_ZVN3L' address=IPv6Interface(u'2002:99::1/64') scope=None flags=None)", 'RTNETLINK answers: Permission denied') I had other failures, but they look like known issues (mkimage, no port on protocol detector) -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] stable branch ovirt-4.2 created
On 02/07/2018 08:46 AM, Dan Kenigsberg wrote: > On Tue, Feb 6, 2018 at 10:28 PM, Francesco Romani wrote: >> Hi all, >> >> >> With the help of Sandro (many thanks @sbonazzo !), we created minutes >> ago the ovirt-4.2 stable branch: >> >> >> Steps performed: >> >> 1. merged https://gerrit.ovirt.org/#/c/87070/ >> >> 2. branched out ovirt-4.2 from git master >> >> 3. merged https://gerrit.ovirt.org/#/c/87181/ to add support for 4.3 level >> >> 4. createed and pushed the tag v4.30.0 from master, to make sure the >> version number is greater of the stable versions, and to (somehow :)) >> align with oVirt versioning >> >> 5. tested make dist/make rpm on both new branch ovirt-4.2 and master, >> both looks good and use the right version >> >> >> Maintainers, please check it looks right for you before merging any new >> patch to master branch. >> >> >> Please let me know about any issue! > Thank you Francesco (and Sandro). > > Any idea why > http://plain.resources.ovirt.org/pub/ovirt-4.2-snapshot/rpm/el7/noarch/ > still does not hold any vdsm-4.20 , and > http://plain.resources.ovirt.org/pub/ovirt-master-snapshot/rpm/el7/noarch/ > does not have the new vdsm-4.30 ? > > ? Uhm, maybe related to CQ (Change Queue), because git state looks ok, one data point: http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-on-demand-el7-x86_64/772/artifact/exported-artifacts/ built from this patch https://gerrit.ovirt.org/#/c/87213/ in turn based on top of current master -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [vdsm] stable branch ovirt-4.2 created
Hi all, With the help of Sandro (many thanks @sbonazzo !), we created minutes ago the ovirt-4.2 stable branch: Steps performed: 1. merged https://gerrit.ovirt.org/#/c/87070/ 2. branched out ovirt-4.2 from git master 3. merged https://gerrit.ovirt.org/#/c/87181/ to add support for 4.3 level 4. createed and pushed the tag v4.30.0 from master, to make sure the version number is greater of the stable versions, and to (somehow :)) align with oVirt versioning 5. tested make dist/make rpm on both new branch ovirt-4.2 and master, both looks good and use the right version Maintainers, please check it looks right for you before merging any new patch to master branch. Please let me know about any issue! Happy hacking, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [ACTION-REQUIRED] Making accurate CI for oVirt 4.2
On 01/24/2018 08:52 AM, Dan Kenigsberg wrote: > On Wed, Jan 24, 2018 at 8:35 AM, Barak Korren wrote: >> On 23 January 2018 at 18:44, Martin Sivak wrote: >>> Hi Barak, >>> >>> can you please please add links to the proper repositories and/or >>> directories when you send something like this? I really helps us when >>> we do not have to search through all the jenkins and other infra >>> repositories for which is the correct one. Because I really do not >>> remember all the places that need to change out of my head. >> See below. >> >>> So what you are asking for here is basically that we edit the files >>> here [1] and create a 4.2_build-artifacts job using copy and paste, >>> right? Or is there some other place that needs to change as well? >> Yep. technically this should amount to a single change to a single >> file (See below). The important part is making the right decision for >> each project, understanding its consequences, and realizing the >> actions that would be needed for changing that decision in the future. >> >>> [1] >>> https://gerrit.ovirt.org/gitweb?p=jenkins.git;a=tree;f=jobs/confs/projects;h=5a59dfea545da98e252eb6c8d95a92d08708a22d;hb=cd75bb9eb3353652384ed89777fc15d71d1f9e36 >> There is only one file** you need to maintain that is (currently) not >> in your own project's repo***. >> Each project has such a file at [1]. >> >> Documentation for the contents of that file can be found here: [2]. >> >> There is no need to copy-paste much - the existing file should contain >> a mapping of project branches to oVirt versions. Typically what would >> be needed is just to add a single entry to the map. For example, for >> engine it would be: >> >> version: >> - master: >> branch: master >> - 4.2: >> branch: master >>... > If project maintainers opt for this "Route 2", it is their personal > responsibility to change the above "master" to "ovirt-4.2" branch > *BEFORE* they create their stable branch ovirt-4.2. If they fail to do > so, CI would get "dirty" with 4.3 packages. Barak hinted to this a > bit too mildly. OK, so let's get ready. I want to take "route 1", so I want to create the branches and map the new jobs to them, so I posted https://gerrit.ovirt.org/#/c/87159/ Rationale: I want the master branch and the ovirt-4.2 branch to be fully independent, like ovirt-4.1 is. Please let me know if I got it right. Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm][RFC] reconsidering branching out ovirt-4.2
We agreed to branch out on Tuesday, February 6. After that, we will lift any restriction for patches in master branch: it is up to each team to decide which patches they want to submit, and which patches they want to backport, following the usual rules. E.g. nothing prevents anyone to submit refactoring or large changes to master *and* to propose them for backport to stable branch(es) later, following the usual steps. Bests, On 01/29/2018 08:39 AM, Francesco Romani wrote: > Hi all, > > > It is time again to reconsider branching out the 4.2 stable branch. > > So far we decided to *not* branch out, and we are taking tags for ovirt > 4.2 releases from master branch. > > This means we are merging safe and/or stabilization patches only in master. > > > I think it is time to reconsider this decision and branch out for 4.2, > because of two reasons: > > 1. it sends a clearer signal that 4.2 is going in stabilization mode > > 2. we have requests from virt team, which wants to start working on the > next cycle features. > > > If we decide to branch out, I'd start the new branch on monday, February > 5 (1 week from now). > > > The discussion is open, please share your acks/nacks for branching out, > and for the branching date. > > > I for myself I'm inclined to branch out, so if noone chimes in (!!) I'll > execute the above plan. > > -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [vdsm][RFC] reconsidering branching out ovirt-4.2
Hi all, It is time again to reconsider branching out the 4.2 stable branch. So far we decided to *not* branch out, and we are taking tags for ovirt 4.2 releases from master branch. This means we are merging safe and/or stabilization patches only in master. I think it is time to reconsider this decision and branch out for 4.2, because of two reasons: 1. it sends a clearer signal that 4.2 is going in stabilization mode 2. we have requests from virt team, which wants to start working on the next cycle features. If we decide to branch out, I'd start the new branch on monday, February 5 (1 week from now). The discussion is open, please share your acks/nacks for branching out, and for the branching date. I for myself I'm inclined to branch out, so if noone chimes in (!!) I'll execute the above plan. -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] vdsm stable branch maitainership
On 01/09/2018 06:54 PM, Michal Skrivanek wrote: > > >> On 9 Jan 2018, at 18:48, Nir Soffer > <mailto:nsof...@redhat.com>> wrote: >> >> On Tue, Jan 9, 2018 at 3:55 PM Adam Litke > <mailto:ali...@redhat.com>> wrote: >> >> +1 >> >> On Tue, Jan 9, 2018 at 8:17 AM, Francesco Romani >> mailto:from...@redhat.com>> wrote: >> >> On 01/09/2018 12:43 PM, Dan Kenigsberg wrote: >> > Hello, >> > >> > I would like to nominate Milan Zamazal and Petr Horacek as >> maintainers >> > of vdsm stable branches. This job requires understanding of >> vdsm >> > packaging and code, a lot of attention to details and >> awareness of the >> > requirements of other components and teams. >> > >> > I believe that both Milan and Petr have these qualities. I >> am certain >> > they would work in responsive caution when merging and >> tagging patches >> > to the stable branches. >> > >> > vdsm maintainers, please confirm if you approve. >> >> >> Why do we need 4 maintainers for the stable branch? >> >> Currently Yaniv and Francesco maintain this branch. > > they both have quite a few other duties recently, and less and less > time to attend to vdsm > if it wasn’t so noticable for Francesco yet, then it is going to be > quite soon. > > I believe it makes sense to ramp up others before it happens. I'd like to stress that I plan to float around as backup and to help people during the ramp up phase (and for any general advice/help that could be needed). -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] vdsm stable branch maitainership
On 01/09/2018 12:43 PM, Dan Kenigsberg wrote: > Hello, > > I would like to nominate Milan Zamazal and Petr Horacek as maintainers > of vdsm stable branches. This job requires understanding of vdsm > packaging and code, a lot of attention to details and awareness of the > requirements of other components and teams. > > I believe that both Milan and Petr have these qualities. I am certain > they would work in responsive caution when merging and tagging patches > to the stable branches. > > vdsm maintainers, please confirm if you approve. +1 -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] fc27 job failing on missing vmconsole
On 01/05/2018 08:16 AM, Dan Kenigsberg wrote: > http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc27-x86_64/ > is failing since yesterday on > > Error: > Problem 1: conflicting requests > - nothing provides ovirt-vmconsole >= 1.0.0-0 needed by > vdsm-4.20.11-7.git4a7f18434.fc27.x86_64 > > after already passing in the past. > > Can this be solved on ovirt-vmconsole side, or should we disable fc27 (again)? Speaking as maintainer of ovirt-console, it is a good chance to review what's the status of ovirt-vmconsole, and how we can improve things for the current state and for the future. ovirt-vmconsole is pretty stable codewise and featurewise (and, I'd say, bugwise), in the last 12+ months I hardly gathered material for another minor release (would be 1.0.5), which is not planned yet, as further proof that the codebase is stable. Most of time, the only thing needed is to ship the existing packages for a new distro release. It seldom may be needed a simple rebuild. I must say I'm not aware of how the current issue was solved for fc26, IIRC some manual intervation was applied along the lines above. Can we automate this step somehow? Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [ OST Failure Report ] [ oVirt Master ] [ 14-12-2017 ] [ 004_basic_sanity.disk_operations ]
On 12/14/2017 01:12 PM, Michal Skrivanek wrote: > >> On 14 Dec 2017, at 13:00, Dafna Ron > <mailto:d...@redhat.com>> wrote: >> >> Hi, >> >> We have a failure on basic suite on test: >> 004_basic_sanity.disk_operations >> >> I think that we query a snapshot that was already deleted >> successfully and report the snapshot as gone. and that is because of >> a different error in update vm query which happens before. >> >> * >> Link and headline of suspected patches: >> https://gerrit.ovirt.org/#/c/85168/ - core: Prevent retry lease >> hotplag in case of failure. >> >> Link to Job: >> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4393 >> >> Link to all logs: >> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/4393/artifact/ >> >> (Relevant) error snippet from the log: >> >> * >> >> vdsm: ** >> ** >> 2017-12-14 02:34:45,222-0500 ERROR (jsonrpc/7) >> [jsonrpc.JsonRpcServer] Internal server error (__init__:611) >> Traceback (most recent call last): >> File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line >> 606, in _handle_request >> res = method(**params) >> File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line >> 201, in _dynamicMethod >> result = fn(*methodArgs) >> File "", line 2, in getAllVmStats >> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line >> 48, in method >> ret = func(*args, **kwargs) >> File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1342, in >> getAllVmStats >> statsList = self._cif.getAllVmStats() >> File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 518, >> in getAllVmStats >> return [v.getStats() for v in self.vmContainer.values()] >> File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1699, >> in getStats >> oga_stats = self._getGuestStats() >> File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1895, >> in _getGuestStats >> self._update_guest_disk_mapping() >> File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1909, >> in _update_guest_disk_mapping >> self._sync_metadata() >> File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4995, >> in _sync_metadata >> self._md_desc.dump(self._dom) >> File "/usr/lib/python2.7/site-packages/vdsm/virt/metadata.py", line >> 477, in dump >> dom.setMetadata(libvirt.VIR_DOMAIN_METADATA_ELEMENT, >> File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", >> line 47, in __getattr__ >> % self.vmid) >> NotConnectedError: VM '7cab7e5a-cb12-4977-ac4f-65218532df7e' was not >> defined yet or was undefined >> ** > > it doesn’t seem to be relevant to this failure, but it deserves a fix > nevertheless > Francesco? > Smells like a race on shutdown. -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [vdsm] libvirtconnection test failures on my box
Hi all, since yesterday, running 'make check' on my F26 box I get those errors: == ERROR: libvirtMock will raise an error when nodeDeviceLookupByName is called. -- Traceback (most recent call last): File "/home/fromani/Projects/upstream/vdsm/tests/monkeypatch.py", line 134, in wrapper return f(*args, **kw) File "/home/fromani/Projects/upstream/vdsm/tests/monkeypatch.py", line 134, in wrapper return f(*args, **kw) File "/home/fromani/Projects/upstream/vdsm/tests/monkeypatch.py", line 134, in wrapper return f(*args, **kw) File "/home/fromani/Projects/upstream/vdsm/tests/common/libvirtconnection_test.py", line 150, in testCallFailedConnectionDown connection = libvirtconnection.get(killOnFailure=True) TypeError: __init__() got an unexpected keyword argument 'killOnFailure' >> begin captured logging << 2017-12-12 09:57:05,801 DEBUG (libvirt/events) [root] START thread (func=>, args=(), kwargs={}) (concurrent:189) 2017-12-12 09:57:05,802 DEBUG (libvirt/events) [root] FINISH thread (concurrent:192) - >> end captured logging << - == ERROR: libvirtMock will raise an error when nodeDeviceLookupByName is called. -- Traceback (most recent call last): File "/home/fromani/Projects/upstream/vdsm/tests/monkeypatch.py", line 134, in wrapper return f(*args, **kw) File "/home/fromani/Projects/upstream/vdsm/tests/monkeypatch.py", line 134, in wrapper return f(*args, **kw) File "/home/fromani/Projects/upstream/vdsm/tests/monkeypatch.py", line 134, in wrapper return f(*args, **kw) File "/home/fromani/Projects/upstream/vdsm/tests/common/libvirtconnection_test.py", line 132, in testCallFailedConnectionUp connection = libvirtconnection.get(killOnFailure=True) TypeError: __init__() got an unexpected keyword argument 'killOnFailure' >> begin captured logging << 2017-12-12 09:57:05,803 DEBUG (libvirt/events) [root] START thread (func=>, args=(), kwargs={}) (concurrent:189) 2017-12-12 09:57:05,803 DEBUG (libvirt/events) [root] FINISH thread (concurrent:192) - >> end captured logging << - == ERROR: Positive test - libvirtMock does not raise any errors -- Traceback (most recent call last): File "/home/fromani/Projects/upstream/vdsm/tests/monkeypatch.py", line 134, in wrapper return f(*args, **kw) File "/home/fromani/Projects/upstream/vdsm/tests/monkeypatch.py", line 134, in wrapper return f(*args, **kw) File "/home/fromani/Projects/upstream/vdsm/tests/common/libvirtconnection_test.py", line 118, in testCallSucceeded connection.nodeDeviceLookupByName() TypeError: nodeDeviceLookupByName() takes exactly 2 arguments (1 given) >> begin captured logging << 2017-12-12 09:57:05,804 DEBUG (libvirt/events) [root] START thread (func=>, args=(), kwargs={}) (concurrent:189) 2017-12-12 09:57:05,804 DEBUG (libvirt/events) [root] FINISH thread (concurrent:192) - >> end captured logging << - -- Smells like incorrect monkeypatching leaking out of test module. The last one is easy, seems just incorrect call, I have a fix pending. However, why is it starting to fail just now? It seems to run fine on CI, which is interesting. Any help is welcome Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [vdsm] merging patches for 4.2.1
Hi all, up until now, we avoided merging patches besides blockers or critical fixes, to help 4.2.0 stabilize. Now that 4.2.0 RC 1 is out, we will start merging again patches targeted for 4.2.1. Should more fixes needed for 4.2.0, we will have one (hopefully) short-lived ovirt-4.2.0 branch created from the last RC tag (4.20.9). Patches not needed for 4.2.1 will not be merged until the stable ovirt-4.2 branch is created, which is expected to happen in few weeks time. Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] oVirt 4.2.0 blockers review - Day 2
On 11/29/2017 08:40 AM, Sandro Bonazzola wrote: > Hi, > we had 7 blockers yesterday and the list is now down to 4: > Bug IDProduct AssigneeStatus Summary Changed > 1516113 cockpit-ovirt phbai...@redhat.com > <mailto:phbai...@redhat.com> POSTDeploy the HostedEngine failed > with the default CPU type 2017-11-27 20:52:27 > 1509629 ovirt-engineah...@redhat.com <mailto:ah...@redhat.com> > POST Cold merge failed to remove all volumes 2017-11-28 11:33:16 > 1507277 ovirt-engineera...@redhat.com <mailto:era...@redhat.com> > POST [RFE][DR] - Vnic Profiles mapping in VMs register from data > storage domain should be supported also for templates 2017-11-28 > 06:38:34 > 1496719 vdsmedwa...@redhat.com <mailto:edwa...@redhat.com> POST > Port mirroring is not set after VM migration 2017-11-28 11:54:20 > We are working hard on 1496719. We fixed the legacy flow, expect the complete fix addressing domain xml very soon, hopefully today Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] oVirt CI now supports Fedora 27 and Fedora Rawhide
; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,726 DEBUG (MainThread) [root] /sbin/tc qdisc add dev dummy_dJrUa parent 1389:1388 handle 1388: fq_codel (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,733 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,734 DEBUG (MainThread) [root] /sbin/tc qdisc show dev dummy_dJrUa (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,745 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,746 DEBUG (MainThread) [root] /sbin/tc filter show dev dummy_dJrUa parent 1389: (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,753 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,754 DEBUG (MainThread) [root] /sbin/tc filter del dev dummy_dJrUa pref 16 parent 1389: (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,760 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,761 DEBUG (MainThread) [root] /sbin/tc class del dev dummy_dJrUa classid 1389:10 (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,768 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,768 DEBUG (MainThread) [root] /sbin/tc class add dev dummy_dJrUa parent 1389: classid 1389:10 hfsc ul m2 800bit ls m1 3200bit d 80us m2 2400bit (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,775 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,812 DEBUG (MainThread) [root] /sbin/tc qdisc show (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,822 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,823 DEBUG (MainThread) [root] /sbin/tc class show dev dummy_dJrUa (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,829 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,830 DEBUG (MainThread) [root] /sbin/tc class show dev dummy_dJrUa (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,837 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,838 DEBUG (MainThread) [root] /sbin/tc class del dev dummy_dJrUa classid 1389:1388 (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,845 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,845 DEBUG (MainThread) [root] /sbin/tc class add dev dummy_dJrUa parent 1389: classid 1389:1388 hfsc ls m1 3200bit d 80us m2 2400bit (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,852 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,853 DEBUG (MainThread) [root] /sbin/tc qdisc add dev dummy_dJrUa parent 1389:1388 handle 1388: fq_codel (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,859 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,860 DEBUG (MainThread) [root] /sbin/tc filter replace dev dummy_dJrUa protocol all parent 1389: pref 16 basic match 'meta(vlan eq 16)' flowid 1389:10 (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,867 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,867 DEBUG (MainThread) [root] /sbin/tc qdisc add dev dummy_dJrUa parent 1389:10 handle 10: fq_codel (cwd None) (cmdutils:133) *11:05:09* 2017-11-24 11:04:40,874 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:09* 2017-11-24 11:04:40,8Coverage.py warning: Module /home/jenkins/workspace/vdsm_master_check-patch-fc27-x86_64/vdsm/vdsm was never imported. (module-not-imported) *11:05:21* 75 DEBUG (MainThread) [root] /sbin/tc class show dev dummy_dJrUa (cwd None) (cmdutils:133) *11:05:21* 2017-11-24 11:04:40,881 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:21* 2017-11-24 11:04:40,883 DEBUG (MainThread) [root] /sbin/tc qdisc show dev dummy_dJrUa (cwd None) (cmdutils:133) *11:05:21* 2017-11-24 11:04:40,889 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:21* 2017-11-24 11:04:40,890 DEBUG (MainThread) [root] /sbin/tc filter show dev dummy_dJrUa (cwd None) (cmdutils:133) *11:05:21* 2017-11-24 11:04:40,897 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:21* 2017-11-24 11:04:40,899 DEBUG (netlink/events) [root] START thread (func=>, args=(), kwargs={}) (concurrent:189) *11:05:21* 2017-11-24 11:04:40,900 DEBUG (MainThread) [root] /sbin/ip link set dev dummy_dJrUa.16 down (cwd None) (cmdutils:133) *11:05:21* 2017-11-24 11:04:40,910 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:21* 2017-11-24 11:04:40,911 DEBUG (netlink/events) [root] FINISH thread (concurrent:192) *11:05:21* 2017-11-24 11:04:40,912 DEBUG (MainThread) [root] /sbin/ip link del dev dummy_dJrUa.16 (cwd None) (cmdutils:133) *11:05:21* 2017-11-24 11:04:40,930 DEBUG (MainThread) [root] SUCCESS: = ''; = 0 (cmdutils:141) *11:05:21* - >> end captured logging << - -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] oVirt CI now supports Fedora 27 and Fedora Rawhide
On 11/23/2017 04:27 PM, Francesco Romani wrote: > >> >> >> It's probably time to mark the tests that use loseup as >> broken-on-jenkins >> >> >> loop devices are usually ok on jenkins. We have several tests in >> storage using >> them and I don't know about any failures. For example >> storage/blockdev_test.py. >> >> Francesco, do you want to mark them as broken for now? > > Yes, because we don't have resource to spare to properly fix the tests. > Hopefully next week. https://gerrit.ovirt.org/#/c/84594/ -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] oVirt CI now supports Fedora 27 and Fedora Rawhide
On 11/23/2017 04:23 PM, Nir Soffer wrote: > > > On Thu, Nov 23, 2017 at 4:55 PM Dan Kenigsberg <mailto:dan...@redhat.com>> wrote: > > On Thu, Nov 23, 2017 at 4:03 PM, Nir Soffer <mailto:nsof...@redhat.com>> wrote: > > On Thu, Nov 23, 2017 at 3:59 PM Dan Kenigsberg > mailto:dan...@redhat.com>> wrote: > >> > >> On Thu, Nov 23, 2017 at 1:56 PM, Nir Soffer <mailto:nsof...@redhat.com>> wrote: > >> > On Thu, Nov 23, 2017 at 1:51 PM Edward Haas <mailto:eh...@redhat.com>> wrote: > >> >> > >> >> Per what I see, all CI jobs on vdsm/fc27 fail. > >> >> This is the second time this week, please consider reverting. > >> >> > >> >> We should try to avoid such changed before the weekend > >> > > >> > > >> > Some of the failures like > >> > > >> > > > http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc27-x86_64/100/console > >> > > >> > Should be fixed by: > >> > https://gerrit.ovirt.org/84569/ > >> > > >> > I have see some network test failing when running mock_runner > locally, > >> > you may need to mark them as broken on fc27. > >> > >> We have a busy couple of weeks until release of ovirt-4.2.0. As > much > >> as I like consuming Fedora early, I'm not sure that enabling so > close > >> to the release was a good idea. Nothing forces us to do it now (and > >> there are a lot of reasons to do it later). Let's give it > another go, > >> but let us not keep it on the red for the weekend. > > > > > > I'm happy with the storage code being tested on current fedora. > > I'm more than happy. I'm thrilled to for it to be tested and run. > I am not happy to give a vote to a job that was never ever successful. > > > The job is successful, some tests or maybe the code they test need work. > This is why we have skipif/xfail and broken_on_ci. > > > > > > > Please make sure the few failing network tests are not breaking > the build. > > It's probably time to mark the tests that use loseup as > broken-on-jenkins > > > loop devices are usually ok on jenkins. We have several tests in > storage using > them and I don't know about any failures. For example > storage/blockdev_test.py. > > Francesco, do you want to mark them as broken for now? Yes, because we don't have resource to spare to properly fix the tests. Hopefully next week. Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] oVirt CI now supports Fedora 27 and Fedora Rawhide
On 11/23/2017 08:25 AM, Barak Korren wrote: >> >>> The package exists: >>> http://resources.ovirt.org/repos/ovirt/tested/master/rpm/fc27/noarch/ovirt-imageio-common-1.2.0-0.201711212128.git9926984.fc27.noarch.rpm >>> >>> repoquery is happy: >>> $ repoquery >>> --repofrompath=r,http://resources.ovirt.org/repos/ovirt/tested/master/rpm/fc27 >>> --repoid=r list ovirt-imageio-common >>> ovirt-imageio-common-0:1.2.0-0.201711212128.git9926984.fc27.noarch >>> >>> Maybe mock caching issue? >>> >>> 00:00:28.127 Using chroot cache = >>> /var/cache/mock/fedora-27-x86_64-a9934b467f29c7317f7dd8f205d66ddd >>> 00:00:28.127 Using chroot dir = >>> /var/lib/mock/fedora-27-x86_64-a9934b467f29c7317f7dd8f205d66ddd-7425 >>> >>> >> No. This just tells you where the cache would be, not if its actually >> there already. >> But yeah, this run actually did use a cached chroot: >> >> 00:00:28.767 Start: unpacking root cache >> 00:01:01.624 Finish: unpacking root cache >> >> I cleaned it from the node and retriggerd, lets see what happens now: >> >> >> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc27-x86_64/85/console > Failed again: > > 00:11:56.063 > - > 00:11:56.064 TOTAL > 49960 > 2528049% > 00:11:56.064 > -- > 00:11:56.065 Ran 2547 tests in 307.236s > 00:11:56.065 > 00:11:56.065 FAILED (SKIP=78, failures=4) > 00:11:56.065 make[1]: *** [Makefile:1118: check] Error 1 > 00:11:56.066 make[1]: Leaving directory > '/home/jenkins/workspace/vdsm_master_check-patch-fc27-x86_64/vdsm/tests' > 00:11:56.066 ERROR: InvocationError: '/usr/bin/make -C tests check' > 00:11:56.067 storage-py27 create: > /home/jenkins/workspace/vdsm_master_check-patch-fc27-x86_64/vdsm/.tox/storage-py27 > 00:11:59.803 storage-py27 installdeps: pytest==3.1.2, nose==1.3.7 > > > > 00:14:06.238 ___ summary > > 00:14:06.238 ERROR: tests: commands failed > 00:14:06.238 storage-py27: commands succeeded > 00:14:06.238 SKIPPED: storage-py35: InterpreterNotFound: python3.5 > 00:14:06.239 storage-py36: commands succeeded > 00:14:06.239 lib-py27: commands succeeded > 00:14:06.264 make: *** [Makefile:1019: tests] Error 1 > 00:14:06.270 + collect-logs > 00:14:06.270 + cp /var/log/vdsm_tests.log > /home/jenkins/workspace/vdsm_master_check-patch-fc27-x86_64/vdsm/exported-artifacts/ > 00:14:06.280 Took 488 seconds > > > I can't really tell why this failed, also did we use to have the > results exported to XUINT? It seems it failed because: *07:12:24* *07:12:24* == *07:12:24* FAIL: test_multiple_vlans (network.tc_test.TestConfigureOutbound) *07:12:24* -- *07:12:24* Traceback (most recent call last): *07:12:24* File "/usr/lib/python2.7/site-packages/mock/mock.py", line 1305, in patched *07:12:24* return func(*args, **keywargs) *07:12:24* File "/home/jenkins/workspace/vdsm_master_check-patch-fc27-x86_64/vdsm/tests/network/tc_test.py", line 459, in test_multiple_vlans *07:12:24* self._analyse_qos_and_general_assertions() *07:12:24* File "/home/jenkins/workspace/vdsm_master_check-patch-fc27-x86_64/vdsm/tests/network/tc_test.py", line 531, in _analyse_qos_and_general_assertions *07:12:24* tc_filters.tagged_filters) *07:12:24* File "/home/jenkins/workspace/vdsm_master_check-patch-fc27-x86_64/vdsm/tests/network/tc_test.py", line 579, in _assertions_on_filters *07:12:24* self.assertEqual(len(untagged_filters), 1) *07:12:24* AssertionError: 0 != 1 *07:12:24* >> begin captured logging << -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [ OST Failure Report ] [ oVirt master ] [ 07-11-2017 ] [ 004_basic_sanity.disk_operations ]
[storage.Image] Setting > parent of volume a6cb7a60-e500-4a75-9da8-9e7cce3e21ac to > e77f8714-9e0f-4f39-8437-e75a3fa44c74 (image:1185) > 2017-11-07 09:20:41,824-0500 INFO (tasks/8) [storage.SANLock] > Releasing Lease(name='e77f8714-9e0f-4f39-8437-e75a3fa44c74', > path=u'/rhev/data-center/mnt/192.168.204.4:_exports_nfs_share1/4a088fb2-5497-4b4d-adec-4c91d998c56e/images/8401cd0 > f-419c-417c-8da3-52da285fa47c/e77f8714-9e0f-4f39-8437-e75a3fa44c74.lease', > offset=0) (clusterlock:435) > 2017-11-07 09:20:41,828-0500 INFO (tasks/8) [storage.SANLock] > Successfully released > Lease(name='e77f8714-9e0f-4f39-8437-e75a3fa44c74', > path=u'/rhev/data-center/mnt/192.168.204.4:_exports_nfs_share1/4a088fb2-5497-4b4d-adec-4c91d998c56e/im > ages/8401cd0f-419c-417c-8da3-52da285fa47c/e77f8714-9e0f-4f39-8437-e75a3fa44c74.lease', > offset=0) (clusterlock:444) > 2017-11-07 09:20:41,842-0500 INFO (tasks/8) > [storage.ThreadPool.WorkerThread] FINISH task > 9ce348bc-e3fb-4f4e-aea0-f8c79e7ae204 (threadPool:210) > 2017-11-07 09:20:42,388-0500 INFO (jsonrpc/4) [api.virt] START > diskReplicateFinish(srcDisk={u'device': u'disk', u'poolID': > u'b20f0eb7-6bea-4994-b87b-e1080c4fd9e5', u'volumeID': > u'8a1660d7-f153-4aca-9f25-c1184c10841a', u'domainID': u'4a08 > 8fb2-5497-4b4d-adec-4c91d998c56e', u'imageID': > u'36837a9b-6396-4d65-a354-df97ce808f01'}, dstDisk={u'device': u'disk', > u'poolID': u'b20f0eb7-6bea-4994-b87b-e1080c4fd9e5', u'volumeID': > u'8a1660d7-f153-4aca-9f25-c1184c10841a', u'domainID': u > '4a088fb2-5497-4b4d-adec-4c91d998c56e', u'imageID': > u'36837a9b-6396-4d65-a354-df97ce808f01'}) > from=:::192.168.204.4,49358, flow_id=4bcdd94e (api:46) > 2017-11-07 09:20:42,388-0500 ERROR (jsonrpc/4) [virt.vm] > (vmId='61d328bc-c920-4691-997f-09bbecb5de10') Drive not found > (srcDisk: {u'device': u'disk', u'poolID': > u'b20f0eb7-6bea-4994-b87b-e1080c4fd9e5', u'volumeID': > u'8a1660d7-f153-4aca-9f > 25-c1184c10841a', u'domainID': > u'4a088fb2-5497-4b4d-adec-4c91d998c56e', u'imageID': > u'36837a9b-6396-4d65-a354-df97ce808f01'}) (vm:4386) > 2017-11-07 09:20:42,389-0500 INFO (jsonrpc/4) [api.virt] FINISH > diskReplicateFinish return={'status': {'message': 'Drive image file > could not be found', 'code': 13}} from=:::192.168.204.4,49358, > flow_id=4bcdd94e (api:52) > 2017-11-07 09:20:42,389-0500 INFO (jsonrpc/4) [jsonrpc.JsonRpcServer] > RPC call VM.diskReplicateFinish failed (error 13) in 0.01 seconds > (__init__:630)** > : > * > * > > ** > > ** -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] s390 draft patches submitted for review
On 09/21/2017 09:52 AM, Viktor Mihajlovski wrote: > On 20.09.2017 22:04, Michal Skrivanek wrote: > [...] >>> thanks for the feedback. I'll have to clean up the vdsm/libvirt stuff >>> first. There's still an issue with NUMA on s390 I have to solve… >> sure, it’s the one to start with. Are other system dependencies (qemu, >> libvirt, other)? > KVM on s390 is usable for quite some time, for practical purposes I > would suggest at least kernel 4.4, QEMU 1.2.5, libvirt 1.3.1. I think we can take those requirements for granted > >> We’re mostly focusing on EL platform, it may also make sense to work on top >> of EL 7.4 with custom QEMU/libvirt (I’m assuming you do need some bleeding >> edge changes there) >> > that's OK from a development perspective, I am juggling kernels, QEMU > and libvirt all the time... > > In case you're looking for more information on KVM on s390 > > [1] https://wiki.qemu.org/Documentation/Platforms/S390X > [2] http://kvmonz.blogspot.co.uk > [3] > https://www.ibm.com/support/knowledgecenter/en/linuxonibm/com.ibm.linux.z.ldva/ldva_c_welcome.html Thanks! About patches, they look good, most of them +2'd already. If you are happy with them, I can start triggering CI on them. -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [review][vdsm] please review https://gerrit.ovirt.org/#/q/topic:drivemonitor_event+status:open
On 09/11/2017 01:16 PM, Eyal Edri wrote: > > > On Mon, Sep 11, 2017 at 2:02 PM, Francesco Romani <mailto:from...@redhat.com>> wrote: > > Hi everyone, > > > https://gerrit.ovirt.org/#/q/topic:drivemonitor_event+status:open > <https://gerrit.ovirt.org/#/q/topic:drivemonitor_event+status:open> is > ready for review. It is the first part of the series needed > > to consume the BLOCK_THRESHOLD event available with libvirt >= > 3.2.0 and > QEMU >= 2.3.0. > > Once completed, this patchset will allow Vdsm to avoid polling, thus > greatly improving the system performance and > > eventually close > https://bugzilla.redhat.com/show_bug.cgi?id=1181665 > <https://bugzilla.redhat.com/show_bug.cgi?id=1181665> > > > Please note that: > > 1. CI fails because the workers are not yet updated to CentOS 7.4 (not > yet released AFAIK!) which will provide libvirt >= 3.2.0. > > > You probably know that already, but just to be sure, please wait for > official CentOS 7.4 be out and that we'll verify OST works well with it > before merging, otherwise any patch that will be merged afterwards > will fail and CI won't work. > > AFAIK, it should be out this week. > > Sure thing. Will not merge before OST and CI both pass. But it is totally reviewable while we wait! :) Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [review][vdsm] please review https://gerrit.ovirt.org/#/q/topic:drivemonitor_event+status:open
Hi everyone, https://gerrit.ovirt.org/#/q/topic:drivemonitor_event+status:open is ready for review. It is the first part of the series needed to consume the BLOCK_THRESHOLD event available with libvirt >= 3.2.0 and QEMU >= 2.3.0. Once completed, this patchset will allow Vdsm to avoid polling, thus greatly improving the system performance and eventually close https://bugzilla.redhat.com/show_bug.cgi?id=1181665 Please note that: 1. CI fails because the workers are not yet updated to CentOS 7.4 (not yet released AFAIK!) which will provide libvirt >= 3.2.0. 2. Few more simple patches will be needed to enable/disable monitoring in specific flows where we cannot use events (e.g. LSM) 3. I did initial verification successfully, installing fedora 25 on thin provisioned disk without issue. -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] status update: consuming the BLOCK_THRESHOLD event from libvirt (rhbz#1181665)
On 07/24/2017 10:05 AM, Francesco Romani wrote: > TL;DR: NEWS! > > First two patches (https://gerrit.ovirt.org/#/c/79386/5 and > https://gerrit.ovirt.org/#/c/79264/14) are now review worthy! Please review the above two patches: those provide the minimal first step toward the consumption of the watermark event. We can build on them, and add/refactor what is needed to completely support the event. https://gerrit.ovirt.org/#/c/79386/ - refactors Vm.extendDrivesIfNeeded to have one method which tries to extend a single drive. We will use this patch both in the existing poll-based flow and in the new event-based flow. Relatively big, but also quite simple and safe https://gerrit.ovirt.org/#/c/79264/ - add the basic work to register and consume the event. This is need for steady-state, and seems to work ok in my initial testing Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] status update: consuming the BLOCK_THRESHOLD event from libvirt (rhbz#1181665)
TL;DR: NEWS! First two patches (https://gerrit.ovirt.org/#/c/79386/5 and https://gerrit.ovirt.org/#/c/79264/14) are now review worthy! On 07/23/2017 10:58 AM, Roy Golan wrote: > > [...] > So we can still use the BLOCK_THRESHOLD event for steady state, and > avoid polling in the vast majority of the cases. > > With "steady state" I mean that the VM is running, with no > administration (snapshot, live merge, live storage migration...) > operation in progress. > > I think it is fair to assume that VMs are in this state the vast > majority of the time. > For the very important cases on which we cannot depend on events, > we can > fall back to polling, but in a smarter way: > > instead of polling everything every 2s, let's just poll just the > drives > involved in the ongoing operations. > > Those should be far less of the total amount of drives, and for a far > shorter time than today, so polling should be practical. > > Since the event fires once, we will need to rearm it only if the > operation is ongoing, and only just before to start it (both > conditions > easy to check) > We can disable the polling on completion, or on error. This per se is > easy, but we will need a careful review of the flows, and perhaps some > safety nets in place. > > > Consider fusing polling and events into a single pipeline of events so > they can be used together. If a poll triggers an event (with > distinguished origin) > then it all the handling is done in one place and it should be easy to > stop or start polling, or remove them totally. Yes, this is the final design I have in mind. I have plans to refactor Vdsm master to make it look like that. It will play nice with refactorings that storage team has planned. Let's see if virt refactorings are just needed to have the block threshold events, or if we can postpone them. > > > > On recovery, we will need to make sure to rearm all the relevant > events, > but we can just plug in the recovery we must do already, so this > should > be easy as well. > > > What is needed in order to 'rearm' it? is there an API to get the > state of event subscription? > If we lost an event how do we know to rearm it? is it idempotent to rearm? QEMU supports a single threshold per block device (= node of backing chain), so rearming a threshold just means setting a new threshold, overwriting the old one. To rearm the, we need to get the highest allocation of block devices and set the threshold. If we do that among the first thing of recovery, it should be little risk, if any. To know if we need to do that, we "just" need to inspect all block devices at recovery. It doesn't come for free, but I believe it is a fair price. > Remind me, do we extend a disk if the VM paused with out of space event? Yes we do. We examine the last paused reason in recovery, we do extension in this case > > How will we handle 2 subsequent events if we didn't extend between > them? (expecting the extend to be async operation) At qemu level, the event cannot fire twice, must be rearmed after every firing. In general, should virt code receive two events before the extension completed... I don't know yet :) Perhaps we can start just handling the first event, I don't think we can easily queue extension requests (and I'm not sure we should) > > > I believe the best route is: > > 1. offer the new event-based code for 4.2, keep the polling around. > Default to events for performance > > 2. remove the polling completely in 4.3 > > > Still wonder if removing them totally is good. The absence of the > events should be supervised somehow - like in today, a failure to poll > getstats of a domain will result in a VM going unresponsive. Not the > most accurate state but at least gives some visibility. So polling > should cover us where events will fail. (similar to engine's vms > monitoring) I don't have strong opinions about polling removal as long as it is disabled by default. Actually, I like having fallbacks and safety nets in place. However, the libvirt event support is here to stay, and as time goes, it should only get better (featurewise and reliability wise). > > I'm currently working on the patches here: > > https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:watermark-event-minimal > > > Even though the basics are in place, I don't think they are ready for > review yet. > First two patches (https://gerrit.ovirt.org/#/c/79386/5 and https://gerrit.ovirt.org/#/c/79264/14) are now review worthy! -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] status update: consuming the BLOCK_THRESHOLD event from libvirt (rhbz#1181665)
On 07/19/2017 12:32 PM, Barak Korren wrote: > On 19 July 2017 at 12:44, Francesco Romani wrote: > >> 2. remove the polling completely in 4.3 >> > This all is way beyond my level of understanding but, I wonder, is > there a danger of events failing to be delivered? Yes, there always is this risk, because of bugs :) Besides that, we must pay attention on recovering. Everything else is "just" a libvirt bug. > Would there be benefit of keeping the polling around but reducing the > frequency to something like once an hour? The whole point of having a such frequent polling is to avoid that VMs get paused because of disk exhaustion while they run on thin-provisioned storage. So infrequent polling is little help here, better to disable it entirely. Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] status update: consuming the BLOCK_THRESHOLD event from libvirt (rhbz#1181665)
Hi all, With libvirt 3.2.0 and onwards, it seems we have now the tools to solve https://bugzilla.redhat.com/show_bug.cgi?id=1181665 and eventually get rid of the disk polling we do. This change is expected to have huge impact on performance, so I'm working on it. I had plans for a comprehensive refactoring in this area, but looks like a solution backportable for 4.1.z is appealing, so I started with this first, saving the refactoring (which I still very much want) for later. So, quick summary: libvirt >= 3.2.0 allows to set a threshold to any node in the backing chain of each drive of a VM (https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainSetBlockThreshold), and fire one event exactly once when that threshold is crossed. The event needs to be explicitely rearmed after. This is exactly what we need to get rid of polling in the steady state, so far so good. The problem is: we can't use this for some important flows we have, and which involve usage of disks not (yet) attached to a given VM. Possibly affected flows: - live storage migration: we use flags = (libvirt.VIR_DOMAIN_BLOCK_COPY_SHALLOW | libvirt.VIR_DOMAIN_BLOCK_COPY_REUSE_EXT | VIR_DOMAIN_BLOCK_COPY_TRANSIENT_JOB) meaning that Vdsm is in charge of handling the volume - snapshots: we use snapFlags = (libvirt.VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT | libvirt.VIR_DOMAIN_SNAPSHOT_CREATE_NO_METADATA) (same meaning as above) - live merge: should be OK (according to a glance at the source and a chat with Adam). So looks like we will need to bridge this gap. So we can still use the BLOCK_THRESHOLD event for steady state, and avoid polling in the vast majority of the cases. With "steady state" I mean that the VM is running, with no administration (snapshot, live merge, live storage migration...) operation in progress. I think it is fair to assume that VMs are in this state the vast majority of the time. For the very important cases on which we cannot depend on events, we can fall back to polling, but in a smarter way: instead of polling everything every 2s, let's just poll just the drives involved in the ongoing operations. Those should be far less of the total amount of drives, and for a far shorter time than today, so polling should be practical. Since the event fires once, we will need to rearm it only if the operation is ongoing, and only just before to start it (both conditions easy to check) We can disable the polling on completion, or on error. This per se is easy, but we will need a careful review of the flows, and perhaps some safety nets in place. Anyway, should we miss to disable the polling, we will "just" have some overhead. On recovery, we will need to make sure to rearm all the relevant events, but we can just plug in the recovery we must do already, so this should be easy as well. So it seems to me this could fly and we can actually have the performance benefits of events. However, due to the fact that we need to review some existing and delicate flows, I think we should still keep the current polling code around for the next release. I believe the best route is: 1. offer the new event-based code for 4.2, keep the polling around. Default to events for performance 2. remove the polling completely in 4.3 I'm currently working on the patches here: https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:watermark-event-minimal Even though the basics are in place, I don't think they are ready for review yet. Comments welcome, as usual. -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] Subject: [ OST Failure Report ] [ oVirt $VER ] [ $DATE ] [ TEST NAME ]
Please make sure to add me as reviewer, the little hacks you mention may be out of date and there could be simpler and safer ways to fix this issue. Bests, On 07/06/2017 10:13 PM, Arik Hadas wrote: > > > On Thu, Jul 6, 2017 at 8:40 PM, Dafna Ron <mailto:d...@redhat.com>> wrote: > > ** > > *Hi, * > > * > > ** > > vm failed to run with libvrt unsupported configuration (see error > below) since the patch is related to 4.2 xml configuration and the > vm failed to run on unsupported configuration I suspect its > related - can you please have a closer look? > > * > > > Right, good analysis. > Will be easy to fix, just need to copy one of VDSM little hacks in: > https://github.com/oVirt/vdsm/blob/master/lib/vdsm/virt/vmdevices/graphics.py#L87 > To: > https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/builder/vminfo/LibvirtVmXmlBuilder.java#L1152 > I'll do that tomorrow. > Not sure why "ci please build" tests passed on that one though. > > > *** > > ** > > here is a grep from logs on vm id: > http://pastebin.test.redhat.com/500889 > <http://pastebin.test.redhat.com/500889> > > Test failed:**004_basic_sanity.vm_run > > Link to suspected patches: https://gerrit.ovirt.org/#/c/78955/ > <https://gerrit.ovirt.org/#/c/78955/> > > Link to Job: > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/7491/ > <http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/7491/> > > Link to all logs: > > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/7491/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-004_basic_sanity.py/ > > <http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/7491/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-004_basic_sanity.py/> > > > Error snippet from the log: > > > > * > > /*engine log: */** > ** > > *2017-07-06 11:26:34,473-04 INFO > [org.ovirt.engine.core.bll.RunVmOnceCommand] (default task-31) > [a7b073ec-04ba-434a-95b4-395957cae6dc] Lock freed to object > 'EngineLock:{exclusiveLocks='[d3b1b67d-d2fd-4ed7-86d1-795ba2f10bc0=VM]', > sharedLocks=''}' > {"params": {"d3b1b67d-d2fd-4ed7-86d1-795ba2f10bc0": {"status": > "Down", "timeOffset": "0", "exitReason": 1, "exitMessage": > "unsupported configuration: unknown spice channel name smain", > "exitCode": 1}, "notify_time": 4295594320}, "jsonrpc": "2.0", > "method": "|virt|VM_status|d3b1b67d-d2fd-4ed7-86d1-795ba2f10bc0"}^@ > > * > > vdsm log: ** > ** > > 2017-07-06 11:26:36,866-0400 ERROR (vm/d3b1b67d) [virt.vm] > (vmId='d3b1b67d-d2fd-4ed7-86d1-795ba2f10bc0') The vm start process > failed (vm:789) > Traceback (most recent call last): > File "/usr/share/vdsm/virt/vm.py", line 723, in _startUnderlyingVm > self._run() > File "/usr/share/vdsm/virt/vm.py", line 2328, in _run > self._connection.createXML(domxml, flags), > File > "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line > 125, in wrapper > ret = f(*args, **kwargs) > File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 646, > in wrapper > return func(inst, *args, **kwargs) > File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3782, > in createXML > if ret is None:raise libvirtError('virDomainCreateXML() > failed', conn=self) > libvirtError: unsupported configuration: unknown spice channel > name smain* > *** > * > * > > ** > > ** > > > > > ___ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] Who cares about 'memGuaranteedSize'?
Hi all, We have this field in the Vdsm API: - description: The amount of memory guaranteed to the VM in MB name: memGuaranteedSize type: uint Available in VmDefinition, VMFullInfo, VmParameters Vdsm dutifully records and reports this value - but doesn't use it. It is used exactly once for balloon stata: stats['balloonInfo'].update({ 'balloon_max': str(max_mem), 'balloon_min': str( int(vm.conf.get('memGuaranteedSize', '0')) * 1024), 'balloon_cur': str(balloon_cur), 'balloon_target': str(balloon_target) }) Now, a quick git grep in both MOM and Engine reveals no obvious usages. Am I missing something? Can we drop this for 4.2, and deprecate it in next(4.1) ? Thanks, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] high performance VM preset
On 05/24/2017 12:57 PM, Michal Skrivanek wrote: > Hi all, > we plan to work on an improvement in VM definition for high performance > workloads which do not require desktop-class devices and generally favor > highest possible performance in expense of less flexibility. > We’re thinking of adding a new VM preset in addition to current Desktop and > Server in New VM dialog, which would automatically pre-select existing > options in the right way, and suggest/warn on suboptimal configuration > All the presets and warning can be changed and ignored. There are few things > we already identified as boosting performance and/or minimize the complexity > of the VM, so we plan the preset to: > - remove all graphical consoles and set the VM as headless, making it > accessible by serial console. > - disable all USB. > - disable soundcard. > - enable I/O Threads, just one for all disks by default. > - set host cpu passthrough (effectively disabling VM live migration), add I/O > Thread pinning in a similar way as the existing CPU pinning. > We plan the following checks and suggest to perform CPU pinning, host > topology == guest topology (number of cores per socket and threads per core > should match), NUMA topology host and guest match, check and suggest the I/O > threads pinning. > A popup on a VM dialog save seems suitable. > > currently identified task and status can be followed on trello card[1] > > Please share your thoughts, questions, any kind of feedback… In order to maximize performance we may also want to limit the number of other VMs (either regular or high performance) running on the same host. This to minimize the interference and the resource stealing. In the extreme case, just the selected high performance VM would be allowed to run on one suitable host. Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] branching out 4.1.2
On 05/22/2017 12:32 PM, Eyal Edri wrote: > > > On Mon, May 22, 2017 at 1:11 PM, Yedidyah Bar David <mailto:d...@redhat.com>> wrote: > > On Mon, May 22, 2017 at 12:52 PM, Yaniv Kaul <mailto:yk...@redhat.com>> wrote: > > > > > > On Mon, May 22, 2017 at 11:23 AM, Francesco Romani > mailto:from...@redhat.com>> > > wrote: > >> > >> Hi all, > >> > >> > >> patches against the 4.1 branch are piling up, so I'm thinking about > >> branching out 4.1.2 tomorrow (20170523) > >> > >> The activity on the 4.1.2 front was quite low lately, so we should > >> expect quite few double backports. > >> > >> > >> Thoughts? I'll go forward and branch if noone objects. > > > > > > 1. Go for it. > > 2. Let's see what the outcome is. How many 'merge races' we > have, how many > > regressions (hopefully none), how much work is poured into it, > > Do you want for the new branch full CI coverage? > > > I don't think it should, since the stable branch which is a superset > of it should all the patches as well and will fail > before backporting to the new branch. > > Yes, there might be rare occasions where a patch will fail on version > branch and not stable branch, > I'm not sure its worth the effort of duplicating all CI resources per > branch just for it. > We already don't cover all flows in CI because there are limited > resources, so I don't see a huge difference here. > > We can always run verification on the final bits via manual job before > releasing to catch such cases. I agree with Eyal. Furthermore 4.1.2 is released, so I expect low-to-none activity on that ovirt-4.1.2 branch. The merging activity on the branch ovirt-4.1 resumed, the merge window is now open again for 4.1.3. Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [vdsm] branching out 4.1.2
Hi all, patches against the 4.1 branch are piling up, so I'm thinking about branching out 4.1.2 tomorrow (20170523) The activity on the 4.1.2 front was quite low lately, so we should expect quite few double backports. Thoughts? I'll go forward and branch if noone objects. Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [Vdsm] silence existing pylint errors
On 04/23/2017 02:54 AM, Nir Soffer wrote: > On Fri, Apr 21, 2017 at 4:05 PM Nir Soffer <mailto:nsof...@redhat.com>> wrote: > > > I think the last patch is too big, and there are some storage > issues that we > > > can fix > > now. Can you split by vertical? I would like to take over > the the storage > > part. > > Please do. > > > I splitted the patches to: > > - https://gerrit.ovirt.org/75728 pylint: brutally silence two > network-related errors > - https://gerrit.ovirt.org/75730 pylint: Silence pylint errors in > gluster > - https://gerrit.ovirt.org/75748 pylint: Silence pylint errors in > infra > - https://gerrit.ovirt.org/75749 pylint: Silence pylint errors in virt > I'm taking care of this today Thanks for kickstarting this! -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] Gluster/virt ports clarifications.
On 04/02/2017 03:53 PM, Leon Goldberg wrote: > Hey, > > We're gathering information regarding the ports we open as part of the > firewalld migration research. > > We have most of the current ports covered by either firewalld itself > or by 3rd party packages, however some questions remain unanswered: > > > IPTablesConfigForVirt: > > - serial consoles (tcp/2223): Is this required? can't find a single > reference to a listening entity. Either way, I couldn't find a > relevant service that provides it. It is required: * on each virtualization host (e.g. the same machine who runs Vdsm) * IF the virtual serial console is enabled (it is by default) The listening entity is the external service "ovirt-vmconsole-host-sshd", which is one special-configured sshd instance. Bests, -- Francesco Romani Senior SW Eng., Virtualization R&D Red Hat IRC: fromani github: @fromanirh ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] Engine XML: metadata and devices from XML
Please note that the approach described here is outdated, because xmlpickle was too easy to use and too generic: that lead to bloated XML and too high risk of misusing metadata; the new approach (https://gerrit.ovirt.org/#/c/74206/) tries to balance things making usage convenient but disallowing arbitrarily nesting On 03/18/2017 03:27 PM, Nir Soffer wrote: > On Wed, Mar 15, 2017 at 2:28 PM Francesco Romani <mailto:from...@redhat.com>> wrote: > > Hi everyone, > > This is both a report of the current state of my Vdsm patches for > Engine > XML support, and a proposal to how move forward and solve > the current open issues. > > TL;DR: > 1. we can and IMO should reuse the current JSON schema to describe the > structure (layout) and the types of the metadata section. > 2. we don't need a priori validation of stuff in the metadata section. > We will just raise in the creation flow if data is missing, or wrong, > according to our schema. > 2. we will add *few* items to the metadata section, only thing we > can't > express clearly-or at all in the libvirt XML. Redundancy and > verbosiness > will be thus kept at bay > 3. I believe [3] is the best tool to do (de)serialize data to the > metadata section. Existing tools fits poorly in our very specific > use case > > Examples below > > +++ > > Long(er) discussion: > > > I have working code[1][2] to encode any custom, picklable, python > object in the metadata section. > > We should decide which module will do the actual python<=>XML > transformation. > Please note that this actually also influences how the data in the > medata section look like, so the two things are a bit coupled. > > I'm eager to reinvent another wheel, but after > initial evaluation I honestly think that my pyxmlpickle[3] is the best > tool for the job over the current alternatives: plistlib[4] and > xmltodict[5]. > > I added the initial rationale here: > https://gerrit.ovirt.org/#/c/73790/4//COMMIT_MSG > > I have completed the initial draft of patches to make it possible to > initialize devices from their XML representation [6]. This is the bare > minimum we need to support the Engine XML, and we *need* this > anyway to > unlock the cleanup we planned and I outlined in my google doc. > > So we are progressing, but I'd like to speed up things. Those [6] > patches are not yet complete, many flows are not covered or > tested; but > they are good enough to demonstrate that there *are* pieces of > information wen need to properly initialize the devices, but we can't > easily extract from the XML. > > First examples that come to my mind are the storage.Drive UUIDs; there > could also be some ambiguity I'm investigating right now for > displayIp/displayNetwork in Graphics devices. In [6] there are various > TODO to mark more of those cases. Most likely, few more cases will pop > out as I cover all the flows we support. > > Long story short: it is hard to correctly rebuild the device conf from > the XML. This is why in [6] I added the 'meta' argument to > from_xml_tree > classmethod in [7]. > > 'meta' is supposed to be the device metadata: extra data related to a > device which doesn't (yet) fit in the libvirt XML representation. > For example, we can store 'displayIp' and 'displayNetwork' here and be > done with that: using both per-device metadata and the XML > representation of one graphic device, we will have everything we > need to > properly build one graphics.Graphics device. > This example may (hopefully) be bogus, but I'm keeping it because > it is > one case easy to follow. > > The device metadata is going to be stored in the vm metadata for the > short/mid term future. Even if the per-device metadata idea/RFE is > accepted (no answer yet, but we are working on it), we will not > have in > 7.4, and unlikely in 7.5. > > As it stands today, I believe there are two open questions: > > 1. do we need a schema for the metadata section? > 2. how do we bind the metadata to the devices? How do we know which > metadata belongs to which metadata, if we don't have aliases nor > addresses to match? (e.g. very first time the VM is created!) > > My current stance is the following > 1. In general, one schema gives us two benefits: 1.a. we document how > the layout of the data should be, including types; 1.b.
Re: [ovirt-devel] [vdsm] Engine XML: metadata and devices from XML
On 03/18/2017 01:14 PM, Nir Soffer wrote: > > > On Fri, Mar 17, 2017 at 4:58 PM Francesco Romani <mailto:from...@redhat.com>> wrote: > > On 03/16/2017 08:03 PM, Francesco Romani wrote: > > On 03/16/2017 01:26 PM, Francesco Romani wrote: > >> On 03/16/2017 11:47 AM, Michal Skrivanek wrote: > >>>> On 16 Mar 2017, at 09:45, Francesco Romani > mailto:from...@redhat.com>> wrote: > >>>> > >>>> We talked about sending storage device purely on metadata, > letting Vdsm > >>>> rebuild them and getting the XML like today. > >>>> > >>>> In the other direction, Vdsm will pass through the XML > (perhaps only > >>>> parts of it, e.g. the devices subtree) like before. > >>>> > >>>> This way we can minimize the changes we are uncertain of, and > more > >>>> importantly, we can minimize the risky changes. > >>>> > >>>> > >>>> The following is a realistic example of how the XML could > look like if > >>>> we send all but the storage devices. It is built using my > pyxmlpickle > >>>> module (see [3] below). > >>> That’s quite verbose. How much work would it need to actually > minimize it and turn it into something more simple. > >>> Most such stuff should go away and I believe it would be > beneficial to make it difficult to use to discourage using > metadata as a generic junkyard > >> It is verbose because it is generic - indeed perhaps too generic. > >> I can try something else based on a concept from Martin > Polednik. Will > >> follow up soon. > > Early preview: > > > > https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:virt-metadata-compact > > > > still plenty of TODOs, I expect to be reviewable material worst case > > monday morning. > > This is how typical XML could look like: > > > > > > > > Why do we need this nesting? We use libvirt metadata for elements; so each direct child of metadata is a separate metadata grop: ovirt-tune:qos is one ovirt-vm:vm is one and so forth. I don't want to mess up with the existing elements. So we could be backward compatible at XML level (read: 4.1 XML works without changes to 4.2) > > > > > > What is ovirt-instance? Gone, merged with ovirt-vm ovirt-vm will hold both per-vm metadata and per-device metadata. > > > > true > true > 192.168.1.51 > en-us > DEFAULT > > > smain,sinputs,scursor,splayback,srecord,sdisplay,ssmartcard,susbredir > ovirtmgmt > > > > > > > Why do we need this nesting? *Here* we had this nesting because the example took vm.conf and brutally translated to XML. Is a worst case scenario. Why is it relevant? There it was initial discussion about how to deal with complex device; to minimize changes, we could marshal the existing vm.conf into the device metadata, then unmarshal on Vdsm side and just use it to rebuild the devices with the very same code we have today (yes, this means sneaking/embedding vm.conf into the XML) Should we go that way, it could look like the above. Let's talk in general now. There are three main use cases requiring nesting: 1. per-vm metadta. Attach key/value pairs. We need one level of nesting to avoid to mess up with other data. So it could look like 1 2 this is simple and nice and I think is not bothering anyone (hopefully :)) 2. per-device metadata: it has to fit into vm section, and we could possibly have more than one device with metadata, so the simplest format is something like 1 2 true false This is the minimal nesting level we need. We could gather the per-device metadata in a dict and feed device with it, like with a new "meta" argument to device constructor, much like "custom" and "specParams" Would that look good? 3. QoS. we need to support the current layout for obvious backward compatibility questions. We could postpone this and use existing code for some more time, but ultimately this should handled by metadata module, just because it is supposed to be the process-wide metadata gateway. > > >
Re: [ovirt-devel] [vdsm] Engine XML: metadata and devices from XML
On 03/20/2017 09:05 AM, Francesco Romani wrote: > On 03/17/2017 11:07 PM, Michal Skrivanek wrote: >>> On 17 Mar 2017, at 15:57, Francesco Romani wrote: >>> >>> On 03/16/2017 08:03 PM, Francesco Romani wrote: >>>> On 03/16/2017 01:26 PM, Francesco Romani wrote: >>>>> On 03/16/2017 11:47 AM, Michal Skrivanek wrote: >>>>>>> On 16 Mar 2017, at 09:45, Francesco Romani wrote: >>>>>>> >>>>>>> We talked about sending storage device purely on metadata, letting Vdsm >>>>>>> rebuild them and getting the XML like today. >>>>>>> >>>>>>> In the other direction, Vdsm will pass through the XML (perhaps only >>>>>>> parts of it, e.g. the devices subtree) like before. >>>>>>> >>>>>>> This way we can minimize the changes we are uncertain of, and more >>>>>>> importantly, we can minimize the risky changes. >>>>>>> >>>>>>> >>>>>>> The following is a realistic example of how the XML could look like if >>>>>>> we send all but the storage devices. It is built using my pyxmlpickle >>>>>>> module (see [3] below). >>>>>> That’s quite verbose. How much work would it need to actually minimize >>>>>> it and turn it into something more simple. >>>>>> Most such stuff should go away and I believe it would be beneficial to >>>>>> make it difficult to use to discourage using metadata as a generic >>>>>> junkyard >>>>> It is verbose because it is generic - indeed perhaps too generic. >>>>> I can try something else based on a concept from Martin Polednik. Will >>>>> follow up soon. >>>> Early preview: >>>> https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:virt-metadata-compact >>>> >>>> still plenty of TODOs, I expect to be reviewable material worst case >>>> monday morning. >>> This is how typical XML could look like: >>> >>> >>> >>> >>> >>> >> not under the ? >> any reason? > No reason, I'll move under it > Unfortunately we need to have the prefix for all the elements, not just for the top-level one. Updating. -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] Engine XML: metadata and devices from XML
On 03/17/2017 11:07 PM, Michal Skrivanek wrote: >> On 17 Mar 2017, at 15:57, Francesco Romani wrote: >> >> On 03/16/2017 08:03 PM, Francesco Romani wrote: >>> On 03/16/2017 01:26 PM, Francesco Romani wrote: >>>> On 03/16/2017 11:47 AM, Michal Skrivanek wrote: >>>>>> On 16 Mar 2017, at 09:45, Francesco Romani wrote: >>>>>> >>>>>> We talked about sending storage device purely on metadata, letting Vdsm >>>>>> rebuild them and getting the XML like today. >>>>>> >>>>>> In the other direction, Vdsm will pass through the XML (perhaps only >>>>>> parts of it, e.g. the devices subtree) like before. >>>>>> >>>>>> This way we can minimize the changes we are uncertain of, and more >>>>>> importantly, we can minimize the risky changes. >>>>>> >>>>>> >>>>>> The following is a realistic example of how the XML could look like if >>>>>> we send all but the storage devices. It is built using my pyxmlpickle >>>>>> module (see [3] below). >>>>> That’s quite verbose. How much work would it need to actually minimize it >>>>> and turn it into something more simple. >>>>> Most such stuff should go away and I believe it would be beneficial to >>>>> make it difficult to use to discourage using metadata as a generic >>>>> junkyard >>>> It is verbose because it is generic - indeed perhaps too generic. >>>> I can try something else based on a concept from Martin Polednik. Will >>>> follow up soon. >>> Early preview: >>> https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:virt-metadata-compact >>> >>> still plenty of TODOs, I expect to be reviewable material worst case >>> monday morning. >> This is how typical XML could look like: >> >> >> >> >> >> > not under the ? > any reason? No reason, I'll move under it Bests, -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] Engine XML: metadata and devices from XML
On 03/16/2017 08:03 PM, Francesco Romani wrote: > On 03/16/2017 01:26 PM, Francesco Romani wrote: >> On 03/16/2017 11:47 AM, Michal Skrivanek wrote: >>>> On 16 Mar 2017, at 09:45, Francesco Romani wrote: >>>> >>>> We talked about sending storage device purely on metadata, letting Vdsm >>>> rebuild them and getting the XML like today. >>>> >>>> In the other direction, Vdsm will pass through the XML (perhaps only >>>> parts of it, e.g. the devices subtree) like before. >>>> >>>> This way we can minimize the changes we are uncertain of, and more >>>> importantly, we can minimize the risky changes. >>>> >>>> >>>> The following is a realistic example of how the XML could look like if >>>> we send all but the storage devices. It is built using my pyxmlpickle >>>> module (see [3] below). >>> That’s quite verbose. How much work would it need to actually minimize it >>> and turn it into something more simple. >>> Most such stuff should go away and I believe it would be beneficial to make >>> it difficult to use to discourage using metadata as a generic junkyard >> It is verbose because it is generic - indeed perhaps too generic. >> I can try something else based on a concept from Martin Polednik. Will >> follow up soon. > Early preview: > https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:virt-metadata-compact > > still plenty of TODOs, I expect to be reviewable material worst case > monday morning. This is how typical XML could look like: true true 192.168.1.51 en-us DEFAULT smain,sinputs,scursor,splayback,srecord,sdisplay,ssmartcard,susbredir ovirtmgmt c578566d-bc61-420c-8f1e-8dfa0a18efd5 5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc 5890a292-0390-01d2-01ed-029a 66441539-f7ac-4946-8a25-75e422f939d4 still working on this -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] Engine XML: metadata and devices from XML
On 03/16/2017 01:26 PM, Francesco Romani wrote: > On 03/16/2017 11:47 AM, Michal Skrivanek wrote: >>> On 16 Mar 2017, at 09:45, Francesco Romani wrote: >>> >>> We talked about sending storage device purely on metadata, letting Vdsm >>> rebuild them and getting the XML like today. >>> >>> In the other direction, Vdsm will pass through the XML (perhaps only >>> parts of it, e.g. the devices subtree) like before. >>> >>> This way we can minimize the changes we are uncertain of, and more >>> importantly, we can minimize the risky changes. >>> >>> >>> The following is a realistic example of how the XML could look like if >>> we send all but the storage devices. It is built using my pyxmlpickle >>> module (see [3] below). >> That’s quite verbose. How much work would it need to actually minimize it >> and turn it into something more simple. >> Most such stuff should go away and I believe it would be beneficial to make >> it difficult to use to discourage using metadata as a generic junkyard > It is verbose because it is generic - indeed perhaps too generic. > I can try something else based on a concept from Martin Polednik. Will > follow up soon. Early preview: https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:virt-metadata-compact still plenty of TODOs, I expect to be reviewable material worst case monday morning. -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] Engine XML: metadata and devices from XML
On 03/16/2017 11:47 AM, Michal Skrivanek wrote: >> On 16 Mar 2017, at 09:45, Francesco Romani wrote: >> >> We talked about sending storage device purely on metadata, letting Vdsm >> rebuild them and getting the XML like today. >> >> In the other direction, Vdsm will pass through the XML (perhaps only >> parts of it, e.g. the devices subtree) like before. >> >> This way we can minimize the changes we are uncertain of, and more >> importantly, we can minimize the risky changes. >> >> >> The following is a realistic example of how the XML could look like if >> we send all but the storage devices. It is built using my pyxmlpickle >> module (see [3] below). > That’s quite verbose. How much work would it need to actually minimize it and > turn it into something more simple. > Most such stuff should go away and I believe it would be beneficial to make > it difficult to use to discourage using metadata as a generic junkyard It is verbose because it is generic - indeed perhaps too generic. I can try something else based on a concept from Martin Polednik. Will follow up soon. Bests, > >> >> >> a0 >> ccd945c8-8069-4f31-8471-bbb58e9dd6ea >> http://ovirt.org/vm/tune/1.0"; >> xmlns:ovirt-vm="http://ovirt.org/vm/1.0"; >> xmlns:ovirt-instance="http://ovirt.org/vm/instance/1.0";> >> >> >> >> >> >> >>> type="str">2 >>> type="str">ide >> >> >> >>> type="str">true >>> type="str">e59c985c-46c2-4489-b355-a6f374125eb9 >> >> > type="str">1 >> > type="str">0 >> > type="str">drive >> > type="str">0 >> > type="str">0 >> >>> type="str">cdrom >>> type="str">false >> >>> type="str">disk >> >> >>> type="str">5890a292-0390-01d2-01ed-029a >> >> > type="str">c578566d-bc61-420c-8f1e-8dfa0a18efd5 >> > type="str">path >> > type="int">0 >> > type="str">/rhev/data-center/mnt/192.168.1.20:_srv_virtstore_nfs_rel40x_data/c578566d-bc61-420c-8f1e-8dfa0a18efd5/images/66441539-f7ac-4946-8a25-75e422f939d4/5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc >> > type="str">5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc >> > type="str">/rhev/data-center/mnt/192.168.1.20:_srv_virtstore_nfs_rel40x_data/c578566d-bc61-420c-8f1e-8dfa0a18efd5/images/66441539-f7ac-4946-8a25-75e422f939d4/5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc.lease >> > type="str">66441539-f7ac-4946-8a25-75e422f939d4 >> >>> type="int">0 >>> type="str">virtio >>> type="str">8589934592 >>> type="str">66441539-f7ac-4946-8a25-75e422f939d4 >>> type="str">false >>> type="str">false >>> type="str">0 >>> type="str">disk >>> type="str">c578566d-bc61-420c-8f1e-8dfa0a18efd5 >>> type="str">0 >>> type="str">raw >>> type="str">66441539-f7ac-4946-8a25-75e422f939d4 >> >> > type="str">0x0 >> > type="str">0x00 >> > type="str">0x >> > type="str">pci >> > type="str">0x05 >> >>> type="str">disk >>> type="str">/rhev/data-center/5890a292-0390-01d2-01ed-029a/c578566d-bc61-420c-8f1e-8dfa0a18efd5/images/66441539-f7ac-4946-8a25-75e422f939d4/5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc >>> type="str">off >>> type="str">false >>> type="str">1 >>> type="str">5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc >> >
Re: [ovirt-devel] [vdsm] Engine XML: metadata and devices from XML
We talked about sending storage device purely on metadata, letting Vdsm rebuild them and getting the XML like today. In the other direction, Vdsm will pass through the XML (perhaps only parts of it, e.g. the devices subtree) like before. This way we can minimize the changes we are uncertain of, and more importantly, we can minimize the risky changes. The following is a realistic example of how the XML could look like if we send all but the storage devices. It is built using my pyxmlpickle module (see [3] below). a0 ccd945c8-8069-4f31-8471-bbb58e9dd6ea http://ovirt.org/vm/tune/1.0"; xmlns:ovirt-vm="http://ovirt.org/vm/1.0"; xmlns:ovirt-instance="http://ovirt.org/vm/instance/1.0";> 2 ide true e59c985c-46c2-4489-b355-a6f374125eb9 1 0 drive 0 0 cdrom false disk 5890a292-0390-01d2-01ed-029a c578566d-bc61-420c-8f1e-8dfa0a18efd5 path 0 /rhev/data-center/mnt/192.168.1.20:_srv_virtstore_nfs_rel40x_data/c578566d-bc61-420c-8f1e-8dfa0a18efd5/images/66441539-f7ac-4946-8a25-75e422f939d4/5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc 5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc /rhev/data-center/mnt/192.168.1.20:_srv_virtstore_nfs_rel40x_data/c578566d-bc61-420c-8f1e-8dfa0a18efd5/images/66441539-f7ac-4946-8a25-75e422f939d4/5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc.lease 66441539-f7ac-4946-8a25-75e422f939d4 0 virtio 8589934592 66441539-f7ac-4946-8a25-75e422f939d4 false false 0 disk c578566d-bc61-420c-8f1e-8dfa0a18efd5 0 raw 66441539-f7ac-4946-8a25-75e422f939d4 0x0 0x00 0x pci 0x05 disk /rhev/data-center/5890a292-0390-01d2-01ed-029a/c578566d-bc61-420c-8f1e-8dfa0a18efd5/images/66441539-f7ac-4946-8a25-75e422f939d4/5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc off false 1 5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc c578566d-bc61-420c-8f1e-8dfa0a18efd5 path 0 /rhev/data-center/mnt/192.168.1.20:_srv_virtstore_nfs_rel40x_data/c578566d-bc61-420c-8f1e-8dfa0a18efd5/images/66441539-f7ac-4946-8a25-75e422f939d4/5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc 5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc /rhev/data-center/mnt/192.168.1.20:_srv_virtstore_nfs_rel40x_data/c578566d-bc61-420c-8f1e-8dfa0a18efd5/images/66441539-f7ac-4946-8a25-75e422f939d4/5c4eeed4-f2a7-490a-ab57-a0d6f3a711cc.lease 66441539-f7ac-4946-8a25-75e422f939d4 4294967296 4194304 4194304 16 1020 /machine oVirt oVirt Node 7-3.1611.el7.centos ccd79775-c888-4789-975a-fde1143dffc9 ccd945c8-8069-4f31-8471-bbb58e9dd6ea hvm Haswell-noTSX destroy restart destroy /usr/libexec/qemu-kvm system_u:system_r:svirt_t:s0:c207,c629 system_u:object_r:svirt_image_t:s0:c207,c629 +107:+107 +107:+107 On 03/15/2017 01:28 PM, Francesco Romani wrote: > Hi everyone, > > This is both a report of the current state of my Vdsm patches for Engine > XML support, and a proposal to how move forward and solve > the current open issues. > > TL;DR: > 1. we can and IMO should reuse the current JSON schema to describe the > structure (layout) and the types of the metadata section. > 2. we don't need a priori validation of stuff in the metadata section. > We will just raise in the creation flow if data is missing, or wrong, > according to our schema. > 2. we will add *few* items to the metadata section, only thing we can't > express clearly-or at all in the libvirt XML. Redundancy and verb
[ovirt-devel] [vdsm] Engine XML: metadata and devices from XML
7-a0d6f3a711cc Please note that yes, this is still verbose, but we don't want to add much data here, for most of information the most reliable source will be the domain XML. We will add here only the extra info we can't really fetch from that. 2. I don't think we need explicit validation: we could just raise along the way in the creation flow if we don't find some extra metadata we need. This will also solve the issue that if we reuse the current schema and we omit most of data, we will lack quite a lot of elements marked mandatory. Once we reached agreement, I will update my https://docs.google.com/document/d/1eD8KSLwwyo2Sk64MytbmE0wBxxMlpIyEI1GRcHDkp7Y/edit#heading=h.hqdqzmmm9i77 accordingly. Final note: while device take the lion's share, we will likely need help from the metadata section also to store VM extra info, but all the above discussion also applies here. +++ [1] https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:virt-metadata3 - uses xmltodict [2] https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:virt-metadata-pyxmlpickle ported the 'virt-metadata3' topic to pyxmlpickle [3] https://github.com/fromanirh/pyxmlpickle [4] https://docs.python.org/2/library/plistlib.html [5] https://github.com/martinblech/xmltodict [6] https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:vm-devs-xml [7] https://gerrit.ovirt.org/#/c/72880/15/lib/vdsm/virt/vmdevices/core.py -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [monitoring][collectd] the collectd virt plugin is now on par with Vdsm needs
No updates yet. I'll move forward filing a github issue, hoping to gather more feedback. Bests, On 03/12/2017 03:38 PM, Yaniv Dary wrote: > Any updates on the usage of collectd with rates? > > Yaniv Dary Technical Product Manager Red Hat Israel Ltd. 34 Jerusalem > Road Building A, 4th floor Ra'anana, Israel 4350109 Tel : +972 (9) > 7692306 8272306 Email: yd...@redhat.com <mailto:yd...@redhat.com> IRC > : ydary > > On Tue, Feb 28, 2017 at 4:17 PM, Yaniv Dary <mailto:yd...@redhat.com>> wrote: > > > > Yaniv Dary Technical Product Manager Red Hat Israel Ltd. 34 > Jerusalem Road Building A, 4th floor Ra'anana, Israel 4350109 Tel > : +972 (9) 7692306 8272306 Email: > yd...@redhat.com <mailto:yd...@redhat.com> IRC : ydary > > > On Tue, Feb 28, 2017 at 4:06 PM, Francesco Romani > mailto:from...@redhat.com>> wrote: > > > On 02/28/2017 12:24 PM, Yaniv Dary wrote: > > We need good answers from them to why they do not support > this use case. > > Maybe a github issue on the use case would get more > attention. They > > should allow us to choose how to present and collect the data. > > Can you open one? > > > > I can, and I will if I get no answer in few more days. > Meantime, among other things, I'm doing my homework to > understand why > they do like that. > > This is the best source of information I found so far (please > check the > whole thread, it's pretty short): > > > https://mailman.verplant.org/pipermail/collectd/2013-September/005924.html > > <https://mailman.verplant.org/pipermail/collectd/2013-September/005924.html> > > Quoting part of the email > > """ > > We only came up with one use case where having the raw counter > values is > beneficial: If you want to calculate the average rate over > arbitrary > time spans, it's easier to look up the raw counter values for > those > points in time and go from there. However, you can also sum up the > individual rates to reach the same result. Finally, when handling > counter resets / overflows within this interval, integrating > over / > summing rates is trivial by comparison. > > Do you have any other use-case for raw counter values? > > Pro: > > * Handling of values becomes easier. > * The rate is calculated only once, in contrast to > potentially several > times, which might be more efficient (currently each rate > conversion > involves a lookup call). > * Together with (1), this removes the need for having the > "types.db", > which could be removed then. We were in wild agreement > that this > would be a worthwhile goal. > > > Not for adding units: > https://github.com/collectd/collectd/issues/2047 > <https://github.com/collectd/collectd/issues/2047> > > > > Contra: > > * Original raw value is lost. It can be reconstructed except > for a > (more or less) constant offset, though. > > > How is this done? > > > """ > > > Looks like this change was intentional and implemented after some > discussion. > > > I understand this, but most monitoring system will not know what > to do with this value. > > > > Bests, > > -- > Francesco Romani > Red Hat Engineering Virtualization R & D > IRC: fromani > > > -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [monitoring][collectd] the collectd virt plugin is now on par with Vdsm needs
On 02/28/2017 12:24 PM, Yaniv Dary wrote: > We need good answers from them to why they do not support this use case. > Maybe a github issue on the use case would get more attention. They > should allow us to choose how to present and collect the data. > Can you open one? > I can, and I will if I get no answer in few more days. Meantime, among other things, I'm doing my homework to understand why they do like that. This is the best source of information I found so far (please check the whole thread, it's pretty short): https://mailman.verplant.org/pipermail/collectd/2013-September/005924.html Quoting part of the email """ We only came up with one use case where having the raw counter values is beneficial: If you want to calculate the average rate over arbitrary time spans, it's easier to look up the raw counter values for those points in time and go from there. However, you can also sum up the individual rates to reach the same result. Finally, when handling counter resets / overflows within this interval, integrating over / summing rates is trivial by comparison. Do you have any other use-case for raw counter values? Pro: * Handling of values becomes easier. * The rate is calculated only once, in contrast to potentially several times, which might be more efficient (currently each rate conversion involves a lookup call). * Together with (1), this removes the need for having the "types.db", which could be removed then. We were in wild agreement that this would be a worthwhile goal. Contra: * Original raw value is lost. It can be reconstructed except for a (more or less) constant offset, though. """ Looks like this change was intentional and implemented after some discussion. Bests, -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [monitoring][collectd] the collectd virt plugin is now on par with Vdsm needs
On 02/27/2017 01:32 PM, Yaniv Dary wrote: > This is about accumulative values, I'm also asking about stats like > CPU usage of the VM\Host that is not reported in absolute value. > Can you bump the thread? Done, let's see. Speaking about options: during the reviews of my pull requests we also discussed the (semi?)recommended way to report more values without adding new collectd types, which is something the collectd upstream really tries to avoid. So we could report the current values *and* the absolutes, making everyone happy; but I'm afraid this will require a new plugin, like the one I had in the working (https://github.com/fromanirh/collectd-ovirt) TL;DR: in the worst case, we have one safe fallback option. -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [VDSM] granting network+2 to Eddy
On 02/28/2017 08:32 AM, Dan Kenigsberg wrote: > Hi, > > After more than a year of substantial contribution to Vdsm networking, > and after several months of me upgrading his score, I would like to > nominate Eddy as a maintainer for network-related code in Vdsm, in > master and stable branches. > > Current Vdsm maintainers and others: please approve my suggestion if > you agree with it. Approved -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [monitoring][collectd] the collectd virt plugin is now on par with Vdsm needs
On 02/26/2017 03:13 PM, Yaniv Dary wrote: > >> 2. collectd *intentionally* report metrics as rates, not as >> absolute >> values as Vdsm does. This may be one issue in presence of >> restarts/data >> loss in the link between collectd and the metrics store. >> >> >> How does this work? >> If we want to show memory usage over time for example, we need to >> have the usage, not the rate. >> How would this be reported? > > I was imprecise, my fault. > > Let me retry: > collectd intentionally report quite a lot of metrics we care about > as rates, not as absolute values. > Memory is actually ok fine. > > a0/virt/disk_octets-hdc -> rate > a0/virt/disk_octets-vda > a0/virt/disk_ops-hdc -> rate > a0/virt/disk_ops-vda > a0/virt/disk_time-hdc -> rate > a0/virt/disk_time-vda > a0/virt/if_dropped-vnet0 -> rate > a0/virt/if_errors-vnet0 -> rate > a0/virt/if_octets-vnet0 -> rate > a0/virt/if_packets-vnet0 -> rate > a0/virt/memory-actual_balloon -> absolute > a0/virt/memory-rss -> absolute > a0/virt/memory-total -> absolute > a0/virt/ps_cputime -> rate > a0/virt/total_requests-flush-hdc -> rate > a0/virt/total_requests-flush-vda > a0/virt/total_time_in_ms-flush-hdc -> rate > a0/virt/total_time_in_ms-flush-vda > a0/virt/virt_cpu_total -> rate > a0/virt/virt_vcpu-0 -> rate > a0/virt/virt_vcpu-1 > > collectd "just" reports the changes since the last sampling. I'm > not sure which is the best way to handle that; I've sent a mail to > collectd list some time ago, no answer so far. > > > Can you CC on that thread? > I don't know how ES would work with rates at all. > I want to be able to show CPU usage over time and I need to know if > its 80% or 10%. > Thanks to the awkward gmail interface I can't reply to myself and CC other people, but I can share the link: https://mailman.verplant.org/pipermail/collectd/2017-January/006965.html -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] oVirt Engine 4.1.1 has been branched out
On 02/23/2017 02:56 PM, Nir Soffer wrote: > On Thu, Feb 23, 2017 at 3:30 PM, Francesco Romani wrote: >> Vdsm will follow up soon. I'll branch out from the branch *tip* (*not* last >> tag) unless anyone objects. > Please avoid branching if we don't have a real need for it. Since things are calm on the ovirt-4.1 patch queue, will delay the branch out until we have clear need. Bests, -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] oVirt Engine 4.1.1 has been branched out
Vdsm will follow up soon. I'll branch out from the branch *tip* (*not* last tag) unless anyone objects. The last tip is tag 4.19.6 plus three patches attached to those three BZs (all merged before Feb, 22) https://bugzilla.redhat.com/show_bug.cgi?id=1376116 https://bugzilla.redhat.com/show_bug.cgi?id=1412583 https://bugzilla.redhat.com/show_bug.cgi?id=1412455 Please let me know if we need any revert On 02/22/2017 10:04 AM, Tal Nisan wrote: > Hi everyone, > Since we've moved to blocker/exception only mode we've branched oVirt > engine 4.1.1 today from the ovirt-engine-4.1 branch to allow work on > 4.1.3 bugs. > > From this moment on 4.1.1 engine bugs have to be merged to: > master > ovirt-engine-4.1 > ovirt-engine-4.1.1.z > > For 4.1.3 only the first 2 branches. > > Thanks. > > > > > > ___ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [monitoring][collectd] the collectd virt plugin is now on par with Vdsm needs
On 02/21/2017 02:44 PM, Yaniv Kaul wrote: > > > On Tue, Feb 21, 2017 at 1:06 PM Francesco Romani <mailto:from...@redhat.com>> wrote: > > Hello everyone, > > > in the last weeks I've been submitting PRs to collectd upstream, to > bring the virt plugin up to date with Vdsm and oVirt needs. > > Previously, the collectd virt plugin reported only a subset of metrics > oVirt uses. > > In current collectd master, the collectd virt plugin provides all the > data Vdsm (thus Engine) needs. This means that it is now > > possible for Vdsm or Engine to query collectd, not Vdsm/libvirt, and > have the same data. > > > Do we wish to ship the unixsock collectd plugin? I'm not sure we do > these days (4.1). > We can do that later, of course, when we ship this. > Y. > AFAIR the collectd unixsock plugin it's built and shipped by default by the collectd (even RPMs). It is the way the command line `collectdctl` too uses to talk with the daemon. Our client module is still work in progress. I'd be happy to just use a third party client module, the (semi-)official one is not shipped by default last time I checked; perhaps just file one RFE about that? -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [monitoring][collectd] the collectd virt plugin is now on par with Vdsm needs
On 02/21/2017 11:55 PM, Yaniv Dary wrote: > > > Yaniv Dary > Technical Product Manager > Red Hat Israel Ltd. > 34 Jerusalem Road > Building A, 4th floor > Ra'anana, Israel 4350109 > > Tel : +972 (9) 7692306 > 8272306 > Email: yd...@redhat.com <mailto:yd...@redhat.com> > IRC : ydary > > On Feb 21, 2017 13:06, "Francesco Romani" <mailto:from...@redhat.com>> wrote: > > Hello everyone, > > > in the last weeks I've been submitting PRs to collectd upstream, to > bring the virt plugin up to date with Vdsm and oVirt needs. > > Previously, the collectd virt plugin reported only a subset of metrics > oVirt uses. > > In current collectd master, the collectd virt plugin provides all the > data Vdsm (thus Engine) needs. This means that it is now > > possible for Vdsm or Engine to query collectd, not Vdsm/libvirt, and > have the same data. > > > There are only two caveats: > > 1. it is yet to be seen which version of collectd will ship all those > enhancements > > 2. collectd *intentionally* report metrics as rates, not as absolute > values as Vdsm does. This may be one issue in presence of > restarts/data > loss in the link between collectd and the metrics store. > > > How does this work? > If we want to show memory usage over time for example, we need to have > the usage, not the rate. > How would this be reported? I was imprecise, my fault. Let me retry: collectd intentionally report quite a lot of metrics we care about as rates, not as absolute values. Memory is actually ok fine. a0/virt/disk_octets-hdc -> rate a0/virt/disk_octets-vda a0/virt/disk_ops-hdc -> rate a0/virt/disk_ops-vda a0/virt/disk_time-hdc -> rate a0/virt/disk_time-vda a0/virt/if_dropped-vnet0 -> rate a0/virt/if_errors-vnet0 -> rate a0/virt/if_octets-vnet0 -> rate a0/virt/if_packets-vnet0 -> rate a0/virt/memory-actual_balloon -> absolute a0/virt/memory-rss -> absolute a0/virt/memory-total -> absolute a0/virt/ps_cputime -> rate a0/virt/total_requests-flush-hdc -> rate a0/virt/total_requests-flush-vda a0/virt/total_time_in_ms-flush-hdc -> rate a0/virt/total_time_in_ms-flush-vda a0/virt/virt_cpu_total -> rate a0/virt/virt_vcpu-0 -> rate a0/virt/virt_vcpu-1 collectd "just" reports the changes since the last sampling. I'm not sure which is the best way to handle that; I've sent a mail to collectd list some time ago, no answer so far. -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] OST: HE vm does not restart on HC setup
On 02/22/2017 03:42 PM, Yaniv Kaul wrote: > > > On Wed, Feb 22, 2017 at 4:32 PM Francesco Romani <mailto:from...@redhat.com>> wrote: > > On 02/22/2017 01:53 PM, Simone Tiraboschi wrote: >> >> >> On Wed, Feb 22, 2017 at 1:33 PM, Simone Tiraboschi >> mailto:stira...@redhat.com>> wrote: >> >> When ovirt-ha-agent checks the status of the engine VM we get: >> >> 2017-02-21 22:21:14,738-0500 ERROR (jsonrpc/2) [api] FINISH getStats >> error=Virtual machine does not exist: {'vmId': >> u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'} (api:69) >> Traceback (most recent call last): >> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line >> 67, in method >> ret = func(*args, **kwargs) >> File "/usr/share/vdsm/API.py", line 335, in getStats >> vm = self.vm >> File "/usr/share/vdsm/API.py", line 130, in vm >> raise exception.NoSuchVM(vmId=self._UUID) >> NoSuchVM: Virtual machine does not exist: {'vmId': >> u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'} >> >> While in ovirt-ha-agent logs we have: >> >> MainThread::INFO::2017-02-21 >> 22:21:18,583::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >> Current state UnknownLocalVmState (score: 3400) >> >> ... >> >> MainThread::INFO::2017-02-21 >> 22:21:31,199::state_decorators::25::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check) >> Unknown local engine vm status no actions taken >> >> Probably it's a bug or a regression somewhere on master. >> >> On ovirt-ha-broker side the detection is based on a strict string >> match on the error message that is expected to be exactly >> 'Virtual machine does not exist' to set down status otherwise we >> set unknown status as in this case: >> >> https://gerrit.ovirt.org/gitweb?p=ovirt-hosted-engine-ha.git;a=blob;f=ovirt_hosted_engine_ha/broker/submonitors/engine_health.py;h=d633cb860b811e84021221771bf706a9a4ac1d63;hb=refs/heads/master#l54 >> >> >> Adding Francesco here to understand if something has recently >> changed there on vdsm side. > It has changed indeed; we had a series of changes which added > context to some exceptions. I believe the straw who broke the > camel's back was I32ec3f86f8d53f8412f4c0526fc85e2a42e30ea5 It is > unfortunate that this change broke HA. Could you perhaps fixing it > checking that the message *begins* with that string, and/or > checking the error code. bests, > > > On the bright side, this is exactly why we need o-s-t running > Hosted-Engine - though we probably need to exercise more HE flows > (global and local maint., for example). > On the downside, how come I32ec3f86f8d53f8412f4c0526fc85e2a42e30ea5 > was merged on Jan1st, and we only saw the regression now? Is there > another bug that hid this one until now? > Y. > It was merged on Jan 29 on master, backported on Feb 8 on 4.1 branch (because it was part of the vmleases feature, needed on 4.1.z). Bests, -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] OST: HE vm does not restart on HC setup
On 02/22/2017 01:53 PM, Simone Tiraboschi wrote: > > > On Wed, Feb 22, 2017 at 1:33 PM, Simone Tiraboschi > mailto:stira...@redhat.com>> wrote: > > When ovirt-ha-agent checks the status of the engine VM we get: > > 2017-02-21 22:21:14,738-0500 ERROR (jsonrpc/2) [api] FINISH getStats > error=Virtual machine does not exist: {'vmId': > u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'} (api:69) > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 67, in > method > ret = func(*args, **kwargs) > File "/usr/share/vdsm/API.py", line 335, in getStats > vm = self.vm > File "/usr/share/vdsm/API.py", line 130, in vm > raise exception.NoSuchVM(vmId=self._UUID) > NoSuchVM: Virtual machine does not exist: {'vmId': > u'2ccc0ef0-cc31-45b8-8e91-a78fa4cad671'} > > While in ovirt-ha-agent logs we have: > > MainThread::INFO::2017-02-21 > 22:21:18,583::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state UnknownLocalVmState (score: 3400) > > ... > > MainThread::INFO::2017-02-21 > 22:21:31,199::state_decorators::25::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check) > Unknown local engine vm status no actions taken > > Probably it's a bug or a regression somewhere on master. > > On ovirt-ha-broker side the detection is based on a strict string > match on the error message that is expected to be exactly 'Virtual > machine does not exist' to set down status otherwise we set unknown > status as in this case: > https://gerrit.ovirt.org/gitweb?p=ovirt-hosted-engine-ha.git;a=blob;f=ovirt_hosted_engine_ha/broker/submonitors/engine_health.py;h=d633cb860b811e84021221771bf706a9a4ac1d63;hb=refs/heads/master#l54 > > > Adding Francesco here to understand if something has recently changed > there on vdsm side. It has changed indeed; we had a series of changes which added context to some exceptions. I believe the straw who broke the camel's back was I32ec3f86f8d53f8412f4c0526fc85e2a42e30ea5 It is unfortunate that this change broke HA. Could you perhaps fixing it checking that the message *begins* with that string, and/or checking the error code. bests, -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [monitoring][collectd] the collectd virt plugin is now on par with Vdsm needs
Hello everyone, in the last weeks I've been submitting PRs to collectd upstream, to bring the virt plugin up to date with Vdsm and oVirt needs. Previously, the collectd virt plugin reported only a subset of metrics oVirt uses. In current collectd master, the collectd virt plugin provides all the data Vdsm (thus Engine) needs. This means that it is now possible for Vdsm or Engine to query collectd, not Vdsm/libvirt, and have the same data. There are only two caveats: 1. it is yet to be seen which version of collectd will ship all those enhancements 2. collectd *intentionally* report metrics as rates, not as absolute values as Vdsm does. This may be one issue in presence of restarts/data loss in the link between collectd and the metrics store. Please keep reading for more details: How to get the code? This somehow tricky until we get one official release. If one is familiar with the RPM build process, it is easy to build one custom packages from a snapshot from collectd master (https://github.com/collectd/collectd) and a recent 5.7.1 RPM (like https://koji.fedoraproject.org/koji/buildinfo?buildID=835669) How to configure it? -- Most thing work out of the box. One currently in progress Vdsm patch ships the recommended configuration https://gerrit.ovirt.org/#/c/71176/6/static/etc/collectd.d/virt.conf The meaning of the configuration option is documented in man 5 collectd.conf How it looks like? -- Let me post one "screenshot" :) $ collectdctl listval | grep a0 a0/virt/disk_octets-hdc a0/virt/disk_octets-vda a0/virt/disk_ops-hdc a0/virt/disk_ops-vda a0/virt/disk_time-hdc a0/virt/disk_time-vda a0/virt/if_dropped-vnet0 a0/virt/if_errors-vnet0 a0/virt/if_octets-vnet0 a0/virt/if_packets-vnet0 a0/virt/memory-actual_balloon a0/virt/memory-rss a0/virt/memory-total a0/virt/ps_cputime a0/virt/total_requests-flush-hdc a0/virt/total_requests-flush-vda a0/virt/total_time_in_ms-flush-hdc a0/virt/total_time_in_ms-flush-vda a0/virt/virt_cpu_total a0/virt/virt_vcpu-0 a0/virt/virt_vcpu-1 How to consume the data? - Among the ways to query collectd, the two most popular (and most fitting for oVirt use case) ways are perhaps the network protocol (https://collectd.org/wiki/index.php/Binary_protocol) and the plain text protocol (https://collectd.org/wiki/index.php/Plain_text_protocol). The first could be used by Engine to get the data directly, or to consolidate the metrics in one database (e.g to run any kind of query, for historical series...). The latter will be used by Vdsm to keep reporting the metrics (again https://gerrit.ovirt.org/#/c/71176/6) Please note that the performance of the plain text protocol are known to be lower than the binary protocol What about the unresponsive hosts? --- We know from experience that hosts may become unresponsive, and this can disrupt monitoring. however, we do want to keep monitoring the responsive hosts, avoiding that one rogue hosts makes us lose all the monitoring data. To cope with this need, the virt plugin gained support for "partition tag". With this, we can group VMs together using one arbitrary tag. This is completely transparent to collectd, and also completely optional. oVirt can use this tag to group VMs per-storage-domain, or however it sees fit, trying to minimize the disruption should one host become unresponsive. Read the full docs here: https://github.com/collectd/collectd/commit/999efc28d8e2e96bc15f535254d412a79755ca4f What about the collectd-ovirt plugin? Some time ago I implemented one out-of-tree collectd plugin leveraging the libvirt bulk stats: https://github.com/fromanirh/collectd-ovirt This plugin is meant to be a modern, drop-in replacement for the existing virt plugin. The development of that out of tree plugin is now halted, because we have everything we need in the upstream collectd plugin. Future work -- We believe we have reached feature parity, so we are looking for bugixes/performance tuning in the near term future. I'll be happy to provide more patches/PRs about that. Thanks and bests, -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] Verify Grade (-1) for check_product and check_target_milestone hooks
On 02/05/2017 12:39 PM, Tal Nisan wrote: > Great, thanks! > I will let you know if something doesn't work as it should > > On Sun, Feb 5, 2017 at 12:09 PM, Shlomo Ben David <mailto:sbend...@redhat.com>> wrote: > > Hi Tal, > > I'm going to apply today the verify grade (-1) for the following > hooks: > > 1. *check_product* - if the patch project is not the same as the > bug product will return a verify grade (-1). > 2. *check_target_milestone* - if the patch branch major version > is not the same as the bug target milestone major version will > return a verify grade (-1). > > Best Regards, > Let's consider this scenario: we just begin developing the 4.2 version on the master branch; we just released the 4.1.0 version, so we have the 4.1.z branch, and we are still supporting the 4.0.z branch. Let's consider a bug which needs to be backported in the 4.0.z branch, from master, of course passing throgh 4.1.z 0. bug target milestone is 4.0.N+1 (last release is 4.0.N) 1. patch on master -> OK 2. patch backported first time on branch 4.1.z -> check_target_milestone fails! branch major 4.1, bug target 4.0 It seems that this flow doesn't cover the case on which we need a double backport. Bests, -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [RFE][oVirt 4.2] Engine XML - the Vdsm side
Hi everyone, following up the reports from Arik, I created a document outlining the Vdsm plans, changes and challenges. Please review and comment the document itself https://docs.google.com/document/d/1eD8KSLwwyo2Sk64MytbmE0wBxxMlpIyEI1GRcHDkp7Y/edit?usp=sharing Or here in this thread. Please point out omissions or concepts you want to be explained in more detail. Once agreement is achieved and all the comments addressed, I will start creating and posting the implementation patches. I am doing cleanup work meanwhile. Comments welcome, -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [Vdsm][Heads up] branching out 4.1.0 on February, 1st.
Hi everyone I'm planning to branch out 4.1.0 on February 1st, to unlock merges on the 4.1 branches, thus backporting to 4.0/3.6. This will make even more important to backport only blocker fixes to 4.1.0. Bests, -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] suspend_resume_vm fail on master experimental
On 01/11/2017 09:29 AM, Yaniv Kaul wrote: > > > On Wed, Jan 11, 2017 at 10:26 AM, Francesco Romani <mailto:from...@redhat.com>> wrote: > > Hi all > > > On 01/11/2017 08:52 AM, Eyal Edri wrote: >> Adding Tomas from Virt. >> >> On Tue, Jan 10, 2017 at 10:54 AM, Piotr Kliczewski >> mailto:piotr.kliczew...@gmail.com>> >> wrote: >> >> On Tue, Jan 10, 2017 at 9:29 AM, Daniel Belenky >> mailto:dbele...@redhat.com>> wrote: >> > Hi all, >> > >> > test-repo_ovirt_experimental_master (link to Jenkins) job >> failed on >> > basic_sanity scenario. >> > The job was triggered by >> https://gerrit.ovirt.org/#/c/69845/ >> <https://gerrit.ovirt.org/#/c/69845/> >> > >> > From looking at the logs, it seems that the reason is VDSM. >> > >> > In the VDSM log, i see the following error: >> > >> > 2017-01-09 16:47:41,331 ERROR (JsonRpc (StompReactor)) >> [vds.dispatcher] SSL >> > error receiving from > connected ('::1', >> > 34942, 0, 0) at 0x36b95f0>: unexpected eof (betterAsyncore:119) >> > > Daniel, could you please remind me the jenkins link? I see > something suspicious on the Vdsm log. > > > Please use my live system: > ssh m...@ykaul-mini.tlv.redhat.com > <mailto:m...@ykaul-mini.tlv.redhat.com> (redacted) > then run a console to the VM: > lagocli --prefix-path /dev/shm/run/current shell engine > > (or 'host0' for the host) > > Most notably, Vdsm received SIGTERM. Is this expected and part of > the test? > > > It's not. I fooled myself. We have two hosts here. host0 was holding the VM up until the suspend. Then Engine decided to resume the VM on the other one, host1. While the VM was resuming, host0 experienced network issues, which led to soft-fencing. That explains the mess on host0, even though it is still unclear why host0 had network issues and heartbeat exceeded in the first place. On host1 the waters are even darker. The VM was resumed ~02:36 2017-01-11 02:36:04,700-05 INFO [org.ovirt.engine.core.vdsbroker.CreateVmVDSCommand] (default task-17) [72c41f12-649b-4833-8485-44e8d20d2b49] FINISH, CreateVmVDSCommand, return: RestoringState, log id: 378da701 2017-01-11 02:36:04,700-05 INFO [org.ovirt.engine.core.bll.RunVmCommand] (default task-17) [72c41f12-649b-4833-8485-44e8d20d2b49] Lock freed to object 'EngineLock:{exclusiveLocks='[42860011-acc3-44d6-9ddf-dea64 2caf083=]', sharedLocks='null'}' 2017-01-11 02:36:04,704-05 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-17) [72c41f12-649b-4833-8485-44e8d20d2b49] Correlation ID: 72c41f12-649b-4833-8485-44e8d20d2b 49, Job ID: a93b571e-aed1-40e7-8d71-831f646255fb, Call Stack: null, Custom Event ID: -1, Message: VM vm0 was started by admin@internal-authz (Host: lago-basic-suite-master-host1). While well within the timeout limit (600s), the vm was still restoring its state: 2017-01-11 02:37:31,059-05 DEBUG [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (DefaultQuartzScheduler8) [5582058d] START, GetAllVmStatsVDSCommand(HostName = lago-basic-suite-master-host1, VdsIdVDSCommandParametersBase:{runAsync='true', hostId='0336661b-721f-4c55-9327-8fd2fd3a0542'}), log id: 1803c51b 2017-01-11 02:37:31,059-05 DEBUG [org.ovirt.vdsm.jsonrpc.client.reactors.stomp.impl.Message] (DefaultQuartzScheduler8) [5582058d] SEND destination:jms.topic.vdsm_requests reply-to:jms.topic.vdsm_responses content-length:103 {"jsonrpc":"2.0","method":"Host.getAllVmStats","params":{},"id":"f2997c1d-49cc-4d2e-937c-e4910fbb75df"}^@ 2017-01-11 02:37:31,059-05 DEBUG [org.ovirt.vdsm.jsonrpc.client.reactors.stomp.StompCommonClient] (DefaultQuartzScheduler8) [5582058d] Message sent: SEND destination:jms.topic.vdsm_requests content-length:103 reply-to:jms.topic.vdsm_responses 2017-01-11 02:37:31,062-05 DEBUG [org.ovirt.vdsm.jsonrpc.client.reactors.stomp.impl.Message] (SSL Stomp Reactor) [5e453618] MESSAGE content-length:829 destination:jms.topic.vdsm_responses content-type:application/json subscription:fe930de2-aa67-4bc4-a34c-be22edd1623e {"jsonrpc": "2.0", "id": "f2997c1d-49cc-4d2e-937c-e4910fbb75df", "result": [{"username": "Unknown", "displayInfo": [{"tlsPort": "-1", "ipAddress": "192.168.201.4", "type": "vnc", "port": "-1"}, {"
Re: [ovirt-devel] suspend_resume_vm fail on master experimental
Thanks, sorry, I was silly enough to have it missed before. On 01/11/2017 09:32 AM, Daniel Belenky wrote: > Link to Jenkins > <http://jenkins.ovirt.org/view/experimental%20jobs/job/test-repo_ovirt_experimental_master/4648/artifact/exported-artifacts/basic_suite_master.sh-el7/exported-artifacts/> > > On Wed, Jan 11, 2017 at 10:26 AM, Francesco Romani <mailto:from...@redhat.com>> wrote: > > Hi all > > > On 01/11/2017 08:52 AM, Eyal Edri wrote: >> Adding Tomas from Virt. >> >> On Tue, Jan 10, 2017 at 10:54 AM, Piotr Kliczewski >> mailto:piotr.kliczew...@gmail.com>> >> wrote: >> >> On Tue, Jan 10, 2017 at 9:29 AM, Daniel Belenky >> mailto:dbele...@redhat.com>> wrote: >> > Hi all, >> > >> > test-repo_ovirt_experimental_master (link to Jenkins) job >> failed on >> > basic_sanity scenario. >> > The job was triggered by >> https://gerrit.ovirt.org/#/c/69845/ >> <https://gerrit.ovirt.org/#/c/69845/> >> > >> > From looking at the logs, it seems that the reason is VDSM. >> > >> > In the VDSM log, i see the following error: >> > >> > 2017-01-09 16:47:41,331 ERROR (JsonRpc (StompReactor)) >> [vds.dispatcher] SSL >> > error receiving from > connected ('::1', >> > 34942, 0, 0) at 0x36b95f0>: unexpected eof (betterAsyncore:119) >> > > Daniel, could you please remind me the jenkins link? I see > something suspicious on the Vdsm log. > Most notably, Vdsm received SIGTERM. Is this expected and part of > the test? > >> > >> >> This issue means that the client closed connection while vdsm was >> replying. It can happen at any time >> when the client is not nice with the connection. As you can >> see the >> client connected locally '::1'. >> >> > >> > Also, when looking at the MOM logs, I see the the following: >> > >> > 2017-01-09 16:43:39,508 - mom.vdsmInterface - ERROR - >> Cannot connect to >> > VDSM! [Errno 111] Connection refused >> > >> >> Looking at the log at this time vdsm had no open socket. >> >> > > Correct, but IIRC we have a race on startup - that's the reason > why MOM retries to connect. After the new try, MOM seems to behave > correctly: > > 2017-01-09 16:44:05,672 - mom.RPCServer - INFO - ping() > 2017-01-09 16:44:05,673 - mom.RPCServer - INFO - getStatistics() > > -- > Francesco Romani > Red Hat Engineering Virtualization R & D > IRC: fromani > > > > > -- > /Daniel Belenky > / > /RHV DevOps > / > /Red Hat Israel > / -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] suspend_resume_vm fail on master experimental
Hi all On 01/11/2017 08:52 AM, Eyal Edri wrote: > Adding Tomas from Virt. > > On Tue, Jan 10, 2017 at 10:54 AM, Piotr Kliczewski > mailto:piotr.kliczew...@gmail.com>> wrote: > > On Tue, Jan 10, 2017 at 9:29 AM, Daniel Belenky > mailto:dbele...@redhat.com>> wrote: > > Hi all, > > > > test-repo_ovirt_experimental_master (link to Jenkins) job failed on > > basic_sanity scenario. > > The job was triggered by https://gerrit.ovirt.org/#/c/69845/ > <https://gerrit.ovirt.org/#/c/69845/> > > > > From looking at the logs, it seems that the reason is VDSM. > > > > In the VDSM log, i see the following error: > > > > 2017-01-09 16:47:41,331 ERROR (JsonRpc (StompReactor)) > [vds.dispatcher] SSL > > error receiving from connected ('::1', > > 34942, 0, 0) at 0x36b95f0>: unexpected eof (betterAsyncore:119) > Daniel, could you please remind me the jenkins link? I see something suspicious on the Vdsm log. Most notably, Vdsm received SIGTERM. Is this expected and part of the test? > > > > This issue means that the client closed connection while vdsm was > replying. It can happen at any time > when the client is not nice with the connection. As you can see the > client connected locally '::1'. > > > > > Also, when looking at the MOM logs, I see the the following: > > > > 2017-01-09 16:43:39,508 - mom.vdsmInterface - ERROR - Cannot > connect to > > VDSM! [Errno 111] Connection refused > > > > Looking at the log at this time vdsm had no open socket. > > Correct, but IIRC we have a race on startup - that's the reason why MOM retries to connect. After the new try, MOM seems to behave correctly: 2017-01-09 16:44:05,672 - mom.RPCServer - INFO - ping() 2017-01-09 16:44:05,673 - mom.RPCServer - INFO - getStatistics() -- Francesco Romani Red Hat Engineering Virtualization R & D IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [Call for Vote] Proposal for new Incubator Project
ial oVirt hosting. The > hosting of oVirt v3 for testing is a laptop on a UPS at my home, and > v4 is also a different pc attached to a UPS. > > External Dependencies > > RHEVM/oVirt REST API - This provider must interact with the API itself > to manage virtual machines. > > Initial Committers > > Marcus Young ( 3vilpenguin at gmail dot com ) > > -- > Brian Proffitt > Principal Community Analyst > Open Source and Standards > @TheTechScribe > 574.383.9BKP > ___ > Board mailing list > bo...@ovirt.org > http://lists.ovirt.org/mailman/listinfo/board > > > ___ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel -- Francesco Romani Red Hat Engineering Virtualization R & D Phone: 8261328 IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [Call for Vote] moVirt as a Full oVirt Project
- Original Message - > From: "Nir Soffer" > To: "Michal Skrivanek" > Cc: "devel" , "board" > Sent: Monday, November 21, 2016 10:00:05 PM > Subject: Re: [ovirt-devel] [Call for Vote] moVirt as a Full oVirt Project > > On Mon, Nov 21, 2016 at 9:52 PM, Michal Skrivanek > wrote: > > > > > >> On 21 Nov 2016, at 19:48, Vojtech Szocs wrote: > >> > >> > >> > >> - Original Message - > >>> From: "Eyal Edri" > >>> To: "Vojtech Szocs" > >>> Cc: "Barak Korren" , "devel" , > >>> "board" , "Michal Skrivanek" > >>> > >>> Sent: Monday, November 21, 2016 7:23:44 PM > >>> Subject: Re: [ovirt-devel] [Call for Vote] moVirt as a Full oVirt Project > >>> > >>>> On Mon, Nov 21, 2016 at 8:17 PM, Vojtech Szocs > >>>> wrote: > >>>> > >>>> > >>>> > >>>> - Original Message - > >>>>> From: "Barak Korren" > >>>>> To: "Brian Proffitt" > >>>>> Cc: "Michal Skrivanek" , bo...@ovirt.org, "devel" > >>>>> < > >>>> devel@ovirt.org> > >>>>> Sent: Monday, November 21, 2016 7:01:08 PM > >>>>> Subject: Re: [ovirt-devel] [Call for Vote] moVirt as a Full oVirt > >>>>> Project > >>>>> > >>>>> -1 > > > > I wonder if 8x +1 beats one -1 :) > > 9X :-) > > +1 for including the project as is. Same, +1 for moVirt > If someone wants to run the project test or build it, the right way > to vote is by sending patches and making it happen. > > I think we should get out of our gerrit silo and move all ovirt > projects to github. This will give ovirt much better visibility. > > Here are some projects developed on github: > https://github.com/systemd/systemd > https://github.com/rust-lang/rust/ > https://github.com/kubernetes/kubernetes I have a very similar opinion. *Development* could move more towards github. If nothing else, this gives much better visibility and lowers the barrier to contribution. This doesn't mean we should shut down the existing infra: it fully makes sense to have CI, releases and final docs on ovirt.org, though. -- Francesco Romani Red Hat Engineering Virtualization R & D Phone: 8261328 IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [vdsm] Requiring virt-preview again on Fedora 24
Hi all, For the upcoming ovirt 4.1 release we are going to use features available only in recent libvirt (>= 2.0.0). Those features are available on CentOS/RHEL, but fedora 24 was left again. For this reason we are going to require the virt-preview repo again [1] This patch enables it on our CI: https://gerrit.ovirt.org/#/c/66308/1 Comments welcome +++ [1] https://fedoraproject.org/wiki/Virtualization_Preview_Repository -- Francesco Romani Red Hat Engineering Virtualization R & D Phone: 8261328 IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] Connection refused when talking to jsonrpc
- Original Message - > From: "Piotr Kliczewski" > To: "Martin Sivak" > Cc: "Michal Skrivanek" , "Francesco Romani" > , "Shira Maximov" > , "devel" > Sent: Wednesday, November 9, 2016 9:54:02 AM > Subject: Re: [ovirt-devel] [vdsm] Connection refused when talking to jsonrpc > > On Wed, Nov 9, 2016 at 9:48 AM, Martin Sivak wrote: > > > > Isn’t the most likely cause by far a simple startup delay? We do open > > the listener “soon” and responds with code 99, but it’s still not instant > > of course > > > > That is possible of course and we handle those "errors" just fine. But > > connection refused never happened with xmlrpc. It might have been > > luck, but it always worked there :) > > > > > There is no difference how we open listening socket (it is used by both > protocols) and I have seen the engine attempting to connect using both > protocols > before the socket was open. What is the time difference that you see? Not sure I reproduced correctly, but it seems a race on startup. I got the same error on my box, and here it happens if mom tries to connect to the unix socket /var/run/vdsm/mom-vdsm.sock *before* that Vdsm creates it. Once vdsmd succesfully starts, a restart of mom-vdsm seems to fix all the issues. I'm not sure yet if that's all of it and how to handle with systemd dependencies. Perhaps Nir's suggestion past in the thread to notify systemd is a good first step in the right direction. HTH, -- Francesco Romani Red Hat Engineering Virtualization R & D Phone: 8261328 IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] Connection refused when talking to jsonrpc
- Original Message - > From: "Piotr Kliczewski" > To: "Martin Perina" > Cc: "Francesco Romani" , "Shira Maximov" > , "devel" > Sent: Tuesday, November 8, 2016 9:31:43 PM > Subject: Re: [vdsm] Connection refused when talking to jsonrpc > >> 2016-11-08 18:25:30,030 - mom - ERROR - Failed to initialize MOM threads > >> Traceback (most recent call last): > >> File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 29, in > >> run > >> hypervisor_iface = self.get_hypervisor_interface() > >> File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 217, > >> in get_hypervisor_interface > >> return module.instance(self.config) > >> File "/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/v > >> dsmjsonrpcbulkInterface.py", > >> line 47, in instance > >> return JsonRpcVdsmBulkInterface() > >> File "/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/v > >> dsmjsonrpcbulkInterface.py", > >> line 29, in __init__ > >> super(JsonRpcVdsmBulkInterface, self).__init__() > >> File "/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/v > >> dsmjsonrpcInterface.py", > >> line 43, in __init__ > >> .orRaise(RuntimeError, 'No connection to VDSM.') > >> File "/usr/lib/python2.7/site-packages/mom/optional.py", line 28, in > >> orRaise > >> raise exception(*args, **kwargs) > >> RuntimeError: No connection to VDSM. > >> > >> > >> The question here is, how much time does VDSM need to allow jsonrpc to > >> connect and request a ping and list of VMs? > >> > >> > It depends on recovery logic in vdsm and it can take quite some time. > > Please share vdsm logs so I could take a look? +1 the most likely cause is the recovery still in progress, however I was expecting a different error, so worth looking at the logs. Bests, -- Francesco Romani Red Hat Engineering Virtualization R & D Phone: 8261328 IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [VDSM] Tests failing because of ordering dependencies
- Original Message - > From: "Nir Soffer" > To: "devel" , "Adam Litke" , "Dan > Kenigsberg" , "Francesco > Romani" , "Piotr Kliczewski" , > "Martin Sivak" > Sent: Wednesday, October 12, 2016 8:14:11 PM > Subject: [VDSM] Tests failing because of ordering dependencies > > Hi all, > > Trying to run vdsm tests via tox (so correct nose is used automatically), > some of the tests fail. > > The failure are all about ordering expectations, which look wrong. > > Please check and fix your tests. > > Thanks, > Nir > 18:04:10 >> begin captured logging << > > 18:04:10 2016-10-12 18:01:56,902 INFO(MainThread) [MOM] Preparing > MOM interface (momIF:49) > 18:04:10 2016-10-12 18:01:56,903 INFO(MainThread) [MOM] Using > named unix socket /tmp/tmpqOQZvm/test_mom_vdsm.sock (momIF:58) > 18:04:10 - >> end captured logging << > - > 18:04:10 > 18:04:10 > == > 18:04:10 FAIL: test_disk_virtio_cache (vmStorageTests.DriveXMLTests) > 18:04:10 > -- > 18:04:10 Traceback (most recent call last): > 18:04:10 File > "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/tests/vmStorageTests.py", > line 84, in test_disk_virtio_cache > 18:04:10 self.check(vm_conf, conf, xml, is_block_device=False) > 18:04:10 File > "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/tests/vmStorageTests.py", > line 222, in check > 18:04:10 self.assertXMLEqual(drive.getXML().toxml(), xml) > 18:04:10 File > "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/tests/testlib.py", > line 253, in assertXMLEqual > 18:04:10 (actualXML, expectedXML)) > 18:04:10 AssertionError: XMLs are different: > 18:04:10 Actual: > 18:04:10 > 18:04:10 > 18:04:10 > 18:04:10 > 18:04:10 54-a672-23e5b495a9ea > 18:04:10 io="threads" name="qemu" type="qcow2" /> > 18:04:10 > 18:04:10 800 > 18:04:10 612 > 18:04:10 > 18:04:10 > 18:04:10 > 18:04:10 Expected: > 18:04:10 > 18:04:10 > 18:04:10 > 18:04:10 > 18:04:10 54-a672-23e5b495a9ea > 18:04:10 io="threads" name="qemu" type="qcow2" /> > 18:04:10 > 18:04:10 612 > 18:04:10 800 > > Order of these elements differ, need to check why. Most likely because we build the xml like (vdsm/virt/vmdevices/storage.py) def _getIotuneXML(self): iotune = vmxml.Element('iotune') for key, value in self.specParams['ioTune'].iteritems(): iotune.appendChildWithArgs(key, text=str(value)) return iotune and iteritems() ordering is not defined > -- > 18:04:10 Traceback (most recent call last): > 18:04:10 File > "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/tests/vmTests.py", > line 434, in testCpuXML > 18:04:10 self.assertXMLEqual(find_xml_element(xml, "./cputune"), > cputuneXML) > 18:04:10 File > "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/tests/testlib.py", > line 253, in assertXMLEqual > 18:04:10 (actualXML, expectedXML)) > 18:04:10 AssertionError: XMLs are different: > 18:04:10 Actual: > 18:04:10 > 18:04:10 > 18:04:10 > 18:04:10 > 18:04:10 > 18:04:10 Expected: > 18:04:10 > 18:04:10 > 18:04:10 > > Same (vdsm/virt/vmxml.py) def appendCpus(self): # lot of code # CPU-pinning support # see http://www.ovirt.org/wiki/Features/Design/cpu-pinning if 'cpuPinning' in self.conf: cputune = Element('cputune') cpuPinning = self.conf.get('cpuPinning') for cpuPin in cpuPinning.keys(): cputune.appendChildWithArgs('vcpupin', vcpu=cpuPin, cpuset=cpuPinning[cpuPin]) self.dom.appendChild(cputune) > 18:04:10 > 18:04:10 > 18:04:10 > 18:04:10 > 18:04:10 > == > 18:04:10 FAIL: testSetIoTune (vmTests.TestVm) > 18:04:10 > -- > 18:04:10 Traceback (most recent call last): > 18:04:10 File > "/home/jenkins/workspace/vdsm_master_check-p
Re: [ovirt-devel] [vdsm] exploring a possible integration between collectd and Vdsm
- Original Message - > From: "Francesco Romani" > To: "devel" > Sent: Wednesday, October 12, 2016 9:29:37 AM > Subject: Re: [ovirt-devel] [vdsm] exploring a possible integration between > collectd and Vdsm > > - Original Message - > > From: "Yaniv Kaul" > > To: "Francesco Romani" > > Cc: "devel" > > Sent: Tuesday, October 11, 2016 10:31:14 PM > > Subject: Re: [ovirt-devel] [vdsm] exploring a possible integration between > > collectd and Vdsm > > > > On Tue, Oct 11, 2016 at 2:05 PM, Francesco Romani > > wrote: > > > > > Hi all, > > > > > > In the last 2.5 days I was exploring if and how we can integrate collectd > > > and Vdsm. > [...] > > This generally sounds like a good idea - and I hope it is coordinated with > > our efforts for monitoring (see [1], [2]). > > Sure it will. Sure it will be coordinated with the monitoring efforts*. -- Francesco Romani Red Hat Engineering Virtualization R & D Phone: 8261328 IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] exploring a possible integration between collectd and Vdsm
- Original Message - > From: "Yaniv Kaul" > To: "Francesco Romani" > Cc: "devel" > Sent: Tuesday, October 11, 2016 10:31:14 PM > Subject: Re: [ovirt-devel] [vdsm] exploring a possible integration between > collectd and Vdsm > > On Tue, Oct 11, 2016 at 2:05 PM, Francesco Romani > wrote: > > > Hi all, > > > > In the last 2.5 days I was exploring if and how we can integrate collectd > > and Vdsm. [...] > This generally sounds like a good idea - and I hope it is coordinated with > our efforts for monitoring (see [1], [2]). Sure it will. I played a couple of days with collectd to "just" see a. how hard is to write a collectd plugin, and/or if it is feasible to ship ot out-of-tree for the initial few releases, until it stabilizes so it can be submitted upstream b. if we can get events/notifications from collectd c. if we can integrate those notifications with Vdsm And turns out we *can* do all of the above, with various degrees of difficulty. > Few notes: > - I think the most compelling reason to move to collectd is actually to > benefit from the already existing plugins that it already has, which will > cover > a lot of the missing monitoring requirements and wishes we have (example: > local disk usage on the host), as well as integrate it into > Engine monitoring (example: postgresql performance monitoring). Agreed > - You can't remove monitoring from VDSM - as it new VDSM may work against > older Engine setups. You can gradually remove them. Yes, for example we can make Vdsm poll collectd and act as facade to old Engines, while new one should skip this step and ask collectd or the metrics aggregator service you mention below. > I'd actually begin with cleanup - there are some 'metrics' that are simply > not needed and should not be reported in the first place and > are there for historic reasons only. Remove them - from Engine first, from > the DB and all, then later we can either send fake values or remove > from VDSM. Yes, this is the first place where we need to coordinate with the metrics effort. > - If you are moving to collectd, as you can see from the metrics effort, > we'd really want to send it elsewhere - and Engine should consume it from > there. > Metrics storages usually have a very nice REST UI with the ability to bring > series with average, with different criteria (such as per hour, per minute > or what not stats), etc. Fully agreed > - I agree with Nir about separating between our core business and the > monitoring we do for extra. Keep in mind that some of the stats are for SLA > and critical scheduling decisions as well. Yes, of course adding a dependency for core monitoring is risky. So far the bottom line is that relying on collectd for this is just one more option on the table now. [mostly brainstorming from now on] However, I'd like highlight that is not just risky: is a different tradeoff. Doing the core monitoring in Vdsm (so in python, essentially in a single threaded server) is not a free lunch, because this has a quite high price on performance level. If the main Vdsm process is overloaded, then the polling cycle can get longer, and the overall response time of processing system events (e.g. disk detected full) can get longer as well. We've observed in not-so-distant past high response time from heavily loaded Vdsm. I think the idea of having different instances for different monitoring purposes (credit to Nir) is the best shot at the moment. We could maybe have one standard system collectd for regular monitoring, and perhaps one special purpose, very limited collectd instance for critical information. On top of that, Vdsm could double-checl and keep doing the core monitoring itself, albeit at lower rate (e.g. every 10s instead of every 2s; every 60s instead of every 15s). Leveraging libvirt events is *the* right thing, no doubt about that, but it would be very nice to have a dependable external service which can generate the events we need based on libvirt data, and move the notification logic on top of it. Something like (final picture, excluding intermediate compatibility layers) [data source] -+--- | `-> [monitoring/metrics collection] -+--- | +--> [metrics store] -{data}-> [Engine] | `--> [notification service] -{events}-> [Vdsm] Not all the "boxes" need to be separate processes, for example collectd has some support for thresholds and notifications which is ~80% of what Vdsm needs (again not considering reliability, just feature-wise). [end brainstorm] > - The libvirt collectd plugin is REALLY outdated. I think it may require > significant work t
[ovirt-devel] [vdsm] exploring a possible integration between collectd and Vdsm
Hi all, In the last 2.5 days I was exploring if and how we can integrate collectd and Vdsm. The final picture could look like: 1. collectd does all the monitoring and reporting currently Vdsm does 2. Engine consumes data from collectd 3. Vdsm consumes *notifications* from collectd - for few but important tasks like Drive high water mark monitoring Benefits (aka: why to bother?): 1. less code in Vdsm / long-awaited modularization of Vdsm 2. better integration with the system, reuse of well-known components 3. more flexibility in monitoring/reporting: collectd is special purpose existing solution 4. faster, more scalable operation because all the monitoring can be done in C At first glance, Collectd seems to have all the tools we need. 1. A plugin interface (https://collectd.org/wiki/index.php/Plugin_architecture and https://collectd.org/wiki/index.php/Table_of_Plugins) 2. Support for notifications and thresholds (https://collectd.org/wiki/index.php/Notifications_and_thresholds) 3. a libvirt plugin https://collectd.org/wiki/index.php/Plugin:virt So, the picture is like 1. we start requiring collectd as dependency of Vdsm 2. we either configure it appropriately (collectd support config drop-ins: /etc/collectd.d) or we document our requirements (or both) 3. collectd monitors the hosts and libvirt 4. Engine polls collectd 5. Vdsm listens from notifications Should libvirt deliver us the event we need (see https://bugzilla.redhat.com/show_bug.cgi?id=1181659), we can just stop using collectd notifications, everything else works as previously. Challenges: 1. Collectd does NOT consider the plugin API stable (https://collectd.org/wiki/index.php/Plugin_architecture#The_interface.27s_stability) so the plugins should be inclueded in the main tree, much like the modules of the linux kernel Worth mentioning that the plugin API itself has a good deal of rough edges. we will need to maintain this plugin ourselves, *and* we need to maintain our thin API layer, to make sure the plugin loads and works with recent versions of collectd. 2. the virt plugin is out of date, doesn't report some data we need: see https://github.com/collectd/collectd/issues/1945 3. the notification message(s) are tailored for human consumption, those messages are not easy to parse for machines. 4. the threshold support in collectd seems to match values against constants; it doesn't seem possible to match a value against another one, as we need to do for high water monitoring (capacity VS allocation). How I'm addressing, or how I plan to address those challenges (aka action items): 1. I've been experimenting with out-of-tree plugins, and I managed develop, build, install and run one out-of-tree plugin: https://github.com/mojaves/vmon/tree/master/collectd The development pace of collectd looks sustainable, so this doesn't look such a big deal. Furthermore, we can engage with upstream to merge our plugins, either as-is or to extend existing ones. 2. Write another collectd plugin based on the Vdsm python code and/or my past accelerator executable project (https://github.com/mojaves/vmon) 3. patch the collectd notification code. It is yet another plugin OR 4. send notification from the new virt module as per #2, bypassing the threshold system. This move could preclude the new virt module to be merged in the collectd tree. Current status of the action items: 1. done BUT PoC quality 2. To be done (more work than #1/possible dupe with github issue) 3. need more investigation, conflicts with #4 4. need more investigation, conflicts with #3 All the code I'm working on will be found on https://github.com/mojaves/vmon Comments are appreciated -- Francesco Romani RedHat Engineering Virtualization R & D Phone: 8261328 IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] branch ovirt-4.0.5 created
- Original Message - > From: "Dan Kenigsberg" > To: "Francesco Romani" > Cc: "Nir Soffer" , devel@ovirt.org > Sent: Monday, October 10, 2016 5:11:26 PM > Subject: Re: [vdsm] branch ovirt-4.0.5 created > > On Mon, Oct 10, 2016 at 10:30:49AM -0400, Francesco Romani wrote: > > Hi everyone, > > > > this time I choose to create the ovirt-4.0.5 branch. > > I already merged some patches for 4.0.6. > > > > Unfortunately I branched a bit too early (from last tag :)) > > > > So patches > > https://gerrit.ovirt.org/#/c/65303/1 > > https://gerrit.ovirt.org/#/c/65304/1 > > https://gerrit.ovirt.org/#/c/65305/1 > > > > Should be trivially mergeable - the only thing changed from ovirt-4.0 > > counterpart > > is the change-id. Please have a quick look just to doublecheck. > > Change-Id should be the same for a master patch and all of its backport. > It seems that it was NOT changed, at least for > https://gerrit.ovirt.org/#/q/I5cea6ec71c913d74d95317ff7318259d64b40969 > which is a GOOD thing. Yes, sorry, indeed it is (and indeed it should not change). > I think we want to enable CI on the new 4.0.5 branch, right? Otherwise > we'd need to fake the CI+1 flag until 4.0.5 is shipped. We should, but it is not urgently needed - just regular priority. For the aforementioned first three patches especially I'm just overly cautious. -- Francesco Romani Red Hat Engineering Virtualization R & D Phone: 8261328 IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] Outreachy internship
- Original Message - > From: "Саша Ершова" > To: devel@ovirt.org > Sent: Wednesday, October 5, 2016 7:40:21 PM > Subject: [ovirt-devel] Outreachy internship > > Dear all, > My name is Alexandra Ershova, and I'm a student in Natural Language > Processing in Higher School of Economics, Moscow, Russia. I'd like to take > part in the current round of Outreachy internships. My main programming > language is Python (I have experience with both 2 and 3). Writing system > tests seems like an interesting project to me, and I would like to do it. > Could you please give me an application task, so that I could make my first > contribution? hello Alexandra, thanks for your interet in oVirt! In addition to what Yaniv already outlined, did you manage to run the Vdsm testsuite? A good first step could be indeed to make sure the lago environment is up and running and it can run the ovirt system tests. Feel free to file issues ond/or ask for help also on the devel@ovirt.org mailing list. Once you are played a bit with lago, it is a good idea to introduce yourself here; you can find a broader audience and even more mentors for lago itself (the lago developers hang around on thet ML). I recommend to work on a CentOS 7.2 or Fedora 24 system, either real or virtualized. Should you have any question, feel free to post a message on devel@ovirt.org, or ping me on irc (fromani on the #vdsm channel on freenode). Please note the following assumes you have one oVirt installation (of any kind), and basic knowledge of the architecture. The application task for the idea you expressed interest in is: 1. write a system test to make sure one host has the 'ovirtmgmt' network available (defined, and running). another way to check this is to make sure the host has one active nic which is part of the aforementioned network. You can check this from the oVirt Engine webadmin UI: select the "host" panel, check the "network interfaces" subtab in the lower portion of the screen. Please note the above is a VERY terse introduction. You will likely need clarifications. You are more than welcome to ask for clarifications via mail (here), and/or join the #vdsm IRC channel on the freenode network. -- Francesco Romani RedHat Engineering Virtualization R & D Phone: 8261328 IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] [vdsm] branch ovirt-4.0.5 created
Hi everyone, this time I choose to create the ovirt-4.0.5 branch. I already merged some patches for 4.0.6. Unfortunately I branched a bit too early (from last tag :)) So patches https://gerrit.ovirt.org/#/c/65303/1 https://gerrit.ovirt.org/#/c/65304/1 https://gerrit.ovirt.org/#/c/65305/1 Should be trivially mergeable - the only thing changed from ovirt-4.0 counterpart is the change-id. Please have a quick look just to doublecheck. patches https://gerrit.ovirt.org/#/c/65306/1 https://gerrit.ovirt.org/#/c/65307/1 are ready anytime once the three above are mentioned. They are very simple and safe. Bests, -- Francesco Romani RedHat Engineering Virtualization R & D Phone: 8261328 IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel