Debugging stuck vdsm jobs
Hi all, We had 2 issues causing vdsm check-patch and check-merge jobs to get stuck. I fixed the one that caused most trouble: https://gerrit.ovirt.org/57993 The other issue may be related to ioprocess, I fixed a related issue: https://gerrit.ovirt.org/57473 But I have seen stuck jobs after this change, so the issue may not be fixed yet. If you see a stuck vdsm job - job that run more than 15 minutes, please get me a backtrace: 1. locate the test_runner process pid: $ ps aux | grep testrunner.py | grep -v grep nsoffer 26297 82.6 0.9 389592 44 pts/3 R+ 22:52 0:02 /usr/bin/python ../tests/testrunner.py ... 2. save a backtrace: gdb attach 26297 --batch -ex "thread apply all py-bt" > py-bt.out Thanks, Nir ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
[oVirt Jenkins] ovirt-engine_master_upgrade-from-master_el7_merged - Build # 417 - Failure!
Project: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-master_el7_merged/ Build: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-master_el7_merged/417/ Build Number: 417 Build Status: Failure Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/58064 - Changes Since Last Success: - Changes for Build #417 No changes - Failed Tests: - No tests ran. ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
Appliance job build failure because of ovirt-3.6-epel
Hey, the 3.6 job completes, but without an engine: http://jenkins.ovirt.org/user/fabiand/my-views/view/appliance/job/ovirt-appliance_ovirt-3.6_build-artifacts-el7-x86_64/lastSuccessfulBuild/artifact/exported-artifacts/anaconda.log/*view*/ The problenm should also be present in any other following job until it's fixed :) 16:01:06,691 INFO program: + yum install -y ovirt-engine 16:01:06,691 INFO program: Loaded plugins: fastestmirror 16:01:06,692 INFO program: http://download.fedoraproject.org/pub/epel/7/x86_64/repodata/55d4bcbc6bcd8727167925d216c94c7f5217b921d892da747b84d079c5905a7b-updateinfo.xml.bz2: [Errno 14] HTTP Error 404 - Not Found 16:01:06,693 INFO program: Trying other mirror. 16:01:06,694 INFO program: To address this issue please refer to the below knowledge base article 16:01:06,694 INFO program: 16:01:06,695 INFO program: https://access.redhat.com/articles/1320623 16:01:06,695 INFO program: 16:01:06,695 INFO program: If above article doesn't help to resolve this issue please create a bug on https://bugs.centos.org/ 16:01:06,696 INFO program: 16:01:06,697 INFO program: http://download.fedoraproject.org/pub/epel/7/x86_64/repodata/3abc3e70be643a17bb37e3f3e1dd057d8c6242c579412fc50de180b9882e0a99-primary.sqlite.xz: [Errno 14] HTTP Error 404 - Not Found 16:01:06,699 INFO program: Trying other mirror. 16:01:06,699 INFO program: Determining fastest mirrors 16:01:06,700 INFO program: * base: centos-distro.cavecreek.net 16:01:06,701 INFO program: * extras: centos.host-engine.com 16:01:06,701 INFO program: * updates: mirror.n5tech.com 16:01:06,702 INFO program: http://download.fedoraproject.org/pub/epel/7/x86_64/repodata/3abc3e70be643a17bb37e3f3e1dd057d8c6242c579412fc50de180b9882e0a99-primary.sqlite.xz: [Errno 14] HTTP Error 404 - Not Found 16:01:06,702 INFO program: Trying other mirror. 16:01:06,702 INFO program: http://download.fedoraproject.org/pub/epel/7/x86_64/repodata/3abc3e70be643a17bb37e3f3e1dd057d8c6242c579412fc50de180b9882e0a99-primary.sqlite.xz: [Errno 14] HTTP Error 404 - Not Found 16:01:06,703 INFO program: Trying other mirror. 16:01:06,703 INFO program: 16:01:06,703 INFO program: 16:01:06,704 INFO program: One of the configured repositories failed (Extra Packages for Enterprise Linux 7 - x86_64), 16:01:06,705 INFO program: and yum doesn't have enough cached data to continue. At this point the only 16:01:06,706 INFO program: safe thing yum can do is fail. There are a few ways to work "fix" this: 16:01:06,706 INFO program: 16:01:06,706 INFO program: 1. Contact the upstream for the repository and get them to fix the problem. 16:01:06,707 INFO program: 16:01:06,707 INFO program: 2. Reconfigure the baseurl/etc. for the repository, to point to a working 16:01:06,707 INFO program: upstream. This is most often useful if you are using a newer 16:01:06,708 INFO program: distribution release than is supported by the repository (and the 16:01:06,708 INFO program: packages for the previous distribution release still work). 16:01:06,708 INFO program: 16:01:06,708 INFO program: 3. Disable the repository, so yum won't use it by default. Yum will then 16:01:06,709 INFO program: just ignore the repository until you permanently enable it again or use 16:01:06,709 INFO program: --enablerepo for temporary usage: 16:01:06,709 INFO program: 16:01:06,710 INFO program: yum-config-manager --disable ovirt-3.6-epel - fabian -- Fabian Deutsch RHEV Hypervisor Red Hat ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
Re: Maintenance on the Mailing-Lists
Yes, copy. Il 26/Mag/2016 17:11, "Marc Dequènes (Duck)" ha scritto: > > On 05/26/2016 11:57 PM, Marc Dequènes (Duck) wrote: > > Quack, > > > > Changes done. Do you copy? > > second call for test, sorry for the noise. > > > > ___ > Infra mailing list > Infra@ovirt.org > http://lists.ovirt.org/mailman/listinfo/infra > > ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
Re: Maintenance on the Mailing-Lists
On 05/26 23:57, Marc Dequènes (Duck) wrote: > Quack, > > Changes done. Do you copy? > Yep, copy (and good signature too) > ___ > Infra mailing list > Infra@ovirt.org > http://lists.ovirt.org/mailman/listinfo/infra -- David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Tel.: +420 532 294 605 Email: dc...@redhat.com IRC: dcaro|dcaroest@{freenode|oftc|redhat} Web: www.redhat.com RHT Global #: 82-62605 signature.asc Description: PGP signature ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
Re: Maintenance on the Mailing-Lists
On 05/26/2016 11:57 PM, Marc Dequènes (Duck) wrote: > Quack, > > Changes done. Do you copy? second call for test, sorry for the noise. signature.asc Description: OpenPGP digital signature ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
Re: Maintenance on the Mailing-Lists
Quack, Changes done. Do you copy? signature.asc Description: OpenPGP digital signature ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
[oVirt Jenkins] ovirt-engine_master_upgrade-from-4.0_el7_merged - Build # 26 - Still Failing!
Project: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/ Build: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/26/ Build Number: 26 Build Status: Still Failing Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/58139 - Changes Since Last Success: - Changes for Build #19 [Tal Nisan] core: Add tests to OvfManagerTest Changes for Build #20 [Sandro Bonazzola] ovirt-live: add 4.0 branch [Martin Perina] aaa-ldap: Add 1.1 branch [Marek Libra] webadmin: Forbid cluster version change if a VM is active Changes for Build #21 [Allon Mureinik] core: PKIResources type inference Changes for Build #22 No changes Changes for Build #23 [Tomas Jelinek] userportal: New VM dialog offers each VM template twice Changes for Build #24 [Martin Sivak] Fix a policy unit db upgrade script according to oVirt style rules Changes for Build #25 [Allon Mureinik] core: GetAllAttachableDisksForVmQuery branching Changes for Build #26 [Allon Mureinik] core: GetAllAttachableDisksForVmQuery's DbFacade - Failed Tests: - No tests ran. ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
[oVirt Jenkins] ovirt-engine_master_upgrade-from-4.0_el7_merged - Build # 25 - Still Failing!
Project: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/ Build: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/25/ Build Number: 25 Build Status: Still Failing Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/58138 - Changes Since Last Success: - Changes for Build #19 [Tal Nisan] core: Add tests to OvfManagerTest Changes for Build #20 [Sandro Bonazzola] ovirt-live: add 4.0 branch [Martin Perina] aaa-ldap: Add 1.1 branch [Marek Libra] webadmin: Forbid cluster version change if a VM is active Changes for Build #21 [Allon Mureinik] core: PKIResources type inference Changes for Build #22 No changes Changes for Build #23 [Tomas Jelinek] userportal: New VM dialog offers each VM template twice Changes for Build #24 [Martin Sivak] Fix a policy unit db upgrade script according to oVirt style rules Changes for Build #25 [Allon Mureinik] core: GetAllAttachableDisksForVmQuery branching - Failed Tests: - No tests ran. ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
[oVirt Jenkins] ovirt-engine_master_upgrade-from-4.0_el7_merged - Build # 24 - Still Failing!
Project: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/ Build: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/24/ Build Number: 24 Build Status: Still Failing Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/51361 - Changes Since Last Success: - Changes for Build #19 [Tal Nisan] core: Add tests to OvfManagerTest Changes for Build #20 [Sandro Bonazzola] ovirt-live: add 4.0 branch [Martin Perina] aaa-ldap: Add 1.1 branch [Marek Libra] webadmin: Forbid cluster version change if a VM is active Changes for Build #21 [Allon Mureinik] core: PKIResources type inference Changes for Build #22 No changes Changes for Build #23 [Tomas Jelinek] userportal: New VM dialog offers each VM template twice Changes for Build #24 [Martin Sivak] Fix a policy unit db upgrade script according to oVirt style rules - Failed Tests: - No tests ran. ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
[oVirt Jenkins] ovirt-engine_master_upgrade-from-4.0_el7_merged - Build # 23 - Still Failing!
Project: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/ Build: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/23/ Build Number: 23 Build Status: Still Failing Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/58076 - Changes Since Last Success: - Changes for Build #19 [Tal Nisan] core: Add tests to OvfManagerTest Changes for Build #20 [Sandro Bonazzola] ovirt-live: add 4.0 branch [Martin Perina] aaa-ldap: Add 1.1 branch [Marek Libra] webadmin: Forbid cluster version change if a VM is active Changes for Build #21 [Allon Mureinik] core: PKIResources type inference Changes for Build #22 No changes Changes for Build #23 [Tomas Jelinek] userportal: New VM dialog offers each VM template twice - Failed Tests: - No tests ran. ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
[oVirt Jenkins] ovirt-engine_master_upgrade-from-master_el7_merged - Build # 413 - Still Failing!
Project: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-master_el7_merged/ Build: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-master_el7_merged/413/ Build Number: 413 Build Status: Still Failing Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/58076 - Changes Since Last Success: - Changes for Build #412 No changes Changes for Build #413 No changes - Failed Tests: - No tests ran. ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
[oVirt Jenkins] ovirt-engine_master_upgrade-from-4.0_el7_merged - Build # 22 - Still Failing!
Project: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/ Build: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/22/ Build Number: 22 Build Status: Still Failing Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/58074 - Changes Since Last Success: - Changes for Build #19 [Tal Nisan] core: Add tests to OvfManagerTest Changes for Build #20 [Sandro Bonazzola] ovirt-live: add 4.0 branch [Martin Perina] aaa-ldap: Add 1.1 branch [Marek Libra] webadmin: Forbid cluster version change if a VM is active Changes for Build #21 [Allon Mureinik] core: PKIResources type inference Changes for Build #22 No changes - Failed Tests: - No tests ran. ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
[oVirt Jenkins] ovirt-engine_master_upgrade-from-master_el7_merged - Build # 412 - Failure!
Project: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-master_el7_merged/ Build: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-master_el7_merged/412/ Build Number: 412 Build Status: Failure Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/58074 - Changes Since Last Success: - Changes for Build #412 No changes - Failed Tests: - No tests ran. ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
[oVirt Jenkins] ovirt-engine_master_upgrade-from-4.0_el7_merged - Build # 21 - Still Failing!
Project: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/ Build: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/21/ Build Number: 21 Build Status: Still Failing Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/58100 - Changes Since Last Success: - Changes for Build #19 [Tal Nisan] core: Add tests to OvfManagerTest Changes for Build #20 [Sandro Bonazzola] ovirt-live: add 4.0 branch [Martin Perina] aaa-ldap: Add 1.1 branch [Marek Libra] webadmin: Forbid cluster version change if a VM is active Changes for Build #21 [Allon Mureinik] core: PKIResources type inference - Failed Tests: - No tests ran. ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
Re: IPv6 RR disabled on lists.ovirt.org -- WHY???
Hello All. Based on the forwarded message and DNS checks I did everything should be fine. The only thing is that have CNAME for mail server may not be a good idea and it is better to make lists to be A and direct records, reverses are already fine. As I see that's the plan so looks good. When we change we can check headers of the nearest message after DSN propagation and google and see if it treats it as SPF pass. Anton. On Wed, May 25, 2016 at 10:15 AM, Marc Dequènes (Duck) wrote: > Quack, > > Thanks dneary for coming to this thread. > > On 05/25/2016 04:24 AM, Karsten Wade wrote: > > > How about experimenting and see what happens (SCIENCE!), maybe with a > > warning to the two main lists (devel, users) in case anything breaks? > > I'm in favor of experimenting too. > > I see no reason not to have IPv6 working on the machines' services after > a look at the configurations. The IPv6 reverse is good, we/I only have > to re-add the direct and reenable Postfix IPv6, and watch :-). > > I will to that tomorrow unless someone raise concerns. > > Regards. > > > ___ > Infra mailing list > Infra@ovirt.org > http://lists.ovirt.org/mailman/listinfo/infra > > -- Anton Marchukov Senior Software Engineer - RHEV CI - Red Hat ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
[oVirt Jenkins] ovirt-engine_master_upgrade-from-4.0_el7_merged - Build # 20 - Still Failing!
Project: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/ Build: http://jenkins.ovirt.org/job/ovirt-engine_master_upgrade-from-4.0_el7_merged/20/ Build Number: 20 Build Status: Still Failing Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/57799 - Changes Since Last Success: - Changes for Build #19 [Tal Nisan] core: Add tests to OvfManagerTest Changes for Build #20 [Sandro Bonazzola] ovirt-live: add 4.0 branch [Martin Perina] aaa-ldap: Add 1.1 branch [Marek Libra] webadmin: Forbid cluster version change if a VM is active - Failed Tests: - No tests ran. ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
Re: ngn build jobs take more than twice (x) as long as in the last days
On 05/26 10:20, Barak Korren wrote: > > > > > > I agree a stable distributed storage solution is the way to go if we can > > find one :) > > > > Distributed storages usually suffer from a large overhead because: > 1. They try to be resilient to node failure, which means keeping two > or more copies of the same file, which results in I/O overhead. > 2. They need to coordinate metadata access for large amounts of files. > Bottlenecks in the metadata management system are a common issue for > distributes FS storages. > > Since most of our data is ephemeral anyway I don't think we need to > pay this overhead. The solution for our current temporary ephemeral data would be for each node to create the vms locally, that's the scratch disks solution we started with. The distributed storage would be used to store the jenkins machines templates, that mostly would be read by the hosts, and thus, properly cached locally with a low miss rate (as they don't usually change). To actually not use at all the central storage, whose extra levels of redundancy are only useful for more critical data (aka production datacenter machines). > > > -- > Barak Korren > bkor...@redhat.com > RHEV-CI Team -- David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Tel.: +420 532 294 605 Email: dc...@redhat.com IRC: dcaro|dcaroest@{freenode|oftc|redhat} Web: www.redhat.com RHT Global #: 82-62605 signature.asc Description: PGP signature ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
Re: Maintenance on the Mailing-Lists
On 05/26/2016 03:41 PM, Sandro Bonazzola wrote: > Note that we are on #ovirt@oftc :-) Hum :-/. Well I guess they can auto-correct. So I've asked IT about the DNS change. Also I asked for the 'lists' entry to be made A/, as I remember having myself problems with CNAMEs a long time ago, and without even SPF involved. We could also point the MX to 'lists' instead of 'linode1' later. We should not use the 'linode1' name for external services at all. I'll send a message when it is done and Postfix's config is changed. I'll ping people on IRC to see if you receive it well. Regards. signature.asc Description: OpenPGP digital signature ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
Re: ngn build jobs take more than twice (x) as long as in the last days
> > > I agree a stable distributed storage solution is the way to go if we can > find one :) > Distributed storages usually suffer from a large overhead because: 1. They try to be resilient to node failure, which means keeping two or more copies of the same file, which results in I/O overhead. 2. They need to coordinate metadata access for large amounts of files. Bottlenecks in the metadata management system are a common issue for distributes FS storages. Since most of our data is ephemeral anyway I don't think we need to pay this overhead. -- Barak Korren bkor...@redhat.com RHEV-CI Team ___ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra