Re: Tested failing because of missing loop devices

2016-01-06 Thread Fabian Deutsch
On Wed, Jan 6, 2016 at 9:33 AM, Eyal Edri  wrote:
> I remembered vaguely that restarting the vm helps,  but I don't think we
> know the root cause.
>
> Adding  Barak to help with the restart.

Right. If it's really about builders, then I'd favor that we create a
dedicated builder for Node.
Currently it looks like the loop issue is caused by at least one job,
and at least two jobs are suffering (one of them node).

But we are blocked by this issue, and having a dedicated builder would
help us move forward.

- fabian

> On Jan 6, 2016 10:20 AM, "Fabian Deutsch"  wrote:
>>
>> Hey,
>>
>> our Node Next builds are alos failing with some error around loop devices.
>>
>> This worked just before christmas, but is now constantly failing this
>> year.
>>
>> Is the root cause already known?
>>
>> Ryan and Tolik were looking into this from the Node side.
>>
>> - fabian
>>
>>
>> On Wed, Dec 23, 2015 at 4:52 PM, Nir Soffer  wrote:
>> > On Wed, Dec 23, 2015 at 5:11 PM, Eyal Edri  wrote:
>> >> I'm guessing this will e solved by running it on lago?
>> >> Isn't that what yaniv is working on now?
>> >
>> > Yes, this may be more stable, but I heard that lago setup takes about
>> > an hour, and the whole
>> > run about 3 hours, so lot of work is needed until we can use it.
>> >
>> >> or these are unit tests and not functional?
>> >
>> > Thats the problem these tests fail because they do not test our code,
>> > but the integration of our code in the environment. For example, if the
>> > test
>> > cannot find an available loop device, the test will fail.
>> >
>> > I think we must move these tests to the integration test package,
>> > that does not run on the ci. These tests can be run only on a vm using
>> > root privileges, and only single test per vm in the same time, to avoid
>> > races
>> > when accessing shared resources (devices, network, etc.).
>> >
>> > The best way to run such test is to start a stateless vm based on a
>> > template
>> > that include the entire requirements, so we don't need to pay for yum
>> > install
>> > on each test (may take 2-3 minutes).
>> >
>> > Some of our customers are using similar setups. Using such setup for our
>> > own tests is the best thing we can do to improve the product.
>> >
>> >>
>> >> e.
>> >>
>> >> On Wed, Dec 23, 2015 at 4:48 PM, Dan Kenigsberg 
>> >> wrote:
>> >>>
>> >>> On Wed, Dec 23, 2015 at 03:21:31AM +0200, Nir Soffer wrote:
>> >>> > Hi all,
>> >>> >
>> >>> > We see too many failures of tests using loop devices. Is it possible
>> >>> > that we run tests
>> >>> > concurrently on the same slave, using all the available loop
>> >>> > devices, or
>> >>> > maybe
>> >>> > creating races between different tests?
>> >>> >
>> >>> > It seems that we need new decorator for disabling tests on the CI
>> >>> > slaves, since this
>> >>> > environment is too fragile.
>> >>> >
>> >>> > Here are some failures:
>> >>> >
>> >>> > 01:10:33
>> >>> >
>> >>> > ==
>> >>> > 01:10:33 ERROR: testLoopMount (mountTests.MountTests)
>> >>> > 01:10:33
>> >>> >
>> >>> > --
>> >>> > 01:10:33 Traceback (most recent call last):
>> >>> > 01:10:33   File
>> >>> >
>> >>> >
>> >>> > "/home/jenkins/workspace/vdsm_master_check-patch-fc23-x86_64/vdsm/tests/mountTests.py",
>> >>> > line 128, in testLoopMount
>> >>> > 01:10:33 m.mount(mntOpts="loop")
>> >>> > 01:10:33   File
>> >>> >
>> >>> >
>> >>> > "/home/jenkins/workspace/vdsm_master_check-patch-fc23-x86_64/vdsm/vdsm/storage/mount.py",
>> >>> > line 225, in mount
>> >>> > 01:10:33 return self._runcmd(cmd, timeout)
>> >>> > 01:10:33   File
>> >>> >
>> >>> >
>> >>> > "/home/jenkins/workspace/vdsm_master_check-patch-fc23-x86_64/vdsm/vdsm/storage/mount.py",
>> >>> > line 241, in _runcmd
>> >>> > 01:10:33 raise MountError(rc, ";".join((out, err)))
>> >>> > 01:10:33 MountError: (32, ';mount: /tmp/tmpZuJRNk: failed to setup
>> >>> > loop device: No such file or directory\n')
>> >>> > 01:10:33  >> begin captured logging <<
>> >>> > 
>> >>> > 01:10:33 Storage.Misc.excCmd: DEBUG: /usr/bin/taskset --cpu-list 0-1
>> >>> > /sbin/mkfs.ext2 -F /tmp/tmpZuJRNk (cwd None)
>> >>> > 01:10:33 Storage.Misc.excCmd: DEBUG: SUCCESS:  = 'mke2fs
>> >>> > 1.42.13
>> >>> > (17-May-2015)\n';  = 0
>> >>> > 01:10:33 Storage.Misc.excCmd: DEBUG: /usr/bin/taskset --cpu-list 0-1
>> >>> > /usr/bin/mount -o loop /tmp/tmpZuJRNk /var/tmp/tmpJO52Xj (cwd None)
>> >>> > 01:10:33 - >> end captured logging <<
>> >>> > -
>> >>> > 01:10:33
>> >>> > 01:10:33
>> >>> >
>> >>> > ==
>> >>> > 01:10:33 ERROR: testSymlinkMount (mountTests.MountTests)
>> >>> > 01:10:33
>> >>> >
>> >>> > 

Re: FAIL: testEnablePromisc (ipwrapperTests.TestDrvinfo) - bad test? ci issue?

2016-01-06 Thread Dan Kenigsberg
On Wed, Jan 06, 2016 at 09:29:46AM +0200, Edward Haas wrote:
> Hi,
> 
> Strange, the logging below shows the 'promisc on' commands was successful.
> Unfortunately, the logs/run/job archive is no longer available.
> 
> The check itself is asymmetric: We set it using iproute2 (command) and
> read it using netlink.
> At the least, we should add some more info on failure (like link state
> details)
> Adding to my TODO.
> 
> Thanks,
> Edy.
> 
> On 01/05/2016 06:10 PM, Nir Soffer wrote:
> > Hi all,
> > 
> > We see this failure again in the ci - can someone from networking take a 
> > look?

I'm guessing that it is a race due to the asynchrony of netlink.
If so, this patch should solve the issue in one location
https://gerrit.ovirt.org/#/c/51410/

___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: fyi - vdsm check-patch for el7 has been disabled due to tests errors unattended

2016-01-06 Thread Eyal Edri
Seems to help if all the jobs finished successfully.
Can we merge it?

e.

On Wed, Jan 6, 2016 at 3:17 PM, Dan Kenigsberg  wrote:

> On Tue, Jan 05, 2016 at 03:17:54PM +0200, Eyal Edri wrote:
> > same for http://jenkins.ovirt.org/job/vdsm_3.5_check-patch-fc22-x86_64/
> >
> >
> > On Tue, Jan 5, 2016 at 2:56 PM, Eyal Edri  wrote:
> >
> > > FYI,
> > >
> > > The vdsm job [1] has been failing for quite some time now, without any
> > > resolution so far.
> > > In order to reduce noise and false positive for CI it was disabled
> until
> > > the relevant developers will ack it it stable and can be re-enabled.
> > >
> > > Please contact the infra team if you need any assistance testing it on
> a
> > > non-production job.
> > >
> > >
> > > [1] http://jenkins.ovirt.org//job/vdsm_3.5_check-patch-el7-x86_64/
>
> https://gerrit.ovirt.org/#/c/51390/ hides some of the problems (most of
> them already solved on 3.6 branch).
>
> I suggest to take it in instead of turning the job off.
>
> The 3.5 branch is quite quiet these days, but I would like to enjoy the
> benefits of our unit test as long as it is alive.
>
> Regards,
> Dan.
>



-- 
Eyal Edri
Associate Manager
EMEA ENG Virtualization R
Red Hat Israel

phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: fyi - vdsm check-patch for el7 has been disabled due to tests errors unattended

2016-01-06 Thread Dan Kenigsberg
On Tue, Jan 05, 2016 at 03:17:54PM +0200, Eyal Edri wrote:
> same for http://jenkins.ovirt.org/job/vdsm_3.5_check-patch-fc22-x86_64/
> 
> 
> On Tue, Jan 5, 2016 at 2:56 PM, Eyal Edri  wrote:
> 
> > FYI,
> >
> > The vdsm job [1] has been failing for quite some time now, without any
> > resolution so far.
> > In order to reduce noise and false positive for CI it was disabled until
> > the relevant developers will ack it it stable and can be re-enabled.
> >
> > Please contact the infra team if you need any assistance testing it on a
> > non-production job.
> >
> >
> > [1] http://jenkins.ovirt.org//job/vdsm_3.5_check-patch-el7-x86_64/

https://gerrit.ovirt.org/#/c/51390/ hides some of the problems (most of
them already solved on 3.6 branch).

I suggest to take it in instead of turning the job off.

The 3.5 branch is quite quiet these days, but I would like to enjoy the
benefits of our unit test as long as it is alive.

Regards,
Dan.
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[oVirt Jenkins] ovirt-engine_3.6_upgrade-from-3.6_el6_merged - Build # 735 - Failure!

2016-01-06 Thread jenkins
Project: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.6_el6_merged/ 
Build: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.6_el6_merged/735/
Build Number: 735
Build Status:  Failure
Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/51331

-
Changes Since Last Success:
-
Changes for Build #735
[Maor Lipchuk] core: Refactor call back of CloneSingleCinderDisk




-
Failed Tests:
-
No tests ran. 

___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[oVirt Jenkins] ovirt-engine_3.6_upgrade-from-3.6_el6_merged - Build # 737 - Still Failing!

2016-01-06 Thread jenkins
Project: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.6_el6_merged/ 
Build: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.6_el6_merged/737/
Build Number: 737
Build Status:  Still Failing
Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/51333

-
Changes Since Last Success:
-
Changes for Build #735
[Maor Lipchuk] core: Refactor call back of CloneSingleCinderDisk


Changes for Build #736
[Maor Lipchuk] core: CoCo revert tasks of clone Cinder disks.


Changes for Build #737
[Maor Lipchuk] core: Add callback for create VM from Template with Cinder.




-
Failed Tests:
-
No tests ran. 

___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[oVirt Jenkins] ovirt-engine_3.6_upgrade-from-3.6_el6_merged - Build # 738 - Still Failing!

2016-01-06 Thread jenkins
Project: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.6_el6_merged/ 
Build: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.6_el6_merged/738/
Build Number: 738
Build Status:  Still Failing
Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/51334

-
Changes Since Last Success:
-
Changes for Build #735
[Maor Lipchuk] core: Refactor call back of CloneSingleCinderDisk


Changes for Build #736
[Maor Lipchuk] core: CoCo revert tasks of clone Cinder disks.


Changes for Build #737
[Maor Lipchuk] core: Add callback for create VM from Template with Cinder.


Changes for Build #738
[Maor Lipchuk] core: Call concurrent execution callback on add vm command.




-
Failed Tests:
-
No tests ran. 

___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[oVirt Jenkins] ovirt-engine_3.6_upgrade-from-3.6_el6_merged - Build # 739 - Still Failing!

2016-01-06 Thread jenkins
Project: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.6_el6_merged/ 
Build: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.6_el6_merged/739/
Build Number: 739
Build Status:  Still Failing
Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/51335

-
Changes Since Last Success:
-
Changes for Build #735
[Maor Lipchuk] core: Refactor call back of CloneSingleCinderDisk


Changes for Build #736
[Maor Lipchuk] core: CoCo revert tasks of clone Cinder disks.


Changes for Build #737
[Maor Lipchuk] core: Add callback for create VM from Template with Cinder.


Changes for Build #738
[Maor Lipchuk] core: Call concurrent execution callback on add vm command.


Changes for Build #739
[Maor Lipchuk] core: CoCo, prevent using removeFromHirerchy on callback.




-
Failed Tests:
-
No tests ran. 

___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[oVirt Jenkins] ovirt-engine_3.6_upgrade-from-3.6_el6_merged - Build # 736 - Still Failing!

2016-01-06 Thread jenkins
Project: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.6_el6_merged/ 
Build: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.6_el6_merged/736/
Build Number: 736
Build Status:  Still Failing
Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/51332

-
Changes Since Last Success:
-
Changes for Build #735
[Maor Lipchuk] core: Refactor call back of CloneSingleCinderDisk


Changes for Build #736
[Maor Lipchuk] core: CoCo revert tasks of clone Cinder disks.




-
Failed Tests:
-
No tests ran. 

___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[oVirt Jenkins] ovirt-engine_3.6_upgrade-from-3.5_el6_merged - Build # 749 - Failure!

2016-01-06 Thread jenkins
Project: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.5_el6_merged/ 
Build: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.5_el6_merged/749/
Build Number: 749
Build Status:  Failure
Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/51330

-
Changes Since Last Success:
-
Changes for Build #749
[Maor Lipchuk] core: Aggregate all commands in call back on failure.




-
Failed Tests:
-
No tests ran. 

___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


junit publishers in the standard job template?

2016-01-06 Thread Fabian Deutsch
Hey,

wouldn't it make sense to add the junit publisher to the standard job template?

That way a job can simple run make check and maybe there is a
nosetests.xml file left over which is then displayed nicely.

In case that no nosetests.xml file is there, no results will be shown
and no harm is done.

Thoughts?

- fabian

P.s.: I'd liek to have it for imgbased

-- 
Fabian Deutsch 
RHEV Hypervisor
Red Hat
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: fyi - vdsm check-patch for el7 has been disabled due to tests errors unattended

2016-01-06 Thread Francesco Romani
- Original Message -
> From: "Dan Kenigsberg" 
> To: "Eyal Edri" , ybron...@redhat.com, from...@redhat.com
> Cc: "devel" , "infra" 
> Sent: Thursday, January 7, 2016 8:01:43 AM
> Subject: Re: fyi - vdsm check-patch for el7 has been disabled due to tests 
> errors unattended
> 
> On Wed, Jan 06, 2016 at 04:00:29PM +0200, Eyal Edri wrote:
> > Seems to help if all the jobs finished successfully.
> > Can we merge it?
> 
> I see that Nir agrees, so I hope Yaniv/Francesco takes it soon.

Taken
 

-- 
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


[oVirt Jenkins] ovirt-engine_3.6_upgrade-from-3.6_el6_merged - Build # 741 - Still Failing!

2016-01-06 Thread jenkins
Project: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.6_el6_merged/ 
Build: 
http://jenkins.ovirt.org/job/ovirt-engine_3.6_upgrade-from-3.6_el6_merged/741/
Build Number: 741
Build Status:  Still Failing
Triggered By: Triggered by Gerrit: https://gerrit.ovirt.org/51401

-
Changes Since Last Success:
-
Changes for Build #735
[Maor Lipchuk] core: Refactor call back of CloneSingleCinderDisk


Changes for Build #736
[Maor Lipchuk] core: CoCo revert tasks of clone Cinder disks.


Changes for Build #737
[Maor Lipchuk] core: Add callback for create VM from Template with Cinder.


Changes for Build #738
[Maor Lipchuk] core: Call concurrent execution callback on add vm command.


Changes for Build #739
[Maor Lipchuk] core: CoCo, prevent using removeFromHirerchy on callback.


Changes for Build #740
[Maor Lipchuk] core: Lock VM should not be done on revert tasks


Changes for Build #741
[Maor Lipchuk] core: Filter out child Cinder commands on end action.




-
Failed Tests:
-
No tests ran. 

___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra