[ovirt-devel] Re: [VDSM] Travis builds still fail on .coverage rename

2018-07-05 Thread Nir Soffer
On Thu, Jul 5, 2018 at 10:55 PM Nir Soffer  wrote:

> On Thu, Jul 5, 2018 at 5:53 PM Nir Soffer  wrote:
>
>> On Thu, Jul 5, 2018 at 5:43 PM Dan Kenigsberg  wrote:
>>
>>> On Thu, Jul 5, 2018 at 2:52 AM, Nir Soffer  wrote:
>>> > On Wed, Jul 4, 2018 at 1:00 PM Dan Kenigsberg 
>>> wrote:
>>> >>
>>> >> On Wed, Jul 4, 2018 at 12:48 PM, Nir Soffer 
>>> wrote:
>>> >> > Dan, travis build still fail when renaming coverage file even after
>>> >> > your last patch.
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> ...SS.SS.SS..S.SSSS.SSS...SSS...S.S.SSSS..S.SS..
>>> >> >
>>> --
>>> >> > Ran 1267 tests in 99.239s
>>> >> > OK (SKIP=63)
>>> >> > [ -n "$NOSE_WITH_COVERAGE" ] && mv .coverage .coverage-nose-py2
>>> >> > make[1]: *** [check] Error 1
>>> >> > make[1]: Leaving directory `/vdsm/tests'
>>> >> > ERROR: InvocationError: '/usr/bin/make -C tests check'
>>> >> >
>>> >> > https://travis-ci.org/oVirt/vdsm/jobs/399932012
>>> >> >
>>> >> > Do you have any idea what is wrong there?
>>> >> >
>>> >> > Why we don't have any error message from the failed command?
>>> >>
>>> >> No idea, nothing pops to mind.
>>> >> We can revert to the sillier [ -f .coverage ] condition instead of
>>> >> understanding (yeah, this feels dirty)
>>> >
>>> >
>>> > Thanks, your patch (https://gerrit.ovirt.org/#/c/92813/) fixed this
>>> > failure.
>>> >
>>> > Now we have failures for the pywatch_test, and some network
>>> > tests. Can someone from network look at this?
>>> > https://travis-ci.org/nirs/vdsm/builds/400204807
>>>
>>> https://travis-ci.org/nirs/vdsm/jobs/400204808 shows
>>>
>>>   ConfigNetworkError: (21, 'Executing commands failed:
>>> ovs-vsctl: cannot create a bridge named vdsmbr_test because a bridge
>>> named vdsmbr_test already exists')
>>>
>>> which I thought was limited to dirty ovirt-ci jenkins slaves. Any idea
>>> why it shows here?
>>>
>>
>> Maybe one failed test leave dirty host to the next test?
>>
>
network tests fail now only on CentOS now.


>
>>
>>> py-watch seems to be failing due to missing gdb on the travis image
>>
>>
>>> cmdutils.py151 DEBUG./py-watch 0.1 sleep 10 (cwd
>>> None)
>>> cmdutils.py159 DEBUGFAILED:  = 'Traceback
>>> (most recent call last):\n  File "./py-watch", line 60, in \n
>>>   dump_trace(watched_proc)\n  File "./py-watch", line 32, in
>>> dump_trace\n\'thread apply all py-bt\'])\n  File
>>> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 575, in
>>> call\np = Popen(*popenargs, **kwargs)\n  File
>>> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 822, in
>>> __init__\nrestore_signals, start_new_session)\n  File
>>> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 1567, in
>>> _execute_child\nraise child_exception_type(errno_num,
>>> err_msg)\nOSError: [Errno 2] No such file or directory: \'gdb\'\n';
>>>  = 1
>>>
>>
>> Cool, easy fix.
>>
>
> Fixed by https://gerrit.ovirt.org/#/c/92846/
>

Fedora 28 build is green with this change:
https://travis-ci.org/nirs/vdsm/jobs/400549561



___ summary 
  tests: commands succeeded
  storage-py27: commands succeeded
  storage-py36: commands succeeded
  lib-py27: commands succeeded
  lib-py36: commands succeeded
  network-py27: commands succeeded
  network-py36: commands succeeded
  virt-py27: commands succeeded
  virt-py36: commands succeeded
  congratulations :)



>
>>
>>
>>> Nir, could you remind me what is "ERROR: InterpreterNotFound:
>>> python3.6" and how can we avoid it? it keeps distracting during
>>> debugging test failures.
>>>
>>
>> We can avoid it in travis using env matrix.
>>
>> Currently we run "make check" which run all the the tox envs
>> (e.g. storage-py27,storage-py36) regardless of the build type. This is
>> good
>> for manual usage when you don't know which python version is available
>> on a developer machine. For example if I have python 3.7 installed, maybe
>> I like to test.
>>
>> We can change this so we will test only the *-py27 on centos, and both
>> *-py27 and *-py36 on Fedora.
>>
>> We can do the same in ovirt CI but it will be harder, we don't have a
>> declerative
>> way to configure this.
>>
>
> Fixed all builds using --enable-python3:
> https://gerrit.ovirt.org/#/c/92

[ovirt-devel] Adding a supported distro

2018-07-05 Thread bob . bownes
dumb q. 

I have a distro I'd like to add to the supported list. (private labeled version 
of a rhel 7.5) 

I tried grabbing an unclaimed ID, putting that plus the output of osinfo-query 
—fields=name os short-id='my_shortid' and derivedFrom.value= well_known_distro 
into osinfo-defaults.properties and restarting ovirt-engine, but I still come 
up with install failed because '$MYNAME' is not supported. Am I missing 
something obvious?

Entry looks like this:
os.my_7x86.id.value = 34
os.my_7x86.name.value = My Linux 7.5
os.my_7x86.derivedFrom.value = rhel_6x64

ideas?
Thanks!
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/XPOK6ZKPUL33JGP65DZYY2BLKF67XHMQ/


[ovirt-devel] Re: [VDSM] Travis builds still fail on .coverage rename

2018-07-05 Thread Nir Soffer
On Thu, Jul 5, 2018 at 5:53 PM Nir Soffer  wrote:

> On Thu, Jul 5, 2018 at 5:43 PM Dan Kenigsberg  wrote:
>
>> On Thu, Jul 5, 2018 at 2:52 AM, Nir Soffer  wrote:
>> > On Wed, Jul 4, 2018 at 1:00 PM Dan Kenigsberg 
>> wrote:
>> >>
>> >> On Wed, Jul 4, 2018 at 12:48 PM, Nir Soffer 
>> wrote:
>> >> > Dan, travis build still fail when renaming coverage file even after
>> >> > your last patch.
>> >> >
>> >> >
>> >> >
>> >> >
>> ...SS.SS.SS..S.SSSS.SSS...SSS...S.S.SSSS..S.SS..
>> >> >
>> --
>> >> > Ran 1267 tests in 99.239s
>> >> > OK (SKIP=63)
>> >> > [ -n "$NOSE_WITH_COVERAGE" ] && mv .coverage .coverage-nose-py2
>> >> > make[1]: *** [check] Error 1
>> >> > make[1]: Leaving directory `/vdsm/tests'
>> >> > ERROR: InvocationError: '/usr/bin/make -C tests check'
>> >> >
>> >> > https://travis-ci.org/oVirt/vdsm/jobs/399932012
>> >> >
>> >> > Do you have any idea what is wrong there?
>> >> >
>> >> > Why we don't have any error message from the failed command?
>> >>
>> >> No idea, nothing pops to mind.
>> >> We can revert to the sillier [ -f .coverage ] condition instead of
>> >> understanding (yeah, this feels dirty)
>> >
>> >
>> > Thanks, your patch (https://gerrit.ovirt.org/#/c/92813/) fixed this
>> > failure.
>> >
>> > Now we have failures for the pywatch_test, and some network
>> > tests. Can someone from network look at this?
>> > https://travis-ci.org/nirs/vdsm/builds/400204807
>>
>> https://travis-ci.org/nirs/vdsm/jobs/400204808 shows
>>
>>   ConfigNetworkError: (21, 'Executing commands failed:
>> ovs-vsctl: cannot create a bridge named vdsmbr_test because a bridge
>> named vdsmbr_test already exists')
>>
>> which I thought was limited to dirty ovirt-ci jenkins slaves. Any idea
>> why it shows here?
>>
>
> Maybe one failed test leave dirty host to the next test?
>
>
>> py-watch seems to be failing due to missing gdb on the travis image
>
>
>> cmdutils.py151 DEBUG./py-watch 0.1 sleep 10 (cwd None)
>> cmdutils.py159 DEBUGFAILED:  = 'Traceback
>> (most recent call last):\n  File "./py-watch", line 60, in \n
>>   dump_trace(watched_proc)\n  File "./py-watch", line 32, in
>> dump_trace\n\'thread apply all py-bt\'])\n  File
>> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 575, in
>> call\np = Popen(*popenargs, **kwargs)\n  File
>> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 822, in
>> __init__\nrestore_signals, start_new_session)\n  File
>> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 1567, in
>> _execute_child\nraise child_exception_type(errno_num,
>> err_msg)\nOSError: [Errno 2] No such file or directory: \'gdb\'\n';
>>  = 1
>>
>
> Cool, easy fix.
>

Fixed by https://gerrit.ovirt.org/#/c/92846/


>
>
>> Nir, could you remind me what is "ERROR: InterpreterNotFound:
>> python3.6" and how can we avoid it? it keeps distracting during
>> debugging test failures.
>>
>
> We can avoid it in travis using env matrix.
>
> Currently we run "make check" which run all the the tox envs
> (e.g. storage-py27,storage-py36) regardless of the build type. This is good
> for manual usage when you don't know which python version is available
> on a developer machine. For example if I have python 3.7 installed, maybe
> I like to test.
>
> We can change this so we will test only the *-py27 on centos, and both
> *-py27 and *-py36 on Fedora.
>
> We can do the same in ovirt CI but it will be harder, we don't have a
> declerative
> way to configure this.
>

Fixed all builds using --enable-python3:
https://gerrit.ovirt.org/#/c/92847/

Nir
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/ZOMIB3FLERFSFFGSVDBOSHOZNJGP65UQ/


[ovirt-devel] Re: Error building Dockerfile.fedora.rawhide

2018-07-05 Thread David Lehman
On Wed, 2018-07-04 at 02:05 +0300, Nir Soffer wrote:
> I'm trying to rebuild our test images at:
> https://hub.docker.com/u/ovirtorg/dashboard/
> 
> The fedora rawhide image fail with this error:
> 
> Problem: conflicting requests
>   - nothing provides python2-blockdev >= 1.1 needed by python2-
> blivet1-1:1.20.4-3.fc28.noarch
> 
> Anyone has clue about this error? I guess this effects also vdsm.
> 
> David, do you know about this issue?

It is new to me, but it looks like there was a change to libblockdev to
stop building the python2-blockdev package in rawhide because of the
push to get rid of python2 in general. Probably all that's needed is a
new build of libblockdev w/ python2 enabled. I'll see what is required
to make that happen.

I should also note that the libblockdev maintainer is in the Czech
Republic, where today and tomorrow are public holidays.

I'll be in touch when I have an update.

David

> 
> Nir
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/FSB5JHTLCBPGWNF5FQHPPO6VMJR3NBQV/


[ovirt-devel] Re: [VDSM] Travis builds still fail on .coverage rename

2018-07-05 Thread Nir Soffer
On Thu, Jul 5, 2018 at 5:43 PM Dan Kenigsberg  wrote:

> On Thu, Jul 5, 2018 at 2:52 AM, Nir Soffer  wrote:
> > On Wed, Jul 4, 2018 at 1:00 PM Dan Kenigsberg  wrote:
> >>
> >> On Wed, Jul 4, 2018 at 12:48 PM, Nir Soffer  wrote:
> >> > Dan, travis build still fail when renaming coverage file even after
> >> > your last patch.
> >> >
> >> >
> >> >
> >> >
> ...SS.SS.SS..S.SSSS.SSS...SSS...S.S.SSSS..S.SS..
> >> > --
> >> > Ran 1267 tests in 99.239s
> >> > OK (SKIP=63)
> >> > [ -n "$NOSE_WITH_COVERAGE" ] && mv .coverage .coverage-nose-py2
> >> > make[1]: *** [check] Error 1
> >> > make[1]: Leaving directory `/vdsm/tests'
> >> > ERROR: InvocationError: '/usr/bin/make -C tests check'
> >> >
> >> > https://travis-ci.org/oVirt/vdsm/jobs/399932012
> >> >
> >> > Do you have any idea what is wrong there?
> >> >
> >> > Why we don't have any error message from the failed command?
> >>
> >> No idea, nothing pops to mind.
> >> We can revert to the sillier [ -f .coverage ] condition instead of
> >> understanding (yeah, this feels dirty)
> >
> >
> > Thanks, your patch (https://gerrit.ovirt.org/#/c/92813/) fixed this
> > failure.
> >
> > Now we have failures for the pywatch_test, and some network
> > tests. Can someone from network look at this?
> > https://travis-ci.org/nirs/vdsm/builds/400204807
>
> https://travis-ci.org/nirs/vdsm/jobs/400204808 shows
>
>   ConfigNetworkError: (21, 'Executing commands failed:
> ovs-vsctl: cannot create a bridge named vdsmbr_test because a bridge
> named vdsmbr_test already exists')
>
> which I thought was limited to dirty ovirt-ci jenkins slaves. Any idea
> why it shows here?
>

Maybe one failed test leave dirty host to the next test?


> py-watch seems to be failing due to missing gdb on the travis image


> cmdutils.py151 DEBUG./py-watch 0.1 sleep 10 (cwd None)
> cmdutils.py159 DEBUGFAILED:  = 'Traceback
> (most recent call last):\n  File "./py-watch", line 60, in \n
>   dump_trace(watched_proc)\n  File "./py-watch", line 32, in
> dump_trace\n\'thread apply all py-bt\'])\n  File
> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 575, in
> call\np = Popen(*popenargs, **kwargs)\n  File
> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 822, in
> __init__\nrestore_signals, start_new_session)\n  File
> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 1567, in
> _execute_child\nraise child_exception_type(errno_num,
> err_msg)\nOSError: [Errno 2] No such file or directory: \'gdb\'\n';
>  = 1
>

Cool, easy fix.


> Nir, could you remind me what is "ERROR: InterpreterNotFound:
> python3.6" and how can we avoid it? it keeps distracting during
> debugging test failures.
>

We can avoid it in travis using env matrix.

Currently we run "make check" which run all the the tox envs
(e.g. storage-py27,storage-py36) regardless of the build type. This is good
for manual usage when you don't know which python version is available
on a developer machine. For example if I have python 3.7 installed, maybe
I like to test.

We can change this so we will test only the *-py27 on centos, and both
*-py27 and *-py36 on Fedora.

We can do the same in ovirt CI but it will be harder, we don't have a
declerative
way to configure this.

Nir
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/UKVQVCL2M6NE5VHZQMIPNYNSREIJOQUQ/


[ovirt-devel] Re: [VDSM] Travis builds still fail on .coverage rename

2018-07-05 Thread Dan Kenigsberg
On Thu, Jul 5, 2018 at 2:52 AM, Nir Soffer  wrote:
> On Wed, Jul 4, 2018 at 1:00 PM Dan Kenigsberg  wrote:
>>
>> On Wed, Jul 4, 2018 at 12:48 PM, Nir Soffer  wrote:
>> > Dan, travis build still fail when renaming coverage file even after
>> > your last patch.
>> >
>> >
>> >
>> > ...SS.SS.SS..S.SSSS.SSS...SSS...S.S.SSSS..S.SS..
>> > --
>> > Ran 1267 tests in 99.239s
>> > OK (SKIP=63)
>> > [ -n "$NOSE_WITH_COVERAGE" ] && mv .coverage .coverage-nose-py2
>> > make[1]: *** [check] Error 1
>> > make[1]: Leaving directory `/vdsm/tests'
>> > ERROR: InvocationError: '/usr/bin/make -C tests check'
>> >
>> > https://travis-ci.org/oVirt/vdsm/jobs/399932012
>> >
>> > Do you have any idea what is wrong there?
>> >
>> > Why we don't have any error message from the failed command?
>>
>> No idea, nothing pops to mind.
>> We can revert to the sillier [ -f .coverage ] condition instead of
>> understanding (yeah, this feels dirty)
>
>
> Thanks, your patch (https://gerrit.ovirt.org/#/c/92813/) fixed this
> failure.
>
> Now we have failures for the pywatch_test, and some network
> tests. Can someone from network look at this?
> https://travis-ci.org/nirs/vdsm/builds/400204807

https://travis-ci.org/nirs/vdsm/jobs/400204808 shows

  ConfigNetworkError: (21, 'Executing commands failed:
ovs-vsctl: cannot create a bridge named vdsmbr_test because a bridge
named vdsmbr_test already exists')

which I thought was limited to dirty ovirt-ci jenkins slaves. Any idea
why it shows here?

py-watch seems to be failing due to missing gdb on the travis image

cmdutils.py151 DEBUG./py-watch 0.1 sleep 10 (cwd None)
cmdutils.py159 DEBUGFAILED:  = 'Traceback
(most recent call last):\n  File "./py-watch", line 60, in \n
  dump_trace(watched_proc)\n  File "./py-watch", line 32, in
dump_trace\n\'thread apply all py-bt\'])\n  File
"/usr/lib64/python2.7/site-packages/subprocess32.py", line 575, in
call\np = Popen(*popenargs, **kwargs)\n  File
"/usr/lib64/python2.7/site-packages/subprocess32.py", line 822, in
__init__\nrestore_signals, start_new_session)\n  File
"/usr/lib64/python2.7/site-packages/subprocess32.py", line 1567, in
_execute_child\nraise child_exception_type(errno_num,
err_msg)\nOSError: [Errno 2] No such file or directory: \'gdb\'\n';
 = 1


Nir, could you remind me what is "ERROR: InterpreterNotFound:
python3.6" and how can we avoid it? it keeps distracting during
debugging test failures.





>
> Nir
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/DIWMHL7ITMMUM32E6SPG247JFOXH5WRT/


[ovirt-devel] Re: [CQ Failure Report] [Ovirt 4.2] [4.7.18]

2018-07-05 Thread Dafna Ron
the restart was not on the SPM so it should not have caused this task to
not be cleaned,
i think that the task was just still running.


On Thu, Jul 5, 2018 at 10:44 AM, Tal Nisan  wrote:

> The task is stuck and never cleared after the restart or we just have to
> wait for it to be finished and cleared?
>
> On Thu, Jul 5, 2018 at 12:30 PM, Dafna Ron  wrote:
>
>> Jira opened: https://ovirt-jira.atlassian.net/browse/OVIRT-2286
>>
>> Tal, can you help fix the test?
>>
>> thanks,
>> Dafna
>>
>>
>> On Wed, Jul 4, 2018 at 7:41 PM, Dafna Ron  wrote:
>>
>>> The vdsm restart is a test in basic_sanity.py
>>> The task that is stuck is downloadImageFromStream
>>>
>>> 2018-07-04 10:12:52,659-04 WARN  [org.ovirt.engine.core.bll.sto
>>> rage.domain.DeactivateStorageDomainWithOvfUpdateCommand] (default
>>> task-1) [ce1c28ba-1550-457f-b5e3-ad051488f897] There are running tasks:
>>> 'AsyncTask:{commandId='0b18c13f-0fce-
>>> 4303-8f85-ae5a2b051991', 
>>> rootCommandId='0b18c13f-0fce-4303-8f85-ae5a2b051991',
>>> storagePoolId='8dd2fe5a-9dca-42e2-8593-1de1b18b4887',
>>> taskId='f2af86fb-dbbb-430c-afd9-2f25131583b1',
>>> vdsmTaskId='45ee0fc8-830d-47d5-9c4a-6d4ed72ae6a1', stepId=
>>> 'null', taskType='downloadImageFromStream', status='running'}
>>>
>>> The task succeeded but after the deactivate storage domain attempt:
>>>
>>> 2018-07-04 10:13:01,579-04 INFO  [org.ovirt.engine.core.bll.Ser
>>> ialChildCommandsExecutionCallback] 
>>> (EE-ManagedThreadFactory-engineScheduled-Thread-98)
>>> [1c611686] Command 'ProcessOvfUpdateForStorageDomain' id:
>>> '6342fce4-96ff-4ae3-8b40-8155a
>>> 5509761' child commands '[0b18c13f-0fce-4303-8f85-ae5a2b051991,
>>> 8e16826f-93dd-442a-8a74-13d14222d45e]' executions were completed,
>>> status 'SUCCEEDED'
>>>
>>>
>>> we should have a solution in ost for locked objects failing jobs... will
>>> open a Jira to follow
>>>
>>>
>>>
>>> On Wed, Jul 4, 2018 at 6:16 PM, Nir Soffer  wrote:
>>>
 On Wed, Jul 4, 2018 at 6:46 PM Dafna Ron  wrote:

> The actual test has failed with error:
>
> 2018-07-04 10:12:52,665-04 ERROR [org.ovirt.engine.api.restapi.
> resource.AbstractBackendResource] (default task-1) [] Operation
> Failed: [Cannot deactivate Storage while there are running tasks on this
> Storage.
>
> -Please wait until tasks will finish and try again.]
>
> However, there is a problem with vdsm on host-1. it restarts which may
> cause the issue with the running tasks.
>

 Who restarted vdsm? why?


> 2018-07-04 10:08:46,603-0400 INFO  (ioprocess/5191) [IOProcessClient]
> shutdown requested (__init__:108)
> 2018-07-04 10:08:46,604-0400 INFO  (MainThread) [storage.udev]
> Stopping multipath event listener (udev:149)
> 2018-07-04 10:08:46,604-0400 INFO  (MainThread) [vdsm.api] FINISH
> prepareForShutdown return=None from=internal, 
> task_id=bf3b87e4-febf-4cb6-8bfa-5840fc926b49
> (api:52)
> 2018-07-04 10:08:46,605-0400 INFO  (MainThread) [vds] Stopping threads
> (vdsmd:160)
> 2018-07-04 10:08:46,605-0400 INFO  (MainThread) [vds] Exiting
> (vdsmd:171)
> 2018-07-04 10:10:00,145-0400 INFO  (MainThread) [vds] (PID: 14034) I
> am the actual vdsm 4.20.32-1.el7 lago-basic-suite-4-2-host-1
> (3.10.0-862.2.3.el7.x86_64) (vdsmd:149)
> 2018-07-04 10:10:00,146-0400 INFO  (MainThread) [vds] VDSM will run
> with cpu affinity: frozenset([1]) (vdsmd:262)
> 2018-07-04 10:10:00,151-0400 INFO  (MainThread) [storage.HSM] START
> HSM init (hsm:366)
> 2018-07-04 10:10:00,154-0400 INFO  (MainThread) [storage.HSM] Creating
> data-center mount directory '/rhev/data-center/mnt' (hsm:373)
> 2018-07-04 10:10:00,154-0400 INFO  (MainThread) [storage.fileUtils]
> Creating directory: /rhev/data-center/mnt mode: None (fileUtils:197)
> 2018-07-04 10:10:00,265-0400 INFO  (MainThread) [storage.HSM]
> Unlinking file '/rhev/data-center/8dd2fe5a-9d
> ca-42e2-8593-1de1b18b4887/44eba8db-3a9c-4fbe-ba33-a039fcd561e1'
> (hsm:523)
> 2018-07-04 10:10:00,266-0400 INFO  (MainThread) [storage.HSM]
> Unlinking file '/rhev/data-center/8dd2fe5a-9d
> ca-42e2-8593-1de1b18b4887/mastersd' (hsm:523)
> 2018-07-04 10:10:00,266-0400 INFO  (MainThread) [storage.HSM]
> Unlinking file '/rhev/data-center/8dd2fe5a-9d
> ca-42e2-8593-1de1b18b4887/c7980a1e-91ef-4095-82eb-37ec03da9b3f'
> (hsm:523)
> 2018-07-04 10:10:00,267-0400 INFO  (MainThread) [storage.HSM]
> Unlinking file '/rhev/data-center/8dd2fe5a-9d
> ca-42e2-8593-1de1b18b4887/4fc62763-d8a5-4c36-8687-91870a92ff05'
> (hsm:523)
> 2018-07-04 10:10:00,267-0400 INFO  (MainThread) [storage.HSM]
> Unlinking file '/rhev/data-center/8dd2fe5a-9d
> ca-42e2-8593-1de1b18b4887/02363608-01b9-4176-b7a1-e9ee235f792a'
> (hsm:523)
> 2018-07-04 10:10:00,267-0400 INFO  (MainThread) [storage.udev]
> Registering multipath event monitor  object at 0x7f9aa4548150> (udev:182)
> 2018-07-04 10

[ovirt-devel] Re: [CQ Failure Report] [Ovirt 4.2] [4.7.18]

2018-07-05 Thread Tal Nisan
The task is stuck and never cleared after the restart or we just have to
wait for it to be finished and cleared?

On Thu, Jul 5, 2018 at 12:30 PM, Dafna Ron  wrote:

> Jira opened: https://ovirt-jira.atlassian.net/browse/OVIRT-2286
>
> Tal, can you help fix the test?
>
> thanks,
> Dafna
>
>
> On Wed, Jul 4, 2018 at 7:41 PM, Dafna Ron  wrote:
>
>> The vdsm restart is a test in basic_sanity.py
>> The task that is stuck is downloadImageFromStream
>>
>> 2018-07-04 10:12:52,659-04 WARN  [org.ovirt.engine.core.bll.sto
>> rage.domain.DeactivateStorageDomainWithOvfUpdateCommand] (default
>> task-1) [ce1c28ba-1550-457f-b5e3-ad051488f897] There are running tasks:
>> 'AsyncTask:{commandId='0b18c13f-0fce-
>> 4303-8f85-ae5a2b051991', 
>> rootCommandId='0b18c13f-0fce-4303-8f85-ae5a2b051991',
>> storagePoolId='8dd2fe5a-9dca-42e2-8593-1de1b18b4887',
>> taskId='f2af86fb-dbbb-430c-afd9-2f25131583b1',
>> vdsmTaskId='45ee0fc8-830d-47d5-9c4a-6d4ed72ae6a1', stepId=
>> 'null', taskType='downloadImageFromStream', status='running'}
>>
>> The task succeeded but after the deactivate storage domain attempt:
>>
>> 2018-07-04 10:13:01,579-04 INFO  [org.ovirt.engine.core.bll.Ser
>> ialChildCommandsExecutionCallback] 
>> (EE-ManagedThreadFactory-engineScheduled-Thread-98)
>> [1c611686] Command 'ProcessOvfUpdateForStorageDomain' id:
>> '6342fce4-96ff-4ae3-8b40-8155a
>> 5509761' child commands '[0b18c13f-0fce-4303-8f85-ae5a2b051991,
>> 8e16826f-93dd-442a-8a74-13d14222d45e]' executions were completed, status
>> 'SUCCEEDED'
>>
>>
>> we should have a solution in ost for locked objects failing jobs... will
>> open a Jira to follow
>>
>>
>>
>> On Wed, Jul 4, 2018 at 6:16 PM, Nir Soffer  wrote:
>>
>>> On Wed, Jul 4, 2018 at 6:46 PM Dafna Ron  wrote:
>>>
 The actual test has failed with error:

 2018-07-04 10:12:52,665-04 ERROR [org.ovirt.engine.api.restapi.
 resource.AbstractBackendResource] (default task-1) [] Operation
 Failed: [Cannot deactivate Storage while there are running tasks on this
 Storage.

 -Please wait until tasks will finish and try again.]

 However, there is a problem with vdsm on host-1. it restarts which may
 cause the issue with the running tasks.

>>>
>>> Who restarted vdsm? why?
>>>
>>>
 2018-07-04 10:08:46,603-0400 INFO  (ioprocess/5191) [IOProcessClient]
 shutdown requested (__init__:108)
 2018-07-04 10:08:46,604-0400 INFO  (MainThread) [storage.udev] Stopping
 multipath event listener (udev:149)
 2018-07-04 10:08:46,604-0400 INFO  (MainThread) [vdsm.api] FINISH
 prepareForShutdown return=None from=internal, 
 task_id=bf3b87e4-febf-4cb6-8bfa-5840fc926b49
 (api:52)
 2018-07-04 10:08:46,605-0400 INFO  (MainThread) [vds] Stopping threads
 (vdsmd:160)
 2018-07-04 10:08:46,605-0400 INFO  (MainThread) [vds] Exiting
 (vdsmd:171)
 2018-07-04 10:10:00,145-0400 INFO  (MainThread) [vds] (PID: 14034) I am
 the actual vdsm 4.20.32-1.el7 lago-basic-suite-4-2-host-1
 (3.10.0-862.2.3.el7.x86_64) (vdsmd:149)
 2018-07-04 10:10:00,146-0400 INFO  (MainThread) [vds] VDSM will run
 with cpu affinity: frozenset([1]) (vdsmd:262)
 2018-07-04 10:10:00,151-0400 INFO  (MainThread) [storage.HSM] START HSM
 init (hsm:366)
 2018-07-04 10:10:00,154-0400 INFO  (MainThread) [storage.HSM] Creating
 data-center mount directory '/rhev/data-center/mnt' (hsm:373)
 2018-07-04 10:10:00,154-0400 INFO  (MainThread) [storage.fileUtils]
 Creating directory: /rhev/data-center/mnt mode: None (fileUtils:197)
 2018-07-04 10:10:00,265-0400 INFO  (MainThread) [storage.HSM] Unlinking
 file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/44eb
 a8db-3a9c-4fbe-ba33-a039fcd561e1' (hsm:523)
 2018-07-04 10:10:00,266-0400 INFO  (MainThread) [storage.HSM] Unlinking
 file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/mastersd'
 (hsm:523)
 2018-07-04 10:10:00,266-0400 INFO  (MainThread) [storage.HSM] Unlinking
 file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/c798
 0a1e-91ef-4095-82eb-37ec03da9b3f' (hsm:523)
 2018-07-04 10:10:00,267-0400 INFO  (MainThread) [storage.HSM] Unlinking
 file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/4fc6
 2763-d8a5-4c36-8687-91870a92ff05' (hsm:523)
 2018-07-04 10:10:00,267-0400 INFO  (MainThread) [storage.HSM] Unlinking
 file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/0236
 3608-01b9-4176-b7a1-e9ee235f792a' (hsm:523)
 2018-07-04 10:10:00,267-0400 INFO  (MainThread) [storage.udev]
 Registering multipath event monitor >>> object at 0x7f9aa4548150> (udev:182)
 2018-07-04 10:10:00,267-0400 INFO  (MainThread) [storage.udev] Starting
 multipath event listener (udev:116)
 2018-07-04 10:10:00,298-0400 INFO  (MainThread) [storage.check]
 Starting check service (check:91)
 2018-07-04 10:10:00,303-0400 INFO  (MainThread) [storage.Dispatcher]
 Starting StorageDispatcher... (di

[ovirt-devel] Re: [CQ Failure Report] [Ovirt 4.2] [4.7.18]

2018-07-05 Thread Dafna Ron
Jira opened: https://ovirt-jira.atlassian.net/browse/OVIRT-2286

Tal, can you help fix the test?

thanks,
Dafna


On Wed, Jul 4, 2018 at 7:41 PM, Dafna Ron  wrote:

> The vdsm restart is a test in basic_sanity.py
> The task that is stuck is downloadImageFromStream
>
> 2018-07-04 10:12:52,659-04 WARN  [org.ovirt.engine.core.bll.
> storage.domain.DeactivateStorageDomainWithOvfUpdateCommand] (default
> task-1) [ce1c28ba-1550-457f-b5e3-ad051488f897] There are running tasks:
> 'AsyncTask:{commandId='0b18c13f-0fce-
> 4303-8f85-ae5a2b051991', rootCommandId='0b18c13f-0fce-4303-8f85-ae5a2b051991',
> storagePoolId='8dd2fe5a-9dca-42e2-8593-1de1b18b4887',
> taskId='f2af86fb-dbbb-430c-afd9-2f25131583b1', 
> vdsmTaskId='45ee0fc8-830d-47d5-9c4a-6d4ed72ae6a1',
> stepId=
> 'null', taskType='downloadImageFromStream', status='running'}
>
> The task succeeded but after the deactivate storage domain attempt:
>
> 2018-07-04 10:13:01,579-04 INFO  [org.ovirt.engine.core.bll.
> SerialChildCommandsExecutionCallback] 
> (EE-ManagedThreadFactory-engineScheduled-Thread-98)
> [1c611686] Command 'ProcessOvfUpdateForStorageDomain' id:
> '6342fce4-96ff-4ae3-8b40-8155a
> 5509761' child commands '[0b18c13f-0fce-4303-8f85-ae5a2b051991,
> 8e16826f-93dd-442a-8a74-13d14222d45e]' executions were completed, status
> 'SUCCEEDED'
>
>
> we should have a solution in ost for locked objects failing jobs... will
> open a Jira to follow
>
>
>
> On Wed, Jul 4, 2018 at 6:16 PM, Nir Soffer  wrote:
>
>> On Wed, Jul 4, 2018 at 6:46 PM Dafna Ron  wrote:
>>
>>> The actual test has failed with error:
>>>
>>> 2018-07-04 10:12:52,665-04 ERROR [org.ovirt.engine.api.restapi.
>>> resource.AbstractBackendResource] (default task-1) [] Operation Failed:
>>> [Cannot deactivate Storage while there are running tasks on this Storage.
>>>
>>> -Please wait until tasks will finish and try again.]
>>>
>>> However, there is a problem with vdsm on host-1. it restarts which may
>>> cause the issue with the running tasks.
>>>
>>
>> Who restarted vdsm? why?
>>
>>
>>> 2018-07-04 10:08:46,603-0400 INFO  (ioprocess/5191) [IOProcessClient]
>>> shutdown requested (__init__:108)
>>> 2018-07-04 10:08:46,604-0400 INFO  (MainThread) [storage.udev] Stopping
>>> multipath event listener (udev:149)
>>> 2018-07-04 10:08:46,604-0400 INFO  (MainThread) [vdsm.api] FINISH
>>> prepareForShutdown return=None from=internal, 
>>> task_id=bf3b87e4-febf-4cb6-8bfa-5840fc926b49
>>> (api:52)
>>> 2018-07-04 10:08:46,605-0400 INFO  (MainThread) [vds] Stopping threads
>>> (vdsmd:160)
>>> 2018-07-04 10:08:46,605-0400 INFO  (MainThread) [vds] Exiting (vdsmd:171)
>>> 2018-07-04 10:10:00,145-0400 INFO  (MainThread) [vds] (PID: 14034) I am
>>> the actual vdsm 4.20.32-1.el7 lago-basic-suite-4-2-host-1
>>> (3.10.0-862.2.3.el7.x86_64) (vdsmd:149)
>>> 2018-07-04 10:10:00,146-0400 INFO  (MainThread) [vds] VDSM will run with
>>> cpu affinity: frozenset([1]) (vdsmd:262)
>>> 2018-07-04 10:10:00,151-0400 INFO  (MainThread) [storage.HSM] START HSM
>>> init (hsm:366)
>>> 2018-07-04 10:10:00,154-0400 INFO  (MainThread) [storage.HSM] Creating
>>> data-center mount directory '/rhev/data-center/mnt' (hsm:373)
>>> 2018-07-04 10:10:00,154-0400 INFO  (MainThread) [storage.fileUtils]
>>> Creating directory: /rhev/data-center/mnt mode: None (fileUtils:197)
>>> 2018-07-04 10:10:00,265-0400 INFO  (MainThread) [storage.HSM] Unlinking
>>> file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/44eb
>>> a8db-3a9c-4fbe-ba33-a039fcd561e1' (hsm:523)
>>> 2018-07-04 10:10:00,266-0400 INFO  (MainThread) [storage.HSM] Unlinking
>>> file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/mastersd'
>>> (hsm:523)
>>> 2018-07-04 10:10:00,266-0400 INFO  (MainThread) [storage.HSM] Unlinking
>>> file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/c798
>>> 0a1e-91ef-4095-82eb-37ec03da9b3f' (hsm:523)
>>> 2018-07-04 10:10:00,267-0400 INFO  (MainThread) [storage.HSM] Unlinking
>>> file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/4fc6
>>> 2763-d8a5-4c36-8687-91870a92ff05' (hsm:523)
>>> 2018-07-04 10:10:00,267-0400 INFO  (MainThread) [storage.HSM] Unlinking
>>> file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/0236
>>> 3608-01b9-4176-b7a1-e9ee235f792a' (hsm:523)
>>> 2018-07-04 10:10:00,267-0400 INFO  (MainThread) [storage.udev]
>>> Registering multipath event monitor >> object at 0x7f9aa4548150> (udev:182)
>>> 2018-07-04 10:10:00,267-0400 INFO  (MainThread) [storage.udev] Starting
>>> multipath event listener (udev:116)
>>> 2018-07-04 10:10:00,298-0400 INFO  (MainThread) [storage.check] Starting
>>> check service (check:91)
>>> 2018-07-04 10:10:00,303-0400 INFO  (MainThread) [storage.Dispatcher]
>>> Starting StorageDispatcher... (di
>>>
>>> On Wed, Jul 4, 2018 at 4:12 PM, Greg Sheremeta 
>>> wrote:
>>>
 """
 Error: Fault reason is "Operation Failed". Fault detail is "[Cannot
 deactivate Storage while there are running tasks on this Storage.
 -Please wait until tasks will finish and try again.]". HTTP re