[ovirt-devel] Re: [VDSM] Travis builds still fail on .coverage rename
On Thu, Jul 5, 2018 at 10:55 PM Nir Soffer wrote: > On Thu, Jul 5, 2018 at 5:53 PM Nir Soffer wrote: > >> On Thu, Jul 5, 2018 at 5:43 PM Dan Kenigsberg wrote: >> >>> On Thu, Jul 5, 2018 at 2:52 AM, Nir Soffer wrote: >>> > On Wed, Jul 4, 2018 at 1:00 PM Dan Kenigsberg >>> wrote: >>> >> >>> >> On Wed, Jul 4, 2018 at 12:48 PM, Nir Soffer >>> wrote: >>> >> > Dan, travis build still fail when renaming coverage file even after >>> >> > your last patch. >>> >> > >>> >> > >>> >> > >>> >> > >>> ...SS.SS.SS..S.SSSS.SSS...SSS...S.S.SSSS..S.SS.. >>> >> > >>> -- >>> >> > Ran 1267 tests in 99.239s >>> >> > OK (SKIP=63) >>> >> > [ -n "$NOSE_WITH_COVERAGE" ] && mv .coverage .coverage-nose-py2 >>> >> > make[1]: *** [check] Error 1 >>> >> > make[1]: Leaving directory `/vdsm/tests' >>> >> > ERROR: InvocationError: '/usr/bin/make -C tests check' >>> >> > >>> >> > https://travis-ci.org/oVirt/vdsm/jobs/399932012 >>> >> > >>> >> > Do you have any idea what is wrong there? >>> >> > >>> >> > Why we don't have any error message from the failed command? >>> >> >>> >> No idea, nothing pops to mind. >>> >> We can revert to the sillier [ -f .coverage ] condition instead of >>> >> understanding (yeah, this feels dirty) >>> > >>> > >>> > Thanks, your patch (https://gerrit.ovirt.org/#/c/92813/) fixed this >>> > failure. >>> > >>> > Now we have failures for the pywatch_test, and some network >>> > tests. Can someone from network look at this? >>> > https://travis-ci.org/nirs/vdsm/builds/400204807 >>> >>> https://travis-ci.org/nirs/vdsm/jobs/400204808 shows >>> >>> ConfigNetworkError: (21, 'Executing commands failed: >>> ovs-vsctl: cannot create a bridge named vdsmbr_test because a bridge >>> named vdsmbr_test already exists') >>> >>> which I thought was limited to dirty ovirt-ci jenkins slaves. Any idea >>> why it shows here? >>> >> >> Maybe one failed test leave dirty host to the next test? >> > network tests fail now only on CentOS now. > >> >>> py-watch seems to be failing due to missing gdb on the travis image >> >> >>> cmdutils.py151 DEBUG./py-watch 0.1 sleep 10 (cwd >>> None) >>> cmdutils.py159 DEBUGFAILED: = 'Traceback >>> (most recent call last):\n File "./py-watch", line 60, in \n >>> dump_trace(watched_proc)\n File "./py-watch", line 32, in >>> dump_trace\n\'thread apply all py-bt\'])\n File >>> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 575, in >>> call\np = Popen(*popenargs, **kwargs)\n File >>> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 822, in >>> __init__\nrestore_signals, start_new_session)\n File >>> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 1567, in >>> _execute_child\nraise child_exception_type(errno_num, >>> err_msg)\nOSError: [Errno 2] No such file or directory: \'gdb\'\n'; >>> = 1 >>> >> >> Cool, easy fix. >> > > Fixed by https://gerrit.ovirt.org/#/c/92846/ > Fedora 28 build is green with this change: https://travis-ci.org/nirs/vdsm/jobs/400549561 ___ summary tests: commands succeeded storage-py27: commands succeeded storage-py36: commands succeeded lib-py27: commands succeeded lib-py36: commands succeeded network-py27: commands succeeded network-py36: commands succeeded virt-py27: commands succeeded virt-py36: commands succeeded congratulations :) > >> >> >>> Nir, could you remind me what is "ERROR: InterpreterNotFound: >>> python3.6" and how can we avoid it? it keeps distracting during >>> debugging test failures. >>> >> >> We can avoid it in travis using env matrix. >> >> Currently we run "make check" which run all the the tox envs >> (e.g. storage-py27,storage-py36) regardless of the build type. This is >> good >> for manual usage when you don't know which python version is available >> on a developer machine. For example if I have python 3.7 installed, maybe >> I like to test. >> >> We can change this so we will test only the *-py27 on centos, and both >> *-py27 and *-py36 on Fedora. >> >> We can do the same in ovirt CI but it will be harder, we don't have a >> declerative >> way to configure this. >> > > Fixed all builds using --enable-python3: > https://gerrit.ovirt.org/#/c/92
[ovirt-devel] Adding a supported distro
dumb q. I have a distro I'd like to add to the supported list. (private labeled version of a rhel 7.5) I tried grabbing an unclaimed ID, putting that plus the output of osinfo-query —fields=name os short-id='my_shortid' and derivedFrom.value= well_known_distro into osinfo-defaults.properties and restarting ovirt-engine, but I still come up with install failed because '$MYNAME' is not supported. Am I missing something obvious? Entry looks like this: os.my_7x86.id.value = 34 os.my_7x86.name.value = My Linux 7.5 os.my_7x86.derivedFrom.value = rhel_6x64 ideas? Thanks! ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/XPOK6ZKPUL33JGP65DZYY2BLKF67XHMQ/
[ovirt-devel] Re: [VDSM] Travis builds still fail on .coverage rename
On Thu, Jul 5, 2018 at 5:53 PM Nir Soffer wrote: > On Thu, Jul 5, 2018 at 5:43 PM Dan Kenigsberg wrote: > >> On Thu, Jul 5, 2018 at 2:52 AM, Nir Soffer wrote: >> > On Wed, Jul 4, 2018 at 1:00 PM Dan Kenigsberg >> wrote: >> >> >> >> On Wed, Jul 4, 2018 at 12:48 PM, Nir Soffer >> wrote: >> >> > Dan, travis build still fail when renaming coverage file even after >> >> > your last patch. >> >> > >> >> > >> >> > >> >> > >> ...SS.SS.SS..S.SSSS.SSS...SSS...S.S.SSSS..S.SS.. >> >> > >> -- >> >> > Ran 1267 tests in 99.239s >> >> > OK (SKIP=63) >> >> > [ -n "$NOSE_WITH_COVERAGE" ] && mv .coverage .coverage-nose-py2 >> >> > make[1]: *** [check] Error 1 >> >> > make[1]: Leaving directory `/vdsm/tests' >> >> > ERROR: InvocationError: '/usr/bin/make -C tests check' >> >> > >> >> > https://travis-ci.org/oVirt/vdsm/jobs/399932012 >> >> > >> >> > Do you have any idea what is wrong there? >> >> > >> >> > Why we don't have any error message from the failed command? >> >> >> >> No idea, nothing pops to mind. >> >> We can revert to the sillier [ -f .coverage ] condition instead of >> >> understanding (yeah, this feels dirty) >> > >> > >> > Thanks, your patch (https://gerrit.ovirt.org/#/c/92813/) fixed this >> > failure. >> > >> > Now we have failures for the pywatch_test, and some network >> > tests. Can someone from network look at this? >> > https://travis-ci.org/nirs/vdsm/builds/400204807 >> >> https://travis-ci.org/nirs/vdsm/jobs/400204808 shows >> >> ConfigNetworkError: (21, 'Executing commands failed: >> ovs-vsctl: cannot create a bridge named vdsmbr_test because a bridge >> named vdsmbr_test already exists') >> >> which I thought was limited to dirty ovirt-ci jenkins slaves. Any idea >> why it shows here? >> > > Maybe one failed test leave dirty host to the next test? > > >> py-watch seems to be failing due to missing gdb on the travis image > > >> cmdutils.py151 DEBUG./py-watch 0.1 sleep 10 (cwd None) >> cmdutils.py159 DEBUGFAILED: = 'Traceback >> (most recent call last):\n File "./py-watch", line 60, in \n >> dump_trace(watched_proc)\n File "./py-watch", line 32, in >> dump_trace\n\'thread apply all py-bt\'])\n File >> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 575, in >> call\np = Popen(*popenargs, **kwargs)\n File >> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 822, in >> __init__\nrestore_signals, start_new_session)\n File >> "/usr/lib64/python2.7/site-packages/subprocess32.py", line 1567, in >> _execute_child\nraise child_exception_type(errno_num, >> err_msg)\nOSError: [Errno 2] No such file or directory: \'gdb\'\n'; >> = 1 >> > > Cool, easy fix. > Fixed by https://gerrit.ovirt.org/#/c/92846/ > > >> Nir, could you remind me what is "ERROR: InterpreterNotFound: >> python3.6" and how can we avoid it? it keeps distracting during >> debugging test failures. >> > > We can avoid it in travis using env matrix. > > Currently we run "make check" which run all the the tox envs > (e.g. storage-py27,storage-py36) regardless of the build type. This is good > for manual usage when you don't know which python version is available > on a developer machine. For example if I have python 3.7 installed, maybe > I like to test. > > We can change this so we will test only the *-py27 on centos, and both > *-py27 and *-py36 on Fedora. > > We can do the same in ovirt CI but it will be harder, we don't have a > declerative > way to configure this. > Fixed all builds using --enable-python3: https://gerrit.ovirt.org/#/c/92847/ Nir ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/ZOMIB3FLERFSFFGSVDBOSHOZNJGP65UQ/
[ovirt-devel] Re: Error building Dockerfile.fedora.rawhide
On Wed, 2018-07-04 at 02:05 +0300, Nir Soffer wrote: > I'm trying to rebuild our test images at: > https://hub.docker.com/u/ovirtorg/dashboard/ > > The fedora rawhide image fail with this error: > > Problem: conflicting requests > - nothing provides python2-blockdev >= 1.1 needed by python2- > blivet1-1:1.20.4-3.fc28.noarch > > Anyone has clue about this error? I guess this effects also vdsm. > > David, do you know about this issue? It is new to me, but it looks like there was a change to libblockdev to stop building the python2-blockdev package in rawhide because of the push to get rid of python2 in general. Probably all that's needed is a new build of libblockdev w/ python2 enabled. I'll see what is required to make that happen. I should also note that the libblockdev maintainer is in the Czech Republic, where today and tomorrow are public holidays. I'll be in touch when I have an update. David > > Nir ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/FSB5JHTLCBPGWNF5FQHPPO6VMJR3NBQV/
[ovirt-devel] Re: [VDSM] Travis builds still fail on .coverage rename
On Thu, Jul 5, 2018 at 5:43 PM Dan Kenigsberg wrote: > On Thu, Jul 5, 2018 at 2:52 AM, Nir Soffer wrote: > > On Wed, Jul 4, 2018 at 1:00 PM Dan Kenigsberg wrote: > >> > >> On Wed, Jul 4, 2018 at 12:48 PM, Nir Soffer wrote: > >> > Dan, travis build still fail when renaming coverage file even after > >> > your last patch. > >> > > >> > > >> > > >> > > ...SS.SS.SS..S.SSSS.SSS...SSS...S.S.SSSS..S.SS.. > >> > -- > >> > Ran 1267 tests in 99.239s > >> > OK (SKIP=63) > >> > [ -n "$NOSE_WITH_COVERAGE" ] && mv .coverage .coverage-nose-py2 > >> > make[1]: *** [check] Error 1 > >> > make[1]: Leaving directory `/vdsm/tests' > >> > ERROR: InvocationError: '/usr/bin/make -C tests check' > >> > > >> > https://travis-ci.org/oVirt/vdsm/jobs/399932012 > >> > > >> > Do you have any idea what is wrong there? > >> > > >> > Why we don't have any error message from the failed command? > >> > >> No idea, nothing pops to mind. > >> We can revert to the sillier [ -f .coverage ] condition instead of > >> understanding (yeah, this feels dirty) > > > > > > Thanks, your patch (https://gerrit.ovirt.org/#/c/92813/) fixed this > > failure. > > > > Now we have failures for the pywatch_test, and some network > > tests. Can someone from network look at this? > > https://travis-ci.org/nirs/vdsm/builds/400204807 > > https://travis-ci.org/nirs/vdsm/jobs/400204808 shows > > ConfigNetworkError: (21, 'Executing commands failed: > ovs-vsctl: cannot create a bridge named vdsmbr_test because a bridge > named vdsmbr_test already exists') > > which I thought was limited to dirty ovirt-ci jenkins slaves. Any idea > why it shows here? > Maybe one failed test leave dirty host to the next test? > py-watch seems to be failing due to missing gdb on the travis image > cmdutils.py151 DEBUG./py-watch 0.1 sleep 10 (cwd None) > cmdutils.py159 DEBUGFAILED: = 'Traceback > (most recent call last):\n File "./py-watch", line 60, in \n > dump_trace(watched_proc)\n File "./py-watch", line 32, in > dump_trace\n\'thread apply all py-bt\'])\n File > "/usr/lib64/python2.7/site-packages/subprocess32.py", line 575, in > call\np = Popen(*popenargs, **kwargs)\n File > "/usr/lib64/python2.7/site-packages/subprocess32.py", line 822, in > __init__\nrestore_signals, start_new_session)\n File > "/usr/lib64/python2.7/site-packages/subprocess32.py", line 1567, in > _execute_child\nraise child_exception_type(errno_num, > err_msg)\nOSError: [Errno 2] No such file or directory: \'gdb\'\n'; > = 1 > Cool, easy fix. > Nir, could you remind me what is "ERROR: InterpreterNotFound: > python3.6" and how can we avoid it? it keeps distracting during > debugging test failures. > We can avoid it in travis using env matrix. Currently we run "make check" which run all the the tox envs (e.g. storage-py27,storage-py36) regardless of the build type. This is good for manual usage when you don't know which python version is available on a developer machine. For example if I have python 3.7 installed, maybe I like to test. We can change this so we will test only the *-py27 on centos, and both *-py27 and *-py36 on Fedora. We can do the same in ovirt CI but it will be harder, we don't have a declerative way to configure this. Nir ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/UKVQVCL2M6NE5VHZQMIPNYNSREIJOQUQ/
[ovirt-devel] Re: [VDSM] Travis builds still fail on .coverage rename
On Thu, Jul 5, 2018 at 2:52 AM, Nir Soffer wrote: > On Wed, Jul 4, 2018 at 1:00 PM Dan Kenigsberg wrote: >> >> On Wed, Jul 4, 2018 at 12:48 PM, Nir Soffer wrote: >> > Dan, travis build still fail when renaming coverage file even after >> > your last patch. >> > >> > >> > >> > ...SS.SS.SS..S.SSSS.SSS...SSS...S.S.SSSS..S.SS.. >> > -- >> > Ran 1267 tests in 99.239s >> > OK (SKIP=63) >> > [ -n "$NOSE_WITH_COVERAGE" ] && mv .coverage .coverage-nose-py2 >> > make[1]: *** [check] Error 1 >> > make[1]: Leaving directory `/vdsm/tests' >> > ERROR: InvocationError: '/usr/bin/make -C tests check' >> > >> > https://travis-ci.org/oVirt/vdsm/jobs/399932012 >> > >> > Do you have any idea what is wrong there? >> > >> > Why we don't have any error message from the failed command? >> >> No idea, nothing pops to mind. >> We can revert to the sillier [ -f .coverage ] condition instead of >> understanding (yeah, this feels dirty) > > > Thanks, your patch (https://gerrit.ovirt.org/#/c/92813/) fixed this > failure. > > Now we have failures for the pywatch_test, and some network > tests. Can someone from network look at this? > https://travis-ci.org/nirs/vdsm/builds/400204807 https://travis-ci.org/nirs/vdsm/jobs/400204808 shows ConfigNetworkError: (21, 'Executing commands failed: ovs-vsctl: cannot create a bridge named vdsmbr_test because a bridge named vdsmbr_test already exists') which I thought was limited to dirty ovirt-ci jenkins slaves. Any idea why it shows here? py-watch seems to be failing due to missing gdb on the travis image cmdutils.py151 DEBUG./py-watch 0.1 sleep 10 (cwd None) cmdutils.py159 DEBUGFAILED: = 'Traceback (most recent call last):\n File "./py-watch", line 60, in \n dump_trace(watched_proc)\n File "./py-watch", line 32, in dump_trace\n\'thread apply all py-bt\'])\n File "/usr/lib64/python2.7/site-packages/subprocess32.py", line 575, in call\np = Popen(*popenargs, **kwargs)\n File "/usr/lib64/python2.7/site-packages/subprocess32.py", line 822, in __init__\nrestore_signals, start_new_session)\n File "/usr/lib64/python2.7/site-packages/subprocess32.py", line 1567, in _execute_child\nraise child_exception_type(errno_num, err_msg)\nOSError: [Errno 2] No such file or directory: \'gdb\'\n'; = 1 Nir, could you remind me what is "ERROR: InterpreterNotFound: python3.6" and how can we avoid it? it keeps distracting during debugging test failures. > > Nir ___ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/DIWMHL7ITMMUM32E6SPG247JFOXH5WRT/
[ovirt-devel] Re: [CQ Failure Report] [Ovirt 4.2] [4.7.18]
the restart was not on the SPM so it should not have caused this task to not be cleaned, i think that the task was just still running. On Thu, Jul 5, 2018 at 10:44 AM, Tal Nisan wrote: > The task is stuck and never cleared after the restart or we just have to > wait for it to be finished and cleared? > > On Thu, Jul 5, 2018 at 12:30 PM, Dafna Ron wrote: > >> Jira opened: https://ovirt-jira.atlassian.net/browse/OVIRT-2286 >> >> Tal, can you help fix the test? >> >> thanks, >> Dafna >> >> >> On Wed, Jul 4, 2018 at 7:41 PM, Dafna Ron wrote: >> >>> The vdsm restart is a test in basic_sanity.py >>> The task that is stuck is downloadImageFromStream >>> >>> 2018-07-04 10:12:52,659-04 WARN [org.ovirt.engine.core.bll.sto >>> rage.domain.DeactivateStorageDomainWithOvfUpdateCommand] (default >>> task-1) [ce1c28ba-1550-457f-b5e3-ad051488f897] There are running tasks: >>> 'AsyncTask:{commandId='0b18c13f-0fce- >>> 4303-8f85-ae5a2b051991', >>> rootCommandId='0b18c13f-0fce-4303-8f85-ae5a2b051991', >>> storagePoolId='8dd2fe5a-9dca-42e2-8593-1de1b18b4887', >>> taskId='f2af86fb-dbbb-430c-afd9-2f25131583b1', >>> vdsmTaskId='45ee0fc8-830d-47d5-9c4a-6d4ed72ae6a1', stepId= >>> 'null', taskType='downloadImageFromStream', status='running'} >>> >>> The task succeeded but after the deactivate storage domain attempt: >>> >>> 2018-07-04 10:13:01,579-04 INFO [org.ovirt.engine.core.bll.Ser >>> ialChildCommandsExecutionCallback] >>> (EE-ManagedThreadFactory-engineScheduled-Thread-98) >>> [1c611686] Command 'ProcessOvfUpdateForStorageDomain' id: >>> '6342fce4-96ff-4ae3-8b40-8155a >>> 5509761' child commands '[0b18c13f-0fce-4303-8f85-ae5a2b051991, >>> 8e16826f-93dd-442a-8a74-13d14222d45e]' executions were completed, >>> status 'SUCCEEDED' >>> >>> >>> we should have a solution in ost for locked objects failing jobs... will >>> open a Jira to follow >>> >>> >>> >>> On Wed, Jul 4, 2018 at 6:16 PM, Nir Soffer wrote: >>> On Wed, Jul 4, 2018 at 6:46 PM Dafna Ron wrote: > The actual test has failed with error: > > 2018-07-04 10:12:52,665-04 ERROR [org.ovirt.engine.api.restapi. > resource.AbstractBackendResource] (default task-1) [] Operation > Failed: [Cannot deactivate Storage while there are running tasks on this > Storage. > > -Please wait until tasks will finish and try again.] > > However, there is a problem with vdsm on host-1. it restarts which may > cause the issue with the running tasks. > Who restarted vdsm? why? > 2018-07-04 10:08:46,603-0400 INFO (ioprocess/5191) [IOProcessClient] > shutdown requested (__init__:108) > 2018-07-04 10:08:46,604-0400 INFO (MainThread) [storage.udev] > Stopping multipath event listener (udev:149) > 2018-07-04 10:08:46,604-0400 INFO (MainThread) [vdsm.api] FINISH > prepareForShutdown return=None from=internal, > task_id=bf3b87e4-febf-4cb6-8bfa-5840fc926b49 > (api:52) > 2018-07-04 10:08:46,605-0400 INFO (MainThread) [vds] Stopping threads > (vdsmd:160) > 2018-07-04 10:08:46,605-0400 INFO (MainThread) [vds] Exiting > (vdsmd:171) > 2018-07-04 10:10:00,145-0400 INFO (MainThread) [vds] (PID: 14034) I > am the actual vdsm 4.20.32-1.el7 lago-basic-suite-4-2-host-1 > (3.10.0-862.2.3.el7.x86_64) (vdsmd:149) > 2018-07-04 10:10:00,146-0400 INFO (MainThread) [vds] VDSM will run > with cpu affinity: frozenset([1]) (vdsmd:262) > 2018-07-04 10:10:00,151-0400 INFO (MainThread) [storage.HSM] START > HSM init (hsm:366) > 2018-07-04 10:10:00,154-0400 INFO (MainThread) [storage.HSM] Creating > data-center mount directory '/rhev/data-center/mnt' (hsm:373) > 2018-07-04 10:10:00,154-0400 INFO (MainThread) [storage.fileUtils] > Creating directory: /rhev/data-center/mnt mode: None (fileUtils:197) > 2018-07-04 10:10:00,265-0400 INFO (MainThread) [storage.HSM] > Unlinking file '/rhev/data-center/8dd2fe5a-9d > ca-42e2-8593-1de1b18b4887/44eba8db-3a9c-4fbe-ba33-a039fcd561e1' > (hsm:523) > 2018-07-04 10:10:00,266-0400 INFO (MainThread) [storage.HSM] > Unlinking file '/rhev/data-center/8dd2fe5a-9d > ca-42e2-8593-1de1b18b4887/mastersd' (hsm:523) > 2018-07-04 10:10:00,266-0400 INFO (MainThread) [storage.HSM] > Unlinking file '/rhev/data-center/8dd2fe5a-9d > ca-42e2-8593-1de1b18b4887/c7980a1e-91ef-4095-82eb-37ec03da9b3f' > (hsm:523) > 2018-07-04 10:10:00,267-0400 INFO (MainThread) [storage.HSM] > Unlinking file '/rhev/data-center/8dd2fe5a-9d > ca-42e2-8593-1de1b18b4887/4fc62763-d8a5-4c36-8687-91870a92ff05' > (hsm:523) > 2018-07-04 10:10:00,267-0400 INFO (MainThread) [storage.HSM] > Unlinking file '/rhev/data-center/8dd2fe5a-9d > ca-42e2-8593-1de1b18b4887/02363608-01b9-4176-b7a1-e9ee235f792a' > (hsm:523) > 2018-07-04 10:10:00,267-0400 INFO (MainThread) [storage.udev] > Registering multipath event monitor object at 0x7f9aa4548150> (udev:182) > 2018-07-04 10
[ovirt-devel] Re: [CQ Failure Report] [Ovirt 4.2] [4.7.18]
The task is stuck and never cleared after the restart or we just have to wait for it to be finished and cleared? On Thu, Jul 5, 2018 at 12:30 PM, Dafna Ron wrote: > Jira opened: https://ovirt-jira.atlassian.net/browse/OVIRT-2286 > > Tal, can you help fix the test? > > thanks, > Dafna > > > On Wed, Jul 4, 2018 at 7:41 PM, Dafna Ron wrote: > >> The vdsm restart is a test in basic_sanity.py >> The task that is stuck is downloadImageFromStream >> >> 2018-07-04 10:12:52,659-04 WARN [org.ovirt.engine.core.bll.sto >> rage.domain.DeactivateStorageDomainWithOvfUpdateCommand] (default >> task-1) [ce1c28ba-1550-457f-b5e3-ad051488f897] There are running tasks: >> 'AsyncTask:{commandId='0b18c13f-0fce- >> 4303-8f85-ae5a2b051991', >> rootCommandId='0b18c13f-0fce-4303-8f85-ae5a2b051991', >> storagePoolId='8dd2fe5a-9dca-42e2-8593-1de1b18b4887', >> taskId='f2af86fb-dbbb-430c-afd9-2f25131583b1', >> vdsmTaskId='45ee0fc8-830d-47d5-9c4a-6d4ed72ae6a1', stepId= >> 'null', taskType='downloadImageFromStream', status='running'} >> >> The task succeeded but after the deactivate storage domain attempt: >> >> 2018-07-04 10:13:01,579-04 INFO [org.ovirt.engine.core.bll.Ser >> ialChildCommandsExecutionCallback] >> (EE-ManagedThreadFactory-engineScheduled-Thread-98) >> [1c611686] Command 'ProcessOvfUpdateForStorageDomain' id: >> '6342fce4-96ff-4ae3-8b40-8155a >> 5509761' child commands '[0b18c13f-0fce-4303-8f85-ae5a2b051991, >> 8e16826f-93dd-442a-8a74-13d14222d45e]' executions were completed, status >> 'SUCCEEDED' >> >> >> we should have a solution in ost for locked objects failing jobs... will >> open a Jira to follow >> >> >> >> On Wed, Jul 4, 2018 at 6:16 PM, Nir Soffer wrote: >> >>> On Wed, Jul 4, 2018 at 6:46 PM Dafna Ron wrote: >>> The actual test has failed with error: 2018-07-04 10:12:52,665-04 ERROR [org.ovirt.engine.api.restapi. resource.AbstractBackendResource] (default task-1) [] Operation Failed: [Cannot deactivate Storage while there are running tasks on this Storage. -Please wait until tasks will finish and try again.] However, there is a problem with vdsm on host-1. it restarts which may cause the issue with the running tasks. >>> >>> Who restarted vdsm? why? >>> >>> 2018-07-04 10:08:46,603-0400 INFO (ioprocess/5191) [IOProcessClient] shutdown requested (__init__:108) 2018-07-04 10:08:46,604-0400 INFO (MainThread) [storage.udev] Stopping multipath event listener (udev:149) 2018-07-04 10:08:46,604-0400 INFO (MainThread) [vdsm.api] FINISH prepareForShutdown return=None from=internal, task_id=bf3b87e4-febf-4cb6-8bfa-5840fc926b49 (api:52) 2018-07-04 10:08:46,605-0400 INFO (MainThread) [vds] Stopping threads (vdsmd:160) 2018-07-04 10:08:46,605-0400 INFO (MainThread) [vds] Exiting (vdsmd:171) 2018-07-04 10:10:00,145-0400 INFO (MainThread) [vds] (PID: 14034) I am the actual vdsm 4.20.32-1.el7 lago-basic-suite-4-2-host-1 (3.10.0-862.2.3.el7.x86_64) (vdsmd:149) 2018-07-04 10:10:00,146-0400 INFO (MainThread) [vds] VDSM will run with cpu affinity: frozenset([1]) (vdsmd:262) 2018-07-04 10:10:00,151-0400 INFO (MainThread) [storage.HSM] START HSM init (hsm:366) 2018-07-04 10:10:00,154-0400 INFO (MainThread) [storage.HSM] Creating data-center mount directory '/rhev/data-center/mnt' (hsm:373) 2018-07-04 10:10:00,154-0400 INFO (MainThread) [storage.fileUtils] Creating directory: /rhev/data-center/mnt mode: None (fileUtils:197) 2018-07-04 10:10:00,265-0400 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/44eb a8db-3a9c-4fbe-ba33-a039fcd561e1' (hsm:523) 2018-07-04 10:10:00,266-0400 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/mastersd' (hsm:523) 2018-07-04 10:10:00,266-0400 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/c798 0a1e-91ef-4095-82eb-37ec03da9b3f' (hsm:523) 2018-07-04 10:10:00,267-0400 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/4fc6 2763-d8a5-4c36-8687-91870a92ff05' (hsm:523) 2018-07-04 10:10:00,267-0400 INFO (MainThread) [storage.HSM] Unlinking file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/0236 3608-01b9-4176-b7a1-e9ee235f792a' (hsm:523) 2018-07-04 10:10:00,267-0400 INFO (MainThread) [storage.udev] Registering multipath event monitor >>> object at 0x7f9aa4548150> (udev:182) 2018-07-04 10:10:00,267-0400 INFO (MainThread) [storage.udev] Starting multipath event listener (udev:116) 2018-07-04 10:10:00,298-0400 INFO (MainThread) [storage.check] Starting check service (check:91) 2018-07-04 10:10:00,303-0400 INFO (MainThread) [storage.Dispatcher] Starting StorageDispatcher... (di
[ovirt-devel] Re: [CQ Failure Report] [Ovirt 4.2] [4.7.18]
Jira opened: https://ovirt-jira.atlassian.net/browse/OVIRT-2286 Tal, can you help fix the test? thanks, Dafna On Wed, Jul 4, 2018 at 7:41 PM, Dafna Ron wrote: > The vdsm restart is a test in basic_sanity.py > The task that is stuck is downloadImageFromStream > > 2018-07-04 10:12:52,659-04 WARN [org.ovirt.engine.core.bll. > storage.domain.DeactivateStorageDomainWithOvfUpdateCommand] (default > task-1) [ce1c28ba-1550-457f-b5e3-ad051488f897] There are running tasks: > 'AsyncTask:{commandId='0b18c13f-0fce- > 4303-8f85-ae5a2b051991', rootCommandId='0b18c13f-0fce-4303-8f85-ae5a2b051991', > storagePoolId='8dd2fe5a-9dca-42e2-8593-1de1b18b4887', > taskId='f2af86fb-dbbb-430c-afd9-2f25131583b1', > vdsmTaskId='45ee0fc8-830d-47d5-9c4a-6d4ed72ae6a1', > stepId= > 'null', taskType='downloadImageFromStream', status='running'} > > The task succeeded but after the deactivate storage domain attempt: > > 2018-07-04 10:13:01,579-04 INFO [org.ovirt.engine.core.bll. > SerialChildCommandsExecutionCallback] > (EE-ManagedThreadFactory-engineScheduled-Thread-98) > [1c611686] Command 'ProcessOvfUpdateForStorageDomain' id: > '6342fce4-96ff-4ae3-8b40-8155a > 5509761' child commands '[0b18c13f-0fce-4303-8f85-ae5a2b051991, > 8e16826f-93dd-442a-8a74-13d14222d45e]' executions were completed, status > 'SUCCEEDED' > > > we should have a solution in ost for locked objects failing jobs... will > open a Jira to follow > > > > On Wed, Jul 4, 2018 at 6:16 PM, Nir Soffer wrote: > >> On Wed, Jul 4, 2018 at 6:46 PM Dafna Ron wrote: >> >>> The actual test has failed with error: >>> >>> 2018-07-04 10:12:52,665-04 ERROR [org.ovirt.engine.api.restapi. >>> resource.AbstractBackendResource] (default task-1) [] Operation Failed: >>> [Cannot deactivate Storage while there are running tasks on this Storage. >>> >>> -Please wait until tasks will finish and try again.] >>> >>> However, there is a problem with vdsm on host-1. it restarts which may >>> cause the issue with the running tasks. >>> >> >> Who restarted vdsm? why? >> >> >>> 2018-07-04 10:08:46,603-0400 INFO (ioprocess/5191) [IOProcessClient] >>> shutdown requested (__init__:108) >>> 2018-07-04 10:08:46,604-0400 INFO (MainThread) [storage.udev] Stopping >>> multipath event listener (udev:149) >>> 2018-07-04 10:08:46,604-0400 INFO (MainThread) [vdsm.api] FINISH >>> prepareForShutdown return=None from=internal, >>> task_id=bf3b87e4-febf-4cb6-8bfa-5840fc926b49 >>> (api:52) >>> 2018-07-04 10:08:46,605-0400 INFO (MainThread) [vds] Stopping threads >>> (vdsmd:160) >>> 2018-07-04 10:08:46,605-0400 INFO (MainThread) [vds] Exiting (vdsmd:171) >>> 2018-07-04 10:10:00,145-0400 INFO (MainThread) [vds] (PID: 14034) I am >>> the actual vdsm 4.20.32-1.el7 lago-basic-suite-4-2-host-1 >>> (3.10.0-862.2.3.el7.x86_64) (vdsmd:149) >>> 2018-07-04 10:10:00,146-0400 INFO (MainThread) [vds] VDSM will run with >>> cpu affinity: frozenset([1]) (vdsmd:262) >>> 2018-07-04 10:10:00,151-0400 INFO (MainThread) [storage.HSM] START HSM >>> init (hsm:366) >>> 2018-07-04 10:10:00,154-0400 INFO (MainThread) [storage.HSM] Creating >>> data-center mount directory '/rhev/data-center/mnt' (hsm:373) >>> 2018-07-04 10:10:00,154-0400 INFO (MainThread) [storage.fileUtils] >>> Creating directory: /rhev/data-center/mnt mode: None (fileUtils:197) >>> 2018-07-04 10:10:00,265-0400 INFO (MainThread) [storage.HSM] Unlinking >>> file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/44eb >>> a8db-3a9c-4fbe-ba33-a039fcd561e1' (hsm:523) >>> 2018-07-04 10:10:00,266-0400 INFO (MainThread) [storage.HSM] Unlinking >>> file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/mastersd' >>> (hsm:523) >>> 2018-07-04 10:10:00,266-0400 INFO (MainThread) [storage.HSM] Unlinking >>> file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/c798 >>> 0a1e-91ef-4095-82eb-37ec03da9b3f' (hsm:523) >>> 2018-07-04 10:10:00,267-0400 INFO (MainThread) [storage.HSM] Unlinking >>> file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/4fc6 >>> 2763-d8a5-4c36-8687-91870a92ff05' (hsm:523) >>> 2018-07-04 10:10:00,267-0400 INFO (MainThread) [storage.HSM] Unlinking >>> file '/rhev/data-center/8dd2fe5a-9dca-42e2-8593-1de1b18b4887/0236 >>> 3608-01b9-4176-b7a1-e9ee235f792a' (hsm:523) >>> 2018-07-04 10:10:00,267-0400 INFO (MainThread) [storage.udev] >>> Registering multipath event monitor >> object at 0x7f9aa4548150> (udev:182) >>> 2018-07-04 10:10:00,267-0400 INFO (MainThread) [storage.udev] Starting >>> multipath event listener (udev:116) >>> 2018-07-04 10:10:00,298-0400 INFO (MainThread) [storage.check] Starting >>> check service (check:91) >>> 2018-07-04 10:10:00,303-0400 INFO (MainThread) [storage.Dispatcher] >>> Starting StorageDispatcher... (di >>> >>> On Wed, Jul 4, 2018 at 4:12 PM, Greg Sheremeta >>> wrote: >>> """ Error: Fault reason is "Operation Failed". Fault detail is "[Cannot deactivate Storage while there are running tasks on this Storage. -Please wait until tasks will finish and try again.]". HTTP re