Re: [ovirt-devel] planned Jenkins restart

2018-04-10 Thread Evgheni Dereveanchin
Maintenance completed, Jenkins back up and running.
The OS, Jenkins core and all plugins were updated:
*https://ovirt-jira.atlassian.net/browse/OVIRT-1937
*

As always - if you see any issues please report them to Jira.

Regards,
Evgheni Dereveanchin


On Wed, Apr 11, 2018 at 4:11 AM, Evgheni Dereveanchin 
wrote:

> Hi everyone,
>
> I'll be performing a planned Jenkins restart within the next hour.
> No new jobs will be scheduled during this maintenance period.
> I will inform you once it is over.
>
> Regards,
> Evgheni Dereveanchin
>



-- 
Regards,
Evgheni Dereveanchin
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] planned Jenkins restart

2018-04-10 Thread Evgheni Dereveanchin
Hi everyone,

I'll be performing a planned Jenkins restart within the next hour.
No new jobs will be scheduled during this maintenance period.
I will inform you once it is over.

Regards,
Evgheni Dereveanchin
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [ OST Failure Report ] [ oVirt 4.2 ] [ 2018-04-04 ] [006_migrations.prepare_migration_attachments_ipv6]

2018-04-10 Thread Ravi Shankar Nori
This [1] should fix the multiple release lock issue

[1] https://gerrit.ovirt.org/#/c/90077/

On Tue, Apr 10, 2018 at 3:53 PM, Ravi Shankar Nori  wrote:

> Working on a patch will post a fix
>
> Thanks
>
> Ravi
>
> On Tue, Apr 10, 2018 at 9:14 AM, Alona Kaplan  wrote:
>
>> Hi all,
>>
>> Looking at the log it seems that the new GetCapabilitiesAsync is
>> responsible for the mess.
>>
>> -
>> * 08:29:47 - engine loses connectivity to host 
>> 'lago-basic-suite-4-2-host-0'.*
>>
>>
>>
>> *- Every 3 seconds a getCapabalititiesAsync request is sent to the host 
>> (unsuccessfully).*
>>
>>  * before each "getCapabilitiesAsync" the monitoring lock is taken 
>> (VdsManager,refreshImpl)
>>
>>  * "getCapabilitiesAsync" immediately fails and throws 
>> 'VDSNetworkException: java.net.ConnectException: Connection refused'. The 
>> exception is caught by 
>> 'GetCapabilitiesAsyncVDSCommand.executeVdsBrokerCommand' which calls 
>> 'onFailure' of the callback and re-throws the exception.
>>
>>  catch (Throwable t) {
>> getParameters().getCallback().onFailure(t);
>> throw t;
>>  }
>>
>> * The 'onFailure' of the callback releases the "monitoringLock" 
>> ('postProcessRefresh()->afterRefreshTreatment()-> if (!succeeded) 
>> lockManager.releaseLock(monitoringLock);')
>>
>> * 'VdsManager,refreshImpl' catches the network exception, marks 
>> 'releaseLock = true' and *tries to release the already released lock*.
>>
>>   The following warning is printed to the log -
>>
>>   WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] 
>> (EE-ManagedThreadFactory-engineScheduled-Thread-53) [] Trying to release 
>> exclusive lock which does not exist, lock key: 
>> 'ecf53d69-eb68-4b11-8df2-c4aa4e19bd93VDS_INIT'
>>
>>
>>
>>
>> *- 08:30:51 a successful getCapabilitiesAsync is sent.*
>>
>>
>> *- 08:32:55 - The failing test starts (Setup Networks for setting ipv6).*
>>
>> * SetupNetworks takes the monitoring lock.
>>
>> *- 08:33:00 - ResponseTracker cleans the getCapabilitiesAsync requests from 
>> 4 minutes ago from its queue and prints a VDSNetworkException: Vds timeout 
>> occured.*
>>
>>   * When the first request is removed from the queue 
>> ('ResponseTracker.remove()'), the
>> *'Callback.onFailure' is invoked (for the second time) -> monitoring lock is 
>> released (the lock taken by the SetupNetworks!).*
>>
>>   * *The other requests removed from the queue also try to release the 
>> monitoring lock*, but there is nothing to release.
>>
>>   * The following warning log is printed -
>> WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] 
>> (EE-ManagedThreadFactory-engineScheduled-Thread-14) [] Trying to release 
>> exclusive lock which does not exist, lock key: 
>> 'ecf53d69-eb68-4b11-8df2-c4aa4e19bd93VDS_INIT'
>>
>> - *08:33:00 - SetupNetwork fails on Timeout ~4 seconds after is started*. 
>> Why? I'm not 100% sure but I guess the late processing of the 
>> 'getCapabilitiesAsync' that causes losing of the monitoring lock and the 
>> late + mupltiple processing of failure is root cause.
>>
>>
>> Ravi, 'getCapabilitiesAsync' failure is treated twice and the lock is trying 
>> to be released three times. Please share your opinion regarding how it 
>> should be fixed.
>>
>>
>> Thanks,
>>
>> Alona.
>>
>>
>>
>>
>>
>>
>> On Sun, Apr 8, 2018 at 1:21 PM, Dan Kenigsberg  wrote:
>>
>>> On Sun, Apr 8, 2018 at 9:21 AM, Edward Haas  wrote:
>>>


 On Sun, Apr 8, 2018 at 9:15 AM, Eyal Edri  wrote:

> Was already done by Yaniv - https://gerrit.ovirt.org/#/c/89851.
> Is it still failing?
>
> On Sun, Apr 8, 2018 at 8:59 AM, Barak Korren 
> wrote:
>
>> On 7 April 2018 at 00:30, Dan Kenigsberg  wrote:
>> > No, I am afraid that we have not managed to understand why setting
>> and
>> > ipv6 address too the host off the grid. We shall continue
>> researching
>> > this next week.
>> >
>> > Edy, https://gerrit.ovirt.org/#/c/88637/ is already 4 weeks old,
>> but
>> > could it possibly be related (I really doubt that)?
>> >
>>
>
 Sorry, but I do not see how this problem is related to VDSM.
 There is nothing that indicates that there is a VDSM problem.

 Has the RPC connection between Engine and VDSM failed?


>>> Further up the thread, Piotr noticed that (at least on one failure of
>>> this test) that the Vdsm host lost connectivity to its storage, and Vdsm
>>> process was restarted. However, this does not seems to happen in all cases
>>> where this test fails.
>>>
>>> ___
>>> Devel mailing list
>>> Devel@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>
>>
>>
>
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [ OST Failure Report ] [ oVirt 4.2 ] [ 2018-04-04 ] [006_migrations.prepare_migration_attachments_ipv6]

2018-04-10 Thread Ravi Shankar Nori
Working on a patch will post a fix

Thanks

Ravi

On Tue, Apr 10, 2018 at 9:14 AM, Alona Kaplan  wrote:

> Hi all,
>
> Looking at the log it seems that the new GetCapabilitiesAsync is
> responsible for the mess.
>
> -
> * 08:29:47 - engine loses connectivity to host 'lago-basic-suite-4-2-host-0'.*
>
>
>
> *- Every 3 seconds a getCapabalititiesAsync request is sent to the host 
> (unsuccessfully).*
>
>  * before each "getCapabilitiesAsync" the monitoring lock is taken 
> (VdsManager,refreshImpl)
>
>  * "getCapabilitiesAsync" immediately fails and throws 
> 'VDSNetworkException: java.net.ConnectException: Connection refused'. The 
> exception is caught by 
> 'GetCapabilitiesAsyncVDSCommand.executeVdsBrokerCommand' which calls 
> 'onFailure' of the callback and re-throws the exception.
>
>  catch (Throwable t) {
> getParameters().getCallback().onFailure(t);
> throw t;
>  }
>
> * The 'onFailure' of the callback releases the "monitoringLock" 
> ('postProcessRefresh()->afterRefreshTreatment()-> if (!succeeded) 
> lockManager.releaseLock(monitoringLock);')
>
> * 'VdsManager,refreshImpl' catches the network exception, marks 
> 'releaseLock = true' and *tries to release the already released lock*.
>
>   The following warning is printed to the log -
>
>   WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] 
> (EE-ManagedThreadFactory-engineScheduled-Thread-53) [] Trying to release 
> exclusive lock which does not exist, lock key: 
> 'ecf53d69-eb68-4b11-8df2-c4aa4e19bd93VDS_INIT'
>
>
>
>
> *- 08:30:51 a successful getCapabilitiesAsync is sent.*
>
>
> *- 08:32:55 - The failing test starts (Setup Networks for setting ipv6).*
>
> * SetupNetworks takes the monitoring lock.
>
> *- 08:33:00 - ResponseTracker cleans the getCapabilitiesAsync requests from 4 
> minutes ago from its queue and prints a VDSNetworkException: Vds timeout 
> occured.*
>
>   * When the first request is removed from the queue 
> ('ResponseTracker.remove()'), the
> *'Callback.onFailure' is invoked (for the second time) -> monitoring lock is 
> released (the lock taken by the SetupNetworks!).*
>
>   * *The other requests removed from the queue also try to release the 
> monitoring lock*, but there is nothing to release.
>
>   * The following warning log is printed -
> WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] 
> (EE-ManagedThreadFactory-engineScheduled-Thread-14) [] Trying to release 
> exclusive lock which does not exist, lock key: 
> 'ecf53d69-eb68-4b11-8df2-c4aa4e19bd93VDS_INIT'
>
> - *08:33:00 - SetupNetwork fails on Timeout ~4 seconds after is started*. 
> Why? I'm not 100% sure but I guess the late processing of the 
> 'getCapabilitiesAsync' that causes losing of the monitoring lock and the late 
> + mupltiple processing of failure is root cause.
>
>
> Ravi, 'getCapabilitiesAsync' failure is treated twice and the lock is trying 
> to be released three times. Please share your opinion regarding how it should 
> be fixed.
>
>
> Thanks,
>
> Alona.
>
>
>
>
>
>
> On Sun, Apr 8, 2018 at 1:21 PM, Dan Kenigsberg  wrote:
>
>> On Sun, Apr 8, 2018 at 9:21 AM, Edward Haas  wrote:
>>
>>>
>>>
>>> On Sun, Apr 8, 2018 at 9:15 AM, Eyal Edri  wrote:
>>>
 Was already done by Yaniv - https://gerrit.ovirt.org/#/c/89851.
 Is it still failing?

 On Sun, Apr 8, 2018 at 8:59 AM, Barak Korren 
 wrote:

> On 7 April 2018 at 00:30, Dan Kenigsberg  wrote:
> > No, I am afraid that we have not managed to understand why setting
> and
> > ipv6 address too the host off the grid. We shall continue researching
> > this next week.
> >
> > Edy, https://gerrit.ovirt.org/#/c/88637/ is already 4 weeks old, but
> > could it possibly be related (I really doubt that)?
> >
>

>>> Sorry, but I do not see how this problem is related to VDSM.
>>> There is nothing that indicates that there is a VDSM problem.
>>>
>>> Has the RPC connection between Engine and VDSM failed?
>>>
>>>
>> Further up the thread, Piotr noticed that (at least on one failure of
>> this test) that the Vdsm host lost connectivity to its storage, and Vdsm
>> process was restarted. However, this does not seems to happen in all cases
>> where this test fails.
>>
>> ___
>> Devel mailing list
>> Devel@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
>>
>
>
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [ OST Failure Report ] [ oVirt 4.2 ] [ 2018-04-04 ] [006_migrations.prepare_migration_attachments_ipv6]

2018-04-10 Thread Gal Ben Haim
I'm seeing the same error in [1], during 006_migrations.migrate_vm.

[1] http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1650/

On Tue, Apr 10, 2018 at 4:14 PM, Alona Kaplan  wrote:

> Hi all,
>
> Looking at the log it seems that the new GetCapabilitiesAsync is
> responsible for the mess.
>
> -
> * 08:29:47 - engine loses connectivity to host 'lago-basic-suite-4-2-host-0'.*
>
>
>
> *- Every 3 seconds a getCapabalititiesAsync request is sent to the host 
> (unsuccessfully).*
>
>  * before each "getCapabilitiesAsync" the monitoring lock is taken 
> (VdsManager,refreshImpl)
>
>  * "getCapabilitiesAsync" immediately fails and throws 
> 'VDSNetworkException: java.net.ConnectException: Connection refused'. The 
> exception is caught by 
> 'GetCapabilitiesAsyncVDSCommand.executeVdsBrokerCommand' which calls 
> 'onFailure' of the callback and re-throws the exception.
>
>  catch (Throwable t) {
> getParameters().getCallback().onFailure(t);
> throw t;
>  }
>
> * The 'onFailure' of the callback releases the "monitoringLock" 
> ('postProcessRefresh()->afterRefreshTreatment()-> if (!succeeded) 
> lockManager.releaseLock(monitoringLock);')
>
> * 'VdsManager,refreshImpl' catches the network exception, marks 
> 'releaseLock = true' and *tries to release the already released lock*.
>
>   The following warning is printed to the log -
>
>   WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] 
> (EE-ManagedThreadFactory-engineScheduled-Thread-53) [] Trying to release 
> exclusive lock which does not exist, lock key: 
> 'ecf53d69-eb68-4b11-8df2-c4aa4e19bd93VDS_INIT'
>
>
>
>
> *- 08:30:51 a successful getCapabilitiesAsync is sent.*
>
>
> *- 08:32:55 - The failing test starts (Setup Networks for setting ipv6).*
>
> * SetupNetworks takes the monitoring lock.
>
> *- 08:33:00 - ResponseTracker cleans the getCapabilitiesAsync requests from 4 
> minutes ago from its queue and prints a VDSNetworkException: Vds timeout 
> occured.*
>
>   * When the first request is removed from the queue 
> ('ResponseTracker.remove()'), the
> *'Callback.onFailure' is invoked (for the second time) -> monitoring lock is 
> released (the lock taken by the SetupNetworks!).*
>
>   * *The other requests removed from the queue also try to release the 
> monitoring lock*, but there is nothing to release.
>
>   * The following warning log is printed -
> WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] 
> (EE-ManagedThreadFactory-engineScheduled-Thread-14) [] Trying to release 
> exclusive lock which does not exist, lock key: 
> 'ecf53d69-eb68-4b11-8df2-c4aa4e19bd93VDS_INIT'
>
> - *08:33:00 - SetupNetwork fails on Timeout ~4 seconds after is started*. 
> Why? I'm not 100% sure but I guess the late processing of the 
> 'getCapabilitiesAsync' that causes losing of the monitoring lock and the late 
> + mupltiple processing of failure is root cause.
>
>
> Ravi, 'getCapabilitiesAsync' failure is treated twice and the lock is trying 
> to be released three times. Please share your opinion regarding how it should 
> be fixed.
>
>
> Thanks,
>
> Alona.
>
>
>
>
>
>
> On Sun, Apr 8, 2018 at 1:21 PM, Dan Kenigsberg  wrote:
>
>> On Sun, Apr 8, 2018 at 9:21 AM, Edward Haas  wrote:
>>
>>>
>>>
>>> On Sun, Apr 8, 2018 at 9:15 AM, Eyal Edri  wrote:
>>>
 Was already done by Yaniv - https://gerrit.ovirt.org/#/c/89851.
 Is it still failing?

 On Sun, Apr 8, 2018 at 8:59 AM, Barak Korren 
 wrote:

> On 7 April 2018 at 00:30, Dan Kenigsberg  wrote:
> > No, I am afraid that we have not managed to understand why setting
> and
> > ipv6 address too the host off the grid. We shall continue researching
> > this next week.
> >
> > Edy, https://gerrit.ovirt.org/#/c/88637/ is already 4 weeks old, but
> > could it possibly be related (I really doubt that)?
> >
>

>>> Sorry, but I do not see how this problem is related to VDSM.
>>> There is nothing that indicates that there is a VDSM problem.
>>>
>>> Has the RPC connection between Engine and VDSM failed?
>>>
>>>
>> Further up the thread, Piotr noticed that (at least on one failure of
>> this test) that the Vdsm host lost connectivity to its storage, and Vdsm
>> process was restarted. However, this does not seems to happen in all cases
>> where this test fails.
>>
>> ___
>> Devel mailing list
>> Devel@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
>>
>
>
> ___
> Devel mailing list
> Devel@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>



-- 
*GAL bEN HAIM*
RHV DEVOPS
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] make check on master fails due to UnicodeDecodeError

2018-04-10 Thread Nir Soffer
On Tue, Apr 10, 2018 at 5:21 PM Shani Leviim  wrote:

> Hi,
>
> Yes, I did clean the root directory but it didn't solve the issue.
> I'm currently running the tests on fedora27, using python version 2.1.14.
>
> Thanks to Dan's help, it seems that we found the root cause:
>
> I had 2 pickle files under /var/cache/vdsm/schema: vdsm-api.pickle and
> vdsm-events.pickle.
> Removing them and re-running the tests using make check was successfully
> completed.
>

How did you have cached schema under /var/run? This directory is owned by
root.
Are you running the tests as root?

This sounds like a bug in the code using the pickled schema. The pickled
should not
be used if the timestamp of the pickle do not match the timestamp of the
source.

Also in make check, we should not use host schema cache, but local schema
cache
generated by running "make".

Nir
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] make check on master fails due to UnicodeDecodeError

2018-04-10 Thread Shani Leviim
Hi,

Yes, I did clean the root directory but it didn't solve the issue.
I'm currently running the tests on fedora27, using python version 2.1.14.

Thanks to Dan's help, it seems that we found the root cause:

I had 2 pickle files under /var/cache/vdsm/schema: vdsm-api.pickle and
vdsm-events.pickle.
Removing them and re-running the tests using make check was successfully
completed.

It was probably derived from a different encoding for python 2 and 3 while
loading the schema file.



*Regards,*

*Shani Leviim*

On Tue, Apr 10, 2018 at 4:19 PM, Nir Soffer  wrote:

> On Tue, Apr 10, 2018 at 2:52 PM Shani Leviim  wrote:
>
>> Hi there,
>> I'm trying to run make check, and I have ~13 tests on vdsm/tests which
>> failes due to the following:
>>
>>   File "/home/sleviim/git/vdsm/lib/vdsm/api/vdsmapi.py", line 212, in
>> __init__
>> loaded_schema = pickle.load(f)
>>   File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
>> return codecs.ascii_decode(input, self.errors)[0]
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0:
>> ordinal not in range(128)
>>
>> (Those lines are common to all failures)
>>
>> Here is an example:
>>
>> ==
>> ERROR: test_ok_response (vdsmapi_test.DataVerificationTests)
>> --
>> Traceback (most recent call last):
>>   File "/home/sleviim/git/vdsm/tests/vdsmapi_test.py", line 96, in
>> test_ok_response
>> _schema.schema().verify_retval(
>>   File "/home/sleviim/git/vdsm/tests/vdsmapi_test.py", line 67, in schema
>> self._schema = vdsmapi.Schema(paths, True)
>>   File "/home/sleviim/git/vdsm/lib/vdsm/api/vdsmapi.py", line 212, in
>> __init__
>> loaded_schema = pickle.load(f)
>>   File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
>> return codecs.ascii_decode(input, self.errors)[0]
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0:
>> ordinal not in range(128)
>>
>> I've also tried to git clean -dxf && ./autogen.sh --system but it didn't
>> help.
>>
>
> Did you clean in the root directory?
>
> cd vdsm-checkout-dir
> git clean -dxf
> ./autogen.sh --system
> make
> make check
>
> Also, on which system do you run the tests? Fedora 27? CentOS? RHEL?
>
> Nir
>
>
>
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] make check on master fails due to UnicodeDecodeError

2018-04-10 Thread Nir Soffer
On Tue, Apr 10, 2018 at 2:52 PM Shani Leviim  wrote:

> Hi there,
> I'm trying to run make check, and I have ~13 tests on vdsm/tests which
> failes due to the following:
>
>   File "/home/sleviim/git/vdsm/lib/vdsm/api/vdsmapi.py", line 212, in
> __init__
> loaded_schema = pickle.load(f)
>   File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
> return codecs.ascii_decode(input, self.errors)[0]
> UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0:
> ordinal not in range(128)
>
> (Those lines are common to all failures)
>
> Here is an example:
>
> ==
> ERROR: test_ok_response (vdsmapi_test.DataVerificationTests)
> --
> Traceback (most recent call last):
>   File "/home/sleviim/git/vdsm/tests/vdsmapi_test.py", line 96, in
> test_ok_response
> _schema.schema().verify_retval(
>   File "/home/sleviim/git/vdsm/tests/vdsmapi_test.py", line 67, in schema
> self._schema = vdsmapi.Schema(paths, True)
>   File "/home/sleviim/git/vdsm/lib/vdsm/api/vdsmapi.py", line 212, in
> __init__
> loaded_schema = pickle.load(f)
>   File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
> return codecs.ascii_decode(input, self.errors)[0]
> UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0:
> ordinal not in range(128)
>
> I've also tried to git clean -dxf && ./autogen.sh --system but it didn't
> help.
>

Did you clean in the root directory?

cd vdsm-checkout-dir
git clean -dxf
./autogen.sh --system
make
make check

Also, on which system do you run the tests? Fedora 27? CentOS? RHEL?

Nir
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [ OST Failure Report ] [ oVirt 4.2 ] [ 2018-04-04 ] [006_migrations.prepare_migration_attachments_ipv6]

2018-04-10 Thread Alona Kaplan
Hi all,

Looking at the log it seems that the new GetCapabilitiesAsync is
responsible for the mess.

-
* 08:29:47 - engine loses connectivity to host 'lago-basic-suite-4-2-host-0'.*



*- Every 3 seconds a getCapabalititiesAsync request is sent to the
host (unsuccessfully).*

 * before each "getCapabilitiesAsync" the monitoring lock is taken
(VdsManager,refreshImpl)

 * "getCapabilitiesAsync" immediately fails and throws
'VDSNetworkException: java.net.ConnectException: Connection refused'.
The exception is caught by
'GetCapabilitiesAsyncVDSCommand.executeVdsBrokerCommand' which calls
'onFailure' of the callback and re-throws the exception.

 catch (Throwable t) {
getParameters().getCallback().onFailure(t);
throw t;
 }

* The 'onFailure' of the callback releases the "monitoringLock"
('postProcessRefresh()->afterRefreshTreatment()-> if (!succeeded)
lockManager.releaseLock(monitoringLock);')

* 'VdsManager,refreshImpl' catches the network exception, marks
'releaseLock = true' and *tries to release the already released lock*.

  The following warning is printed to the log -

  WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager]
(EE-ManagedThreadFactory-engineScheduled-Thread-53) [] Trying to
release exclusive lock which does not exist, lock key:
'ecf53d69-eb68-4b11-8df2-c4aa4e19bd93VDS_INIT'




*- 08:30:51 a successful getCapabilitiesAsync is sent.*


*- 08:32:55 - The failing test starts (Setup Networks for setting ipv6).*

* SetupNetworks takes the monitoring lock.

*- 08:33:00 - ResponseTracker cleans the getCapabilitiesAsync requests
from 4 minutes ago from its queue and prints a VDSNetworkException:
Vds timeout occured.*

  * When the first request is removed from the queue
('ResponseTracker.remove()'), the
*'Callback.onFailure' is invoked (for the second time) -> monitoring
lock is released (the lock taken by the SetupNetworks!).*

  * *The other requests removed from the queue also try to release
the monitoring lock*, but there is nothing to release.

  * The following warning log is printed -
WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager]
(EE-ManagedThreadFactory-engineScheduled-Thread-14) [] Trying to
release exclusive lock which does not exist, lock key:
'ecf53d69-eb68-4b11-8df2-c4aa4e19bd93VDS_INIT'

- *08:33:00 - SetupNetwork fails on Timeout ~4 seconds after is
started*. Why? I'm not 100% sure but I guess the late processing of
the 'getCapabilitiesAsync' that causes losing of the monitoring lock
and the late + mupltiple processing of failure is root cause.


Ravi, 'getCapabilitiesAsync' failure is treated twice and the lock is
trying to be released three times. Please share your opinion regarding
how it should be fixed.


Thanks,

Alona.






On Sun, Apr 8, 2018 at 1:21 PM, Dan Kenigsberg  wrote:

> On Sun, Apr 8, 2018 at 9:21 AM, Edward Haas  wrote:
>
>>
>>
>> On Sun, Apr 8, 2018 at 9:15 AM, Eyal Edri  wrote:
>>
>>> Was already done by Yaniv - https://gerrit.ovirt.org/#/c/89851.
>>> Is it still failing?
>>>
>>> On Sun, Apr 8, 2018 at 8:59 AM, Barak Korren  wrote:
>>>
 On 7 April 2018 at 00:30, Dan Kenigsberg  wrote:
 > No, I am afraid that we have not managed to understand why setting and
 > ipv6 address too the host off the grid. We shall continue researching
 > this next week.
 >
 > Edy, https://gerrit.ovirt.org/#/c/88637/ is already 4 weeks old, but
 > could it possibly be related (I really doubt that)?
 >

>>>
>> Sorry, but I do not see how this problem is related to VDSM.
>> There is nothing that indicates that there is a VDSM problem.
>>
>> Has the RPC connection between Engine and VDSM failed?
>>
>>
> Further up the thread, Piotr noticed that (at least on one failure of this
> test) that the Vdsm host lost connectivity to its storage, and Vdsm process
> was restarted. However, this does not seems to happen in all cases where
> this test fails.
>
> ___
> Devel mailing list
> Devel@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] dynamic ownership changes

2018-04-10 Thread Martin Polednik

Hey,

I've created a patch[0] that is finally able to activate libvirt's
dynamic_ownership for VDSM while not negatively affecting
functionality of our storage code.

That of course comes with quite a bit of code removal, mostly in the
area of host devices, hwrng and anything that touches devices; bunch
of test changes and one XML generation caveat (storage is handled by
VDSM, therefore disk relabelling needs to be disabled on the VDSM
level).

Because of the scope of the patch, I welcome storage/virt/network
people to review the code and consider the implication this change has
on current/future features.

[0] https://gerrit.ovirt.org/#/c/89830/

mpolednik
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel


[ovirt-devel] make check on master fails due to UnicodeDecodeError

2018-04-10 Thread Shani Leviim
Hi there,
I'm trying to run make check, and I have ~13 tests on vdsm/tests which
failes due to the following:

  File "/home/sleviim/git/vdsm/lib/vdsm/api/vdsmapi.py", line 212, in
__init__
loaded_schema = pickle.load(f)
  File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0:
ordinal not in range(128)

(Those lines are common to all failures)

Here is an example:

==
ERROR: test_ok_response (vdsmapi_test.DataVerificationTests)
--
Traceback (most recent call last):
  File "/home/sleviim/git/vdsm/tests/vdsmapi_test.py", line 96, in
test_ok_response
_schema.schema().verify_retval(
  File "/home/sleviim/git/vdsm/tests/vdsmapi_test.py", line 67, in schema
self._schema = vdsmapi.Schema(paths, True)
  File "/home/sleviim/git/vdsm/lib/vdsm/api/vdsmapi.py", line 212, in
__init__
loaded_schema = pickle.load(f)
  File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0:
ordinal not in range(128)

I've also tried to git clean -dxf && ./autogen.sh --system but it didn't
help.

Can you please assist?
Thanks!



*Regards,*

*Shani Leviim*
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel