[SSSD] Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2016-04-26 Thread Michal Židek

On 04/22/2016 09:04 AM, Lukas Slebodnik wrote:

On (02/12/15 17:10), Michal Židek wrote:

Hi!

I saw some integration tests failures recently,
and I think there is a race condition between the
enumeration refresh timeout and the sleeps
after some operations that wait for this timeout.
SSSD fails to populate changes from LDAP in time
and some asserts can fail because of this.

So far I saw 4 tests to fail like this, which
is already quite a lot.

The attached patch modifies the timeout values
and hopefully removes the issue.

Michal



From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michal=20=C5=BDidek?= 
Date: Wed, 2 Dec 2015 16:44:48 +0100
Subject: [PATCH] ldap_test.py: Modify enum cache timeouts

There is a race condation between ldap
enumeration refresh timeout and the sleeps
that wait for the ldap changes to populate
to SSSD if the timeout and the sleeps have
the same value.
---
src/tests/intg/ldap_test.py | 30 +-
1 file changed, 17 insertions(+), 13 deletions(-)


Houston,
we have a problem.

You patch does nto work either :-(


Hmmm... looks like my magic numbers went out of mana.



http://sssd-ci.duckdns.org/logs/job/42/59/summary.html
rhel6
make-intgcheck
ldap_test.py:496: test_add_remove_group_rfc2307_bis FAILED

http://sssd-ci.duckdns.org/logs/job/42/60/summary.html
rhel6
make-intgcheck
ldap_test.py:466: test_add_remove_user FAILED

http://sssd-ci.duckdns.org/logs/job/42/61/summary.html
rhel6
make-intgcheck
ldap_test.py:481: test_add_remove_group_rfc2307 FAILED

LS


___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD] Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2016-04-22 Thread Lukas Slebodnik
On (02/12/15 17:10), Michal Židek wrote:
>Hi!
>
>I saw some integration tests failures recently,
>and I think there is a race condition between the
>enumeration refresh timeout and the sleeps
>after some operations that wait for this timeout.
>SSSD fails to populate changes from LDAP in time
>and some asserts can fail because of this.
>
>So far I saw 4 tests to fail like this, which
>is already quite a lot.
>
>The attached patch modifies the timeout values
>and hopefully removes the issue.
>
>Michal

>From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 2001
>From: =?UTF-8?q?Michal=20=C5=BDidek?= 
>Date: Wed, 2 Dec 2015 16:44:48 +0100
>Subject: [PATCH] ldap_test.py: Modify enum cache timeouts
>
>There is a race condation between ldap
>enumeration refresh timeout and the sleeps
>that wait for the ldap changes to populate
>to SSSD if the timeout and the sleeps have
>the same value.
>---
> src/tests/intg/ldap_test.py | 30 +-
> 1 file changed, 17 insertions(+), 13 deletions(-)
>
Houston,
we have a problem.

You patch does nto work either :-(

http://sssd-ci.duckdns.org/logs/job/42/59/summary.html
rhel6
make-intgcheck
ldap_test.py:496: test_add_remove_group_rfc2307_bis FAILED

http://sssd-ci.duckdns.org/logs/job/42/60/summary.html
rhel6
make-intgcheck
ldap_test.py:466: test_add_remove_user FAILED

http://sssd-ci.duckdns.org/logs/job/42/61/summary.html
rhel6
make-intgcheck
ldap_test.py:481: test_add_remove_group_rfc2307 FAILED

LS
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD] Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2016-04-20 Thread Michal Židek

On 04/18/2016 12:39 PM, Michal Židek wrote:

On 04/18/2016 12:22 PM, Lukas Slebodnik wrote:

On (18/04/16 11:14), Michal Židek wrote:

On 04/18/2016 10:39 AM, Lukas Slebodnik wrote:

On (02/12/15 17:10), Michal Židek wrote:

Hi!

I saw some integration tests failures recently,
and I think there is a race condition between the
enumeration refresh timeout and the sleeps
after some operations that wait for this timeout.
SSSD fails to populate changes from LDAP in time
and some asserts can fail because of this.

So far I saw 4 tests to fail like this, which
is already quite a lot.

The attached patch modifies the timeout values
and hopefully removes the issue.

Michal


I think I found alternative solution for intermittent
failures of ADD REMOVE test with enumeration.


Not sure if this is safer than my patch. I made the
timeouts bigger on purpose, so that we avoid
problems with machines under heavy load.

You decided to go the opposite direction by
making one of the timeouts shorter.

That being said, maybe your patch is better,
because if it does not fail even under heavy
load it will make the tests shorter.

Once the CI is up we can try 20 test
runs with your patch and if they do not
fail I will give you an ACK.

If they fail, we can still fallback to my patch.


No,
We will need to find another solution.
Increasing timeout is not acceptable from long term perspective.
We need to have faster tests and not slower.

LS


Then lets hope your patch will work :D


Your patch did not fix the issue.
It still fails cca 1 out of 5 times because
in the enumeration tests.

Maybe the timeout magic needs more mana
or there is some other issue which we do not see.

Michal
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD] Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2016-04-18 Thread Michal Židek

On 04/18/2016 12:22 PM, Lukas Slebodnik wrote:

On (18/04/16 11:14), Michal Židek wrote:

On 04/18/2016 10:39 AM, Lukas Slebodnik wrote:

On (02/12/15 17:10), Michal Židek wrote:

Hi!

I saw some integration tests failures recently,
and I think there is a race condition between the
enumeration refresh timeout and the sleeps
after some operations that wait for this timeout.
SSSD fails to populate changes from LDAP in time
and some asserts can fail because of this.

So far I saw 4 tests to fail like this, which
is already quite a lot.

The attached patch modifies the timeout values
and hopefully removes the issue.

Michal


I think I found alternative solution for intermittent
failures of ADD REMOVE test with enumeration.


Not sure if this is safer than my patch. I made the
timeouts bigger on purpose, so that we avoid
problems with machines under heavy load.

You decided to go the opposite direction by
making one of the timeouts shorter.

That being said, maybe your patch is better,
because if it does not fail even under heavy
load it will make the tests shorter.

Once the CI is up we can try 20 test
runs with your patch and if they do not
fail I will give you an ACK.

If they fail, we can still fallback to my patch.


No,
We will need to find another solution.
Increasing timeout is not acceptable from long term perspective.
We need to have faster tests and not slower.

LS


Then lets hope your patch will work :D
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD] Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2016-04-18 Thread Lukas Slebodnik
On (18/04/16 11:14), Michal Židek wrote:
>On 04/18/2016 10:39 AM, Lukas Slebodnik wrote:
>>On (02/12/15 17:10), Michal Židek wrote:
>>>Hi!
>>>
>>>I saw some integration tests failures recently,
>>>and I think there is a race condition between the
>>>enumeration refresh timeout and the sleeps
>>>after some operations that wait for this timeout.
>>>SSSD fails to populate changes from LDAP in time
>>>and some asserts can fail because of this.
>>>
>>>So far I saw 4 tests to fail like this, which
>>>is already quite a lot.
>>>
>>>The attached patch modifies the timeout values
>>>and hopefully removes the issue.
>>>
>>>Michal
>>
>>I think I found alternative solution for intermittent
>>failures of ADD REMOVE test with enumeration.
>
>Not sure if this is safer than my patch. I made the
>timeouts bigger on purpose, so that we avoid
>problems with machines under heavy load.
>
>You decided to go the opposite direction by
>making one of the timeouts shorter.
>
>That being said, maybe your patch is better,
>because if it does not fail even under heavy
>load it will make the tests shorter.
>
>Once the CI is up we can try 20 test
>runs with your patch and if they do not
>fail I will give you an ACK.
>
>If they fail, we can still fallback to my patch.
>
No,
We will need to find another solution.
Increasing timeout is not acceptable from long term perspective.
We need to have faster tests and not slower.

LS
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD] Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2016-04-18 Thread Michal Židek

On 04/18/2016 10:39 AM, Lukas Slebodnik wrote:

On (02/12/15 17:10), Michal Židek wrote:

Hi!

I saw some integration tests failures recently,
and I think there is a race condition between the
enumeration refresh timeout and the sleeps
after some operations that wait for this timeout.
SSSD fails to populate changes from LDAP in time
and some asserts can fail because of this.

So far I saw 4 tests to fail like this, which
is already quite a lot.

The attached patch modifies the timeout values
and hopefully removes the issue.

Michal


I think I found alternative solution for intermittent
failures of ADD REMOVE test with enumeration.


Not sure if this is safer than my patch. I made the
timeouts bigger on purpose, so that we avoid
problems with machines under heavy load.

You decided to go the opposite direction by
making one of the timeouts shorter.

That being said, maybe your patch is better,
because if it does not fail even under heavy
load it will make the tests shorter.

Once the CI is up we can try 20 test
runs with your patch and if they do not
fail I will give you an ACK.

If they fail, we can still fallback to my patch.



Occurence of failure was increased recently therefore
please review attached patch.

I know that Jakub wanted/tried to rewrite test without enumeration.
But I was mentioned in this thread that testing on enumeration is valid
case.


We would have to test the enumeration sooner or later anyway.



LS





___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD] Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2016-04-18 Thread Lukas Slebodnik
On (02/12/15 17:10), Michal Židek wrote:
>Hi!
>
>I saw some integration tests failures recently,
>and I think there is a race condition between the
>enumeration refresh timeout and the sleeps
>after some operations that wait for this timeout.
>SSSD fails to populate changes from LDAP in time
>and some asserts can fail because of this.
>
>So far I saw 4 tests to fail like this, which
>is already quite a lot.
>
>The attached patch modifies the timeout values
>and hopefully removes the issue.
>
>Michal

I think I found alternative solution for intermittent
failures of ADD REMOVE test with enumeration.

Occurence of failure was increased recently therefore
please review attached patch.

I know that Jakub wanted/tried to rewrite test without enumeration.
But I was mentioned in this thread that testing on enumeration is valid
case.

LS
>From 7450ff8918e3763612ad256b8dc6152c438003d8 Mon Sep 17 00:00:00 2001
From: Lukas Slebodnik 
Date: Fri, 15 Apr 2016 22:46:21 +0200
Subject: [PATCH 1/3] intg: Reduce refresh timeout of enumeration task

It caused some intermittent failures if entry_cache_timeout
has the same value as ldap_enumeration_refresh_timeout.
---
 src/tests/intg/ldap_test.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/tests/intg/ldap_test.py b/src/tests/intg/ldap_test.py
index 
757ee20a2884d1ba18804fb88b883e9bd23797e8..9774c290529b5eadf3a72c43c0ccc1b4643869f6
 100644
--- a/src/tests/intg/ldap_test.py
+++ b/src/tests/intg/ldap_test.py
@@ -139,10 +139,10 @@ def format_interactive_conf(ldap_conn, schema):
 entry_negative_timeout  = 0
 
 [domain/LDAP]
-ldap_enumeration_refresh_timeout= {0}
+ldap_enumeration_refresh_timeout= {1}
 ldap_purge_cache_timeout= 1
 entry_cache_timeout = {0}
-""").format(INTERACTIVE_TIMEOUT)
+""").format(INTERACTIVE_TIMEOUT, INTERACTIVE_TIMEOUT/2)
 
 
 def create_conf_file(contents):
-- 
2.7.3

___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD] Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2015-12-11 Thread Lukas Slebodnik
On (04/12/15 16:58), Lukas Slebodnik wrote:
>On (04/12/15 16:02), Michal Židek wrote:
>>On 12/04/2015 03:07 PM, Lukas Slebodnik wrote:
>>>On (04/12/15 14:35), Michal Židek wrote:
On 12/04/2015 02:32 PM, Jakub Hrozek wrote:
>On Fri, Dec 04, 2015 at 02:29:49PM +0100, Michal Židek wrote:
>>On 12/04/2015 12:29 PM, Lukas Slebodnik wrote:
>>>On (04/12/15 12:11), Jakub Hrozek wrote:
On Fri, Dec 04, 2015 at 11:42:00AM +0100, Lukas Slebodnik wrote:
>On (03/12/15 20:22), Jakub Hrozek wrote:
>>On Wed, Dec 02, 2015 at 05:59:07PM +0100, Lukas Slebodnik wrote:
>>>On (02/12/15 17:10), Michal Židek wrote:
Hi!

I saw some integration tests failures recently,
and I think there is a race condition between the
enumeration refresh timeout and the sleeps
after some operations that wait for this timeout.
SSSD fails to populate changes from LDAP in time
and some asserts can fail because of this.

So far I saw 4 tests to fail like this, which
is already quite a lot.

The attached patch modifies the timeout values
and hopefully removes the issue.

Michal
>>>
From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 
2001
From: =?UTF-8?q?Michal=20=C5=BDidek?= 
Date: Wed, 2 Dec 2015 16:44:48 +0100
Subject: [PATCH] ldap_test.py: Modify enum cache timeouts

There is a race condation between ldap
enumeration refresh timeout and the sleeps
that wait for the ldap changes to populate
to SSSD if the timeout and the sleeps have
the same value.
---
src/tests/intg/ldap_test.py | 30 +-
1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/tests/intg/ldap_test.py 
b/src/tests/intg/ldap_test.py
index 757ee20..8ec8dbe 100644
--- a/src/tests/intg/ldap_test.py
+++ b/src/tests/intg/ldap_test.py
@@ -33,7 +33,11 @@ import ldap_ent
from util import *

LDAP_BASE_DN = "dc=example,dc=com"
-INTERACTIVE_TIMEOUT = 4
+INTERACTIVE_TIMEOUT = 2
+
+
+def wait_for_ldap_enum_refresh():
+time.sleep(INTERACTIVE_TIMEOUT + 4)
>>>Why does it need to be INTERACTIVE_TIMEOUT + 4
>>>
>>>Could it be INTERACTIVE_TIMEOUT + 3 or + 5
>>>
>>
>>Regardless of the value we choose, can we move this patch forward? I 
>>see
>>the related failure quite often in SSSD.
>Adding timeout without real explanation is not a solution.
>
>The main problem is with empiric values.
>If they are very high then test are slow.
>And there still can be slow/fast machine where lower values caused 
>troubles.
>
>The ideal solution would be to get rid of enumeration
>in ldap tests.

Enumeration is a codepath that is different from non-enumeration, so it
should be tested. Not as priority, not as the only ldap tests, but it's
a valid case, so it should be there.

>If we want to test enumeration than it should be in separate
>test.

Maybe, but we do have enumeration tests and we should fix them.

>>>Adding sleep is not a fix. It's just a workaround
>>>because all sleep timeout are just an empiric values.
>>>and we should fix test and not adding workaround/hacks.
>>>
>>>If we cannot fix the test and don't want te rewrite test without 
>>>enumeration
>>>then we should remove/revert problematic tests.
>>>
>>>LS
>>
>>I will send a new patch with an explanation (sort of),
>>but it still will be a guess. I am not sure what the
>>real safe value should be, only that the sleep's
>>after operation should be longer than the ldap
>>refresh and enum cache timeouts (and that the
>>current values do not cope well wit the CI load).
>
>Would it be more acceptable then to define the ldap refresh and enum
>cache timeouts as variables in the test and sleep for (enum_timeout +
>cache_timeout + 1) ?
>
>At least that would be more readable than a magic constant..

Will do. All will be derived from INTERACTIVE_TIMEOUT
so that we need to change just one value if needed in the
future.

>>>Will it be reliable?
>>>
>>>Will it work on slow(arm) machines?
>>>
>>>I plan to run integration test in "%check" phase
>>>of rpm build. And koji/fedora has rpm machines.
>>
>>We can mark tests that may fail on slow machines
>>due to timeouts as "unsafe" and skip 

[SSSD]Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2015-12-04 Thread Lukas Slebodnik
On (03/12/15 20:22), Jakub Hrozek wrote:
>On Wed, Dec 02, 2015 at 05:59:07PM +0100, Lukas Slebodnik wrote:
>> On (02/12/15 17:10), Michal Židek wrote:
>> >Hi!
>> >
>> >I saw some integration tests failures recently,
>> >and I think there is a race condition between the
>> >enumeration refresh timeout and the sleeps
>> >after some operations that wait for this timeout.
>> >SSSD fails to populate changes from LDAP in time
>> >and some asserts can fail because of this.
>> >
>> >So far I saw 4 tests to fail like this, which
>> >is already quite a lot.
>> >
>> >The attached patch modifies the timeout values
>> >and hopefully removes the issue.
>> >
>> >Michal
>> 
>> >From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 2001
>> >From: =?UTF-8?q?Michal=20=C5=BDidek?= 
>> >Date: Wed, 2 Dec 2015 16:44:48 +0100
>> >Subject: [PATCH] ldap_test.py: Modify enum cache timeouts
>> >
>> >There is a race condation between ldap
>> >enumeration refresh timeout and the sleeps
>> >that wait for the ldap changes to populate
>> >to SSSD if the timeout and the sleeps have
>> >the same value.
>> >---
>> > src/tests/intg/ldap_test.py | 30 +-
>> > 1 file changed, 17 insertions(+), 13 deletions(-)
>> >
>> >diff --git a/src/tests/intg/ldap_test.py b/src/tests/intg/ldap_test.py
>> >index 757ee20..8ec8dbe 100644
>> >--- a/src/tests/intg/ldap_test.py
>> >+++ b/src/tests/intg/ldap_test.py
>> >@@ -33,7 +33,11 @@ import ldap_ent
>> > from util import *
>> > 
>> > LDAP_BASE_DN = "dc=example,dc=com"
>> >-INTERACTIVE_TIMEOUT = 4
>> >+INTERACTIVE_TIMEOUT = 2
>> >+
>> >+
>> >+def wait_for_ldap_enum_refresh():
>> >+time.sleep(INTERACTIVE_TIMEOUT + 4)
>> Why does it need to be INTERACTIVE_TIMEOUT + 4
>> 
>> Could it be INTERACTIVE_TIMEOUT + 3 or + 5
>> 
>
>Regardless of the value we choose, can we move this patch forward? I see
>the related failure quite often in SSSD.
Adding timeout without real explanation is not a solution.

The main problem is with empiric values.
If they are very high then test are slow.
And there still can be slow/fast machine where lower values caused troubles.

The ideal solution would be to get rid of enumeration
in ldap tests. If we want to test enumeration than it should be in separate
test. I'm not sure we would be able to get rid of "sleep()"
in enumeration test but all such values should pre properly documented why it
is big enough ...

LS
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD]Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2015-12-04 Thread Jakub Hrozek
On Fri, Dec 04, 2015 at 11:42:00AM +0100, Lukas Slebodnik wrote:
> On (03/12/15 20:22), Jakub Hrozek wrote:
> >On Wed, Dec 02, 2015 at 05:59:07PM +0100, Lukas Slebodnik wrote:
> >> On (02/12/15 17:10), Michal Židek wrote:
> >> >Hi!
> >> >
> >> >I saw some integration tests failures recently,
> >> >and I think there is a race condition between the
> >> >enumeration refresh timeout and the sleeps
> >> >after some operations that wait for this timeout.
> >> >SSSD fails to populate changes from LDAP in time
> >> >and some asserts can fail because of this.
> >> >
> >> >So far I saw 4 tests to fail like this, which
> >> >is already quite a lot.
> >> >
> >> >The attached patch modifies the timeout values
> >> >and hopefully removes the issue.
> >> >
> >> >Michal
> >> 
> >> >From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 2001
> >> >From: =?UTF-8?q?Michal=20=C5=BDidek?= 
> >> >Date: Wed, 2 Dec 2015 16:44:48 +0100
> >> >Subject: [PATCH] ldap_test.py: Modify enum cache timeouts
> >> >
> >> >There is a race condation between ldap
> >> >enumeration refresh timeout and the sleeps
> >> >that wait for the ldap changes to populate
> >> >to SSSD if the timeout and the sleeps have
> >> >the same value.
> >> >---
> >> > src/tests/intg/ldap_test.py | 30 +-
> >> > 1 file changed, 17 insertions(+), 13 deletions(-)
> >> >
> >> >diff --git a/src/tests/intg/ldap_test.py b/src/tests/intg/ldap_test.py
> >> >index 757ee20..8ec8dbe 100644
> >> >--- a/src/tests/intg/ldap_test.py
> >> >+++ b/src/tests/intg/ldap_test.py
> >> >@@ -33,7 +33,11 @@ import ldap_ent
> >> > from util import *
> >> > 
> >> > LDAP_BASE_DN = "dc=example,dc=com"
> >> >-INTERACTIVE_TIMEOUT = 4
> >> >+INTERACTIVE_TIMEOUT = 2
> >> >+
> >> >+
> >> >+def wait_for_ldap_enum_refresh():
> >> >+time.sleep(INTERACTIVE_TIMEOUT + 4)
> >> Why does it need to be INTERACTIVE_TIMEOUT + 4
> >> 
> >> Could it be INTERACTIVE_TIMEOUT + 3 or + 5
> >> 
> >
> >Regardless of the value we choose, can we move this patch forward? I see
> >the related failure quite often in SSSD.
> Adding timeout without real explanation is not a solution.
> 
> The main problem is with empiric values.
> If they are very high then test are slow.
> And there still can be slow/fast machine where lower values caused troubles.
> 
> The ideal solution would be to get rid of enumeration
> in ldap tests.

Enumeration is a codepath that is different from non-enumeration, so it
should be tested. Not as priority, not as the only ldap tests, but it's
a valid case, so it should be there.

> If we want to test enumeration than it should be in separate
> test.

Maybe, but we do have enumeration tests and we should fix them.

> I'm not sure we would be able to get rid of "sleep()"
> in enumeration test but all such values should pre properly documented why it
> is big enough ...
> 
> LS
> ___
> sssd-devel mailing list
> sssd-devel@lists.fedorahosted.org
> https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD]Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2015-12-04 Thread Lukas Slebodnik
On (04/12/15 12:11), Jakub Hrozek wrote:
>On Fri, Dec 04, 2015 at 11:42:00AM +0100, Lukas Slebodnik wrote:
>> On (03/12/15 20:22), Jakub Hrozek wrote:
>> >On Wed, Dec 02, 2015 at 05:59:07PM +0100, Lukas Slebodnik wrote:
>> >> On (02/12/15 17:10), Michal Židek wrote:
>> >> >Hi!
>> >> >
>> >> >I saw some integration tests failures recently,
>> >> >and I think there is a race condition between the
>> >> >enumeration refresh timeout and the sleeps
>> >> >after some operations that wait for this timeout.
>> >> >SSSD fails to populate changes from LDAP in time
>> >> >and some asserts can fail because of this.
>> >> >
>> >> >So far I saw 4 tests to fail like this, which
>> >> >is already quite a lot.
>> >> >
>> >> >The attached patch modifies the timeout values
>> >> >and hopefully removes the issue.
>> >> >
>> >> >Michal
>> >> 
>> >> >From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 2001
>> >> >From: =?UTF-8?q?Michal=20=C5=BDidek?= 
>> >> >Date: Wed, 2 Dec 2015 16:44:48 +0100
>> >> >Subject: [PATCH] ldap_test.py: Modify enum cache timeouts
>> >> >
>> >> >There is a race condation between ldap
>> >> >enumeration refresh timeout and the sleeps
>> >> >that wait for the ldap changes to populate
>> >> >to SSSD if the timeout and the sleeps have
>> >> >the same value.
>> >> >---
>> >> > src/tests/intg/ldap_test.py | 30 +-
>> >> > 1 file changed, 17 insertions(+), 13 deletions(-)
>> >> >
>> >> >diff --git a/src/tests/intg/ldap_test.py b/src/tests/intg/ldap_test.py
>> >> >index 757ee20..8ec8dbe 100644
>> >> >--- a/src/tests/intg/ldap_test.py
>> >> >+++ b/src/tests/intg/ldap_test.py
>> >> >@@ -33,7 +33,11 @@ import ldap_ent
>> >> > from util import *
>> >> > 
>> >> > LDAP_BASE_DN = "dc=example,dc=com"
>> >> >-INTERACTIVE_TIMEOUT = 4
>> >> >+INTERACTIVE_TIMEOUT = 2
>> >> >+
>> >> >+
>> >> >+def wait_for_ldap_enum_refresh():
>> >> >+time.sleep(INTERACTIVE_TIMEOUT + 4)
>> >> Why does it need to be INTERACTIVE_TIMEOUT + 4
>> >> 
>> >> Could it be INTERACTIVE_TIMEOUT + 3 or + 5
>> >> 
>> >
>> >Regardless of the value we choose, can we move this patch forward? I see
>> >the related failure quite often in SSSD.
>> Adding timeout without real explanation is not a solution.
>> 
>> The main problem is with empiric values.
>> If they are very high then test are slow.
>> And there still can be slow/fast machine where lower values caused troubles.
>> 
>> The ideal solution would be to get rid of enumeration
>> in ldap tests.
>
>Enumeration is a codepath that is different from non-enumeration, so it
>should be tested. Not as priority, not as the only ldap tests, but it's
>a valid case, so it should be there.
>
>> If we want to test enumeration than it should be in separate
>> test.
>
>Maybe, but we do have enumeration tests and we should fix them.
>
Adding sleep is not a fix. It's just a workaround
because all sleep timeout are just an empiric values.
and we should fix test and not adding workaround/hacks.

If we cannot fix the test and don't want te rewrite test without enumeration
then we should remove/revert problematic tests.

LS
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD]Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2015-12-04 Thread Michal Židek

On 12/04/2015 03:07 PM, Lukas Slebodnik wrote:

On (04/12/15 14:35), Michal Židek wrote:

On 12/04/2015 02:32 PM, Jakub Hrozek wrote:

On Fri, Dec 04, 2015 at 02:29:49PM +0100, Michal Židek wrote:

On 12/04/2015 12:29 PM, Lukas Slebodnik wrote:

On (04/12/15 12:11), Jakub Hrozek wrote:

On Fri, Dec 04, 2015 at 11:42:00AM +0100, Lukas Slebodnik wrote:

On (03/12/15 20:22), Jakub Hrozek wrote:

On Wed, Dec 02, 2015 at 05:59:07PM +0100, Lukas Slebodnik wrote:

On (02/12/15 17:10), Michal Židek wrote:

Hi!

I saw some integration tests failures recently,
and I think there is a race condition between the
enumeration refresh timeout and the sleeps
after some operations that wait for this timeout.
SSSD fails to populate changes from LDAP in time
and some asserts can fail because of this.

So far I saw 4 tests to fail like this, which
is already quite a lot.

The attached patch modifies the timeout values
and hopefully removes the issue.

Michal


>From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 2001

From: =?UTF-8?q?Michal=20=C5=BDidek?= 
Date: Wed, 2 Dec 2015 16:44:48 +0100
Subject: [PATCH] ldap_test.py: Modify enum cache timeouts

There is a race condation between ldap
enumeration refresh timeout and the sleeps
that wait for the ldap changes to populate
to SSSD if the timeout and the sleeps have
the same value.
---
src/tests/intg/ldap_test.py | 30 +-
1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/tests/intg/ldap_test.py b/src/tests/intg/ldap_test.py
index 757ee20..8ec8dbe 100644
--- a/src/tests/intg/ldap_test.py
+++ b/src/tests/intg/ldap_test.py
@@ -33,7 +33,11 @@ import ldap_ent

>from util import *


LDAP_BASE_DN = "dc=example,dc=com"
-INTERACTIVE_TIMEOUT = 4
+INTERACTIVE_TIMEOUT = 2
+
+
+def wait_for_ldap_enum_refresh():
+time.sleep(INTERACTIVE_TIMEOUT + 4)

Why does it need to be INTERACTIVE_TIMEOUT + 4

Could it be INTERACTIVE_TIMEOUT + 3 or + 5



Regardless of the value we choose, can we move this patch forward? I see
the related failure quite often in SSSD.

Adding timeout without real explanation is not a solution.

The main problem is with empiric values.
If they are very high then test are slow.
And there still can be slow/fast machine where lower values caused troubles.

The ideal solution would be to get rid of enumeration
in ldap tests.


Enumeration is a codepath that is different from non-enumeration, so it
should be tested. Not as priority, not as the only ldap tests, but it's
a valid case, so it should be there.


If we want to test enumeration than it should be in separate
test.


Maybe, but we do have enumeration tests and we should fix them.


Adding sleep is not a fix. It's just a workaround
because all sleep timeout are just an empiric values.
and we should fix test and not adding workaround/hacks.

If we cannot fix the test and don't want te rewrite test without enumeration
then we should remove/revert problematic tests.

LS


I will send a new patch with an explanation (sort of),
but it still will be a guess. I am not sure what the
real safe value should be, only that the sleep's
after operation should be longer than the ldap
refresh and enum cache timeouts (and that the
current values do not cope well wit the CI load).


Would it be more acceptable then to define the ldap refresh and enum
cache timeouts as variables in the test and sleep for (enum_timeout +
cache_timeout + 1) ?

At least that would be more readable than a magic constant..


Will do. All will be derived from INTERACTIVE_TIMEOUT
so that we need to change just one value if needed in the
future.


Will it be reliable?

Will it work on slow(arm) machines?

I plan to run integration test in "%check" phase
of rpm build. And koji/fedora has rpm machines.


We can mark tests that may fail on slow machines
due to timeouts as "unsafe" and skip them in the
rpm build. The simplest way to do it would be to
use -k "not test_unsafe_" option in INTGCHECK_PYTEST_ARGS
and the unsafe tests would have test_unsafe_ prefix.
Would that work for you? It is something that we
can start using immediately.


I already have a POC patch.

LS

___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD]Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2015-12-04 Thread Lukas Slebodnik
On (04/12/15 16:02), Michal Židek wrote:
>On 12/04/2015 03:07 PM, Lukas Slebodnik wrote:
>>On (04/12/15 14:35), Michal Židek wrote:
>>>On 12/04/2015 02:32 PM, Jakub Hrozek wrote:
On Fri, Dec 04, 2015 at 02:29:49PM +0100, Michal Židek wrote:
>On 12/04/2015 12:29 PM, Lukas Slebodnik wrote:
>>On (04/12/15 12:11), Jakub Hrozek wrote:
>>>On Fri, Dec 04, 2015 at 11:42:00AM +0100, Lukas Slebodnik wrote:
On (03/12/15 20:22), Jakub Hrozek wrote:
>On Wed, Dec 02, 2015 at 05:59:07PM +0100, Lukas Slebodnik wrote:
>>On (02/12/15 17:10), Michal Židek wrote:
>>>Hi!
>>>
>>>I saw some integration tests failures recently,
>>>and I think there is a race condition between the
>>>enumeration refresh timeout and the sleeps
>>>after some operations that wait for this timeout.
>>>SSSD fails to populate changes from LDAP in time
>>>and some asserts can fail because of this.
>>>
>>>So far I saw 4 tests to fail like this, which
>>>is already quite a lot.
>>>
>>>The attached patch modifies the timeout values
>>>and hopefully removes the issue.
>>>
>>>Michal
>>
>>>From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 
>>>2001
>>>From: =?UTF-8?q?Michal=20=C5=BDidek?= 
>>>Date: Wed, 2 Dec 2015 16:44:48 +0100
>>>Subject: [PATCH] ldap_test.py: Modify enum cache timeouts
>>>
>>>There is a race condation between ldap
>>>enumeration refresh timeout and the sleeps
>>>that wait for the ldap changes to populate
>>>to SSSD if the timeout and the sleeps have
>>>the same value.
>>>---
>>>src/tests/intg/ldap_test.py | 30 +-
>>>1 file changed, 17 insertions(+), 13 deletions(-)
>>>
>>>diff --git a/src/tests/intg/ldap_test.py 
>>>b/src/tests/intg/ldap_test.py
>>>index 757ee20..8ec8dbe 100644
>>>--- a/src/tests/intg/ldap_test.py
>>>+++ b/src/tests/intg/ldap_test.py
>>>@@ -33,7 +33,11 @@ import ldap_ent
>>>from util import *
>>>
>>>LDAP_BASE_DN = "dc=example,dc=com"
>>>-INTERACTIVE_TIMEOUT = 4
>>>+INTERACTIVE_TIMEOUT = 2
>>>+
>>>+
>>>+def wait_for_ldap_enum_refresh():
>>>+time.sleep(INTERACTIVE_TIMEOUT + 4)
>>Why does it need to be INTERACTIVE_TIMEOUT + 4
>>
>>Could it be INTERACTIVE_TIMEOUT + 3 or + 5
>>
>
>Regardless of the value we choose, can we move this patch forward? I 
>see
>the related failure quite often in SSSD.
Adding timeout without real explanation is not a solution.

The main problem is with empiric values.
If they are very high then test are slow.
And there still can be slow/fast machine where lower values caused 
troubles.

The ideal solution would be to get rid of enumeration
in ldap tests.
>>>
>>>Enumeration is a codepath that is different from non-enumeration, so it
>>>should be tested. Not as priority, not as the only ldap tests, but it's
>>>a valid case, so it should be there.
>>>
If we want to test enumeration than it should be in separate
test.
>>>
>>>Maybe, but we do have enumeration tests and we should fix them.
>>>
>>Adding sleep is not a fix. It's just a workaround
>>because all sleep timeout are just an empiric values.
>>and we should fix test and not adding workaround/hacks.
>>
>>If we cannot fix the test and don't want te rewrite test without 
>>enumeration
>>then we should remove/revert problematic tests.
>>
>>LS
>
>I will send a new patch with an explanation (sort of),
>but it still will be a guess. I am not sure what the
>real safe value should be, only that the sleep's
>after operation should be longer than the ldap
>refresh and enum cache timeouts (and that the
>current values do not cope well wit the CI load).

Would it be more acceptable then to define the ldap refresh and enum
cache timeouts as variables in the test and sleep for (enum_timeout +
cache_timeout + 1) ?

At least that would be more readable than a magic constant..
>>>
>>>Will do. All will be derived from INTERACTIVE_TIMEOUT
>>>so that we need to change just one value if needed in the
>>>future.
>>>
>>Will it be reliable?
>>
>>Will it work on slow(arm) machines?
>>
>>I plan to run integration test in "%check" phase
>>of rpm build. And koji/fedora has rpm machines.
>
>We can mark tests that may fail on slow machines
>due to timeouts as "unsafe" and skip them in the
>rpm build.
It's not about rpmbuild, it's a general problem.
And marking tests as unsafe does not solve anything.
We need a reliable way how to find out 

[SSSD]Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2015-12-04 Thread Michal Židek

On 12/04/2015 12:29 PM, Lukas Slebodnik wrote:

On (04/12/15 12:11), Jakub Hrozek wrote:

On Fri, Dec 04, 2015 at 11:42:00AM +0100, Lukas Slebodnik wrote:

On (03/12/15 20:22), Jakub Hrozek wrote:

On Wed, Dec 02, 2015 at 05:59:07PM +0100, Lukas Slebodnik wrote:

On (02/12/15 17:10), Michal Židek wrote:

Hi!

I saw some integration tests failures recently,
and I think there is a race condition between the
enumeration refresh timeout and the sleeps
after some operations that wait for this timeout.
SSSD fails to populate changes from LDAP in time
and some asserts can fail because of this.

So far I saw 4 tests to fail like this, which
is already quite a lot.

The attached patch modifies the timeout values
and hopefully removes the issue.

Michal


>From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 2001

From: =?UTF-8?q?Michal=20=C5=BDidek?= 
Date: Wed, 2 Dec 2015 16:44:48 +0100
Subject: [PATCH] ldap_test.py: Modify enum cache timeouts

There is a race condation between ldap
enumeration refresh timeout and the sleeps
that wait for the ldap changes to populate
to SSSD if the timeout and the sleeps have
the same value.
---
src/tests/intg/ldap_test.py | 30 +-
1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/tests/intg/ldap_test.py b/src/tests/intg/ldap_test.py
index 757ee20..8ec8dbe 100644
--- a/src/tests/intg/ldap_test.py
+++ b/src/tests/intg/ldap_test.py
@@ -33,7 +33,11 @@ import ldap_ent
from util import *

LDAP_BASE_DN = "dc=example,dc=com"
-INTERACTIVE_TIMEOUT = 4
+INTERACTIVE_TIMEOUT = 2
+
+
+def wait_for_ldap_enum_refresh():
+time.sleep(INTERACTIVE_TIMEOUT + 4)

Why does it need to be INTERACTIVE_TIMEOUT + 4

Could it be INTERACTIVE_TIMEOUT + 3 or + 5



Regardless of the value we choose, can we move this patch forward? I see
the related failure quite often in SSSD.

Adding timeout without real explanation is not a solution.

The main problem is with empiric values.
If they are very high then test are slow.
And there still can be slow/fast machine where lower values caused troubles.

The ideal solution would be to get rid of enumeration
in ldap tests.


Enumeration is a codepath that is different from non-enumeration, so it
should be tested. Not as priority, not as the only ldap tests, but it's
a valid case, so it should be there.


If we want to test enumeration than it should be in separate
test.


Maybe, but we do have enumeration tests and we should fix them.


Adding sleep is not a fix. It's just a workaround
because all sleep timeout are just an empiric values.
and we should fix test and not adding workaround/hacks.

If we cannot fix the test and don't want te rewrite test without enumeration
then we should remove/revert problematic tests.

LS


I will send a new patch with an explanation (sort of),
but it still will be a guess. I am not sure what the
real safe value should be, only that the sleep's
after operation should be longer than the ldap
refresh and enum cache timeouts (and that the
current values do not cope well wit the CI load).

Lukas is right that I pulled the values out of
place where no knowledge resides so let us make
a compromise. I will push this patch to the
CI a lot of times (let's say 40-50) over the
weekend and see if it fails.

I am also not sure if lowering the INTERACTIVE_TIMEOUT
was a good idea, I did it in order to make the
execution a little faster and it is a timeout that
needs to be shorter than the sleeps that wait
for the ldap change to propagate to sysdb (and I did not
want the sleeps to be too long). With
the weekend "stress testing" we can hopefully avoid
additional adjustments with new patches.

Michal
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD]Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2015-12-04 Thread Nikolai Kondrashov

On 12/04/2015 12:42 PM, Lukas Slebodnik wrote:

On (03/12/15 20:22), Jakub Hrozek wrote:

On Wed, Dec 02, 2015 at 05:59:07PM +0100, Lukas Slebodnik wrote:

On (02/12/15 17:10), Michal Židek wrote:

Hi!

I saw some integration tests failures recently,
and I think there is a race condition between the
enumeration refresh timeout and the sleeps
after some operations that wait for this timeout.
SSSD fails to populate changes from LDAP in time
and some asserts can fail because of this.

So far I saw 4 tests to fail like this, which
is already quite a lot.

The attached patch modifies the timeout values
and hopefully removes the issue.

Michal


>From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 2001

From: =?UTF-8?q?Michal=20=C5=BDidek?= 
Date: Wed, 2 Dec 2015 16:44:48 +0100
Subject: [PATCH] ldap_test.py: Modify enum cache timeouts

There is a race condation between ldap
enumeration refresh timeout and the sleeps
that wait for the ldap changes to populate
to SSSD if the timeout and the sleeps have
the same value.
---
src/tests/intg/ldap_test.py | 30 +-
1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/tests/intg/ldap_test.py b/src/tests/intg/ldap_test.py
index 757ee20..8ec8dbe 100644
--- a/src/tests/intg/ldap_test.py
+++ b/src/tests/intg/ldap_test.py
@@ -33,7 +33,11 @@ import ldap_ent
from util import *

LDAP_BASE_DN = "dc=example,dc=com"
-INTERACTIVE_TIMEOUT = 4
+INTERACTIVE_TIMEOUT = 2
+
+
+def wait_for_ldap_enum_refresh():
+time.sleep(INTERACTIVE_TIMEOUT + 4)

Why does it need to be INTERACTIVE_TIMEOUT + 4

Could it be INTERACTIVE_TIMEOUT + 3 or + 5



Regardless of the value we choose, can we move this patch forward? I see
the related failure quite often in SSSD.

Adding timeout without real explanation is not a solution.

The main problem is with empiric values.
If they are very high then test are slow.
And there still can be slow/fast machine where lower values caused troubles.

The ideal solution would be to get rid of enumeration
in ldap tests. If we want to test enumeration than it should be in separate
test. I'm not sure we would be able to get rid of "sleep()"
in enumeration test but all such values should pre properly documented why it
is big enough ...


Yeah, these timeouts are messy.

The problem is not the empiric timeout values themselves but rather the
guesswork of when certain cache state changes are supposed to happen in
relation to them. If we can reason about event deadlines then we can use them.

If not, and the code is too complicated, can we perhaps introduce some
explicit synchronization with sssd caching mechanism, where sssd behavior will
drive the tests? E.g. the tests would actually wait for sssd to do something
with the cache and after sssd reports it is done, the tests will verify the
time and the result?

We would still get to test that sssd does something we need in the expected
timeframe, but we can make the tests faster - i.e. as fast as sssd can perform
and be configured to.

Perhaps add something to the sss_cache tool?

Nick
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD]Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2015-12-04 Thread Michal Židek

On 12/04/2015 02:32 PM, Jakub Hrozek wrote:

On Fri, Dec 04, 2015 at 02:29:49PM +0100, Michal Židek wrote:

On 12/04/2015 12:29 PM, Lukas Slebodnik wrote:

On (04/12/15 12:11), Jakub Hrozek wrote:

On Fri, Dec 04, 2015 at 11:42:00AM +0100, Lukas Slebodnik wrote:

On (03/12/15 20:22), Jakub Hrozek wrote:

On Wed, Dec 02, 2015 at 05:59:07PM +0100, Lukas Slebodnik wrote:

On (02/12/15 17:10), Michal Židek wrote:

Hi!

I saw some integration tests failures recently,
and I think there is a race condition between the
enumeration refresh timeout and the sleeps
after some operations that wait for this timeout.
SSSD fails to populate changes from LDAP in time
and some asserts can fail because of this.

So far I saw 4 tests to fail like this, which
is already quite a lot.

The attached patch modifies the timeout values
and hopefully removes the issue.

Michal


>From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 2001

From: =?UTF-8?q?Michal=20=C5=BDidek?= 
Date: Wed, 2 Dec 2015 16:44:48 +0100
Subject: [PATCH] ldap_test.py: Modify enum cache timeouts

There is a race condation between ldap
enumeration refresh timeout and the sleeps
that wait for the ldap changes to populate
to SSSD if the timeout and the sleeps have
the same value.
---
src/tests/intg/ldap_test.py | 30 +-
1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/tests/intg/ldap_test.py b/src/tests/intg/ldap_test.py
index 757ee20..8ec8dbe 100644
--- a/src/tests/intg/ldap_test.py
+++ b/src/tests/intg/ldap_test.py
@@ -33,7 +33,11 @@ import ldap_ent

>from util import *


LDAP_BASE_DN = "dc=example,dc=com"
-INTERACTIVE_TIMEOUT = 4
+INTERACTIVE_TIMEOUT = 2
+
+
+def wait_for_ldap_enum_refresh():
+time.sleep(INTERACTIVE_TIMEOUT + 4)

Why does it need to be INTERACTIVE_TIMEOUT + 4

Could it be INTERACTIVE_TIMEOUT + 3 or + 5



Regardless of the value we choose, can we move this patch forward? I see
the related failure quite often in SSSD.

Adding timeout without real explanation is not a solution.

The main problem is with empiric values.
If they are very high then test are slow.
And there still can be slow/fast machine where lower values caused troubles.

The ideal solution would be to get rid of enumeration
in ldap tests.


Enumeration is a codepath that is different from non-enumeration, so it
should be tested. Not as priority, not as the only ldap tests, but it's
a valid case, so it should be there.


If we want to test enumeration than it should be in separate
test.


Maybe, but we do have enumeration tests and we should fix them.


Adding sleep is not a fix. It's just a workaround
because all sleep timeout are just an empiric values.
and we should fix test and not adding workaround/hacks.

If we cannot fix the test and don't want te rewrite test without enumeration
then we should remove/revert problematic tests.

LS


I will send a new patch with an explanation (sort of),
but it still will be a guess. I am not sure what the
real safe value should be, only that the sleep's
after operation should be longer than the ldap
refresh and enum cache timeouts (and that the
current values do not cope well wit the CI load).


Would it be more acceptable then to define the ldap refresh and enum
cache timeouts as variables in the test and sleep for (enum_timeout +
cache_timeout + 1) ?

At least that would be more readable than a magic constant..


Will do. All will be derived from INTERACTIVE_TIMEOUT
so that we need to change just one value if needed in the
future.





Lukas is right that I pulled the values out of
place where no knowledge resides so let us make
a compromise. I will push this patch to the
CI a lot of times (let's say 40-50) over the
weekend and see if it fails.

I am also not sure if lowering the INTERACTIVE_TIMEOUT
was a good idea, I did it in order to make the
execution a little faster and it is a timeout that
needs to be shorter than the sleeps that wait
for the ldap change to propagate to sysdb (and I did not
want the sleeps to be too long). With
the weekend "stress testing" we can hopefully avoid
additional adjustments with new patches.

Michal
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org

___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD]Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2015-12-04 Thread Jakub Hrozek
On Fri, Dec 04, 2015 at 02:29:49PM +0100, Michal Židek wrote:
> On 12/04/2015 12:29 PM, Lukas Slebodnik wrote:
> >On (04/12/15 12:11), Jakub Hrozek wrote:
> >>On Fri, Dec 04, 2015 at 11:42:00AM +0100, Lukas Slebodnik wrote:
> >>>On (03/12/15 20:22), Jakub Hrozek wrote:
> On Wed, Dec 02, 2015 at 05:59:07PM +0100, Lukas Slebodnik wrote:
> >On (02/12/15 17:10), Michal Židek wrote:
> >>Hi!
> >>
> >>I saw some integration tests failures recently,
> >>and I think there is a race condition between the
> >>enumeration refresh timeout and the sleeps
> >>after some operations that wait for this timeout.
> >>SSSD fails to populate changes from LDAP in time
> >>and some asserts can fail because of this.
> >>
> >>So far I saw 4 tests to fail like this, which
> >>is already quite a lot.
> >>
> >>The attached patch modifies the timeout values
> >>and hopefully removes the issue.
> >>
> >>Michal
> >
> >>From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 2001
> >>From: =?UTF-8?q?Michal=20=C5=BDidek?= 
> >>Date: Wed, 2 Dec 2015 16:44:48 +0100
> >>Subject: [PATCH] ldap_test.py: Modify enum cache timeouts
> >>
> >>There is a race condation between ldap
> >>enumeration refresh timeout and the sleeps
> >>that wait for the ldap changes to populate
> >>to SSSD if the timeout and the sleeps have
> >>the same value.
> >>---
> >>src/tests/intg/ldap_test.py | 30 +-
> >>1 file changed, 17 insertions(+), 13 deletions(-)
> >>
> >>diff --git a/src/tests/intg/ldap_test.py b/src/tests/intg/ldap_test.py
> >>index 757ee20..8ec8dbe 100644
> >>--- a/src/tests/intg/ldap_test.py
> >>+++ b/src/tests/intg/ldap_test.py
> >>@@ -33,7 +33,11 @@ import ldap_ent
> >>from util import *
> >>
> >>LDAP_BASE_DN = "dc=example,dc=com"
> >>-INTERACTIVE_TIMEOUT = 4
> >>+INTERACTIVE_TIMEOUT = 2
> >>+
> >>+
> >>+def wait_for_ldap_enum_refresh():
> >>+time.sleep(INTERACTIVE_TIMEOUT + 4)
> >Why does it need to be INTERACTIVE_TIMEOUT + 4
> >
> >Could it be INTERACTIVE_TIMEOUT + 3 or + 5
> >
> 
> Regardless of the value we choose, can we move this patch forward? I see
> the related failure quite often in SSSD.
> >>>Adding timeout without real explanation is not a solution.
> >>>
> >>>The main problem is with empiric values.
> >>>If they are very high then test are slow.
> >>>And there still can be slow/fast machine where lower values caused 
> >>>troubles.
> >>>
> >>>The ideal solution would be to get rid of enumeration
> >>>in ldap tests.
> >>
> >>Enumeration is a codepath that is different from non-enumeration, so it
> >>should be tested. Not as priority, not as the only ldap tests, but it's
> >>a valid case, so it should be there.
> >>
> >>>If we want to test enumeration than it should be in separate
> >>>test.
> >>
> >>Maybe, but we do have enumeration tests and we should fix them.
> >>
> >Adding sleep is not a fix. It's just a workaround
> >because all sleep timeout are just an empiric values.
> >and we should fix test and not adding workaround/hacks.
> >
> >If we cannot fix the test and don't want te rewrite test without enumeration
> >then we should remove/revert problematic tests.
> >
> >LS
> 
> I will send a new patch with an explanation (sort of),
> but it still will be a guess. I am not sure what the
> real safe value should be, only that the sleep's
> after operation should be longer than the ldap
> refresh and enum cache timeouts (and that the
> current values do not cope well wit the CI load).

Would it be more acceptable then to define the ldap refresh and enum
cache timeouts as variables in the test and sleep for (enum_timeout +
cache_timeout + 1) ?

At least that would be more readable than a magic constant..

> 
> Lukas is right that I pulled the values out of
> place where no knowledge resides so let us make
> a compromise. I will push this patch to the
> CI a lot of times (let's say 40-50) over the
> weekend and see if it fails.
> 
> I am also not sure if lowering the INTERACTIVE_TIMEOUT
> was a good idea, I did it in order to make the
> execution a little faster and it is a timeout that
> needs to be shorter than the sleeps that wait
> for the ldap change to propagate to sysdb (and I did not
> want the sleeps to be too long). With
> the weekend "stress testing" we can hopefully avoid
> additional adjustments with new patches.
> 
> Michal
> ___
> sssd-devel mailing list
> sssd-devel@lists.fedorahosted.org
> https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD]Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2015-12-03 Thread Jakub Hrozek
On Wed, Dec 02, 2015 at 05:59:07PM +0100, Lukas Slebodnik wrote:
> On (02/12/15 17:10), Michal Židek wrote:
> >Hi!
> >
> >I saw some integration tests failures recently,
> >and I think there is a race condition between the
> >enumeration refresh timeout and the sleeps
> >after some operations that wait for this timeout.
> >SSSD fails to populate changes from LDAP in time
> >and some asserts can fail because of this.
> >
> >So far I saw 4 tests to fail like this, which
> >is already quite a lot.
> >
> >The attached patch modifies the timeout values
> >and hopefully removes the issue.
> >
> >Michal
> 
> >From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 2001
> >From: =?UTF-8?q?Michal=20=C5=BDidek?= 
> >Date: Wed, 2 Dec 2015 16:44:48 +0100
> >Subject: [PATCH] ldap_test.py: Modify enum cache timeouts
> >
> >There is a race condation between ldap
> >enumeration refresh timeout and the sleeps
> >that wait for the ldap changes to populate
> >to SSSD if the timeout and the sleeps have
> >the same value.
> >---
> > src/tests/intg/ldap_test.py | 30 +-
> > 1 file changed, 17 insertions(+), 13 deletions(-)
> >
> >diff --git a/src/tests/intg/ldap_test.py b/src/tests/intg/ldap_test.py
> >index 757ee20..8ec8dbe 100644
> >--- a/src/tests/intg/ldap_test.py
> >+++ b/src/tests/intg/ldap_test.py
> >@@ -33,7 +33,11 @@ import ldap_ent
> > from util import *
> > 
> > LDAP_BASE_DN = "dc=example,dc=com"
> >-INTERACTIVE_TIMEOUT = 4
> >+INTERACTIVE_TIMEOUT = 2
> >+
> >+
> >+def wait_for_ldap_enum_refresh():
> >+time.sleep(INTERACTIVE_TIMEOUT + 4)
> Why does it need to be INTERACTIVE_TIMEOUT + 4
> 
> Could it be INTERACTIVE_TIMEOUT + 3 or + 5
> 

Regardless of the value we choose, can we move this patch forward? I see
the related failure quite often in SSSD.
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org


[SSSD]Re: [PATCH] ldap_test.py: Modify enum cache timeouts

2015-12-02 Thread Lukas Slebodnik
On (02/12/15 17:10), Michal Židek wrote:
>Hi!
>
>I saw some integration tests failures recently,
>and I think there is a race condition between the
>enumeration refresh timeout and the sleeps
>after some operations that wait for this timeout.
>SSSD fails to populate changes from LDAP in time
>and some asserts can fail because of this.
>
>So far I saw 4 tests to fail like this, which
>is already quite a lot.
>
>The attached patch modifies the timeout values
>and hopefully removes the issue.
>
>Michal

>From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 2001
>From: =?UTF-8?q?Michal=20=C5=BDidek?= 
>Date: Wed, 2 Dec 2015 16:44:48 +0100
>Subject: [PATCH] ldap_test.py: Modify enum cache timeouts
>
>There is a race condation between ldap
>enumeration refresh timeout and the sleeps
>that wait for the ldap changes to populate
>to SSSD if the timeout and the sleeps have
>the same value.
>---
> src/tests/intg/ldap_test.py | 30 +-
> 1 file changed, 17 insertions(+), 13 deletions(-)
>
>diff --git a/src/tests/intg/ldap_test.py b/src/tests/intg/ldap_test.py
>index 757ee20..8ec8dbe 100644
>--- a/src/tests/intg/ldap_test.py
>+++ b/src/tests/intg/ldap_test.py
>@@ -33,7 +33,11 @@ import ldap_ent
> from util import *
> 
> LDAP_BASE_DN = "dc=example,dc=com"
>-INTERACTIVE_TIMEOUT = 4
>+INTERACTIVE_TIMEOUT = 2
>+
>+
>+def wait_for_ldap_enum_refresh():
>+time.sleep(INTERACTIVE_TIMEOUT + 4)
Why does it need to be INTERACTIVE_TIMEOUT + 4

Could it be INTERACTIVE_TIMEOUT + 3 or + 5

LS
___
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org