In addition to PR precheckin jobs, I've also run a full regression against
the changes to make Cache.close() synchronous. There are failures, but I
have no idea if they are normal or "ok" failures or not. So I'm not sure
what to do next with this change unless someone else wants to review the
Hydra failures. This is the problem with having a bunch of non-open tests
that we can't really discuss on dev list. Let me know what you guys want to
do!

On Tue, Apr 21, 2020 at 2:27 PM Kirk Lund <kl...@apache.org> wrote:

> PR #4963 https://github.com/apache/geode/pull/4963 is ready for review.
> It has passed precheckin once. After self-review, I reverted a couple small
> changes that weren't needed so it needs to go through precheckin again.
>
> On Fri, Apr 17, 2020 at 9:42 AM Kirk Lund <kl...@apache.org> wrote:
>
>> Memcached IntegrationJUnitTest hangs the PR IntegrationTest job because
>> Cache.close() calls GeodeMemcachedService.close() which again calls
>> Cache.close(). Looks like the code base has lots of Cache.close() calls
>> -- all of them could theoretically cause issues. I hate to add 
>> ThreadLocal<Boolean>
>> isClosingThread or something like it just to allow reentrant calls to
>> Cache.close().
>>
>> Mark let the IntegrationTest job run for 7+ hours which shows the hang in
>> the Memcached IntegrationJUnitTest. (Thanks Mark!)
>>
>> On Thu, Apr 16, 2020 at 1:38 PM Kirk Lund <kl...@apache.org> wrote:
>>
>>> It timed out while running OldFreeListOffHeapRegionJUnitTest but I think
>>> the tests before it were responsible for the timeout being exceeded. I
>>> looked through all of the previously run tests and how long each but
>>> without having some sort of database with how long each test takes, it's
>>> impossible to know which test or tests take longer in any given PR.
>>>
>>> The IntegrationTest job that exceeded the timeout:
>>> https://concourse.apachegeode-ci.info/builds/147866
>>>
>>> The Test Summary for the above IntegrationTest job with Duration for
>>> each test:
>>> http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-4963/test-results/integrationTest/1587061092/
>>>
>>> Unless we want to start tracking each test class/method and its Duration
>>> in a database, I don't see how we could look for trends or changes to
>>> identify test(s) that suddenly start taking longer. All of the tests take
>>> less than 3 minutes each, so unless one suddenly spikes to 10 minutes or
>>> more, there's really no way to find the test(s).
>>>
>>> On Thu, Apr 16, 2020 at 12:52 PM Owen Nichols <onich...@pivotal.io>
>>> wrote:
>>>
>>>> Kirk, most IntegrationTest jobs run in 25-30 minutes, but I did see one
>>>> <
>>>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/IntegrationTestOpenJDK11/builds/7202>
>>>> that came in just under 45 minutes but did succeed.  It would be nice to
>>>> know what test is occasionally taking longer and why…
>>>>
>>>> Here’s an example of a previous timeout increase (Note that both the
>>>> job timeout and the callstack timeout should be increased by the same
>>>> amount): https://github.com/apache/geode/pull/4231
>>>>
>>>> > On Apr 16, 2020, at 10:47 AM, Kirk Lund <kl...@apache.org> wrote:
>>>> >
>>>> > Unfortunately, IntegrationTest exceeds timeout every time I trigger
>>>> it. The
>>>> > cause does not appear to be a specific test or hang. I
>>>> > think IntegrationTest has already been running very close to the
>>>> timeout
>>>> > and is exceeding it fairly often even without my changes in #4963.
>>>> >
>>>> > Should we increase the timeout for IntegrationTest? (Anyone know how
>>>> to
>>>> > increase it?)
>>>>
>>>>

Reply via email to