Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-29 Thread Kirk Lund
Done and merged to develop. Thanks to Donal for helping me out.

On Tue, Apr 28, 2020 at 3:21 PM Kirk Lund  wrote:

> In addition to PR precheckin jobs, I've also run a full regression against
> the changes to make Cache.close() synchronous. There are failures, but I
> have no idea if they are normal or "ok" failures or not. So I'm not sure
> what to do next with this change unless someone else wants to review the
> Hydra failures. This is the problem with having a bunch of non-open tests
> that we can't really discuss on dev list. Let me know what you guys want to
> do!
>
> On Tue, Apr 21, 2020 at 2:27 PM Kirk Lund  wrote:
>
>> PR #4963 https://github.com/apache/geode/pull/4963 is ready for review.
>> It has passed precheckin once. After self-review, I reverted a couple small
>> changes that weren't needed so it needs to go through precheckin again.
>>
>> On Fri, Apr 17, 2020 at 9:42 AM Kirk Lund  wrote:
>>
>>> Memcached IntegrationJUnitTest hangs the PR IntegrationTest job because
>>> Cache.close() calls GeodeMemcachedService.close() which again calls
>>> Cache.close(). Looks like the code base has lots of Cache.close() calls
>>> -- all of them could theoretically cause issues. I hate to add 
>>> ThreadLocal
>>> isClosingThread or something like it just to allow reentrant calls to
>>> Cache.close().
>>>
>>> Mark let the IntegrationTest job run for 7+ hours which shows the hang
>>> in the Memcached IntegrationJUnitTest. (Thanks Mark!)
>>>
>>> On Thu, Apr 16, 2020 at 1:38 PM Kirk Lund  wrote:
>>>
 It timed out while running OldFreeListOffHeapRegionJUnitTest but I
 think the tests before it were responsible for the timeout being exceeded.
 I looked through all of the previously run tests and how long each but
 without having some sort of database with how long each test takes, it's
 impossible to know which test or tests take longer in any given PR.

 The IntegrationTest job that exceeded the timeout:
 https://concourse.apachegeode-ci.info/builds/147866

 The Test Summary for the above IntegrationTest job with Duration for
 each test:
 http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-4963/test-results/integrationTest/1587061092/

 Unless we want to start tracking each test class/method and its
 Duration in a database, I don't see how we could look for trends or changes
 to identify test(s) that suddenly start taking longer. All of the tests
 take less than 3 minutes each, so unless one suddenly spikes to 10 minutes
 or more, there's really no way to find the test(s).

 On Thu, Apr 16, 2020 at 12:52 PM Owen Nichols 
 wrote:

> Kirk, most IntegrationTest jobs run in 25-30 minutes, but I did see
> one <
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/IntegrationTestOpenJDK11/builds/7202>
> that came in just under 45 minutes but did succeed.  It would be nice to
> know what test is occasionally taking longer and why…
>
> Here’s an example of a previous timeout increase (Note that both the
> job timeout and the callstack timeout should be increased by the same
> amount): https://github.com/apache/geode/pull/4231
>
> > On Apr 16, 2020, at 10:47 AM, Kirk Lund  wrote:
> >
> > Unfortunately, IntegrationTest exceeds timeout every time I trigger
> it. The
> > cause does not appear to be a specific test or hang. I
> > think IntegrationTest has already been running very close to the
> timeout
> > and is exceeding it fairly often even without my changes in #4963.
> >
> > Should we increase the timeout for IntegrationTest? (Anyone know how
> to
> > increase it?)
>
>


Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-28 Thread Kirk Lund
In addition to PR precheckin jobs, I've also run a full regression against
the changes to make Cache.close() synchronous. There are failures, but I
have no idea if they are normal or "ok" failures or not. So I'm not sure
what to do next with this change unless someone else wants to review the
Hydra failures. This is the problem with having a bunch of non-open tests
that we can't really discuss on dev list. Let me know what you guys want to
do!

On Tue, Apr 21, 2020 at 2:27 PM Kirk Lund  wrote:

> PR #4963 https://github.com/apache/geode/pull/4963 is ready for review.
> It has passed precheckin once. After self-review, I reverted a couple small
> changes that weren't needed so it needs to go through precheckin again.
>
> On Fri, Apr 17, 2020 at 9:42 AM Kirk Lund  wrote:
>
>> Memcached IntegrationJUnitTest hangs the PR IntegrationTest job because
>> Cache.close() calls GeodeMemcachedService.close() which again calls
>> Cache.close(). Looks like the code base has lots of Cache.close() calls
>> -- all of them could theoretically cause issues. I hate to add 
>> ThreadLocal
>> isClosingThread or something like it just to allow reentrant calls to
>> Cache.close().
>>
>> Mark let the IntegrationTest job run for 7+ hours which shows the hang in
>> the Memcached IntegrationJUnitTest. (Thanks Mark!)
>>
>> On Thu, Apr 16, 2020 at 1:38 PM Kirk Lund  wrote:
>>
>>> It timed out while running OldFreeListOffHeapRegionJUnitTest but I think
>>> the tests before it were responsible for the timeout being exceeded. I
>>> looked through all of the previously run tests and how long each but
>>> without having some sort of database with how long each test takes, it's
>>> impossible to know which test or tests take longer in any given PR.
>>>
>>> The IntegrationTest job that exceeded the timeout:
>>> https://concourse.apachegeode-ci.info/builds/147866
>>>
>>> The Test Summary for the above IntegrationTest job with Duration for
>>> each test:
>>> http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-4963/test-results/integrationTest/1587061092/
>>>
>>> Unless we want to start tracking each test class/method and its Duration
>>> in a database, I don't see how we could look for trends or changes to
>>> identify test(s) that suddenly start taking longer. All of the tests take
>>> less than 3 minutes each, so unless one suddenly spikes to 10 minutes or
>>> more, there's really no way to find the test(s).
>>>
>>> On Thu, Apr 16, 2020 at 12:52 PM Owen Nichols 
>>> wrote:
>>>
 Kirk, most IntegrationTest jobs run in 25-30 minutes, but I did see one
 <
 https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/IntegrationTestOpenJDK11/builds/7202>
 that came in just under 45 minutes but did succeed.  It would be nice to
 know what test is occasionally taking longer and why…

 Here’s an example of a previous timeout increase (Note that both the
 job timeout and the callstack timeout should be increased by the same
 amount): https://github.com/apache/geode/pull/4231

 > On Apr 16, 2020, at 10:47 AM, Kirk Lund  wrote:
 >
 > Unfortunately, IntegrationTest exceeds timeout every time I trigger
 it. The
 > cause does not appear to be a specific test or hang. I
 > think IntegrationTest has already been running very close to the
 timeout
 > and is exceeding it fairly often even without my changes in #4963.
 >
 > Should we increase the timeout for IntegrationTest? (Anyone know how
 to
 > increase it?)




Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-21 Thread Kirk Lund
PR #4963 https://github.com/apache/geode/pull/4963 is ready for review. It
has passed precheckin once. After self-review, I reverted a couple small
changes that weren't needed so it needs to go through precheckin again.

On Fri, Apr 17, 2020 at 9:42 AM Kirk Lund  wrote:

> Memcached IntegrationJUnitTest hangs the PR IntegrationTest job because
> Cache.close() calls GeodeMemcachedService.close() which again calls
> Cache.close(). Looks like the code base has lots of Cache.close() calls
> -- all of them could theoretically cause issues. I hate to add 
> ThreadLocal
> isClosingThread or something like it just to allow reentrant calls to
> Cache.close().
>
> Mark let the IntegrationTest job run for 7+ hours which shows the hang in
> the Memcached IntegrationJUnitTest. (Thanks Mark!)
>
> On Thu, Apr 16, 2020 at 1:38 PM Kirk Lund  wrote:
>
>> It timed out while running OldFreeListOffHeapRegionJUnitTest but I think
>> the tests before it were responsible for the timeout being exceeded. I
>> looked through all of the previously run tests and how long each but
>> without having some sort of database with how long each test takes, it's
>> impossible to know which test or tests take longer in any given PR.
>>
>> The IntegrationTest job that exceeded the timeout:
>> https://concourse.apachegeode-ci.info/builds/147866
>>
>> The Test Summary for the above IntegrationTest job with Duration for each
>> test:
>> http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-4963/test-results/integrationTest/1587061092/
>>
>> Unless we want to start tracking each test class/method and its Duration
>> in a database, I don't see how we could look for trends or changes to
>> identify test(s) that suddenly start taking longer. All of the tests take
>> less than 3 minutes each, so unless one suddenly spikes to 10 minutes or
>> more, there's really no way to find the test(s).
>>
>> On Thu, Apr 16, 2020 at 12:52 PM Owen Nichols 
>> wrote:
>>
>>> Kirk, most IntegrationTest jobs run in 25-30 minutes, but I did see one <
>>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/IntegrationTestOpenJDK11/builds/7202>
>>> that came in just under 45 minutes but did succeed.  It would be nice to
>>> know what test is occasionally taking longer and why…
>>>
>>> Here’s an example of a previous timeout increase (Note that both the job
>>> timeout and the callstack timeout should be increased by the same amount):
>>> https://github.com/apache/geode/pull/4231
>>>
>>> > On Apr 16, 2020, at 10:47 AM, Kirk Lund  wrote:
>>> >
>>> > Unfortunately, IntegrationTest exceeds timeout every time I trigger
>>> it. The
>>> > cause does not appear to be a specific test or hang. I
>>> > think IntegrationTest has already been running very close to the
>>> timeout
>>> > and is exceeding it fairly often even without my changes in #4963.
>>> >
>>> > Should we increase the timeout for IntegrationTest? (Anyone know how to
>>> > increase it?)
>>>
>>>


Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-17 Thread Kirk Lund
Memcached IntegrationJUnitTest hangs the PR IntegrationTest job because
Cache.close() calls GeodeMemcachedService.close() which again calls
Cache.close(). Looks like the code base has lots of Cache.close() calls --
all of them could theoretically cause issues. I hate to add
ThreadLocal
isClosingThread or something like it just to allow reentrant calls to
Cache.close().

Mark let the IntegrationTest job run for 7+ hours which shows the hang in
the Memcached IntegrationJUnitTest. (Thanks Mark!)

On Thu, Apr 16, 2020 at 1:38 PM Kirk Lund  wrote:

> It timed out while running OldFreeListOffHeapRegionJUnitTest but I think
> the tests before it were responsible for the timeout being exceeded. I
> looked through all of the previously run tests and how long each but
> without having some sort of database with how long each test takes, it's
> impossible to know which test or tests take longer in any given PR.
>
> The IntegrationTest job that exceeded the timeout:
> https://concourse.apachegeode-ci.info/builds/147866
>
> The Test Summary for the above IntegrationTest job with Duration for each
> test:
> http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-4963/test-results/integrationTest/1587061092/
>
> Unless we want to start tracking each test class/method and its Duration
> in a database, I don't see how we could look for trends or changes to
> identify test(s) that suddenly start taking longer. All of the tests take
> less than 3 minutes each, so unless one suddenly spikes to 10 minutes or
> more, there's really no way to find the test(s).
>
> On Thu, Apr 16, 2020 at 12:52 PM Owen Nichols  wrote:
>
>> Kirk, most IntegrationTest jobs run in 25-30 minutes, but I did see one <
>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/IntegrationTestOpenJDK11/builds/7202>
>> that came in just under 45 minutes but did succeed.  It would be nice to
>> know what test is occasionally taking longer and why…
>>
>> Here’s an example of a previous timeout increase (Note that both the job
>> timeout and the callstack timeout should be increased by the same amount):
>> https://github.com/apache/geode/pull/4231
>>
>> > On Apr 16, 2020, at 10:47 AM, Kirk Lund  wrote:
>> >
>> > Unfortunately, IntegrationTest exceeds timeout every time I trigger it.
>> The
>> > cause does not appear to be a specific test or hang. I
>> > think IntegrationTest has already been running very close to the timeout
>> > and is exceeding it fairly often even without my changes in #4963.
>> >
>> > Should we increase the timeout for IntegrationTest? (Anyone know how to
>> > increase it?)
>>
>>


Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-16 Thread Kirk Lund
It timed out while running OldFreeListOffHeapRegionJUnitTest but I think
the tests before it were responsible for the timeout being exceeded. I
looked through all of the previously run tests and how long each but
without having some sort of database with how long each test takes, it's
impossible to know which test or tests take longer in any given PR.

The IntegrationTest job that exceeded the timeout:
https://concourse.apachegeode-ci.info/builds/147866

The Test Summary for the above IntegrationTest job with Duration for each
test:
http://files.apachegeode-ci.info/builds/apache-develop-pr/geode-pr-4963/test-results/integrationTest/1587061092/

Unless we want to start tracking each test class/method and its Duration in
a database, I don't see how we could look for trends or changes to identify
test(s) that suddenly start taking longer. All of the tests take less than
3 minutes each, so unless one suddenly spikes to 10 minutes or more,
there's really no way to find the test(s).

On Thu, Apr 16, 2020 at 12:52 PM Owen Nichols  wrote:

> Kirk, most IntegrationTest jobs run in 25-30 minutes, but I did see one <
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/IntegrationTestOpenJDK11/builds/7202>
> that came in just under 45 minutes but did succeed.  It would be nice to
> know what test is occasionally taking longer and why…
>
> Here’s an example of a previous timeout increase (Note that both the job
> timeout and the callstack timeout should be increased by the same amount):
> https://github.com/apache/geode/pull/4231
>
> > On Apr 16, 2020, at 10:47 AM, Kirk Lund  wrote:
> >
> > Unfortunately, IntegrationTest exceeds timeout every time I trigger it.
> The
> > cause does not appear to be a specific test or hang. I
> > think IntegrationTest has already been running very close to the timeout
> > and is exceeding it fairly often even without my changes in #4963.
> >
> > Should we increase the timeout for IntegrationTest? (Anyone know how to
> > increase it?)
>
>


Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-16 Thread Owen Nichols
Kirk, most IntegrationTest jobs run in 25-30 minutes, but I did see one 

 that came in just under 45 minutes but did succeed.  It would be nice to know 
what test is occasionally taking longer and why…

Here’s an example of a previous timeout increase (Note that both the job 
timeout and the callstack timeout should be increased by the same amount): 
https://github.com/apache/geode/pull/4231

> On Apr 16, 2020, at 10:47 AM, Kirk Lund  wrote:
> 
> Unfortunately, IntegrationTest exceeds timeout every time I trigger it. The
> cause does not appear to be a specific test or hang. I
> think IntegrationTest has already been running very close to the timeout
> and is exceeding it fairly often even without my changes in #4963.
> 
> Should we increase the timeout for IntegrationTest? (Anyone know how to
> increase it?)



Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-16 Thread Robert Houghton
 >
> >>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fx%2Fthread.html%2Ff385a6dd51209e2706c68c9821412a6f22ebef3e809636060ac0bf55%40%253Cdev.geode.apache.org%253E&data=02%7C01%7Chansonm%40vmware.com%7C7a43463ab53c416234d908d7e0c4cc6b%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637225008165230328&sdata=GD77kAubDDWfP93zjYsNP61VMN4%2FKBAHcq95GwjeMBc%3D&reserved=0
> >> > >
> >> > > On Tue, Apr 14, 2020 at 3:23 PM John Blum  wrote:
> >> > >
> >> > >> My first thought is cache close (i.e. RegionService.close()  should
> >> be
> >> > >> synchronous (by default), perhaps, with non-blocking options or
> >> options
> >> > to
> >> > >> wait for a set timeout as Jake suggested.
> >> > >>
> >> > >> This is a problem for *Integration Tests* (that start a peer cache
> >> > >> instance, in-process or standalone) in general and not simply just
> >> > >> "distributed" tests!  This is the reason I built support for this
> in
> >> > >> *Spring
> >> > >> Test for Apache Geode* (STDG) since subsequent tests requiring a
> peer
> >> > cache
> >> > >> instance (or CacheServer) may conflict with each other, especially
> >> > given 1)
> >> > >> the cache instance is a *Singleton* and 2) it is ideal to not have
> to
> >> > >> restart the JVM between, even for *Integration Tests*, however,
> turns
> >> > out
> >> > >> to be a really common practice. *#ugh*  However, without proper
> >> > >> coordination this can be a real problem, hence STDG's support.
> Even
> >> > when
> >> > >> forking JVMs, you still have to be careful to wait in certain
> cases,
> >> as
> >> > you
> >> > >> could run into other conflicts (e.g. BindExceptions if not varying
> >> port
> >> > >> numbers and such).  For all these reasons and more, it is important
> >> that
> >> > >> the cache has fully shutdown and released all its resources.
> >> > >>
> >> > >> Also, while we are on this topic, I think it would be useful to
> have
> >> a
> >> > >> dedicated interface for the cache instance lifecycle.  It's
> >> unfortunate
> >> > the
> >> > >> CacheListener interface is already taken for Region events. *#sigh*
> >> > >>
> >> > >> The interface might be something like...
> >> > >>
> >> > >> interface CacheLifecycleListener {
> >> > >>
> >> > >>  default void isStarting(CacheEvent event) {}
> >> > >>
> >> > >>  default void onStart(CacheEvent event) {}
> >> > >>
> >> > >>  default void isClosing(CacheEvent event) {}
> >> > >>
> >> > >>  default void onClose(CacheEvent event) {}
> >> > >>
> >> > >>  ...
> >> > >>
> >> > >> }
> >> > >>
> >> > >> -j
> >> > >>
> >> > >> On Tue, Apr 14, 2020 at 3:21 PM Jason Huynh 
> >> wrote:
> >> > >>
> >> > >>> The isClosed flag and method might be used currently more as an
> >> > >> isClosing.
> >> > >>> The GemFireCacheImpl.isClosed() method is actually returning
> >> isClosing.
> >> > >>> Whatever change to isClosed that will be made, will have to
> properly
> >> > >> handle
> >> > >>> cases where it's meant to be treated as isClosing().
> >> > >>>
> >> > >>> On Tue, Apr 14, 2020 at 3:09 PM Mark Hanson 
> >> > wrote:
> >> > >>>
> >> > >>>> Hi Jake,
> >> > >>>>
> >> > >>>> For Option 6: We could fix isClosed as well. That is a great
> >> > >> suggestion.
> >> > >>>> Currently, it returns almost immediately.
> >> > >>>> I like your options though
> >> > >>>>
> >> > >>>> Any other thoughts?
> >> > >>>>
> >> > >>>> Any preferences? It think any of the current options seem better
> >> than
> >> > >> the
> >> > >>>> current situation as long as we fix isClosed.
> >> > >>>>
> >> > >>>> Thanks,
> >> > >>>> Mark
> >> > >>>> 
> >> > >>>> From: Jacob Barrett 
> >> > >>>> Sent: Tuesday, April 14, 2020 2:30 PM
> >> > >>>> To: dev@geode.apache.org 
> >> > >>>> Subject: Re: [Discuss] Cache.close synchronous is not
> synchronous,
> >> but
> >> > >>>> code still expects it to be
> >> > >>>>
> >> > >>>> Option 4: Cache.closeAndWait(long timeout, TimeUnit unit) -
> Closes
> >> and
> >> > >>>> waits until it is really closed.
> >> > >>>> Option 5: Cache.close(Runnable closedCalleback) - Runs callback
> >> after
> >> > >>>> cache is really close.
> >> > >>>> Option 6: cache.close(); while (!cache.isClosed());
> >> > >>>>
> >> > >>>>
> >> > >>>>> On Apr 14, 2020, at 2:11 PM, Mark Hanson 
> >> wrote:
> >> > >>>>>
> >> > >>>>> Hi All,
> >> > >>>>>
> >> > >>>>> I know that we have discussed this once before, but I think it
> >> bears
> >> > >>>> repeating. We have test code that assumes cache.close is
> >> synchronous.
> >> > >> It
> >> > >>> is
> >> > >>>> not. Not even close. I would like discuss some possible changes.
> >> > >>>>>
> >> > >>>>> Option 1. Call it what it is.  Deprecate Cache.close and create
> a
> >> new
> >> > >>>> method called asyncClose to replace it. Simple and descriptive.
> >> > >>>>> Option 2. Fix cache close so it is synchronous. Some might say
> >> that
> >> > >> we
> >> > >>>> are going to break behavior, but I doubt any user relies on the
> >> fact
> >> > >> that
> >> > >>>> it is asynchronous. That would be dangerous in and of itself.
> >> > >> Obviously,
> >> > >>> we
> >> > >>>> don’t want to change behavior, but there have been a number of
> >> > >>> distributed
> >> > >>>> tests that have failed for this. If internal to the code devs
> don’t
> >> > get
> >> > >>> it
> >> > >>>> right, where does that leave users.
> >> > >>>>> Option 3. Status quo.
> >> > >>>>>
> >> > >>>>> What do you think? Are there other options I am missing?
> >> > >>>>>
> >> > >>>>> Thanks,
> >> > >>>>> Mark
> >> > >>>>>
> >> > >>>>
> >> > >>>>
> >> > >>>
> >> > >>
> >> > >>
> >> > >> --
> >> > >> -John
> >> > >> Spring Data Team
> >> > >>
> >> >
> >> >
> >>
> >> --
> >> -John
> >> Spring Data Team
> >>
> >
>


Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-16 Thread Kirk Lund
>> > >> instance, in-process or standalone) in general and not simply just
>> > >> "distributed" tests!  This is the reason I built support for this in
>> > >> *Spring
>> > >> Test for Apache Geode* (STDG) since subsequent tests requiring a peer
>> > cache
>> > >> instance (or CacheServer) may conflict with each other, especially
>> > given 1)
>> > >> the cache instance is a *Singleton* and 2) it is ideal to not have to
>> > >> restart the JVM between, even for *Integration Tests*, however, turns
>> > out
>> > >> to be a really common practice. *#ugh*  However, without proper
>> > >> coordination this can be a real problem, hence STDG's support.  Even
>> > when
>> > >> forking JVMs, you still have to be careful to wait in certain cases,
>> as
>> > you
>> > >> could run into other conflicts (e.g. BindExceptions if not varying
>> port
>> > >> numbers and such).  For all these reasons and more, it is important
>> that
>> > >> the cache has fully shutdown and released all its resources.
>> > >>
>> > >> Also, while we are on this topic, I think it would be useful to have
>> a
>> > >> dedicated interface for the cache instance lifecycle.  It's
>> unfortunate
>> > the
>> > >> CacheListener interface is already taken for Region events. *#sigh*
>> > >>
>> > >> The interface might be something like...
>> > >>
>> > >> interface CacheLifecycleListener {
>> > >>
>> > >>  default void isStarting(CacheEvent event) {}
>> > >>
>> > >>  default void onStart(CacheEvent event) {}
>> > >>
>> > >>  default void isClosing(CacheEvent event) {}
>> > >>
>> > >>  default void onClose(CacheEvent event) {}
>> > >>
>> > >>  ...
>> > >>
>> > >> }
>> > >>
>> > >> -j
>> > >>
>> > >> On Tue, Apr 14, 2020 at 3:21 PM Jason Huynh 
>> wrote:
>> > >>
>> > >>> The isClosed flag and method might be used currently more as an
>> > >> isClosing.
>> > >>> The GemFireCacheImpl.isClosed() method is actually returning
>> isClosing.
>> > >>> Whatever change to isClosed that will be made, will have to properly
>> > >> handle
>> > >>> cases where it's meant to be treated as isClosing().
>> > >>>
>> > >>> On Tue, Apr 14, 2020 at 3:09 PM Mark Hanson 
>> > wrote:
>> > >>>
>> > >>>> Hi Jake,
>> > >>>>
>> > >>>> For Option 6: We could fix isClosed as well. That is a great
>> > >> suggestion.
>> > >>>> Currently, it returns almost immediately.
>> > >>>> I like your options though
>> > >>>>
>> > >>>> Any other thoughts?
>> > >>>>
>> > >>>> Any preferences? It think any of the current options seem better
>> than
>> > >> the
>> > >>>> current situation as long as we fix isClosed.
>> > >>>>
>> > >>>> Thanks,
>> > >>>> Mark
>> > >>>> 
>> > >>>> From: Jacob Barrett 
>> > >>>> Sent: Tuesday, April 14, 2020 2:30 PM
>> > >>>> To: dev@geode.apache.org 
>> > >>>> Subject: Re: [Discuss] Cache.close synchronous is not synchronous,
>> but
>> > >>>> code still expects it to be
>> > >>>>
>> > >>>> Option 4: Cache.closeAndWait(long timeout, TimeUnit unit) - Closes
>> and
>> > >>>> waits until it is really closed.
>> > >>>> Option 5: Cache.close(Runnable closedCalleback) - Runs callback
>> after
>> > >>>> cache is really close.
>> > >>>> Option 6: cache.close(); while (!cache.isClosed());
>> > >>>>
>> > >>>>
>> > >>>>> On Apr 14, 2020, at 2:11 PM, Mark Hanson 
>> wrote:
>> > >>>>>
>> > >>>>> Hi All,
>> > >>>>>
>> > >>>>> I know that we have discussed this once before, but I think it
>> bears
>> > >>>> repeating. We have test code that assumes cache.close is
>> synchronous.
>> > >> It
>> > >>> is
>> > >>>> not. Not even close. I would like discuss some possible changes.
>> > >>>>>
>> > >>>>> Option 1. Call it what it is.  Deprecate Cache.close and create a
>> new
>> > >>>> method called asyncClose to replace it. Simple and descriptive.
>> > >>>>> Option 2. Fix cache close so it is synchronous. Some might say
>> that
>> > >> we
>> > >>>> are going to break behavior, but I doubt any user relies on the
>> fact
>> > >> that
>> > >>>> it is asynchronous. That would be dangerous in and of itself.
>> > >> Obviously,
>> > >>> we
>> > >>>> don’t want to change behavior, but there have been a number of
>> > >>> distributed
>> > >>>> tests that have failed for this. If internal to the code devs don’t
>> > get
>> > >>> it
>> > >>>> right, where does that leave users.
>> > >>>>> Option 3. Status quo.
>> > >>>>>
>> > >>>>> What do you think? Are there other options I am missing?
>> > >>>>>
>> > >>>>> Thanks,
>> > >>>>> Mark
>> > >>>>>
>> > >>>>
>> > >>>>
>> > >>>
>> > >>
>> > >>
>> > >> --
>> > >> -John
>> > >> Spring Data Team
>> > >>
>> >
>> >
>>
>> --
>> -John
>> Spring Data Team
>>
>


Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-16 Thread Bruce Schuchardt
fault void isClosing(CacheEvent event) {}
> >>
> >>  default void onClose(CacheEvent event) {}
> >>
> >>  ...
> >>
> >> }
> >>
> >> -j
> >>
> >> On Tue, Apr 14, 2020 at 3:21 PM Jason Huynh  wrote:
> >>
> >>> The isClosed flag and method might be used currently more as an
> >> isClosing.
> >>> The GemFireCacheImpl.isClosed() method is actually returning 
isClosing.
> >>> Whatever change to isClosed that will be made, will have to properly
> >> handle
> >>> cases where it's meant to be treated as isClosing().
> >>>
> >>> On Tue, Apr 14, 2020 at 3:09 PM Mark Hanson 
    > wrote:
    > >>>
    > >>>> Hi Jake,
> >>>>
> >>>> For Option 6: We could fix isClosed as well. That is a great
> >> suggestion.
> >>>> Currently, it returns almost immediately.
> >>>> I like your options though
> >>>>
> >>>> Any other thoughts?
> >>>>
> >>>> Any preferences? It think any of the current options seem better than
> >> the
> >>>> current situation as long as we fix isClosed.
> >>>>
> >>>> Thanks,
> >>>> Mark
> >>>> 
> >>>> From: Jacob Barrett 
> >>>> Sent: Tuesday, April 14, 2020 2:30 PM
> >>>> To: dev@geode.apache.org 
> >>>> Subject: Re: [Discuss] Cache.close synchronous is not synchronous, 
but
> >>>> code still expects it to be
> >>>>
> >>>> Option 4: Cache.closeAndWait(long timeout, TimeUnit unit) - Closes 
and
> >>>> waits until it is really closed.
> >>>> Option 5: Cache.close(Runnable closedCalleback) - Runs callback after
> >>>> cache is really close.
> >>>> Option 6: cache.close(); while (!cache.isClosed());
> >>>>
> >>>>
> >>>>> On Apr 14, 2020, at 2:11 PM, Mark Hanson  wrote:
> >>>>>
> >>>>> Hi All,
> >>>>>
> >>>>> I know that we have discussed this once before, but I think it bears
> >>>> repeating. We have test code that assumes cache.close is synchronous.
> >> It
> >>> is
> >>>> not. Not even close. I would like discuss some possible changes.
> >>>>>
> >>>>> Option 1. Call it what it is.  Deprecate Cache.close and create a 
new
> >>>> method called asyncClose to replace it. Simple and descriptive.
> >>>>> Option 2. Fix cache close so it is synchronous. Some might say that
> >> we
> >>>> are going to break behavior, but I doubt any user relies on the fact
> >> that
> >>>> it is asynchronous. That would be dangerous in and of itself.
> >> Obviously,
> >>> we
> >>>> don’t want to change behavior, but there have been a number of
> >>> distributed
> >>>> tests that have failed for this. If internal to the code devs don’t
> get
> >>> it
> >>>> right, where does that leave users.
> >>>>> Option 3. Status quo.
> >>>>>
> >>>>> What do you think? Are there other options I am missing?
> >>>>>
> >>>>> Thanks,
> >>>>> Mark
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >> --
> >> -John
> >> Spring Data Team
> >>
>
>

-- 
-John
Spring Data Team





Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-15 Thread Anthony Baker
r this in
>>>>> *Spring
>>>>> Test for Apache Geode* (STDG) since subsequent tests requiring a peer
>>> cache
>>>>> instance (or CacheServer) may conflict with each other, especially
>>> given 1)
>>>>> the cache instance is a *Singleton* and 2) it is ideal to not have to
>>>>> restart the JVM between, even for *Integration Tests*, however, turns
>>> out
>>>>> to be a really common practice. *#ugh*  However, without proper
>>>>> coordination this can be a real problem, hence STDG's support.  Even
>>> when
>>>>> forking JVMs, you still have to be careful to wait in certain cases,
>> as
>>> you
>>>>> could run into other conflicts (e.g. BindExceptions if not varying
>> port
>>>>> numbers and such).  For all these reasons and more, it is important
>> that
>>>>> the cache has fully shutdown and released all its resources.
>>>>> 
>>>>> Also, while we are on this topic, I think it would be useful to have a
>>>>> dedicated interface for the cache instance lifecycle.  It's
>> unfortunate
>>> the
>>>>> CacheListener interface is already taken for Region events. *#sigh*
>>>>> 
>>>>> The interface might be something like...
>>>>> 
>>>>> interface CacheLifecycleListener {
>>>>> 
>>>>> default void isStarting(CacheEvent event) {}
>>>>> 
>>>>> default void onStart(CacheEvent event) {}
>>>>> 
>>>>> default void isClosing(CacheEvent event) {}
>>>>> 
>>>>> default void onClose(CacheEvent event) {}
>>>>> 
>>>>> ...
>>>>> 
>>>>> }
>>>>> 
>>>>> -j
>>>>> 
>>>>> On Tue, Apr 14, 2020 at 3:21 PM Jason Huynh 
>> wrote:
>>>>> 
>>>>>> The isClosed flag and method might be used currently more as an
>>>>> isClosing.
>>>>>> The GemFireCacheImpl.isClosed() method is actually returning
>> isClosing.
>>>>>> Whatever change to isClosed that will be made, will have to properly
>>>>> handle
>>>>>> cases where it's meant to be treated as isClosing().
>>>>>> 
>>>>>> On Tue, Apr 14, 2020 at 3:09 PM Mark Hanson 
>>> wrote:
>>>>>> 
>>>>>>> Hi Jake,
>>>>>>> 
>>>>>>> For Option 6: We could fix isClosed as well. That is a great
>>>>> suggestion.
>>>>>>> Currently, it returns almost immediately.
>>>>>>> I like your options though
>>>>>>> 
>>>>>>> Any other thoughts?
>>>>>>> 
>>>>>>> Any preferences? It think any of the current options seem better
>> than
>>>>> the
>>>>>>> current situation as long as we fix isClosed.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Mark
>>>>>>> 
>>>>>>> From: Jacob Barrett 
>>>>>>> Sent: Tuesday, April 14, 2020 2:30 PM
>>>>>>> To: dev@geode.apache.org 
>>>>>>> Subject: Re: [Discuss] Cache.close synchronous is not synchronous,
>> but
>>>>>>> code still expects it to be
>>>>>>> 
>>>>>>> Option 4: Cache.closeAndWait(long timeout, TimeUnit unit) - Closes
>> and
>>>>>>> waits until it is really closed.
>>>>>>> Option 5: Cache.close(Runnable closedCalleback) - Runs callback
>> after
>>>>>>> cache is really close.
>>>>>>> Option 6: cache.close(); while (!cache.isClosed());
>>>>>>> 
>>>>>>> 
>>>>>>>> On Apr 14, 2020, at 2:11 PM, Mark Hanson 
>> wrote:
>>>>>>>> 
>>>>>>>> Hi All,
>>>>>>>> 
>>>>>>>> I know that we have discussed this once before, but I think it
>> bears
>>>>>>> repeating. We have test code that assumes cache.close is
>> synchronous.
>>>>> It
>>>>>> is
>>>>>>> not. Not even close. I would like discuss some possible changes.
>>>>>>>> 
>>>>>>>> Option 1. Call it what it is.  Deprecate Cache.close and create a
>> new
>>>>>>> method called asyncClose to replace it. Simple and descriptive.
>>>>>>>> Option 2. Fix cache close so it is synchronous. Some might say that
>>>>> we
>>>>>>> are going to break behavior, but I doubt any user relies on the fact
>>>>> that
>>>>>>> it is asynchronous. That would be dangerous in and of itself.
>>>>> Obviously,
>>>>>> we
>>>>>>> don’t want to change behavior, but there have been a number of
>>>>>> distributed
>>>>>>> tests that have failed for this. If internal to the code devs don’t
>>> get
>>>>>> it
>>>>>>> right, where does that leave users.
>>>>>>>> Option 3. Status quo.
>>>>>>>> 
>>>>>>>> What do you think? Are there other options I am missing?
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Mark
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> -John
>>>>> Spring Data Team
>>>>> 
>>> 
>>> 
>> 
>> --
>> -John
>> Spring Data Team
>> 



Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-14 Thread Kirk Lund
t; > >> dedicated interface for the cache instance lifecycle.  It's
> unfortunate
> > the
> > >> CacheListener interface is already taken for Region events. *#sigh*
> > >>
> > >> The interface might be something like...
> > >>
> > >> interface CacheLifecycleListener {
> > >>
> > >>  default void isStarting(CacheEvent event) {}
> > >>
> > >>  default void onStart(CacheEvent event) {}
> > >>
> > >>  default void isClosing(CacheEvent event) {}
> > >>
> > >>  default void onClose(CacheEvent event) {}
> > >>
> > >>  ...
> > >>
> > >> }
> > >>
> > >> -j
> > >>
> > >> On Tue, Apr 14, 2020 at 3:21 PM Jason Huynh 
> wrote:
> > >>
> > >>> The isClosed flag and method might be used currently more as an
> > >> isClosing.
> > >>> The GemFireCacheImpl.isClosed() method is actually returning
> isClosing.
> > >>> Whatever change to isClosed that will be made, will have to properly
> > >> handle
> > >>> cases where it's meant to be treated as isClosing().
> > >>>
> > >>> On Tue, Apr 14, 2020 at 3:09 PM Mark Hanson 
> > wrote:
> > >>>
> > >>>> Hi Jake,
> > >>>>
> > >>>> For Option 6: We could fix isClosed as well. That is a great
> > >> suggestion.
> > >>>> Currently, it returns almost immediately.
> > >>>> I like your options though
> > >>>>
> > >>>> Any other thoughts?
> > >>>>
> > >>>> Any preferences? It think any of the current options seem better
> than
> > >> the
> > >>>> current situation as long as we fix isClosed.
> > >>>>
> > >>>> Thanks,
> > >>>> Mark
> > >>>> 
> > >>>> From: Jacob Barrett 
> > >>>> Sent: Tuesday, April 14, 2020 2:30 PM
> > >>>> To: dev@geode.apache.org 
> > >>>> Subject: Re: [Discuss] Cache.close synchronous is not synchronous,
> but
> > >>>> code still expects it to be
> > >>>>
> > >>>> Option 4: Cache.closeAndWait(long timeout, TimeUnit unit) - Closes
> and
> > >>>> waits until it is really closed.
> > >>>> Option 5: Cache.close(Runnable closedCalleback) - Runs callback
> after
> > >>>> cache is really close.
> > >>>> Option 6: cache.close(); while (!cache.isClosed());
> > >>>>
> > >>>>
> > >>>>> On Apr 14, 2020, at 2:11 PM, Mark Hanson 
> wrote:
> > >>>>>
> > >>>>> Hi All,
> > >>>>>
> > >>>>> I know that we have discussed this once before, but I think it
> bears
> > >>>> repeating. We have test code that assumes cache.close is
> synchronous.
> > >> It
> > >>> is
> > >>>> not. Not even close. I would like discuss some possible changes.
> > >>>>>
> > >>>>> Option 1. Call it what it is.  Deprecate Cache.close and create a
> new
> > >>>> method called asyncClose to replace it. Simple and descriptive.
> > >>>>> Option 2. Fix cache close so it is synchronous. Some might say that
> > >> we
> > >>>> are going to break behavior, but I doubt any user relies on the fact
> > >> that
> > >>>> it is asynchronous. That would be dangerous in and of itself.
> > >> Obviously,
> > >>> we
> > >>>> don’t want to change behavior, but there have been a number of
> > >>> distributed
> > >>>> tests that have failed for this. If internal to the code devs don’t
> > get
> > >>> it
> > >>>> right, where does that leave users.
> > >>>>> Option 3. Status quo.
> > >>>>>
> > >>>>> What do you think? Are there other options I am missing?
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Mark
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >>
> > >> --
> > >> -John
> > >> Spring Data Team
> > >>
> >
> >
>
> --
> -John
> Spring Data Team
>


Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-14 Thread Jacob Barrett


> On Apr 14, 2020, at 3:09 PM, Mark Hanson  wrote:
> 
> Any other thoughts?

Don’t encourage me!

Option 7: remove all singletons!

> Any preferences? It think any of the current options seem better than the 
> current situation as long as we fix isClosed.

Option 7, 2 or 4 in that order.



Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-14 Thread John Blum
 as isClosing().
> >>>
> >>> On Tue, Apr 14, 2020 at 3:09 PM Mark Hanson 
> wrote:
> >>>
> >>>> Hi Jake,
> >>>>
> >>>> For Option 6: We could fix isClosed as well. That is a great
> >> suggestion.
> >>>> Currently, it returns almost immediately.
> >>>> I like your options though
> >>>>
> >>>> Any other thoughts?
> >>>>
> >>>> Any preferences? It think any of the current options seem better than
> >> the
> >>>> current situation as long as we fix isClosed.
> >>>>
> >>>> Thanks,
> >>>> Mark
> >>>> 
> >>>> From: Jacob Barrett 
> >>>> Sent: Tuesday, April 14, 2020 2:30 PM
> >>>> To: dev@geode.apache.org 
> >>>> Subject: Re: [Discuss] Cache.close synchronous is not synchronous, but
> >>>> code still expects it to be
> >>>>
> >>>> Option 4: Cache.closeAndWait(long timeout, TimeUnit unit) - Closes and
> >>>> waits until it is really closed.
> >>>> Option 5: Cache.close(Runnable closedCalleback) - Runs callback after
> >>>> cache is really close.
> >>>> Option 6: cache.close(); while (!cache.isClosed());
> >>>>
> >>>>
> >>>>> On Apr 14, 2020, at 2:11 PM, Mark Hanson  wrote:
> >>>>>
> >>>>> Hi All,
> >>>>>
> >>>>> I know that we have discussed this once before, but I think it bears
> >>>> repeating. We have test code that assumes cache.close is synchronous.
> >> It
> >>> is
> >>>> not. Not even close. I would like discuss some possible changes.
> >>>>>
> >>>>> Option 1. Call it what it is.  Deprecate Cache.close and create a new
> >>>> method called asyncClose to replace it. Simple and descriptive.
> >>>>> Option 2. Fix cache close so it is synchronous. Some might say that
> >> we
> >>>> are going to break behavior, but I doubt any user relies on the fact
> >> that
> >>>> it is asynchronous. That would be dangerous in and of itself.
> >> Obviously,
> >>> we
> >>>> don’t want to change behavior, but there have been a number of
> >>> distributed
> >>>> tests that have failed for this. If internal to the code devs don’t
> get
> >>> it
> >>>> right, where does that leave users.
> >>>>> Option 3. Status quo.
> >>>>>
> >>>>> What do you think? Are there other options I am missing?
> >>>>>
> >>>>> Thanks,
> >>>>> Mark
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >> --
> >> -John
> >> Spring Data Team
> >>
>
>

-- 
-John
Spring Data Team


Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-14 Thread Mark Hanson
I believe it is because of disk persistence among other things. Kirk would know 
for sure. He fixed the issue and his PR got shutdown.
I totally support just fixing the bug.

Let’s give Kirk a chance to chime in.

Thanks,
Mark

> On Apr 14, 2020, at 3:39 PM, Dan Smith  wrote:
> 
> IMO if it's not currently synchronous, that's just a bug that needs to be
> fixed. If folks can provide details on what actually was asynchronous or
> the particular test failures they saw, that would be helpful.
> 
> Previously, when this came up it looks like Kirk found that close would not
> wait for a different call to close() issued by a different thread [1]. Is
> that still the bug we are running into? One that thread, it seems like we
> also had a consensus we should just fix bugs with Cache.close:
> 
> -Dan
> 
> 1.
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fx%2Fthread.html%2Ff385a6dd51209e2706c68c9821412a6f22ebef3e809636060ac0bf55%40%253Cdev.geode.apache.org%253E&data=02%7C01%7Chansonm%40vmware.com%7C7a43463ab53c416234d908d7e0c4cc6b%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637225008165230328&sdata=GD77kAubDDWfP93zjYsNP61VMN4%2FKBAHcq95GwjeMBc%3D&reserved=0
> 
> On Tue, Apr 14, 2020 at 3:23 PM John Blum  wrote:
> 
>> My first thought is cache close (i.e. RegionService.close()  should be
>> synchronous (by default), perhaps, with non-blocking options or options to
>> wait for a set timeout as Jake suggested.
>> 
>> This is a problem for *Integration Tests* (that start a peer cache
>> instance, in-process or standalone) in general and not simply just
>> "distributed" tests!  This is the reason I built support for this in
>> *Spring
>> Test for Apache Geode* (STDG) since subsequent tests requiring a peer cache
>> instance (or CacheServer) may conflict with each other, especially given 1)
>> the cache instance is a *Singleton* and 2) it is ideal to not have to
>> restart the JVM between, even for *Integration Tests*, however, turns out
>> to be a really common practice. *#ugh*  However, without proper
>> coordination this can be a real problem, hence STDG's support.  Even when
>> forking JVMs, you still have to be careful to wait in certain cases, as you
>> could run into other conflicts (e.g. BindExceptions if not varying port
>> numbers and such).  For all these reasons and more, it is important that
>> the cache has fully shutdown and released all its resources.
>> 
>> Also, while we are on this topic, I think it would be useful to have a
>> dedicated interface for the cache instance lifecycle.  It's unfortunate the
>> CacheListener interface is already taken for Region events. *#sigh*
>> 
>> The interface might be something like...
>> 
>> interface CacheLifecycleListener {
>> 
>>  default void isStarting(CacheEvent event) {}
>> 
>>  default void onStart(CacheEvent event) {}
>> 
>>  default void isClosing(CacheEvent event) {}
>> 
>>  default void onClose(CacheEvent event) {}
>> 
>>  ...
>> 
>> }
>> 
>> -j
>> 
>> On Tue, Apr 14, 2020 at 3:21 PM Jason Huynh  wrote:
>> 
>>> The isClosed flag and method might be used currently more as an
>> isClosing.
>>> The GemFireCacheImpl.isClosed() method is actually returning isClosing.
>>> Whatever change to isClosed that will be made, will have to properly
>> handle
>>> cases where it's meant to be treated as isClosing().
>>> 
>>> On Tue, Apr 14, 2020 at 3:09 PM Mark Hanson  wrote:
>>> 
>>>> Hi Jake,
>>>> 
>>>> For Option 6: We could fix isClosed as well. That is a great
>> suggestion.
>>>> Currently, it returns almost immediately.
>>>> I like your options though
>>>> 
>>>> Any other thoughts?
>>>> 
>>>> Any preferences? It think any of the current options seem better than
>> the
>>>> current situation as long as we fix isClosed.
>>>> 
>>>> Thanks,
>>>> Mark
>>>> 
>>>> From: Jacob Barrett 
>>>> Sent: Tuesday, April 14, 2020 2:30 PM
>>>> To: dev@geode.apache.org 
>>>> Subject: Re: [Discuss] Cache.close synchronous is not synchronous, but
>>>> code still expects it to be
>>>> 
>>>> Option 4: Cache.closeAndWait(long timeout, TimeUnit unit) - Closes and
>>>> waits until it is really closed.
>>>> Option 5: Cache.close(Runnable closedCalleback) - Runs callback after
>>>> cache is re

Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-14 Thread Dan Smith
IMO if it's not currently synchronous, that's just a bug that needs to be
fixed. If folks can provide details on what actually was asynchronous or
the particular test failures they saw, that would be helpful.

Previously, when this came up it looks like Kirk found that close would not
wait for a different call to close() issued by a different thread [1]. Is
that still the bug we are running into? One that thread, it seems like we
also had a consensus we should just fix bugs with Cache.close:

-Dan

1.
https://lists.apache.org/x/thread.html/f385a6dd51209e2706c68c9821412a6f22ebef3e809636060ac0bf55@%3Cdev.geode.apache.org%3E

On Tue, Apr 14, 2020 at 3:23 PM John Blum  wrote:

> My first thought is cache close (i.e. RegionService.close()  should be
> synchronous (by default), perhaps, with non-blocking options or options to
> wait for a set timeout as Jake suggested.
>
> This is a problem for *Integration Tests* (that start a peer cache
> instance, in-process or standalone) in general and not simply just
> "distributed" tests!  This is the reason I built support for this in
> *Spring
> Test for Apache Geode* (STDG) since subsequent tests requiring a peer cache
> instance (or CacheServer) may conflict with each other, especially given 1)
> the cache instance is a *Singleton* and 2) it is ideal to not have to
> restart the JVM between, even for *Integration Tests*, however, turns out
> to be a really common practice. *#ugh*  However, without proper
> coordination this can be a real problem, hence STDG's support.  Even when
> forking JVMs, you still have to be careful to wait in certain cases, as you
> could run into other conflicts (e.g. BindExceptions if not varying port
> numbers and such).  For all these reasons and more, it is important that
> the cache has fully shutdown and released all its resources.
>
> Also, while we are on this topic, I think it would be useful to have a
> dedicated interface for the cache instance lifecycle.  It's unfortunate the
> CacheListener interface is already taken for Region events. *#sigh*
>
> The interface might be something like...
>
> interface CacheLifecycleListener {
>
>   default void isStarting(CacheEvent event) {}
>
>   default void onStart(CacheEvent event) {}
>
>   default void isClosing(CacheEvent event) {}
>
>   default void onClose(CacheEvent event) {}
>
>   ...
>
> }
>
> -j
>
> On Tue, Apr 14, 2020 at 3:21 PM Jason Huynh  wrote:
>
> > The isClosed flag and method might be used currently more as an
> isClosing.
> > The GemFireCacheImpl.isClosed() method is actually returning isClosing.
> > Whatever change to isClosed that will be made, will have to properly
> handle
> > cases where it's meant to be treated as isClosing().
> >
> > On Tue, Apr 14, 2020 at 3:09 PM Mark Hanson  wrote:
> >
> > > Hi Jake,
> > >
> > > For Option 6: We could fix isClosed as well. That is a great
> suggestion.
> > > Currently, it returns almost immediately.
> > > I like your options though
> > >
> > > Any other thoughts?
> > >
> > > Any preferences? It think any of the current options seem better than
> the
> > > current situation as long as we fix isClosed.
> > >
> > > Thanks,
> > > Mark
> > > 
> > > From: Jacob Barrett 
> > > Sent: Tuesday, April 14, 2020 2:30 PM
> > > To: dev@geode.apache.org 
> > > Subject: Re: [Discuss] Cache.close synchronous is not synchronous, but
> > > code still expects it to be
> > >
> > > Option 4: Cache.closeAndWait(long timeout, TimeUnit unit) - Closes and
> > > waits until it is really closed.
> > > Option 5: Cache.close(Runnable closedCalleback) - Runs callback after
> > > cache is really close.
> > > Option 6: cache.close(); while (!cache.isClosed());
> > >
> > >
> > > > On Apr 14, 2020, at 2:11 PM, Mark Hanson  wrote:
> > > >
> > > > Hi All,
> > > >
> > > > I know that we have discussed this once before, but I think it bears
> > > repeating. We have test code that assumes cache.close is synchronous.
> It
> > is
> > > not. Not even close. I would like discuss some possible changes.
> > > >
> > > > Option 1. Call it what it is.  Deprecate Cache.close and create a new
> > > method called asyncClose to replace it. Simple and descriptive.
> > > > Option 2. Fix cache close so it is synchronous. Some might say that
> we
> > > are going to break behavior, but I doubt any user relies on the fact
> that
> > > it is asynchronous. That would be dangerous in and of itself.
> Obviously,
> > we
> > > don’t want to change behavior, but there have been a number of
> > distributed
> > > tests that have failed for this. If internal to the code devs don’t get
> > it
> > > right, where does that leave users.
> > > > Option 3. Status quo.
> > > >
> > > > What do you think? Are there other options I am missing?
> > > >
> > > > Thanks,
> > > > Mark
> > > >
> > >
> > >
> >
>
>
> --
> -John
> Spring Data Team
>


Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-14 Thread John Blum
My first thought is cache close (i.e. RegionService.close()  should be
synchronous (by default), perhaps, with non-blocking options or options to
wait for a set timeout as Jake suggested.

This is a problem for *Integration Tests* (that start a peer cache
instance, in-process or standalone) in general and not simply just
"distributed" tests!  This is the reason I built support for this in *Spring
Test for Apache Geode* (STDG) since subsequent tests requiring a peer cache
instance (or CacheServer) may conflict with each other, especially given 1)
the cache instance is a *Singleton* and 2) it is ideal to not have to
restart the JVM between, even for *Integration Tests*, however, turns out
to be a really common practice. *#ugh*  However, without proper
coordination this can be a real problem, hence STDG's support.  Even when
forking JVMs, you still have to be careful to wait in certain cases, as you
could run into other conflicts (e.g. BindExceptions if not varying port
numbers and such).  For all these reasons and more, it is important that
the cache has fully shutdown and released all its resources.

Also, while we are on this topic, I think it would be useful to have a
dedicated interface for the cache instance lifecycle.  It's unfortunate the
CacheListener interface is already taken for Region events. *#sigh*

The interface might be something like...

interface CacheLifecycleListener {

  default void isStarting(CacheEvent event) {}

  default void onStart(CacheEvent event) {}

  default void isClosing(CacheEvent event) {}

  default void onClose(CacheEvent event) {}

  ...

}

-j

On Tue, Apr 14, 2020 at 3:21 PM Jason Huynh  wrote:

> The isClosed flag and method might be used currently more as an isClosing.
> The GemFireCacheImpl.isClosed() method is actually returning isClosing.
> Whatever change to isClosed that will be made, will have to properly handle
> cases where it's meant to be treated as isClosing().
>
> On Tue, Apr 14, 2020 at 3:09 PM Mark Hanson  wrote:
>
> > Hi Jake,
> >
> > For Option 6: We could fix isClosed as well. That is a great suggestion.
> > Currently, it returns almost immediately.
> > I like your options though
> >
> > Any other thoughts?
> >
> > Any preferences? It think any of the current options seem better than the
> > current situation as long as we fix isClosed.
> >
> > Thanks,
> > Mark
> > ________
> > From: Jacob Barrett 
> > Sent: Tuesday, April 14, 2020 2:30 PM
> > To: dev@geode.apache.org 
> > Subject: Re: [Discuss] Cache.close synchronous is not synchronous, but
> > code still expects it to be
> >
> > Option 4: Cache.closeAndWait(long timeout, TimeUnit unit) - Closes and
> > waits until it is really closed.
> > Option 5: Cache.close(Runnable closedCalleback) - Runs callback after
> > cache is really close.
> > Option 6: cache.close(); while (!cache.isClosed());
> >
> >
> > > On Apr 14, 2020, at 2:11 PM, Mark Hanson  wrote:
> > >
> > > Hi All,
> > >
> > > I know that we have discussed this once before, but I think it bears
> > repeating. We have test code that assumes cache.close is synchronous. It
> is
> > not. Not even close. I would like discuss some possible changes.
> > >
> > > Option 1. Call it what it is.  Deprecate Cache.close and create a new
> > method called asyncClose to replace it. Simple and descriptive.
> > > Option 2. Fix cache close so it is synchronous. Some might say that we
> > are going to break behavior, but I doubt any user relies on the fact that
> > it is asynchronous. That would be dangerous in and of itself. Obviously,
> we
> > don’t want to change behavior, but there have been a number of
> distributed
> > tests that have failed for this. If internal to the code devs don’t get
> it
> > right, where does that leave users.
> > > Option 3. Status quo.
> > >
> > > What do you think? Are there other options I am missing?
> > >
> > > Thanks,
> > > Mark
> > >
> >
> >
>


-- 
-John
Spring Data Team


Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-14 Thread Jason Huynh
The isClosed flag and method might be used currently more as an isClosing.
The GemFireCacheImpl.isClosed() method is actually returning isClosing.
Whatever change to isClosed that will be made, will have to properly handle
cases where it's meant to be treated as isClosing().

On Tue, Apr 14, 2020 at 3:09 PM Mark Hanson  wrote:

> Hi Jake,
>
> For Option 6: We could fix isClosed as well. That is a great suggestion.
> Currently, it returns almost immediately.
> I like your options though
>
> Any other thoughts?
>
> Any preferences? It think any of the current options seem better than the
> current situation as long as we fix isClosed.
>
> Thanks,
> Mark
> 
> From: Jacob Barrett 
> Sent: Tuesday, April 14, 2020 2:30 PM
> To: dev@geode.apache.org 
> Subject: Re: [Discuss] Cache.close synchronous is not synchronous, but
> code still expects it to be
>
> Option 4: Cache.closeAndWait(long timeout, TimeUnit unit) - Closes and
> waits until it is really closed.
> Option 5: Cache.close(Runnable closedCalleback) - Runs callback after
> cache is really close.
> Option 6: cache.close(); while (!cache.isClosed());
>
>
> > On Apr 14, 2020, at 2:11 PM, Mark Hanson  wrote:
> >
> > Hi All,
> >
> > I know that we have discussed this once before, but I think it bears
> repeating. We have test code that assumes cache.close is synchronous. It is
> not. Not even close. I would like discuss some possible changes.
> >
> > Option 1. Call it what it is.  Deprecate Cache.close and create a new
> method called asyncClose to replace it. Simple and descriptive.
> > Option 2. Fix cache close so it is synchronous. Some might say that we
> are going to break behavior, but I doubt any user relies on the fact that
> it is asynchronous. That would be dangerous in and of itself. Obviously, we
> don’t want to change behavior, but there have been a number of distributed
> tests that have failed for this. If internal to the code devs don’t get it
> right, where does that leave users.
> > Option 3. Status quo.
> >
> > What do you think? Are there other options I am missing?
> >
> > Thanks,
> > Mark
> >
>
>


Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-14 Thread Dan Smith
Can you spell out what parts of Cache.close are asynchronous? As far as I
can tell it shuts down threadpools, etc. synchronously.

-Dan

On Tue, Apr 14, 2020 at 3:09 PM Mark Hanson  wrote:

> Hi Jake,
>
> For Option 6: We could fix isClosed as well. That is a great suggestion.
> Currently, it returns almost immediately.
> I like your options though
>
> Any other thoughts?
>
> Any preferences? It think any of the current options seem better than the
> current situation as long as we fix isClosed.
>
> Thanks,
> Mark
> 
> From: Jacob Barrett 
> Sent: Tuesday, April 14, 2020 2:30 PM
> To: dev@geode.apache.org 
> Subject: Re: [Discuss] Cache.close synchronous is not synchronous, but
> code still expects it to be
>
> Option 4: Cache.closeAndWait(long timeout, TimeUnit unit) - Closes and
> waits until it is really closed.
> Option 5: Cache.close(Runnable closedCalleback) - Runs callback after
> cache is really close.
> Option 6: cache.close(); while (!cache.isClosed());
>
>
> > On Apr 14, 2020, at 2:11 PM, Mark Hanson  wrote:
> >
> > Hi All,
> >
> > I know that we have discussed this once before, but I think it bears
> repeating. We have test code that assumes cache.close is synchronous. It is
> not. Not even close. I would like discuss some possible changes.
> >
> > Option 1. Call it what it is.  Deprecate Cache.close and create a new
> method called asyncClose to replace it. Simple and descriptive.
> > Option 2. Fix cache close so it is synchronous. Some might say that we
> are going to break behavior, but I doubt any user relies on the fact that
> it is asynchronous. That would be dangerous in and of itself. Obviously, we
> don’t want to change behavior, but there have been a number of distributed
> tests that have failed for this. If internal to the code devs don’t get it
> right, where does that leave users.
> > Option 3. Status quo.
> >
> > What do you think? Are there other options I am missing?
> >
> > Thanks,
> > Mark
> >
>
>


Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-14 Thread Mark Hanson
Hi Jake,

For Option 6: We could fix isClosed as well. That is a great suggestion. 
Currently, it returns almost immediately.
I like your options though

Any other thoughts?

Any preferences? It think any of the current options seem better than the 
current situation as long as we fix isClosed.

Thanks,
Mark

From: Jacob Barrett 
Sent: Tuesday, April 14, 2020 2:30 PM
To: dev@geode.apache.org 
Subject: Re: [Discuss] Cache.close synchronous is not synchronous, but code 
still expects it to be

Option 4: Cache.closeAndWait(long timeout, TimeUnit unit) - Closes and waits 
until it is really closed.
Option 5: Cache.close(Runnable closedCalleback) - Runs callback after cache is 
really close.
Option 6: cache.close(); while (!cache.isClosed());


> On Apr 14, 2020, at 2:11 PM, Mark Hanson  wrote:
>
> Hi All,
>
> I know that we have discussed this once before, but I think it bears 
> repeating. We have test code that assumes cache.close is synchronous. It is 
> not. Not even close. I would like discuss some possible changes.
>
> Option 1. Call it what it is.  Deprecate Cache.close and create a new method 
> called asyncClose to replace it. Simple and descriptive.
> Option 2. Fix cache close so it is synchronous. Some might say that we are 
> going to break behavior, but I doubt any user relies on the fact that it is 
> asynchronous. That would be dangerous in and of itself. Obviously, we don’t 
> want to change behavior, but there have been a number of distributed tests 
> that have failed for this. If internal to the code devs don’t get it right, 
> where does that leave users.
> Option 3. Status quo.
>
> What do you think? Are there other options I am missing?
>
> Thanks,
> Mark
>



Re: [Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-14 Thread Jacob Barrett
Option 4: Cache.closeAndWait(long timeout, TimeUnit unit) - Closes and waits 
until it is really closed.
Option 5: Cache.close(Runnable closedCalleback) - Runs callback after cache is 
really close.
Option 6: cache.close(); while (!cache.isClosed());


> On Apr 14, 2020, at 2:11 PM, Mark Hanson  wrote:
> 
> Hi All,
> 
> I know that we have discussed this once before, but I think it bears 
> repeating. We have test code that assumes cache.close is synchronous. It is 
> not. Not even close. I would like discuss some possible changes.
> 
> Option 1. Call it what it is.  Deprecate Cache.close and create a new method 
> called asyncClose to replace it. Simple and descriptive. 
> Option 2. Fix cache close so it is synchronous. Some might say that we are 
> going to break behavior, but I doubt any user relies on the fact that it is 
> asynchronous. That would be dangerous in and of itself. Obviously, we don’t 
> want to change behavior, but there have been a number of distributed tests 
> that have failed for this. If internal to the code devs don’t get it right, 
> where does that leave users.
> Option 3. Status quo.
> 
> What do you think? Are there other options I am missing?
> 
> Thanks,
> Mark
> 



[Discuss] Cache.close synchronous is not synchronous, but code still expects it to be....

2020-04-14 Thread Mark Hanson
Hi All,

I know that we have discussed this once before, but I think it bears repeating. 
We have test code that assumes cache.close is synchronous. It is not. Not even 
close. I would like discuss some possible changes.

Option 1. Call it what it is.  Deprecate Cache.close and create a new method 
called asyncClose to replace it. Simple and descriptive. 
Option 2. Fix cache close so it is synchronous. Some might say that we are 
going to break behavior, but I doubt any user relies on the fact that it is 
asynchronous. That would be dangerous in and of itself. Obviously, we don’t 
want to change behavior, but there have been a number of distributed tests that 
have failed for this. If internal to the code devs don’t get it right, where 
does that leave users.
Option 3. Status quo.

What do you think? Are there other options I am missing?

Thanks,
Mark