My initial idea was that the LRU limits on the cache will serve as a 
transparent cleanup mechanism. I.e. the user would set the max objects per 
cache in a such a way, that even a completely full cache should not crash the 
app. E.g. here is an Ehcache config with a global LRU for all cache groups:

<defaultCache maxEntriesLocalHeap="1000" eternal="false" 
memoryStoreEvictionPolicy="LRU"> 
   ... 
</defaultCache>

It works well for use cases with very active, constantly churning the cache. In 
a less active app the churn rate won't be as high, and you'd be essentially 
overpaying for memory most of the time (though the app should still never run 
out of memory if it is configured with proper limits). So there seems to be two 
issues with the current implementation:

* It requires developer awareness of cache sizing
* The app will eventually be running at _max_ memory instead of _average_

From your description, it looks like you don't really care about preserving 
both the current and the new implementation side by side. You just need a 
solution that works just like the old one, but is free of the limitations 
above? 


Here is one quick idea on how we might handle that. We have a 
"QueryCache.clear()" method. This can become a way to explicitly clear the 
cache at the end of the context life. The method is currently deprecated, as we 
couldn't do a fast clean of only a subset of keys belonging to given context 
across all cache groups. But it doesn't have to be fast, and we can use Cayenne 
event threads to do it on the background without blocking the app. So maybe we 
undeprecate it (or create a new one, like "clearNonBlocking()" or something) 
and change the implementation to do a full scan of all cache keys, removing 
those matching the prefix for the ObjectContext (e.g. JCache has 
Cache.removeAll(Set<K> keys) method).

WDYT?

BTW, while we are having this discussion, may I request to move 597376ae5 
commit to a pull request or a dedicated branch and off of master? 

Andrus



> On Jul 5, 2019, at 6:45 PM, John Huss <johnth...@gmail.com> wrote:
> 
> Query results that are stored in the local cache are only visible/available
> to the context that fetched them, so once the context has gone out of scope
> these query results are no longer useful in any way - they simply can't be
> accessed. Therefore it would be logical for those objects to be evicted
> from the cache. But under the current behavior these objects will remain in
> the cache until some expiration time configured for that cache group
> expires. And crucially, not only the cached objects will remain, but the
> entire context of the ObjectContext that fetched them will be retained in
> memory. Configuring the expiration time for that cache group is cumbersome
> and often the expiration time needs to be different for different
> tasks/queries causing an explosion in the number of cache groups. Even when
> an expiration has been set, during the period of time between when the
> ObjectContext goes out of scope until the expiration fires this memory is
> unavailable, which could be a problem.
> 
> My first run at this problem added a finalizer to DataContext to clean up
> the dead objects in the cache when the ObjectContext was garbage collected.
> That worked ok, but I dislike finalizers, and even with this, under heavy
> load there were leaks occurring that I couldn't resolve. After more thought
> it made more sense to simply tie the lifetimes of the context and cache
> together so that they will expire at the same time.
> 
> Why do I want to use the local cache?
> It's more simple to add a call to query.localCache() than to manually
> create single-purpose custom caches all over the place, like:
> 
> List<T> fetchResults;
> 
> public List<T> fetchThings() {
>  if (resultResults != null) {
>    fetchResults = context.select(query);
>  }
>  return  fetchResults;
> }
> 
> While this works and is easily understandable, using the built-in caching
> behavior is much nicer. When the results of a fetch need to be accessed
> multiple times using a cache can improve performance significantly.
> 
> The current caching behavior can function as a hidden bomb. In one
> situation a developer added a .localCache to a simple query that fetched a
> single row essentially to define a lightweight un-modeled, read-only
> relationship. But that query was called in multiple places, once of which
> was a background process that fetches huge amounts of data. The query
> wasn't using an explicit cache group, and the default cache group was not
> configured with any expiration time, so before long the app ran out of
> memory because it was retaining the results of these huge ObjectContext
> forever. While you can blame the user for not configuring the cache group,
> this still seems like less than ideal behavior.
> 
> What do I want?
> To be able to use the existing caching API .localCache and .sharedCache
> without having to worry that my application will run out of memory if a
> developer adds a .localCache call that doesn't get caught in code review.
> 
> I agree that the current cache story is too complicated and I'd like to
> help improve that if possible. I need some more time to think about my
> ideas for that and how I would approach it.
> 
> Thanks.
> 
> 
> On Thu, Jul 4, 2019 at 10:18 AM Andrus Adamchik <and...@objectstyle.org>
> wrote:
> 
>> 
>>> I think that would be too limiting. I would prefer to be able to use
>> caches with both behavoirs/lifecycles in the same context.
>> 
>> Could you please elaborate on use cases?
>> 
>>> The choice of the scope or lifetime of the cache should be at the query
>> level rather than the context level.
>> 
>> But your current change does not do it at the query level. Both caches are
>> the properties of the context, not query.
>> 
>> I am not (yet) worried about the unit tests. Those will become important
>> once we figure out a design. But I'd like this feature to be conceptually
>> clear to the users.
>> 
>> When I am teaching Cayenne classes or just having an informal conversation
>> about it, I always feel that our cache "story" is too complicated. I can
>> show various knobs, but I often have hard time explaining when to use them
>> and in what combination to solve a given problem. We already have LOCAL vs
>> SHARED cache. Having another type of LOCAL cache is going to cause
>> someone's head to explode :) So trying to figure out where the new feature
>> fits in, and how to avoid further complicating the cache model.
>> 
>> So maybe let's take a look at the use cases as I mentioned above to see
>> why per-query switching of the physical caches is important, and then
>> brainstorm on how to best accomplish it.
>> 
>>> I'm out of the office until Monday, so I can't work on it until then, but
>>> we can keep discussing beforehand.
>> 
>> Appreciate that!
>> 
>> Andrus
>> 
>> 
>>> On Jul 4, 2019, at 5:50 PM, John Huss <johnth...@gmail.com> wrote:
>>> 
>>> On Thu, Jul 4, 2019 at 2:45 AM Andrus Adamchik <and...@objectstyle.org>
>>> wrote:
>>> 
>>>>> My intention was to separate the local and shared query caches while
>> not
>>>>> eliminating either, so that is the reason both are used. So if you have
>>>>> explicitly set a custom localQueryCache then calling getQueryCache()
>> will
>>>>> allow you to still access the *shared* query cache as well.
>>>> 
>>>> But this is called by the framework, not the user. And from what I see,
>>>> some of Cayenne code calls the former, and other - the later.
>>>> 
>>>> I general I am thinking we should make the type of per-context
>> QueryCache
>>>> a property of the entire runtime (it is either a segment of the main
>> cache,
>>>> or a separate per-context object ... either one or the other for all
>>>> contexts). Is this too limiting?
>>>> 
>>> 
>>> I think that would be too limiting. I would prefer to be able to use
>> caches
>>> with both behavoirs/lifecycles in the same context. The choice of the
>> scope
>>> or lifetime of the cache should be at the query level rather than the
>>> context level.
>>> 
>>> I realize now this commit didn't have unit tests to target the new
>>> behaviors when the localQueryCache is set. I have some tests for it in my
>>> application, but those didn't make it into Cayenne. And I need to write
>>> some more to check the behavior of the shared query cache when the
>>> localQueryCache is set.
>>> 
>>> I'm out of the office until Monday, so I can't work on it until then, but
>>> we can keep discussing beforehand. Thanks.
>>> 
>>>> On Jul 3, 2019, at 5:34 PM, John Huss <johnth...@gmail.com> wrote:
>>>>> 
>>>>> On Wed, Jul 3, 2019 at 5:33 AM Andrus Adamchik <and...@objectstyle.org
>>> 
>>>>> wrote:
>>>>> 
>>>>>> Hi John,
>>>>>> 
>>>>>> I think I understand where you are going with this feature. As we
>>>>>> discussed before, sometimes keeping a context cache as a transparent
>>>> region
>>>>>> of the main cache is undesirable, e.g. for memory management reasons.
>>>> But I
>>>>>> have some questions about the implementation committed to "master":
>>>>>> 
>>>>>> 1. The commit seems incomplete - sometimes we use the new
>>>>>> "getLocalQueryCache()", sometimes - the old "getQueryCache()".
>>>>>> 
>>>>> 
>>>>> My intention was to separate the local and shared query caches while
>> not
>>>>> eliminating either, so that is the reason both are used. So if you have
>>>>> explicitly set a custom localQueryCache then calling getQueryCache()
>> will
>>>>> allow you to still access the *shared* query cache as well.
>>>>> 
>>>>> 
>>>>> 
>>>>>> 2. Having both "queryCache" and "localQueryCache" as ivars of
>>>> BaseContext
>>>>>> may be confusing. What do you think of moving cache type selection
>>>> logic in
>>>>>> ObjectContextFactory?
>>>>>> 
>>>>> 
>>>>> Do you mean exposing this in the API of the interface? Currently I am
>>>>> configuring this with a DataContextFactory subclass like this:
>>>>> 
>>>>> *public* *static* *class* Factory *extends* DataContextFactory {
>>>>> 
>>>>> @Override
>>>>> 
>>>>> *protected* DataContext newInstance(DataChannel parent, ObjectStore
>>>>> objectStore) {
>>>>> 
>>>>> *return* *new* IcsDataContext(parent, objectStore);
>>>>> 
>>>>> }
>>>>> 
>>>>> @Override
>>>>> 
>>>>> *protected* ObjectContext createdFromDataDomain(DataDomain parent) {
>>>>> 
>>>>> DataContext result = (DataContext)
>> *super*.createdFromDataDomain(parent);
>>>>> 
>>>>> result.setLocalQueryCache(*new* MapQueryCache(Integer.*MAX_VALUE*)); //
>>>> use
>>>>> separate unbounded cache whose lifetime is bound to the ObjectContext
>>>> itself
>>>>> 
>>>>> *return* result;
>>>>> 
>>>>> }
>>>>> 
>>>>> @Override
>>>>> 
>>>>> *protected* ObjectContext createFromDataContext(DataContext parent) {
>>>>> 
>>>>> DataContext result = (DataContext)
>> *super*.createFromDataContext(parent);
>>>>> 
>>>>> result.setLocalQueryCache(*new* MapQueryCache(Integer.*MAX_VALUE*));
>>>>> 
>>>>> *return* result;
>>>>> 
>>>>> }
>>>>> 
>>>>> @Override
>>>>> 
>>>>> *protected* ObjectContext createFromGenericChannel(DataChannel parent)
>> {
>>>>> 
>>>>> DataContext result = (DataContext)
>>>> *super*.createFromGenericChannel(parent);
>>>>> 
>>>>> result.setLocalQueryCache(*new* MapQueryCache(Integer.*MAX_VALUE*));
>>>>> 
>>>>> *return* result;
>>>>> 
>>>>> }
>>>>> 
>>>>> }
>>>>> 
>>>>> I'm open to other ways of configuring this if you have ideas, perhaps
>>>> like
>>>>> using DI. Thanks for your feedback.
>>>>> 
>>>>> 
>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Andrus
>>>>>> 
>>>>>> 
>>>>>>> On Jul 1, 2019, at 5:53 PM, johnth...@apache.org wrote:
>>>>>>> 
>>>>>>> This is an automated email from the ASF dual-hosted git repository.
>>>>>>> 
>>>>>>> johnthuss pushed a commit to branch master
>>>>>>> in repository https://gitbox.apache.org/repos/asf/cayenne.git
>>>>>>> 
>>>>>>> 
>>>>>>> The following commit(s) were added to refs/heads/master by this push:
>>>>>>>  new 597376a  CAY-2589 Allow optionally using a local query cache
>>>>>> that is separate from the shared query cache.
>>>>>>> 597376a is described below
>>>>>>> 
>>>>>>> commit 597376ae558b9a0c5ddc4390da188b9530204e3d
>>>>>>> Author: John Huss <johnth...@apache.org>
>>>>>>> AuthorDate: Mon Jun 3 16:57:20 2019 -0500
>>>>>>> 
>>>>>>> CAY-2589 Allow optionally using a local query cache that is separate
>>>>>> from the shared query cache.
>>>>>>> 
>>>>>>> The local query cache can be customized by creating a custom
>>>>>> DataContextFactory.
>>>>>>> 
>>>>>>> A separate cache can prevent memory leaks from occurring if you used
>>>>>> non-expiring cache
>>>>>>> groups (like the default cache group perhaps) along with the local
>>>>>> cache, which wasn't
>>>>>>> intuitive if you expected the lifetime of the cache to match the
>>>>>> lifetime of the ObjectContext.
>>>>>>> ---
>>>>>>> RELEASE-NOTES.txt                                  |  1 +
>>>>>>> .../main/java/org/apache/cayenne/BaseContext.java  | 27
>>>>>> +++++++++++++++++++++-
>>>>>>> .../cayenne/util/ObjectContextQueryAction.java     |  9 +++++++-
>>>>>>> 3 files changed, 35 insertions(+), 2 deletions(-)
>>>>>>> 
>>>>>>> diff --git a/RELEASE-NOTES.txt b/RELEASE-NOTES.txt
>>>>>>> index 6f45986..3aa2fa8 100644
>>>>>>> --- a/RELEASE-NOTES.txt
>>>>>>> +++ b/RELEASE-NOTES.txt
>>>>>>> @@ -36,6 +36,7 @@ CAY-2569 Custom 'Naming Strategy' in Cayenne
>> Modeler
>>>>>>> CAY-2570 Use MySQL adapter for latest versions of MariaDB
>>>>>>> CAY-2579 Review and possibly relax usage of readonly flag of
>>>>>> ObjRelationship
>>>>>>> CAY-2585 Rename scalarQuery and params methods in SQLSelect
>>>>>>> +CAY-2589 - Allow optionally using a local query cache that is
>> separate
>>>>>> from the shared query cache.
>>>>>>> 
>>>>>>> Bug Fixes:
>>>>>>> 
>>>>>>> diff --git
>>>>>> a/cayenne-server/src/main/java/org/apache/cayenne/BaseContext.java
>>>>>> b/cayenne-server/src/main/java/org/apache/cayenne/BaseContext.java
>>>>>>> index 981db73..ffae953 100644
>>>>>>> ---
>> a/cayenne-server/src/main/java/org/apache/cayenne/BaseContext.java
>>>>>>> +++
>> b/cayenne-server/src/main/java/org/apache/cayenne/BaseContext.java
>>>>>>> @@ -98,6 +98,7 @@ public abstract class BaseContext implements
>>>>>> ObjectContext {
>>>>>>>    // registry
>>>>>>>    protected transient DataChannel channel;
>>>>>>>    protected transient QueryCache queryCache;
>>>>>>> +     protected transient QueryCache localQueryCache;
>>>>>>>    protected transient EntityResolver entityResolver;
>>>>>>> 
>>>>>>>    protected boolean validatingObjectsOnCommit = true;
>>>>>>> @@ -469,17 +470,41 @@ public abstract class BaseContext implements
>>>>>> ObjectContext {
>>>>>>>    @Override
>>>>>>>    public abstract Collection<?> uncommittedObjects();
>>>>>>> 
>>>>>>> +     /**
>>>>>>> +      * Used for storing cached query results available to all
>>>>>> ObjectContexts.
>>>>>>> +      */
>>>>>>>    public QueryCache getQueryCache() {
>>>>>>>            attachToRuntimeIfNeeded();
>>>>>>>            return queryCache;
>>>>>>>    }
>>>>>>> 
>>>>>>>    /**
>>>>>>> -      * Sets a QueryCache to be used for storing cached query
>> results.
>>>>>>> +      * Sets a QueryCache to be used for storing cached query
>> results
>>>>>> available to all ObjectContexts.
>>>>>>>     */
>>>>>>>    public void setQueryCache(QueryCache queryCache) {
>>>>>>>            this.queryCache = queryCache;
>>>>>>>    }
>>>>>>> +
>>>>>>> +     /**
>>>>>>> +      * Used for storing cached query results available only to this
>>>>>> ObjectContext.
>>>>>>> +      * By default the local query cache and the shared query cache
>>>>>> will use the same underlying storage.
>>>>>>> +      *
>>>>>>> +      * @since 4.2
>>>>>>> +      */
>>>>>>> +     public QueryCache getLocalQueryCache() {
>>>>>>> +             attachToRuntimeIfNeeded();
>>>>>>> +             return localQueryCache != null ? localQueryCache :
>>>>>> getQueryCache();
>>>>>>> +     }
>>>>>>> +
>>>>>>> +     /**
>>>>>>> +      * Sets a QueryCache to be used for storing cached query
>> results
>>>>>> available only to this ObjectContext.
>>>>>>> +      * By default the local query cache and the shared query cache
>>>>>> will use the same underlying storage.
>>>>>>> +      *
>>>>>>> +      * @since 4.2
>>>>>>> +      */
>>>>>>> +     public void setLocalQueryCache(QueryCache queryCache) {
>>>>>>> +             this.localQueryCache = queryCache;
>>>>>>> +     }
>>>>>>> 
>>>>>>>    /**
>>>>>>>     * Returns EventManager associated with the ObjectStore.
>>>>>>> diff --git
>>>>>> 
>>>> 
>> a/cayenne-server/src/main/java/org/apache/cayenne/util/ObjectContextQueryAction.java
>>>>>> 
>>>> 
>> b/cayenne-server/src/main/java/org/apache/cayenne/util/ObjectContextQueryAction.java
>>>>>>> index 75dbdfa..acef54e 100644
>>>>>>> ---
>>>>>> 
>>>> 
>> a/cayenne-server/src/main/java/org/apache/cayenne/util/ObjectContextQueryAction.java
>>>>>>> +++
>>>>>> 
>>>> 
>> b/cayenne-server/src/main/java/org/apache/cayenne/util/ObjectContextQueryAction.java
>>>>>>> @@ -360,7 +360,7 @@ public abstract class ObjectContextQueryAction {
>>>>>>>          return !DONE;
>>>>>>>      }
>>>>>>> 
>>>>>>> -        QueryCache queryCache = getQueryCache();
>>>>>>> +        QueryCache queryCache = getLocalQueryCache();
>>>>>>>      QueryCacheEntryFactory factory = getCacheObjectFactory();
>>>>>>> 
>>>>>>>      if (cache) {
>>>>>>> @@ -385,6 +385,13 @@ public abstract class ObjectContextQueryAction {
>>>>>>>  protected QueryCache getQueryCache() {
>>>>>>>      return ((BaseContext) actingContext).getQueryCache();
>>>>>>>  }
>>>>>>> +
>>>>>>> +    /**
>>>>>>> +     * @since 4.2
>>>>>>> +     */
>>>>>>> +    protected QueryCache getLocalQueryCache() {
>>>>>>> +        return ((BaseContext) actingContext).getLocalQueryCache();
>>>>>>> +    }
>>>>>>> 
>>>>>>>  /**
>>>>>>>   * @since 3.0
>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 

Reply via email to