Asynchronously Initialize Cache Before Reading

Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD A) Thu, 20 Feb 2025 14:40:38 -0800

Hi All,

Apologies for the noise .. Still getting my bearings on how to contribute 
(realized too late that gh issues were sourced from the jira project).


I want to bring to your attention a small proposed feature as described below:

The current CachedModeledFrameworkImpl doesn't manage cache initialization for 
you. A perfect example of the kind of code you can expect to see in the wild is 
in the CachedModeledFramework tests themselves, i.e. blocking on semaphores 
that pin the reading thread to prevent it from certain operations that are 
cache dependent (but as far as I can tell exactly which operations are cache 
dependent is not really guaranteed so this is arguably cumbersome). Either way, 
this is fine in a lot of cases...
However, I propose an additional InitializedCachedModeledFramework 
implementation which asynchronously waits for the cache initialization trigger 
and only then proceeds to read from the cache. I implemented something similar 
for my personal use-case where I couldn't rely on, i.e. readThrough, to handle 
the uninitialized case because readThrough cannot disambiguate between a znode 
that is missing because it truly is missing from zk vs one that is missing 
because the cache hasn't initialized. In my case the znode wouldn't always 
exist in zk and so using readThrough would result in a lot of wasted calls to 
zk, greatly reducing the benefit of the cache in the first place.
To reiterate InitializedCachedModeledFramework has a couple benefits over the 
existing implementation:
                                                                                
                        1)No more possibility of misleading NoNodeException in 
the case of reading from CachedModeledFramework before the cache has warmed. (I 
say misleading because the node may exist .. just not in the cache)
                                                                                
                        2)No more temptation to add blocking semaphores in 
front of this non-blocking interface.
                                                                                
                        3)IMO generally less confusion about how to properly 
use this otherwise great(!) feature.

A first pass at an implementation is here 
https://github.com/apache/curator/pull/1247/files .. 

But I just wanted to gauge the community's sense of whether they think this is 
a good idea and, assuming it is, if a new CMF implementation is the right way 
to go about it.

I'd appreciate any feedback either way.

Thanks,
Luke

Asynchronously Initialize Cache Before Reading

Reply via email to