I'm using CacheBasedDataset to filter a subset from a distributed cache
of all training data for Linear Regression. This seems to by default use
the AffinityFunction from the upstream cache to create a new temporary
cache with every preprocessing trainer and on every dataset update. This
causes a lot of additional traffic if happening on multiple nodes.

So I was looking to create local caches for the filtered datasets.


On 27.09.22 18:30, Stephen Darlington wrote:
What are you trying to do? The general solution is to create a long-lived cache 
and have a run-number or similar as part of the key.

On 27 Sep 2022, at 15:36, Thomas Kramer <don.tequ...@gmx.de> wrote:

I understand creating a new cache dynamically requires a cluster-wide
lock with partition map exchange event to create the cache on all nodes.
This is unnecessary traffic when only working with local caches.

For local-only caches I assume this wouldn't happen. But CacheMode.LOCAL
is deprecated.

Is there a way to create a local cache without triggering unnecessary
map exchange events?

Would this work or does it still create a short global lock on all nodes
not only the local node?

         CacheConfiguration<UUID, BinaryObject> cfg = new
CacheConfiguration<>();
         cfg.setCacheMode(CacheMode.REPLICATED);
         cfg.setAffinity(new LocalAffinityFunction());

Reply via email to