I'm using CacheBasedDataset to filter a subset from a distributed cache
of all training data for Linear Regression. This seems to by default use
the AffinityFunction from the upstream cache to create a new temporary
cache with every preprocessing trainer and on every dataset update. This
causes a lot of additional traffic if happening on multiple nodes.
So I was looking to create local caches for the filtered datasets.
On 27.09.22 18:30, Stephen Darlington wrote:
What are you trying to do? The general solution is to create a long-lived cache
and have a run-number or similar as part of the key.
On 27 Sep 2022, at 15:36, Thomas Kramer <don.tequ...@gmx.de> wrote:
I understand creating a new cache dynamically requires a cluster-wide
lock with partition map exchange event to create the cache on all nodes.
This is unnecessary traffic when only working with local caches.
For local-only caches I assume this wouldn't happen. But CacheMode.LOCAL
is deprecated.
Is there a way to create a local cache without triggering unnecessary
map exchange events?
Would this work or does it still create a short global lock on all nodes
not only the local node?
CacheConfiguration<UUID, BinaryObject> cfg = new
CacheConfiguration<>();
cfg.setCacheMode(CacheMode.REPLICATED);
cfg.setAffinity(new LocalAffinityFunction());