> For Compute.affinityRun, I am not sure how to work with it for my scenario
affinityRun and affinityCall have partition-based overloads (taking int
partId).
Partition-based compute is the reliable way to process all data in the
cluster,
even in the face of topology changes/rebalance (as opposed to localEntries
or local queries).
The whole thing can look like this:
1. From the initiator node, start processing all partitions in parallel
for (int part = 0; part < ignite.affinity().partitions(); i++)
var fut = ignite.compute().affinityCallAsync(cacheNames, part, new
MyJob(part));
2. Inside MyJob, find tenant data with SQL
var entries = cache.query(new SqlFieldsQuery().setPartitions(part)...);
3. Still inside MyJob, process the data in any way, return results from the
job
return process(entries);
4. Aggregate job results on the initiator
Here, Ignite guarantees that steps 2 and 3:
- Operate on local data (job runs on the node where the partition is
located)
- The partition is locked while job runs, so the data won't be moved in
case of topology changes
On Sat, Apr 24, 2021 at 3:12 AM William.L <[email protected]> wrote:
> Thanks for the pointers stephendarlington, ptupitsyn.
>
> Looks like I can run a mapper that does a local SQL query to get the set of
> keys for the tenant (that resides on the local server node), and then do
> Compute.affinityRun or Cache.invokeAll.
>
> For Cache.invokeAll, it takes a dictionary of keys to EntryProcessor so
> that
> is easy to understand.
>
> For Compute.affinityRun, I am not sure how to work with it for my scenario:
> * It takes an affinity key to find the partition's server to run the
> IgniteRunnable but I don't see an interface to pass in the specific keys?
> Am
> I expected to pass the key set as part of IgniteRunnable object?
> * Suppose the cache use user_id as the affinity key then it is possible
> that
> 2 user_id will map to the same partition. How do I avoid duplicate
> processing/scanning?
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>