Thank you, that makes sense. My use case is to match Spark dataframe functionality using only C# if possible, without using Spark
Specifically we have CSV files we wish to load into the cache and then we have compute functions that act on those rows, adding columns as they do, so the cache will be heavy on read/write To try and improve the initial cache population from file(which can be millions of rows) I distribute a job to the cluster that each reads a piece of the file to get some sort of upload parallelization. I am using affinity keys so that the calculations only have to process the data on the node they run on, which works fine. But then I thought, performance would probably improve on the cache population step if i just used LOCAL caches. Its the same end result, calculations working off only the data they have on the node. I can maybe live with the downsides of local cache, which i assume include no fault tolerance or load balancing, if the speed improvements make it worthwhile. Anyway, basically to get my desired functionality I have 2 options - either use affinity keys and affinity compute OR use local caches and broadcast compute. -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/