All right, Let's assume a simple scenario. When the IgniteCache.loadCache is invoked, we check whether the cache is not local, and if so, then we'll initiate the new loading logic.
First, we take a "streamer" node, it could be done by utilizing LoadBalancingSpi, or it may be configured statically, for the reason that the streamer node is running on the same host as the persistence storage provider. After that we start the loading task on the streamer node which creates IgniteDataStreamer and loads the cache with CacheStore.loadCache. Every call to IgniteBiInClosure.apply simply invokes IgniteDataStreamer.addData. This implementation will completely relieve overhead on the persistence storage provider. Network overhead is also decreased in the case of partitioned caches. For two nodes we get 1-1/2 amount of data transferred by the network (1 part well be transferred from the persistence storage to the streamer, and then 1/2 from the streamer node to the another node). For three nodes it will be 1-2/3 and so on, up to the two times amount of data on the big clusters. I'd like to propose some additional optimization at this place. If we have the streamer node on the same machine as the persistence storage provider, then we completely relieve the network overhead as well. It could be a some special daemon node for the cache loading assigned in the cache configuration, or an ordinary sever node as well. Certainly this calculations have been done in assumption that we have even partitioned cache with only primary nodes (without backups). In the case of one backup (the most frequent case I think), we get 2 amount of data transferred by the network on two nodes, 2-1/3 on three, 2-1/2 on four, and so on up to the three times amount of data on the big clusters. Hence it's still better than the current implementation. In the worst case with a fully replicated cache we take N+1 amount of data transferred by the network (where N is the number of nodes in the cluster). But it's not a problem in small clusters, and a little overhead in big clusters. And we still gain the persistence storage provider optimization. Now let's take more complex scenario. To achieve some level of parallelism, we could split our cluster on several groups. It could be a parameter of the IgniteCache.loadCache method or a cache configuration option. The number of groups could be a fixed value, or it could be calculated dynamically by the maximum number of nodes in the group. After splitting the whole cluster on groups we will take the streamer node in the each group and submit the task for loading the cache similar to the single streamer scenario, except as the only keys will be passed to the IgniteDataStreamer.addData method those correspond to the cluster group where is the streamer node running. In this case we get equal level of overhead as the parallelism, but not so surplus as how many nodes in whole the cluster. 2016-11-11 15:37 GMT+03:00 Alexey Kuznetsov <akuznet...@apache.org>: > Alexandr, > > Could you describe your proposal in more details? > Especially in case with several nodes. > > On Fri, Nov 11, 2016 at 6:34 PM, Alexandr Kuramshin <ein.nsk...@gmail.com> > wrote: > > > Hi, > > > > You know CacheStore API that is commonly used for read/write-through > > relationship of the in-memory data with the persistence storage. > > > > There is also IgniteCache.loadCache method for hot-loading the cache on > > startup. Invocation of this method causes execution of > CacheStore.loadCache > > on the all nodes storing the cache partitions. Because of none keys are > > passed to the CacheStore.loadCache methods, the underlying implementation > > is forced to read all the data from the persistence storage, but only > part > > of the data will be stored on each node. > > > > So, the current implementation have two general drawbacks: > > > > 1. Persistence storage is forced to perform as many identical queries as > > many nodes on the cluster. Each query may involve much additional > > computation on the persistence storage server. > > > > 2. Network is forced to transfer much more data, so obviously the big > > disadvantage on large systems. > > > > The partition-aware data loading approach, described in > > https://apacheignite.readme.io/docs/data-loading#section- > > partition-aware-data-loading > > , is not a choice. It requires persistence of the volatile data depended > on > > affinity function implementation and settings. > > > > I propose using something like IgniteDataStreamer inside > > IgniteCache.loadCache implementation. > > > > > > -- > > Thanks, > > Alexandr Kuramshin > > > > > > -- > Alexey Kuznetsov > -- Thanks, Alexandr Kuramshin