Hi, I'm trying to understand the configuration parameters for IGFS. My use case is using IGFS with a secondary file system, thus acting as a cache for a hadoop file system, without having to modify any existing application (just the input and output path that will now use the igfs scheme). In the javadoc for FileSystemConfiguration I see:
int getPerNodeBatchSize() Gets number of file blocks buffered on local node before sending batch to remote node. int getPerNodeParallelBatchCount() Gets number of batches that can be concurrently sent to remote node. int getPrefetchBlocks() Get number of pre-fetched blocks if specific file's chunk is requested. What is the remote node here? I understand this doesn't have to do with other ignite nodes holding backup copies, as that would be set in the cache configuration. I have also taken a look to http://apache-ignite-users.70518.x6.nabble. com/IGFS-Data-cache-size-td2875.html but that post seems to refer to a deprecated field FileSystemConfiguration.maxSpaceSize that I haven't been able to see neither in the javadoc or in https://github.com/apache/ ignite/blob/2.3.0/modules/core/src/main/java/org/apache/ ignite/configuration/FileSystemConfiguration.java. Other questions that I have regarding Ignite configuration in the context of this use case: - When I use ATOMIC for the atomicityMode of metaCacheConfiguration I get an launch exception "Failed to start grid: IGFS metadata cache should be transactional: igfs". So I understand TRANSACTIONAL is required for metaCacheConfiguration, but I get no error when using ATOMIC for dataCacheConfiguration, is there any reason to use TRANSACTIONAL for dataCacheConfiguration? I understand ATOMIC gets better performance if you don't use the transaction features. - The readThrough, writeThrough,writeBehind fields for the CacheConfiguration dataCacheConfiguration and metaCacheConfiguration have any effect? Or maybe IGFS is setting them according to the IgfsMode configured in the defaultMode field of FileSystemConfiguration? - Similarly, does the setExpiryPolicyFactory in dataCacheConfiguration and metaCacheConfiguration have any effect? I'd be interested in using DUAL_ASYNC defaultMode, and I though that maybe the ExpiryPolicy could give an upper bound for the time it takes for a record to be written to the secondary file system, because it has been expired from the cache. That way I could safely tear down the IGFS cluster after that time without any data loss. Is there some way of achieving that? Otherwise I think DUAL_ASYNC could only be used in long lived cluster, because I understand there is no functionality to flush the IGFS caches into the secondary file system. - Similarly, does the eviction policy configured for dataCacheConfiguration and metaCacheConfiguration have any effect? In any case I understand that IGFS can never fail due to having no more space in the caches, because it will evict the requires entries, saving them to the secondary file system if needed in order to avoid data loss. It would be nice if someone could point me to some webminar or documentation specific for IGFS. I have already watched https://www.youtube.com/watch?v=pshM_gy7Wig and I think it is a good introduction, but I would like to get more details. I have also read the book "High-Performance In-Memory Computing With Apache Ignite" Thanks a lot for all your help. Best Regards, Juan