Hi,

According to
https://apacheignite-fs.readme.io/docs/file-system#section-file-system-uri,
a URI for connecting to a ignite file system from a Hadoop compatible
application follows the pattern igfs://myIgfs@myHost:12345/. Let's assume I
have a ignite cluster with 5 nodes running in 5 different hosts data_host1,
data_host2, data_host3, data_host4, data_host5 in port 12345, and I have a
different set of 5 hosts compute_host1, compute_host2, compute_host3,
compute_host4, compute_host5 where I'm running some application that
connects to that IGFS cluster, for example a Spark program running on YARN
containers in the compute hosts. If I use the URI
igfs://myIgfs@data_host1:12345
in the Spark driver, and use that URI for all the Spark tasks running in
the compute hosts, will all the compute hosts connect to the same host
data_host1? If that is the case, then I would need to manually implement
some kind of load balancing in the application, in order to avoid hot spots
in Ignite. But maybe Ignite is treating the node in the URI as a kind of
seed node that is only used to get the set of cluster members, and
implements load balancing internally. Can you clarify which is the actual
behaviour? Any pointer to the documentation describing this would be much
appreciated.

Thanks a lot.

Juan Rodriguez Hortala

Reply via email to