Yury Babak created IGNITE-8795: ---------------------------------- Summary: Add ability to start and maintain TensorFlow cluster on top of Apache Ignite Key: IGNITE-8795 URL: https://issues.apache.org/jira/browse/IGNITE-8795 Project: Ignite Issue Type: New Feature Components: ml Reporter: Yury Babak Assignee: Anton Dmitriev Fix For: 2.6
As described in the design document (https://docs.google.com/document/d/1jROIahK1rc7bSgOvhJhfpMqIGvht_IE8zn5NAt6x8ks/edit?usp=sharing) Distributed TensorFlow is based on TensorFlow cluster concept. It's a set of TensorFlow processes started among the cluster and available througth the gRPC interfaces. It's assumed that these processes contain heavy operations that requires data to be stored locally on the nodes where the processes running. Apache Ignite admits the data to be moved from one node to another as result of node failure of rebalancing. As result the TensorFlow cluster should be changed dynamically as well as TensorFlow Cache (follow-the-data strategy). -- This message was sent by Atlassian JIRA (v7.6.3#76005)