[ 
https://issues.apache.org/jira/browse/IGNITE-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Ozerov updated IGNITE-8670:
------------------------------------
    Fix Version/s: 2.7

> Umbrella: TensorFlow integration
> --------------------------------
>
>                 Key: IGNITE-8670
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8670
>             Project: Ignite
>          Issue Type: New Feature
>          Components: ml
>            Reporter: Yury Babak
>            Assignee: Yury Babak
>            Priority: Major
>             Fix For: 2.7
>
>
>  
> *What is the goal?*
> TensorFlow on Apache Ignite should consists of three major components: 
> _Ignite Dataset_ that provides an ability to feed training data from Apache 
> Ignite, _IGFS Plugin_ that allows to use Apache Ignite File System for 
> checkpointing and communication with TensorBoard, and _Distributed Training_ 
> that makes it possible to run model training instantly inside Apache Ignite 
> cluster to minimize data transfers and provide so called Zero ETL.
>  
> *Ignite Dataset*
> Ignite Dataset represents an integration between Apache Ignite and TensorFlow 
> that allows to use Apache Ignite as a data source for neural network 
> training, inference and all other computations supported by TensorFlow. Using 
> of Ignite Dataset has a lot of advantages, just a few of them: TensorFlow 
> gets a fast access to distributed database that can contain training data and 
> data for inference; objects feeded by Ignite Dataset can have any structure 
> thus all preprocessing can be done in TensorFlow pipeline; SSL, Windows and 
> distributed training are also supported. 
> For now Ignite Dataset is a part of TensorFlow, so you don’t need to install 
> any third-party packages and you can use it out of the box. The integration 
> is based on [tf.data|https://www.tensorflow.org/api_docs/python/tf/data] from 
> TensorFlow side and [Binary Client 
> Protocol|https://apacheignite.readme.io/v2.6/docs/binary-client-protocol] 
> from Apache Ignite side.
>  
> *IGFS Plugin*
> In addition to database functionality Apache Ignite provides a distributed 
> file system called [IGFS|https://ignite.apache.org/features/igfs.html]. IGFS 
> delivers a similar functionality to Hadoop HDFS, but only in-memory. IGFS 
> Plugin for TensorFlow allows to use IGFS for checkpointing (for reliability 
> and fault-tolerance) and for communication with TensorBoard (even when 
> TensorBoard runs in a different process or machine).
> For now IGFS Plugin is a part of TensorFlow, so you don’t need to install any 
> third-party packages and you can use it out of the box. The integration is 
> based on [custom filesystem 
> plugin|https://www.tensorflow.org/extend/add_filesys] from TensorFlow side 
> and [IGFS Native API|https://ignite.apache.org/features/igfs.html] from 
> Apache Ignite side.
>  
> *Distributed Training*
> Distributed training allows to utilize computational resources of the whole 
> cluster and thus speed up training of deep learning model. TensorFlow is a 
> machine learning framework that [natively 
> supports|https://www.tensorflow.org/deploy/distributed] distributed neural 
> network training, inference and other computations.
> Distributed Training in TensorFlow on Apache Ignite is based on [standalone 
> client 
> mode|https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/distribute#standalone-client-mode]
>  of distributed multi-worker training. Standalone client mode assumes that we 
> have a cluster of workers with started TensorFlow servers and we have a 
> client that actually contains model code. When the client calls 
> tf.estimator.train_and_evaluate TensorFlow uses specified distribution 
> strategy to distribute computations across workers so that most 
> computationally intensive part performs on workers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to