Re: [DISCUSSION] Spinoff ANN package

Behroz Sikander Wed, 05 Aug 2015 14:14:12 -0700

+1
I would also like to participate :)

On Wed, Aug 5, 2015 at 5:52 AM, Edward J. Yoon <[email protected]>
wrote:


> Guys,
>
> I plan to submit a 'DNN platform on top of Apache Hama' proposal as
> below. I know Hama community is somewhat small, but the main reason is
> that this domain-specific project is not fit for Apache Hama
> community. Recruiting volunteers is also hard problem. I expect this
> will become a very nice use-case of Apache Hama.
>
> If you have any suggestions or other opinions, Please let me know.
> Also, if you want to participate in this project, Pls feel free to add
> your name here.
>
> Thanks!
>
> --
> == Abstract ==
>
> (tentatively named "Horn [hɔ:n]", korean meaning of Horn is a
> "Spirit") is a neuron-centric programming APIs and execution framework
> for large-scale deep learning, built on top of Apache Hama.
>
> == Proposal ==
>
> It is a goal of the Horn to provide a neuron-centric programming APIs
> which allows user to easily define the characteristic of artificial
> neural network model and its structure, and its execution framework
> that leverages the heterogeneous resources on Hama and Hadoop YARN
> cluster.
>
> == Background ==
>
> The initial ANN code was developed at Apache Hama project by a
> committer, Yexi Jiang (Facebook) in 2013. The motivation behind this
> work is to build a framework that provides more intuitive programming
> APIs like Google's MapReduce or Pregel and supports applications
> needing large model with huge memory consumptions in distributed way.
>
> == Rationale ==
>
> While many of deep learning open source softwares are still data or
> model parallel only, we aim to support both data and model parallelism
> and also fault-tolerant system design. The basic idea of data and
> model parallelism is use of the remote parameter server to parallelize
> model creation and distribute training across machines, and the BSP
> framework of Apache Hama for performing asynchronous mini-batches.
> Within single BSP job, each task group works asynchronously using
> region barrier synchronization instead of global barrier
> synchronization, and trains large-scale neural network model using
> assigned data sets in BSP paradigm. This architecture is inspired by
> Google's DistBelief (Jeff Dean et al, 2012).
>
> == Initial Goals ==
>
> Some current goals include:
>
>  * builds new community
>  * provides more intuitive programming APIs
>  * needs both data and model parallelism support
>  * must run natively on both Hama and Hadoop2
>  * needs also GPUs and InfiniBand support
>
> == Current Status ==
>
> === Meritocracy ===
>
> The core developers understand what it means to have a process based
> on meritocracy. We will provide continuous efforts to build an
> environment that supports this, encouraging community members to
> contribute.
>
> === Community ===
>
> A small community has formed within the Apache Hama project and some
> companies such as instant messenger service company and mobile
> manufacturing company. And many people are interested in the
> large-scale deep learning platform itself. By bringing Horn into
> Apache, we believe that the community will grow even bigger.
>
> === Core Developers ===
>
> Edward J. Yoon, Thomas Jungblut, and Dongjin Lee
>
> == Known Risks ==
>
> === Orphaned Products ===
>
> Apache Hama is already a core open source component at Samsung
> Electronics, and Horn also will be used by Samsung Electronics, and so
> there is no direct risk for this project to be orphaned.
>
> === Inexperience with Open Source ===
>
> Some are very new and the others have experience using and/or working
> on Apache open source projects.
>
> === Homogeneous Developers ===
>
> The initial committers are from different organizations such as,
> Microsoft, Samsung Electronics, and Line Plus.
>
> === Reliance on Salaried Developers ===
>
> Other developers will also start working on the project in their spare
> time.
>
> === Relationships with Other Apache Products ===
>
>  * Horn is based on Apache Hama
>  * Apache Zookeeper is used for distributed locking service
>  * Natively run on Apache Hadoop and Mesos
>  * Horn can be somewhat overlapped with Singa podling.
>
> === An Excessive Fascination with the Apache Brand ===
>
> Horn itself will hopefully have benefits from Apache, in terms of
> attracting a community and establishing a solid group of developers,
> but also the relation with Apache Hama, a general-purpose BSP
> computing engine. These are the main reasons for us to send this
> proposal.
>
> == Documentation ==
>
> Initial plan about Horn can be found at
> http://blog.udanax.org/2015/06/googles-distbelief-clone-project-on.html
>
> == Initial Source ==
>
> The initial source code has been release as part of Apache Hama
> project developed under Apache Software Foundation. The source code is
> currently hosted at
>
> https://svn.apache.org/repos/asf/hama/trunk/ml/src/main/java/org/apache/hama/ml/ann/
>
> == Cryptography ==
>
> Not applicable.
>
> == Required Resources ==
>
> Mailing Lists
>
>  * horn-private
>  * horn-dev
>
> Subversion Directory
>
>  * Git is the preferred source control system: git://git.apache.org/horn
>
> Issue Tracking
>
>  * a JIRA issue tracker, HORN
>
> == Initial Committers and Affiliations ==
>
>  * Thomas Jungblut (tjungblut at apache dot org)
>  * Edward J. Yoon (edwardyoon at apache dot org)
>  * Dongjin Lee (dongjin.lee.kr at gmail dot com)
>  * Minho Kim (minwise.kim at samsung dot com)
>  * TODO
>
> == Affiliations ==
>
>  * Thomas Jungblut (Microsoft)
>  * Edward J. Yoon (Samsung Electronics)
>  * Donjin Lee (LINE Plus)
>  * Minho Kim (Samsung Electronics)
>  * TODO
>
> == Sponsors ==
>
> Champion
>
>  * Edward J. Yoon <edwardyoon at apache dot org>
>
> Nominated Mentors
>
>  * TODO
>
> Sponsoring Entity
>
> The Apache Incubator
>
> --
> Best Regards, Edward J. Yoon
>

Re: [DISCUSSION] Spinoff ANN package

Reply via email to