Thanks for proposing this Wangda, my +1 as well.
It is amazing to see the progress made in Submarine last year, the community 
grows fast and quiet collaborative. I can see the reasons to get it release 
faster in its own cycle. And at the same time, the Ozone way works very well.

—
Weiwei
On Feb 1, 2019, 10:49 AM +0800, Xun Liu <neliu...@163.com>, wrote:
> +1
>
> Hello everyone,
>
> I am Xun Liu, the head of the machine learning team at Netease Research 
> Institute. I quite agree with Wangda.
>
> Our team is very grateful for getting Submarine machine learning engine from 
> the community.
> We are heavy users of Submarine.
> Because Submarine fits into the direction of our big data team's hadoop 
> technology stack,
> It avoids the needs to increase the manpower investment in learning other 
> container scheduling systems.
> The important thing is that we can use a common YARN cluster to run machine 
> learning,
> which makes the utilization of server resources more efficient, and reserves 
> a lot of human and material resources in our previous years.
>
> Our team have finished the test and deployment of the Submarine and will 
> provide the service to our e-commerce department (http://www.kaola.com/) 
> shortly.
>
> We also plan to provides the Submarine engine in our existing YARN cluster in 
> the next six months.
> Because we have a lot of product departments need to use machine learning 
> services,
> for example:
> 1) Game department (http://game.163.com/) needs AI battle training,
> 2) News department (http://www.163.com) needs news recommendation,
> 3) Mailbox department (http://www.163.com) requires anti-spam and illegal 
> detection,
> 4) Music department (https://music.163.com/) requires music recommendation,
> 5) Education department (http://www.youdao.com) requires voice recognition,
> 6) Massive Open Online Courses (https://open.163.com/) requires multilingual 
> translation and so on.
>
> If Submarine can be released independently like Ozone, it will help us 
> quickly get the latest features and improvements, and it will be great 
> helpful to our team and users.
>
> Thanks hadoop Community!
>
>
> > 在 2019年2月1日,上午2:53,Wangda Tan <wheele...@gmail.com> 写道:
> >
> > Hi devs,
> >
> > Since we started submarine-related effort last year, we received a lot of
> > feedbacks, several companies (such as Netease, China Mobile, etc.) are
> > trying to deploy Submarine to their Hadoop cluster along with big data
> > workloads. Linkedin also has big interests to contribute a Submarine TonY (
> > https://github.com/linkedin/TonY) runtime to allow users to use the same
> > interface.
> >
> > From what I can see, there're several issues of putting Submarine under
> > yarn-applications directory and have same release cycle with Hadoop:
> >
> > 1) We started 3.2.0 release at Sep 2018, but the release is done at Jan
> > 2019. Because of non-predictable blockers and security issues, it got
> > delayed a lot. We need to iterate submarine fast at this point.
> >
> > 2) We also see a lot of requirements to use Submarine on older Hadoop
> > releases such as 2.x. Many companies may not upgrade Hadoop to 3.x in a
> > short time, but the requirement to run deep learning is urgent to them. We
> > should decouple Submarine from Hadoop version.
> >
> > And why we wanna to keep it within Hadoop? First, Submarine included some
> > innovation parts such as enhancements of user experiences for YARN
> > services/containerization support which we can add it back to Hadoop later
> > to address common requirements. In addition to that, we have a big overlap
> > in the community developing and using it.
> >
> > There're several proposals we have went through during Ozone merge to trunk
> > discussion:
> > https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201803.mbox/%3ccahfhakh6_m3yldf5a2kq8+w-5fbvx5ahfgs-x1vajw8gmnz...@mail.gmail.com%3E
> >
> > I propose to adopt Ozone model: which is the same master branch, different
> > release cycle, and different release branch. It is a great example to show
> > agile release we can do (2 Ozone releases after Oct 2018) with less
> > overhead to setup CI, projects, etc.
> >
> > *Links:*
> > - JIRA: https://issues.apache.org/jira/browse/YARN-8135
> > - Design doc
> > <https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit>
> > - User doc
> > <https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/Index.html>
> > (3.2.0
> > release)
> > - Blogposts, {Submarine} : Running deep learning workloads on Apache Hadoop
> > <https://hortonworks.com/blog/submarine-running-deep-learning-workloads-apache-hadoop/>,
> > (Chinese Translation: Link <https://www.jishuwen.com/d/2Vpu>)
> > - Talks: Strata Data Conf NY
> > <https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/detail/68289>
> >
> > Thoughts?
> >
> > Thanks,
> > Wangda Tan
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>

Reply via email to