Re: [DISCUSS] Making submarine to different release model like Ozone

Xiaoyu Yao Fri, 01 Feb 2019 11:03:10 -0800

+1, thanks for bringing this up, Wangda. This will help expanding the Hadoop 
ecosystem by supporting new AI/ML workloads.


Thanks,
Xiaoyu
On 2/1/19, 10:58 AM, "Dinesh Chitlangia" <dchitlan...@hortonworks.com> wrote:

    +1 This is a fantastic recommendation given the increasing interest in ML 
across the globe.
    
    Thanks,
    Dinesh
    
    
    
    On 2/1/19, 1:54 PM, "Ajay Kumar" <ajay.ku...@hortonworks.com> wrote:
    
        +1, Thanks for driving this. With rise of use cases running ML along 
with traditional applications this will be of great help.
        
        Thanks,
        Ajay   
        
        On 2/1/19, 10:49 AM, "Suma Shivaprasad" <sumasai.shivapra...@gmail.com> 
wrote:
        
            +1. Thanks for bringing this up Wangda.
            
            Makes sense to have Submarine follow its own release cadence given 
the good
            momentum/adoption so far. Also, making it run with older versions 
of Hadoop
            would drive higher adoption.
            
            Suma
            
            On Fri, Feb 1, 2019 at 9:40 AM Eric Yang <ey...@hortonworks.com> 
wrote:
            
            > Submarine is an application built for YARN framework, but it does 
not have
            > strong dependency on YARN development.  For this kind of 
projects, it would
            > be best to enter Apache Incubator cycles to create a new 
community.  Apache
            > commons is the only project other than Incubator that has 
independent
            > release cycles.  The collection is large, and the project goal is
            > ambitious.  No one really knows which component works with each 
other in
            > Apache commons.  Hadoop is a much more focused project on 
distributed
            > computing framework and not incubation sandbox.  For alignment 
with Hadoop
            > goals, and we want to prevent Hadoop project to be overloaded 
while
            > allowing good ideas to be carried forwarded in Apache incubator.  
Put on my
            > Apache Member hat, my vote is -1 to allow more independent 
subproject
            > release cycle in Hadoop project that does not align with Hadoop 
project
            > goals.
            >
            > Apache incubator process is highly recommended for Submarine:
            > https://incubator.apache.org/policy/process.html This allows 
Submarine to
            > develop for older version of Hadoop like Spark works with 
multiple versions
            > of Hadoop.
            >
            > Regards,
            > Eric
            >
            > On 1/31/19, 10:51 PM, "Weiwei Yang" <abvclo...@gmail.com> wrote:
            >
            >     Thanks for proposing this Wangda, my +1 as well.
            >     It is amazing to see the progress made in Submarine last 
year, the
            > community grows fast and quiet collaborative. I can see the 
reasons to get
            > it release faster in its own cycle. And at the same time, the 
Ozone way
            > works very well.
            >
            >     —
            >     Weiwei
            >     On Feb 1, 2019, 10:49 AM +0800, Xun Liu <neliu...@163.com>, 
wrote:
            >     > +1
            >     >
            >     > Hello everyone,
            >     >
            >     > I am Xun Liu, the head of the machine learning team at 
Netease
            > Research Institute. I quite agree with Wangda.
            >     >
            >     > Our team is very grateful for getting Submarine machine 
learning
            > engine from the community.
            >     > We are heavy users of Submarine.
            >     > Because Submarine fits into the direction of our big data 
team's
            > hadoop technology stack,
            >     > It avoids the needs to increase the manpower investment in 
learning
            > other container scheduling systems.
            >     > The important thing is that we can use a common YARN 
cluster to run
            > machine learning,
            >     > which makes the utilization of server resources more 
efficient, and
            > reserves a lot of human and material resources in our previous 
years.
            >     >
            >     > Our team have finished the test and deployment of the 
Submarine and
            > will provide the service to our e-commerce department (
            > http://www.kaola.com/) shortly.
            >     >
            >     > We also plan to provides the Submarine engine in our 
existing YARN
            > cluster in the next six months.
            >     > Because we have a lot of product departments need to use 
machine
            > learning services,
            >     > for example:
            >     > 1) Game department (http://game.163.com/) needs AI battle 
training,
            >     > 2) News department (http://www.163.com) needs news 
recommendation,
            >     > 3) Mailbox department (http://www.163.com) requires 
anti-spam and
            > illegal detection,
            >     > 4) Music department (https://music.163.com/) requires music
            > recommendation,
            >     > 5) Education department (http://www.youdao.com) requires 
voice
            > recognition,
            >     > 6) Massive Open Online Courses (https://open.163.com/) 
requires
            > multilingual translation and so on.
            >     >
            >     > If Submarine can be released independently like Ozone, it 
will help
            > us quickly get the latest features and improvements, and it will 
be great
            > helpful to our team and users.
            >     >
            >     > Thanks hadoop Community!
            >     >
            >     >
            >     > > 在 2019年2月1日，上午2:53，Wangda Tan <wheele...@gmail.com> 写道：
            >     > >
            >     > > Hi devs,
            >     > >
            >     > > Since we started submarine-related effort last year, we 
received a
            > lot of
            >     > > feedbacks, several companies (such as Netease, China 
Mobile, etc.)
            > are
            >     > > trying to deploy Submarine to their Hadoop cluster along 
with big
            > data
            >     > > workloads. Linkedin also has big interests to contribute a
            > Submarine TonY (
            >     > > https://github.com/linkedin/TonY) runtime to allow users 
to use
            > the same
            >     > > interface.
            >     > >
            >     > > From what I can see, there're several issues of putting 
Submarine
            > under
            >     > > yarn-applications directory and have same release cycle 
with
            > Hadoop:
            >     > >
            >     > > 1) We started 3.2.0 release at Sep 2018, but the release 
is done
            > at Jan
            >     > > 2019. Because of non-predictable blockers and security 
issues, it
            > got
            >     > > delayed a lot. We need to iterate submarine fast at this 
point.
            >     > >
            >     > > 2) We also see a lot of requirements to use Submarine on 
older
            > Hadoop
            >     > > releases such as 2.x. Many companies may not upgrade 
Hadoop to 3.x
            > in a
            >     > > short time, but the requirement to run deep learning is 
urgent to
            > them. We
            >     > > should decouple Submarine from Hadoop version.
            >     > >
            >     > > And why we wanna to keep it within Hadoop? First, 
Submarine
            > included some
            >     > > innovation parts such as enhancements of user experiences 
for YARN
            >     > > services/containerization support which we can add it 
back to
            > Hadoop later
            >     > > to address common requirements. In addition to that, we 
have a big
            > overlap
            >     > > in the community developing and using it.
            >     > >
            >     > > There're several proposals we have went through during 
Ozone merge
            > to trunk
            >     > > discussion:
            >     > >
            > 
https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201803.mbox/%3ccahfhakh6_m3yldf5a2kq8+w-5fbvx5ahfgs-x1vajw8gmnz...@mail.gmail.com%3E
            >     > >
            >     > > I propose to adopt Ozone model: which is the same master 
branch,
            > different
            >     > > release cycle, and different release branch. It is a 
great example
            > to show
            >     > > agile release we can do (2 Ozone releases after Oct 2018) 
with less
            >     > > overhead to setup CI, projects, etc.
            >     > >
            >     > > *Links:*
            >     > > - JIRA: https://issues.apache.org/jira/browse/YARN-8135
            >     > > - Design doc
            >     > > <
            > 
https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit
            > >
            >     > > - User doc
            >     > > <
            > 
https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/Index.html
            > >
            >     > > (3.2.0
            >     > > release)
            >     > > - Blogposts, {Submarine} : Running deep learning 
workloads on
            > Apache Hadoop
            >     > > <
            > 
https://hortonworks.com/blog/submarine-running-deep-learning-workloads-apache-hadoop/
            > >,
            >     > > (Chinese Translation: Link 
<https://www.jishuwen.com/d/2Vpu>)
            >     > > - Talks: Strata Data Conf NY
            >     > > <
            > 
https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/detail/68289
            > >
            >     > >
            >     > > Thoughts?
            >     > >
            >     > > Thanks,
            >     > > Wangda Tan
            >     >
            >     >
            >     >
            >     > 
---------------------------------------------------------------------
            >     > To unsubscribe, e-mail: 
hdfs-dev-unsubscr...@hadoop.apache.org
            >     > For additional commands, e-mail: 
hdfs-dev-h...@hadoop.apache.org
            >     >
            >
            >
            >
            
        
        
        ---------------------------------------------------------------------
        To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
        For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
        
        
    
    
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
    For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
    
    


---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Re: [DISCUSS] Making submarine to different release model like Ozone

Reply via email to