Re: Any thoughts making Submarine a separate Apache project?

Wangda Tan Mon, 29 Jul 2019 08:16:08 -0700

Thanks Vinod, the proposal to make it be TLP definitely a great suggestion.
I will draft a proposal and keep the thread posted.


Best,
Wangda

On Mon, Jul 29, 2019 at 3:46 PM Vinod Kumar Vavilapalli <[email protected]>
wrote:

> Looks like there's a meaningful push behind this.
>
> Given the desire is to fork off Apache Hadoop, you'd want to make sure
> this enthusiasm turns into building a real, independent but more
> importantly a sustainable community.
>
> Given that there were two official releases off the Apache Hadoop project,
> I doubt if you'd need to go through the incubator process. Instead you can
> directly propose a new TLP at ASF board. The last few times this happened
> was with ORC, and long before that with Hive, HBase etc. Can somebody who
> have cycles and been on the ASF lists for a while look into the process
> here?
>
> For the Apache Hadoop community, this will be treated simply as
> code-change and so need a committer +1? You can be more gently by formally
> doing a vote once a process doc is written down.
>
> Back to the sustainable community point, as part of drafting this
> proposal, you'd definitely want to make sure all of the Apache Hadoop
> PMC/Committers can exercise their will to join this new project as
> PMC/Committers respectively without any additional constraints.
>
> Thanks
> +Vinod
>
> > On Jul 25, 2019, at 1:31 PM, Wangda Tan <[email protected]> wrote:
> >
> > Thanks everybody for sharing your thoughts. I saw positive feedbacks from
> > 20+ contributors!
> >
> > So I think we should move it forward, any suggestions about what we
> should
> > do?
> >
> > Best,
> > Wangda
> >
> > On Mon, Jul 22, 2019 at 5:36 PM neo <[email protected]> wrote:
> >
> >> +1, This is neo from TiDB & TiKV community.
> >> Thanks Xun for bring this up.
> >>
> >> Our CNCF project's open source distributed KV storage system TiKV,
> >> Hadoop submarine's machine learning engine helps us to optimize data
> >> storage,
> >> helping us solve some problems in data hotspots and data shuffers.
> >>
> >> We are ready to improve the performance of TiDB in our open source
> >> distributed relational database TiDB and also using the hadoop submarine
> >> machine learning engine.
> >>
> >> I think if submarine can be independent, it will develop faster and
> better.
> >> Thanks to the hadoop community for developing submarine!
> >>
> >> Best Regards,
> >> neo
> >> www.pingcap.com / https://github.com/pingcap/tidb /
> >> https://github.com/tikv
> >>
> >> Xun Liu <[email protected]> 于2019年7月22日周一 下午4:07写道：
> >>
> >>> @adam.antal
> >>>
> >>> The submarine development team has completed the following
> preparations:
> >>> 1. Established a temporary test repository on Github.
> >>> 2. Change the package name of hadoop submarine from
> org.hadoop.submarine
> >> to
> >>> org.submarine
> >>> 3. Combine the Linkedin/TonY code into the Hadoop submarine module;
> >>> 4. On the Github docked travis-ci system, all test cases have been
> >> tested;
> >>> 5. Several Hadoop submarine users completed the system test using the
> >> code
> >>> in this repository.
> >>>
> >>> 赵欣 <[email protected]> 于2019年7月22日周一 上午9:38写道：
> >>>
> >>>> Hi
> >>>>
> >>>> I am a teacher at Southeast University (https://www.seu.edu.cn/). We
> >> are
> >>>> a major in electrical engineering. Our teaching teams and students use
> >>>> bigoop submarine for big data analysis and automation control of
> >>> electrical
> >>>> equipment.
> >>>>
> >>>> Many thanks to the hadoop community for providing us with machine
> >>> learning
> >>>> tools like submarine.
> >>>>
> >>>> I wish hadoop submarine is getting better and better.
> >>>>
> >>>>
> >>>> ==============================
> >>>> 赵欣
> >>>> 东南大学电气工程学院
> >>>>
> >>>> -----------------------------------------------------
> >>>>
> >>>> Zhao XIN
> >>>>
> >>>> School of Electrical Engineering
> >>>>
> >>>> ==============================
> >>>> 2019-07-18
> >>>>
> >>>>
> >>>> *From:* Xun Liu <[email protected]>
> >>>> *Date:* 2019-07-18 09:46
> >>>> *To:* xinzhao <[email protected]>
> >>>> *Subject:* Fwd: Re: Any thoughts making Submarine a separate Apache
> >>>> project?
> >>>>
> >>>>
> >>>> ---------- Forwarded message ---------
> >>>> 发件人： [email protected] <[email protected]>
> >>>> Date: 2019年7月17日周三 下午3:17
> >>>> Subject: Re: Re: Any thoughts making Submarine a separate Apache
> >> project?
> >>>> To: Szilard Nemeth <[email protected]>, runlin zhang <
> >>>> [email protected]>
> >>>> Cc: Xun Liu <[email protected]>, common-dev <
> >>> [email protected]>,
> >>>> yarn-dev <[email protected]>, hdfs-dev <
> >>>> [email protected]>, mapreduce-dev <
> >>>> [email protected]>, submarine-dev <
> >>>> [email protected]>
> >>>>
> >>>>
> >>>> +1 ，Good idea, we are very much looking forward to it.
> >>>>
> >>>> ------------------------------
> >>>> [email protected]
> >>>>
> >>>>
> >>>> *From:* Szilard Nemeth <[email protected]>
> >>>> *Date:* 2019-07-17 14:55
> >>>> *To:* runlin zhang <[email protected]>
> >>>> *CC:* Xun Liu <[email protected]>; Hadoop Common
> >>>> <[email protected]>; yarn-dev <[email protected]
> >;
> >>>> Hdfs-dev <[email protected]>; mapreduce-dev
> >>>> <[email protected]>; submarine-dev
> >>>> <[email protected]>
> >>>> *Subject:* Re: Any thoughts making Submarine a separate Apache
> project?
> >>>> +1, this is a very great idea.
> >>>> As Hadoop repository has already grown huge and contains many
> >> projects, I
> >>>> think in general it's a good idea to separate projects in the early
> >>> phase.
> >>>>
> >>>>
> >>>> On Wed, Jul 17, 2019, 08:50 runlin zhang <[email protected]> wrote:
> >>>>
> >>>>> +1 ，That will be great ！
> >>>>>
> >>>>>> 在 2019年7月10日，下午3:34，Xun Liu <[email protected]> 写道：
> >>>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> This is Xun Liu contributing to the Submarine project for deep
> >>> learning
> >>>>>> workloads running with big data workloads together on Hadoop
> >>> clusters.
> >>>>>>
> >>>>>> There are a bunch of integrations of Submarine to other projects
> >> are
> >>>>>> finished or going on, such as Apache Zeppelin, TonY, Azkaban. The
> >>> next
> >>>>> step
> >>>>>> of Submarine is going to integrate with more projects like Apache
> >>>> Arrow,
> >>>>>> Redis, MLflow, etc. & be able to handle end-to-end machine learning
> >>> use
> >>>>>> cases like model serving, notebook management, advanced training
> >>>>>> optimizations (like auto parameter tuning, memory cache
> >> optimizations
> >>>> for
> >>>>>> large datasets for training, etc.), and make it run on other
> >>> platforms
> >>>>> like
> >>>>>> Kubernetes or natively on Cloud. LinkedIn also wants to donate TonY
> >>>>> project
> >>>>>> to Apache so we can put Submarine and TonY together to the same
> >>>> codebase
> >>>>>> (Page #30.
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30
> >>>>>> ).
> >>>>>>
> >>>>>> This expands the scope of the original Submarine project in
> >> exciting
> >>>> new
> >>>>>> ways. Toward that end, would it make sense to create a separate
> >>>> Submarine
> >>>>>> project at Apache? This can make faster adoption of Submarine, and
> >>>> allow
> >>>>>> Submarine to grow to a full-blown machine learning platform.
> >>>>>>
> >>>>>> There will be lots of technical details to work out, but any
> >> initial
> >>>>>> thoughts on this?
> >>>>>>
> >>>>>> Best Regards,
> >>>>>> Xun Liu
> >>>>>
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: [email protected]
> >>>>> For additional commands, e-mail: [email protected]
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: Any thoughts making Submarine a separate Apache project?

Reply via email to