Re: Any thoughts making Submarine a separate Apache project?

Vinod Kumar Vavilapalli Mon, 29 Jul 2019 06:46:33 -0700

Looks like there's a meaningful push behind this.

Given the desire is to fork off Apache Hadoop, you'd want to make sure this 
enthusiasm turns into building a real, independent but more importantly a 
sustainable community.


Given that there were two official releases off the Apache Hadoop project, I 
doubt if you'd need to go through the incubator process. Instead you can 
directly propose a new TLP at ASF board. The last few times this happened was 
with ORC, and long before that with Hive, HBase etc. Can somebody who have 
cycles and been on the ASF lists for a while look into the process here?

For the Apache Hadoop community, this will be treated simply as code-change and 
so need a committer +1? You can be more gently by formally doing a vote once a 
process doc is written down.

Back to the sustainable community point, as part of drafting this proposal, 
you'd definitely want to make sure all of the Apache Hadoop PMC/Committers can 
exercise their will to join this new project as PMC/Committers respectively 
without any additional constraints.

Thanks
+Vinod

> On Jul 25, 2019, at 1:31 PM, Wangda Tan <[email protected]> wrote:
> 
> Thanks everybody for sharing your thoughts. I saw positive feedbacks from
> 20+ contributors!
> 
> So I think we should move it forward, any suggestions about what we should
> do?
> 
> Best,
> Wangda
> 
> On Mon, Jul 22, 2019 at 5:36 PM neo <[email protected]> wrote:
> 
>> +1, This is neo from TiDB & TiKV community.
>> Thanks Xun for bring this up.
>> 
>> Our CNCF project's open source distributed KV storage system TiKV,
>> Hadoop submarine's machine learning engine helps us to optimize data
>> storage,
>> helping us solve some problems in data hotspots and data shuffers.
>> 
>> We are ready to improve the performance of TiDB in our open source
>> distributed relational database TiDB and also using the hadoop submarine
>> machine learning engine.
>> 
>> I think if submarine can be independent, it will develop faster and better.
>> Thanks to the hadoop community for developing submarine!
>> 
>> Best Regards,
>> neo
>> www.pingcap.com / https://github.com/pingcap/tidb /
>> https://github.com/tikv
>> 
>> Xun Liu <[email protected]> 于2019年7月22日周一 下午4:07写道：
>> 
>>> @adam.antal
>>> 
>>> The submarine development team has completed the following preparations:
>>> 1. Established a temporary test repository on Github.
>>> 2. Change the package name of hadoop submarine from org.hadoop.submarine
>> to
>>> org.submarine
>>> 3. Combine the Linkedin/TonY code into the Hadoop submarine module;
>>> 4. On the Github docked travis-ci system, all test cases have been
>> tested;
>>> 5. Several Hadoop submarine users completed the system test using the
>> code
>>> in this repository.
>>> 
>>> 赵欣 <[email protected]> 于2019年7月22日周一 上午9:38写道：
>>> 
>>>> Hi
>>>> 
>>>> I am a teacher at Southeast University (https://www.seu.edu.cn/). We
>> are
>>>> a major in electrical engineering. Our teaching teams and students use
>>>> bigoop submarine for big data analysis and automation control of
>>> electrical
>>>> equipment.
>>>> 
>>>> Many thanks to the hadoop community for providing us with machine
>>> learning
>>>> tools like submarine.
>>>> 
>>>> I wish hadoop submarine is getting better and better.
>>>> 
>>>> 
>>>> ==============================
>>>> 赵欣
>>>> 东南大学电气工程学院
>>>> 
>>>> -----------------------------------------------------
>>>> 
>>>> Zhao XIN
>>>> 
>>>> School of Electrical Engineering
>>>> 
>>>> ==============================
>>>> 2019-07-18
>>>> 
>>>> 
>>>> *From:* Xun Liu <[email protected]>
>>>> *Date:* 2019-07-18 09:46
>>>> *To:* xinzhao <[email protected]>
>>>> *Subject:* Fwd: Re: Any thoughts making Submarine a separate Apache
>>>> project?
>>>> 
>>>> 
>>>> ---------- Forwarded message ---------
>>>> 发件人： [email protected] <[email protected]>
>>>> Date: 2019年7月17日周三 下午3:17
>>>> Subject: Re: Re: Any thoughts making Submarine a separate Apache
>> project?
>>>> To: Szilard Nemeth <[email protected]>, runlin zhang <
>>>> [email protected]>
>>>> Cc: Xun Liu <[email protected]>, common-dev <
>>> [email protected]>,
>>>> yarn-dev <[email protected]>, hdfs-dev <
>>>> [email protected]>, mapreduce-dev <
>>>> [email protected]>, submarine-dev <
>>>> [email protected]>
>>>> 
>>>> 
>>>> +1 ，Good idea, we are very much looking forward to it.
>>>> 
>>>> ------------------------------
>>>> [email protected]
>>>> 
>>>> 
>>>> *From:* Szilard Nemeth <[email protected]>
>>>> *Date:* 2019-07-17 14:55
>>>> *To:* runlin zhang <[email protected]>
>>>> *CC:* Xun Liu <[email protected]>; Hadoop Common
>>>> <[email protected]>; yarn-dev <[email protected]>;
>>>> Hdfs-dev <[email protected]>; mapreduce-dev
>>>> <[email protected]>; submarine-dev
>>>> <[email protected]>
>>>> *Subject:* Re: Any thoughts making Submarine a separate Apache project?
>>>> +1, this is a very great idea.
>>>> As Hadoop repository has already grown huge and contains many
>> projects, I
>>>> think in general it's a good idea to separate projects in the early
>>> phase.
>>>> 
>>>> 
>>>> On Wed, Jul 17, 2019, 08:50 runlin zhang <[email protected]> wrote:
>>>> 
>>>>> +1 ，That will be great ！
>>>>> 
>>>>>> 在 2019年7月10日，下午3:34，Xun Liu <[email protected]> 写道：
>>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> This is Xun Liu contributing to the Submarine project for deep
>>> learning
>>>>>> workloads running with big data workloads together on Hadoop
>>> clusters.
>>>>>> 
>>>>>> There are a bunch of integrations of Submarine to other projects
>> are
>>>>>> finished or going on, such as Apache Zeppelin, TonY, Azkaban. The
>>> next
>>>>> step
>>>>>> of Submarine is going to integrate with more projects like Apache
>>>> Arrow,
>>>>>> Redis, MLflow, etc. & be able to handle end-to-end machine learning
>>> use
>>>>>> cases like model serving, notebook management, advanced training
>>>>>> optimizations (like auto parameter tuning, memory cache
>> optimizations
>>>> for
>>>>>> large datasets for training, etc.), and make it run on other
>>> platforms
>>>>> like
>>>>>> Kubernetes or natively on Cloud. LinkedIn also wants to donate TonY
>>>>> project
>>>>>> to Apache so we can put Submarine and TonY together to the same
>>>> codebase
>>>>>> (Page #30.
>>>>>> 
>>>>> 
>>>> 
>>> 
>> https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30
>>>>>> ).
>>>>>> 
>>>>>> This expands the scope of the original Submarine project in
>> exciting
>>>> new
>>>>>> ways. Toward that end, would it make sense to create a separate
>>>> Submarine
>>>>>> project at Apache? This can make faster adoption of Submarine, and
>>>> allow
>>>>>> Submarine to grow to a full-blown machine learning platform.
>>>>>> 
>>>>>> There will be lots of technical details to work out, but any
>> initial
>>>>>> thoughts on this?
>>>>>> 
>>>>>> Best Regards,
>>>>>> Xun Liu
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [email protected]
>>>>> For additional commands, e-mail: [email protected]
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Any thoughts making Submarine a separate Apache project?

Reply via email to