Hi Makoto,

I cannot speak for Hive PMC, only as a data tool user and occasional
contributor. I think the idea is very much a good one. Incubator takes a
lot of work because it's all about establishing a vibrant developer and
user community for the project. "Community before code," as they say.

I would also encourage you to consider joining forces with DataFu, rather
than "competing". I think there's a real appetite a wholistic toolbox of
patterns and implementations that can span these projects. From my
understanding, there's nothing about DataFu that's unique to Pig, they just
need the work done to abstract away the Pig bits and implement the Hive

Is there anything about Hivemall that's unique to Hive, that wouldn't be
applicable to Pig as well?

+Casey, as I believe he has some interest in seeing DataFu reach a wider
audience as well.

Good on you.

On Friday, November 21, 2014, Makoto Yui <yuin...@gmail.com> wrote:

> Hi all,
> I am the principal developer of Hivemall, a scalable machine learning
> library for Apache Hive.
>   https://github.com/myui/hivemall
> When I presented a talk at the last Hadoop Summit in San Jose [1],
> several audiences asked me the possibility to change the software
> license of Hivemall to Apache License v2 and then sustainability of the
> project was their major concerns.
> Since then, I am wondering to propose Hivemall as an Apache Incubator
> project. The position of Hivemall for Hive would become similar one to
> DataFu (an Apache Incubator project) for Apache Pig.
> I believe that adding machine learning functionality over Apache Hive
> could extend application range of Apache Hive and Hivemall could help
> existing Hive users in their learning-scale data analytics projects.
> I have got approved from my employer (AIST) to change the license of
> Hivemall to Apache License version 2 and the donating the code to Apache
> Foundation. And now, I am willing to propose Hivemall as an Apache
> incubator project, together with Hivemall contributors in NTT corp.
> I am considering that the current Hivemall codebase is bits large to be
> included in Hive contrib and thus it is better to be a separated
> incubator project. I would like to propose Hivemall to be graduated as a
> subproject of Apache Hive.
> Is the strategy possible from the Hive PMC point of view?
> http://incubator.apache.org/guides/graduation.html#subproject-or-top-level
> Before formulating a proposal, I would like to hear Hive developers’
> opinion (e.g., possibilities, +1/-1, and missing pieces for incubations)
> on incubating Hivemall.
> BTW, I found this JIRA issue mentioning Hivemall.
> https://issues.apache.org/jira/browse/HIVE-7940
> Is there a possibility to cooperate with them in proposing Hivemall to
> Apache Incubator project? According the incubation guides, I need a
> mentor/champion for incubating.
> http://incubator.apache.org/guides/proposal.html#formulating
> Your help toward the incubation will be much appreciated.
> Thanks,
> Makoto
> [1] http://www.slideshare.net/myui/hivemall-hadoop-summit-2014-san-jose
> -- ******************************************* Makoto YUI
> Information Technology Research
> Institute, AIST.
> http://staff.aist.go.jp/m.yui/ *******************************************

