We recognize the value of a non-Java-centric API for doing math/algo work, similar in spirit to what R has done. So...

0xdata is
- looking at how the h2o API/programming model fits with the existing Mahout Java API
- doing some initial exploratory porting to the existing Java API
- watching with interest how the API work moves, especially if a consensus arrives around The Right Way to program these kinds of ML and Big Data algorithms - Looking to support ALL efforts around open-source ML systems, in part because we don't know which solution is best - In particular H2O is a killer-fast backend for doing distributed computations, but is not the easiest thing to use. We are working to improve & extend that usability, while keeping our speed. - Modifying H2O's internal API to support the Mahout Java API, and an R API, and a potential Scala-based API (perhaps from Spark/DataBricks and perhaps from Dmitriy's work) - goes directly to our goal of making H2O more usable, and supporting a higher goal of making ML on Big Data more usable ('cause a faster backend means it's faster on the same-size data, or possible on bigger data).

Cliff


On 5/6/2014 11:27 AM, Saikat Kanjilal wrote:
The paragraph(s) don't necessarily clearly identify whether the non-comitters 
are currently only working on 0xdata or spark or both(which is actually the 
case), ideally a statement around non-committers doing work in both areas would 
be great with the obvious open-source addition that outside contributions are 
encouraged.

From: [email protected]
Date: Tue, 6 May 2014 18:23:18 +0200
Subject: consensus statement?
To: [email protected]

I have been involved in side conversations to try to build a bit of unity
among our community and would like to propose this as a statement of what
we are doing:


Apache Mahout is moving immediately to a faster execution model. The first
of these is Spark. Outside contributions are always encouraged.


As a bit of commentary, it is clear that what the committers are working on
is Spark and it is clear that Spark will be the first new platform for
Mahout.  It is also clear that there are non-committers (the 0xdata crew
for one) who are working with the community to extend Mahout beyond just
Spark.  As a statement of where the community is *right* now, however, I
don't think we need to say much more than that we encourage contributions.

Sound fair?  Correct?
                                        

Reply via email to