I want to express my opinions for the vision, too. I tried to capture those
words from various discussions in the dev-list, and hope that most, of them
support the common sense of excitement the new Mahout arouses

To me, the fundamental benefit of the shift that Mahout is undergoing is a
better separation of the distributed execution engine, distributed data
structures, matrix computations, and algorithms layers, which will allow
the users/devs of Mahout with different roles focus on the relevant parts
of the framework:

   1. A machine learning scientist, independent from the underlying
   distributed execution engine, can utilize the matrix language and the
   decompositions to implement new algorithms (which implies that the current
   distributed mahout algorithms are to be rewritten in the matrix language)
   2. A math-scala module contributor, for the benefit of higher level
   algorithms, can add new, or improve existing functions (the set of
   decompositions is an example) with optimization plans (such as if two
   matrices are partitioned in the same way, ...), where the concrete
   implementations of those optimizations are delegated to the distributed
   execution engine layer
   3. A distributed execution engine author can add machine learning
   capabilities to her platform with i)concrete Matrix and Matrix I/O
   implementation  ii)partitioning, checkpointing, broadcasting behaviors,
   iii)BLAS
   4. A Mahout user with access to a cluster operated by a
   Mahout-supporting distributed execution engine can run machine learning
   algorithms implemented on top of the matrix language

Best

Gokhan


On Tue, May 20, 2014 at 8:30 PM, Dmitriy Lyubimov <[email protected]> wrote:

> inline
>
>
> On Tue, May 20, 2014 at 12:42 AM, Sebastian Schelter <[email protected]>
> wrote:
>
> >
> >>
> > Let's take the next from our homepage as starting point. What should we
> > add/remove/modify?
> >
> > ------------------------------------------------------------
> > ----------------
> > The Mahout community decided to move its codebase onto modern data
> > processing systems that offer a richer programming model and more
> efficient
> > execution than Hadoop MapReduce. Mahout will therefore reject new
> MapReduce
> > algorithm implementations from now on. We will however keep our widely
> used
> > MapReduce algorithms in the codebase and maintain them.
> >
> > We are building our future implementations on top of a
>
> Scala
>
> > DSL for linear algebraic operations which has been developed over the
> last
> > months. Programs written in this DSL are automatically optimized and
> > executed in parallel for Apache Spark.
>
> More platforms to be added in the future.
>
> >
> > Furthermore, there is an experimental contribution undergoing which aims
> > to integrate the h20 platform into Mahout.
> > ------------------------------------------------------------
> > ----------------
> >
>

Reply via email to