Big +1, very nicely captures what I also think --sebastian Am 21.05.2014 14:27 schrieb "Gokhan Capan" <[email protected]>:
> I want to express my opinions for the vision, too. I tried to capture those > words from various discussions in the dev-list, and hope that most, of them > support the common sense of excitement the new Mahout arouses > > To me, the fundamental benefit of the shift that Mahout is undergoing is a > better separation of the distributed execution engine, distributed data > structures, matrix computations, and algorithms layers, which will allow > the users/devs of Mahout with different roles focus on the relevant parts > of the framework: > > 1. A machine learning scientist, independent from the underlying > distributed execution engine, can utilize the matrix language and the > decompositions to implement new algorithms (which implies that the > current > distributed mahout algorithms are to be rewritten in the matrix > language) > 2. A math-scala module contributor, for the benefit of higher level > algorithms, can add new, or improve existing functions (the set of > decompositions is an example) with optimization plans (such as if two > matrices are partitioned in the same way, ...), where the concrete > implementations of those optimizations are delegated to the distributed > execution engine layer > 3. A distributed execution engine author can add machine learning > capabilities to her platform with i)concrete Matrix and Matrix I/O > implementation ii)partitioning, checkpointing, broadcasting behaviors, > iii)BLAS > 4. A Mahout user with access to a cluster operated by a > Mahout-supporting distributed execution engine can run machine learning > algorithms implemented on top of the matrix language > > Best > > Gokhan > > > On Tue, May 20, 2014 at 8:30 PM, Dmitriy Lyubimov <[email protected]> > wrote: > > > inline > > > > > > On Tue, May 20, 2014 at 12:42 AM, Sebastian Schelter <[email protected]> > > wrote: > > > > > > > >> > > > Let's take the next from our homepage as starting point. What should we > > > add/remove/modify? > > > > > > ------------------------------------------------------------ > > > ---------------- > > > The Mahout community decided to move its codebase onto modern data > > > processing systems that offer a richer programming model and more > > efficient > > > execution than Hadoop MapReduce. Mahout will therefore reject new > > MapReduce > > > algorithm implementations from now on. We will however keep our widely > > used > > > MapReduce algorithms in the codebase and maintain them. > > > > > > We are building our future implementations on top of a > > > > Scala > > > > > DSL for linear algebraic operations which has been developed over the > > last > > > months. Programs written in this DSL are automatically optimized and > > > executed in parallel for Apache Spark. > > > > More platforms to be added in the future. > > > > > > > > Furthermore, there is an experimental contribution undergoing which > aims > > > to integrate the h20 platform into Mahout. > > > ------------------------------------------------------------ > > > ---------------- > > > > > >
