The Apache Mahout PMC is pleased to announce the release of Mahout 0.11.1.
Mahout's goal is to create an environment for quickly creating machine learning applications that scale and run on the highest performance parallel computation engines available. Mahout comprises an interactive environment and library that supports generalized scalable linear algebra and includes many modern machine learning algorithms. The Mahout Math environment we call “Samsara (संसारा )” for its symbol of universal renewal. It reflects a fundamental rethinking of how scalable machine learning algorithms are built and customized. Mahout-Samsara is here to help people create their own math while providing some off-the-shelf algorithm implementations. At its base are general linear algebra and statistical operations along with the data structures to support them. It’s written in Scala with Mahout-specific extensions, and runs on Spark and H2O. To get started with Apache Mahout 0.11.1, download the release artifacts and signatures from http://www.apache.org/dist/mahout/0.11.1/. Many thanks to the contributors and committers who were part of this release. Please see below for the Release Highlights. RELEASE HIGHLIGHTS This is a minor release over Mahout 0.11.0 meant to expand Mahout’s compatibility with Spark versions, to introduce some new features and to fix some bugs. Mahout 0.11.1 includes all new features and bug fixes released in Mahout versions 0.11.0 and earlier. Mahout 0.11.1 new features compared to Mahout 0.11.0 1. Spark 1.4+ support. 2. 4x Performance improvement in Dot Product over Dense Vectors ( https://issues.apache.org/jira/browse/MAHOUT-1781) 3. %*% optimization based on matrix flavors. Note: Mahout 0.11.1 artifacts are binary compatible with Spark 1.4x and Spark 1.5+. STATS A total of 34 separate JIRA issues are addressed in this release [2] with 10 bugfixes. GETTING STARTED Download the release artifacts and signatures at http://www.apache.org/dist/mahout/0.11.1/ The examples directory contains several working examples of the core functionality available in Mahout. These can be run via scripts in the examples/bin directory. Most examples do not need a Hadoop cluster in order to run. FUTURE PLANS Integration with Apache Flink is in the works and is targeted for Mahout Release 0.12.0 in collaboration with TU Berlin and Data Artisans to add Flink as the 3rd execution engine to Mahout. This would be in addition to existing Apache Spark and H2O engines. To see progress on this branch look here: https://github.com/apache/mahout/commits/flink-binding. KNOWN ISSUES - In the binary zip or tar distribution, the example data for mahout/examples/bin/run-item-sim is missing. To run it get the csv files from Github <https://github.com/apache/mahout/tree/mahout-0.10.x/examples/src/main/resources> [3]. - OOM errors are observed on Mac OS with Java 7 when running trying to run the Mahout Spark Shell, it works fine with Java 8. [1] https://issues.apache.org/jira/browse/MAHOUT-1787?jql=project%20%3D%20MAHOUT%20AND%20status%20in%20%28Resolved%2C%20closed%29%20AND%20%28fixVersion%20%3D%200.11.1%20%29 [2] http://mahout.apache.org/developers/how-to-contribute.html [3] https://github.com/apache/mahout/tree/master/examples/src/main/resource <https://github.com/apache/mahout/tree/master/examples/src/main/resources>s