The Apache Mahout PMC is pleased to announce the release of Mahout 14.1. Mahout's goal is to create an environment for quickly creating machine-learning applications that scale and run on the highest-performance parallel computation engines available. Mahout comprises an interactive environment and library that support generalized scalable linear algebra and include many modern machine-learning algorithms. This release ships some major changes from 0.14.0, most in support of refactoring the build system.
To get started with Apache Mahout 14.1, download the release artifacts and signatures from https://downloads.apache.org/mahout/14.1/. Many thanks to the contributors and committers who were part of this release. RELEASE HIGHLIGHTS The theme of the 14.1 release is a major refactor for simplicity of usage and maintenance. Pom structure and components have moved, so please ask on the mailing lists for help if anything is not where you expect it. STATS A total of 17 separate JIRA issues are addressed in this release [1]. GETTING STARTED Download the release artifacts and signatures at https://mahout.apache.org/general/downloads.html. The examples directory contains several working examples of the core functionality available in Mahout. These can be run via scripts in the examples/bin directory. Most examples do not need a Hadoop cluster in order to run. FUTURE PLANS 14.2 As the project moves towards a 14.2 release, we are working on the following: * Further Native Integration for increased speedups * JCuda backing for In-core Matrices and CUDA solvers * Enumeration across multiple GPUs per JVM instance on a given instance * GPU/OpenMP Acceleration for linear solvers * Runtime probing and optimization of available hardware for caching of correct/most optimal solver * Python bindings for DSL CONTRIBUTING If you are interested in contributing, please see our How to Contribute [2] page or contact us via email at d...@mahout.apache.org. CREDITS As with every release, we wish to thank all of the users and contributors to Mahout. Please see the JIRA Release Notes [1] for individual credits. Big thanks to Chris Dutz for his effort on the refactoring and cleanup in this release. KNOWN ISSUES: * The classify-wikipedia.sh example has an outdated link to the data files. A workaround is to change the download section of the script to: `curl https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles10.xml-p002336425p003046511.bz2 -o ${WORK_DIR}/wikixml/enwiki-latest-pages-articles.xml.bz2` * Currently GPU acceleration for supported operations is limited to a single JVM instance * Occasional segfault with certain GPU models and computations * On older GPUs some tests fail when building ViennaCL due to card limitations * Currently automatic probing of a system’s hardware happens at each supported operation, adding some overhead * Currently the example in the main README errors out due to a packaging error; we will be fixing this in the next point release [1] <https://issues.apache.org/jira/issues/?jql=project%20%3D%20MAHOUT%20AND%20issuetype%20in%20(standardIssueTypes()%2C%20subTaskIssueTypes())%20AND%20status%20%3D%20Resolved%20AND%20fixVersion%20in%20(0.13.0%2C%200.13.1%2C%201.0.0)> https://issues.apache.org/jira/browse/MAHOUT-2068?jql=project%20%3D%20MAHOUT%20AND%20issuetype%20in%20(standardIssueTypes()%2C%20subTaskIssueTypes())%20AND%20status%20%3D%20Resolved%20AND%20fixVersion%20in%20(0.14.1%2C%200.14.0) [2] https://mahout.apache.org/developers/how-to-contribute