Hi, the draft looks good overall, I have some minor comments inline:
On 22.12.2013 03:28, Suneel Marthi wrote: > Hi All, > > Please see below the first draft of Release notes for Mahout 0.9. Please feel > free to add/edit sections as u see fit. > (This is a draft only). > > Regards, > Suneel > > > --------------------------------- > > > The Apache Mahout PMC is pleased to announce the release of Mahout 0.9. > Mahout's goal is to build scalable machine learning libraries focused > primarily in the areas of collaborative filtering (recommenders), > clustering and classification (known collectively as the "3Cs"), as well as > the > necessary infrastructure to support those implementations including, but > not limited to, math packages for statistics, linear algebra and others > as well as Java primitive collections, local and distributed vector and > matrix classes and a variety of integrative code to work with popular > packages like Apache Hadoop, Apache Lucene, Apache HBase, Apache > Cassandra and much more. The 0.9 release is mainly a clean up release in > preparation for an upcoming 1.0 release targeted for first half of 2014, but > there are a few > significant new features, which are highlighted below. > > To get started with Apache Mahout 0.9, > download the release artifacts and signatures at > http://www.apache.org/dyn/closer.cgi/mahout or visit the central Maven > repository. > > In > addition to the release highlights and artifacts, please pay attention > to the section labelled FUTURE PLANS below for more information about > upcoming releases of Mahout. > > As with any release, we wish to thank all of the users and contributors > to Mahout. Please see the CHANGELOG [1] and JIRA Release Notes [2] for > individual credits, as there are too many to list here. > > GETTING STARTED > > In the release package, the examples directory contains several working > examples of the core > functionality available in Mahout. These can be run via scripts in the > examples/bin > directory and will prompt you for more information to help you try > things out. Most examples do not need a Hadoop cluster in > order to run. > > RELEASE HIGHLIGHTS > > The highlights of the Apache Mahout 0.9 release include, but are not > limited to the list below. For further information, see the included > CHANGELOG file. > > - Scala DSL Bindings for Mahout Math Linear Algebra (MAHOUT-1297). > See > http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html > - New Multilayer Perceptron Classifier (MAHOUT-1265) > - Recommenders as a Search (MAHOUT-1288). See > https://github.com/pferrel/solr-recommender > - MAHOUT-1364: Upgrade Mahout to be Lucene 4.6.0 compliant > - MAHOUT-1361: Online Algorithm for computing accurate Quantiles using > 1-dimensional Clustering > See > https://github.com/tdunning/t-digest/blob/master/docs/theory/t-digest-paper/histo.pdf > for the details. > > - Removed Deprecated algorithms. > > - the usual bug fixes. See JIRA [?} for more information on the 0.9 release. > > > A total 91 separate JIRA issues were addressed in this release. > > The following algorithms that were marked deprecated in 0.8 have been removed > in 0.9: > > - From Clustering: > Dirichlet - replaced by Collapsible Variational Bayes (CVB) I think we switched our LDA implementation to use CVB and removed Dirichlet clustering, those are two different things, right? > > Meanshift > > MinHash - removed due to poor performance and lack of usage > > EigenCuts - > > > - From Classification (both are sequential implementations) > > Winnow - lack of actual usage > > Perceptron - lack of actual usage > > > - Frequent Pattern Mining > > - Collaborative Filtering > All recommenders in org.apache.mahout.cf.taste.impl.recommender.knn > SlopeOne implementations in org.apache.mahout.cf.taste.hadoop.slopeone > and org.apache.mahout.cf.taste.impl.recommender.slopeone > Distributed pseudo recommender in org.apache.mahout.cf.taste.hadoop.pseudo > TreeClusteringRecommender in org.apache.mahout.cf.taste.impl.recommender We should be careful, because the package knn could make people think we removed our itembased recommenders (already caused confusion on twitter). I think it would be sufficient to say we removed a couple of rarely used recommenders, in particular SlopeOne. > > - Mahout Math > Lanczos in favour of SSVD IIRC, we agreed to not remove Lanczos, although it was initially deprecated. We should undeprecate it. > Hadoop entropy stuff in org.apache.mahout.math.stats.entropy > > If you are interested in supporting 1 or more of these algorithms, please > make it known on dev@mahout.apache.org and via JIRA issues that fix and/or > improve them. Please also provide > supporting evidence as to their effectiveness for you in production. > > > CONTRIBUTING > > Mahout > is always looking for contributions focused on the 3Cs. If you are > interested in contributing, please see our contribution page, > https://cwiki.apache.org/MAHOUT/how-to-contribute.html, on the Mahout wiki or > contact us via email at dev@mahout.apache.org. > > FUTURE PLANS > > 1.0 Plans > ------------ > > > - New Downpour SGD classifier > > - Support for Finite State Transducers (FST) as a Dictionary Type. > - Support for Hadoop 2.x > - Port Mahout's recommenders to Spark (??) > - Support for Java 7 > - Better API interfaces for Clustering > - (what else???) > > > As the project moves towards a 1.0 release, the community will be focused on > key algorithms that are proven to scale in production > and have seen wide-spread adoption. > > Our plans as a community are to focus 1.0 on the support of algorithms and > features listed above. > The support for the algorithms packaged in 1.0 for atleast two minor versions > after 1.0 release. > In the case of removal after 1.0, we will deprecate > the functionality in the 1.(x+1) minor release and remove > it in the > 1.(x+2) release. For instance, if feature X is to be removed after the > 1.2 release, it will be deprecated in 1.3 and removed in 1.4. > > [1] > http://svn.apache.org/viewvc/mahout/trunk/CHANGELOG?revision=1552746&view=markup > [2] > https://issues.apache.org/jira/browse/MAHOUT-1376?jql=project%20%3D%20MAHOUT%20AND%20fixVersion%20%3D%20%220.9%22 >