Re: Release

Andrew Palumbo Tue, 17 Mar 2015 08:54:55 -0700


On 03/15/2015 01:42 PM, Pat Ferrel wrote:

Lots of discussion off the record about doing a release but shouldn’t we plan 
this?


What has to be in a release of Mahout 0.10?

Seems like we could release as-is but it would be nice to have some of the 
already completed work that isn’t committed yet:
* mrlegacy refactored out of scala, is it possible to get this in Dmitriy?

One question is how to package, with which version of Spark. There is a bug in 
Spark 1.2.1 and I think in 1.2 (this is the big distro build) that requires any 
class that uses the JavaSerializer to set a specific SparkConf key/value to 
point to the guava jar on all workers. This only effects IndexedDatasets since 
they use Guava’s BiMap. Rumor has it that 1.3 fixes this but I haven’t tried it 
yet.

So we are currently stuck on 1.1.1 but could document how to work around to use 
1.2 for a user who want’s to build Mahout from scratch. A user source build on 
1.3 may not require a work around. We seem to be good on hadoop 2.x, which in 
itself is a good reason to release since 0.9 was not.

What else needs to be done:
* rename module math-scala to core?
* create the distribution build. Currently this does not publish the scaladocs 
and does not create artifacts for H2O or and Scala.

same problem for javadocs (other than mregacy). Is this a question forINFRA? We have MAHOUT-1562<https://issues.apache.org/jira/browse/MAHOUT-1562> and MAHOUT-1585<https://issues.apache.org/jira/browse/MAHOUT-1585>open for these. Were javadocs for all modules ever hosted? there werelinks for them which are now dead so I removed them from the site. I'mwondering because even once we get the scaladocs published in the buildwill we have the same problem of them not being hosted.

* is H2O really in a form to publish?

In terms of scala bindings for the DRM and DSL Linear algebraoperations, solvers, etc.. , H2O should be good to go with the exceptionof one bug (MAHOUT-1638<https://issues.apache.org/jira/browse/MAHOUT-1638>). It passes(almost) all math-scala tests. We have no other algorithms (outside ofmath-scala solvers, decompositions, etc) for H2O. I'm not sure if itsbeing used or how much real world testing its had; It does serve at thevery least as a proof of concept for the Engine Neutral DSL.


Docs
* IMO we should name the Mahout Spark-Scala DSL and shell. More unique names 
are easier to find in searches. Maybe Suneel can polish off his sanskrit and 
suggest something.
* we should be ready to do some work here to restructure the CMS since it is 
very 0.9 centric with Scala stuff almost an afterthought.

Agreed. What about categorizing the Documentation on the site undertabs like "Mahout-DSL" "Mahout Spark-Environment" and "Mahout Map-Reduce" ?

Re: Release

Reply via email to