Yeah we need a real name that brings no baggage. R-like, based on scala, big 
data linear algebra yada yada. Can’t say that in a descriptive phrase so why 
not a name like Mahout-xyz? Of course with a more catchy search friendly xyz

But AP’s structure seems pretty good

I’m nervous releasing H2O with no one supporting it. Is anyone signing up for 
that?


On Mar 17, 2015, at 8:59 AM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:

I dont like the term dsl.

It is algebtaic optimizer, folks. Calling it dsl brings in wrong and too
trivial ideas about it.
On Mar 17, 2015 8:27 AM, "Andrew Palumbo" <ap....@outlook.com> wrote:

> 
> On 03/15/2015 01:42 PM, Pat Ferrel wrote:
> 
>> Lots of discussion off the record about doing a release but shouldn’t we
>> plan this?
>> 
>> What has to be in a release of Mahout 0.10?
>> 
>> Seems like we could release as-is but it would be nice to have some of
>> the already completed work that isn’t committed yet:
>> * mrlegacy refactored out of scala, is it possible to get this in Dmitriy?
>> 
>> One question is how to package, with which version of Spark. There is a
>> bug in Spark 1.2.1 and I think in 1.2 (this is the big distro build) that
>> requires any class that uses the JavaSerializer to set a specific SparkConf
>> key/value to point to the guava jar on all workers. This only effects
>> IndexedDatasets since they use Guava’s BiMap. Rumor has it that 1.3 fixes
>> this but I haven’t tried it yet.
>> 
>> So we are currently stuck on 1.1.1 but could document how to work around
>> to use 1.2 for a user who want’s to build Mahout from scratch. A user
>> source build on 1.3 may not require a work around. We seem to be good on
>> hadoop 2.x, which in itself is a good reason to release since 0.9 was not.
>> 
>> What else needs to be done:
>> * rename module math-scala to core?
>> * create the distribution build. Currently this does not publish the
>> scaladocs and does not create artifacts for H2O or and Scala.
>> 
> 
> same problem for javadocs (other than mregacy).  Is this a question for
> INFRA?  We have MAHOUT-1562 <https://issues.apache.org/
> jira/browse/MAHOUT-1562> and MAHOUT-1585 <https://issues.apache.org/
> jira/browse/MAHOUT-1585>
> open for these.  Were javadocs for all modules ever hosted? there were
> links for them which are now dead so I removed them from the site.  I'm
> wondering because even once we get the scaladocs published in the build
> will we have the same problem of them not being hosted.
> 
> * is H2O really in a form to publish?
>> 
> 
> In terms of scala bindings for the DRM and  DSL Linear algebra operations,
> solvers, etc.. , H2O should be good to go with the exception of one bug
> (MAHOUT-1638 <https://issues.apache.org/jira/browse/MAHOUT-1638>).   It
> passes (almost) all math-scala tests.  We have no other algorithms (outside
> of math-scala solvers, decompositions, etc) for H2O. I'm not sure if its
> being used or how much real world testing its had; It does serve at the
> very least as a proof of concept for the Engine Neutral DSL.
> 
> 
>> Docs
>> * IMO we should name the Mahout Spark-Scala DSL and shell. More unique
>> names are easier to find in searches. Maybe Suneel can polish off his
>> sanskrit and suggest something.
>> * we should be ready to do some work here to restructure the CMS since it
>> is very 0.9 centric with Scala stuff almost an afterthought.
>> 
> 
> Agreed.  What about categorizing  the Documentation on the site under tabs
> like "Mahout-DSL"  "Mahout Spark-Environment" and "Mahout Map-Reduce" ?
> 
> 
> 

Reply via email to