The release gotta be by mid-April in time for the big top guys to package 
mahout into their distro

Sent from my iPhone

> On Mar 17, 2015, at 6:41 PM, Andrew Musselman <andrew.mussel...@gmail.com> 
> wrote:
> 
> Seeing so much stuff here and along the line that I think a 0.9.1 release
> is in order; get things in order in parallel with more complex questions.
> I can commit to working on cleanup and minor bugs next two months, plan a
> release in May, for instance.
> 
> On Tue, Mar 17, 2015 at 11:38 PM, Andrew Musselman <
> andrew.mussel...@gmail.com> wrote:
> 
>> Agree DSL is a bad name; I like distributed algebra or algebraic optimizer.
>> 
>>> On Tue, Mar 17, 2015 at 5:14 PM, Pat Ferrel <p...@occamsmachete.com> wrote:
>>> 
>>> Yeah we need a real name that brings no baggage. R-like, based on scala,
>>> big data linear algebra yada yada. Can’t say that in a descriptive phrase
>>> so why not a name like Mahout-xyz? Of course with a more catchy search
>>> friendly xyz
>>> 
>>> But AP’s structure seems pretty good
>>> 
>>> I’m nervous releasing H2O with no one supporting it. Is anyone signing up
>>> for that?
>>> 
>>> 
>>> On Mar 17, 2015, at 8:59 AM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:
>>> 
>>> I dont like the term dsl.
>>> 
>>> It is algebtaic optimizer, folks. Calling it dsl brings in wrong and too
>>> trivial ideas about it.
>>>> On Mar 17, 2015 8:27 AM, "Andrew Palumbo" <ap....@outlook.com> wrote:
>>>> 
>>>> 
>>>>> On 03/15/2015 01:42 PM, Pat Ferrel wrote:
>>>>> 
>>>>> Lots of discussion off the record about doing a release but shouldn’t
>>> we
>>>>> plan this?
>>>>> 
>>>>> What has to be in a release of Mahout 0.10?
>>>>> 
>>>>> Seems like we could release as-is but it would be nice to have some of
>>>>> the already completed work that isn’t committed yet:
>>>>> * mrlegacy refactored out of scala, is it possible to get this in
>>> Dmitriy?
>>>>> 
>>>>> One question is how to package, with which version of Spark. There is a
>>>>> bug in Spark 1.2.1 and I think in 1.2 (this is the big distro build)
>>> that
>>>>> requires any class that uses the JavaSerializer to set a specific
>>> SparkConf
>>>>> key/value to point to the guava jar on all workers. This only effects
>>>>> IndexedDatasets since they use Guava’s BiMap. Rumor has it that 1.3
>>> fixes
>>>>> this but I haven’t tried it yet.
>>>>> 
>>>>> So we are currently stuck on 1.1.1 but could document how to work
>>> around
>>>>> to use 1.2 for a user who want’s to build Mahout from scratch. A user
>>>>> source build on 1.3 may not require a work around. We seem to be good
>>> on
>>>>> hadoop 2.x, which in itself is a good reason to release since 0.9 was
>>> not.
>>>>> 
>>>>> What else needs to be done:
>>>>> * rename module math-scala to core?
>>>>> * create the distribution build. Currently this does not publish the
>>>>> scaladocs and does not create artifacts for H2O or and Scala.
>>>> 
>>>> same problem for javadocs (other than mregacy).  Is this a question for
>>>> INFRA?  We have MAHOUT-1562 <https://issues.apache.org/
>>>> jira/browse/MAHOUT-1562> and MAHOUT-1585 <https://issues.apache.org/
>>>> jira/browse/MAHOUT-1585>
>>>> open for these.  Were javadocs for all modules ever hosted? there were
>>>> links for them which are now dead so I removed them from the site.  I'm
>>>> wondering because even once we get the scaladocs published in the build
>>>> will we have the same problem of them not being hosted.
>>>> 
>>>> * is H2O really in a form to publish?
>>>> 
>>>> In terms of scala bindings for the DRM and  DSL Linear algebra
>>> operations,
>>>> solvers, etc.. , H2O should be good to go with the exception of one bug
>>>> (MAHOUT-1638 <https://issues.apache.org/jira/browse/MAHOUT-1638>).   It
>>>> passes (almost) all math-scala tests.  We have no other algorithms
>>> (outside
>>>> of math-scala solvers, decompositions, etc) for H2O. I'm not sure if its
>>>> being used or how much real world testing its had; It does serve at the
>>>> very least as a proof of concept for the Engine Neutral DSL.
>>>> 
>>>> 
>>>>> Docs
>>>>> * IMO we should name the Mahout Spark-Scala DSL and shell. More unique
>>>>> names are easier to find in searches. Maybe Suneel can polish off his
>>>>> sanskrit and suggest something.
>>>>> * we should be ready to do some work here to restructure the CMS since
>>> it
>>>>> is very 0.9 centric with Scala stuff almost an afterthought.
>>>> 
>>>> Agreed.  What about categorizing  the Documentation on the site under
>>> tabs
>>>> like "Mahout-DSL"  "Mahout Spark-Environment" and "Mahout Map-Reduce" ?
>> 

Reply via email to