I've taken a stab at adding a subset of the functionality used by MLTable 
operators into the blog on top of the R CRUD functionality I listed earlier 
into the integration API section of the blog, please review and let me know 
your thoughts, will be tackling the dplyr functionality next and adding that in 
, blog is shown below, again please see the integration API section for details:

http://mlefforts.blogspot.com/2014/04/introduction-this-proposal-will.html

Look forward to hearing comments either on the list on the jira ticket itself:
https://issues.apache.org/jira/browse/MAHOUT-1490
Thanks in advance.

> Date: Wed, 30 Apr 2014 17:13:52 +0200
> From: [email protected]
> To: [email protected]; [email protected]
> Subject: Re: Helping out on spark efforts
> 
> I think getting the design right for MAHOUT-1490 is tough. Dmitriy 
> suggested to update the design example to Scala code and try to work in 
> things that fit from dply from R and MLTable. I'd love to see such a 
> design doc.
> 
> --sebastian
> 
> On 04/30/2014 05:02 PM, Ted Dunning wrote:
> > +1 for foundations first.
> >
> > There are bunches of algorithms just behind that.  K-means.  SGD+Adagrad
> > regression.  Autoencoders.  K-sparse encoding.  Lots of stuff.
> >
> >
> >
> > On Wed, Apr 30, 2014 at 4:52 PM, Sebastian Schelter <[email protected]> wrote:
> >
> >> I think you should concentrate on MAHOUT-1490, that is a highly important
> >> task that will be the foundation for a lot of stuff to be built on top.
> >> Let's focus on getting this thing right and then move on to other things.
> >>
> >> --sebastian
> >>
> >>
> >> On 04/30/2014 04:44 PM, Saikat Kanjilal wrote:
> >>
> >>> Sebastien/Dmitry,In looking through the current list of issues I didnt
> >>> see other algorithms in mahout that are talked about being ported to 
> >>> spark,
> >>> I was wondering if there's any interest/need in porting or writing things
> >>> like LR/KMeans/SVM to use spark, I'd like to help out in this area while
> >>> working on 1490.  Also are we planning to port the distributed versions of
> >>> taste to use spark as well at some point.
> >>> Thanks in advance.
> >>>
> >>>
> >>
> >
> 
                                          

Reply via email to