I've taken a stab at adding a subset of the functionality used by MLTable operators into the blog on top of the R CRUD functionality I listed earlier into the integration API section of the blog, please review and let me know your thoughts, will be tackling the dplyr functionality next and adding that in , blog is shown below, again please see the integration API section for details:
http://mlefforts.blogspot.com/2014/04/introduction-this-proposal-will.html Look forward to hearing comments either on the list on the jira ticket itself: https://issues.apache.org/jira/browse/MAHOUT-1490 Thanks in advance. > Date: Wed, 30 Apr 2014 17:13:52 +0200 > From: [email protected] > To: [email protected]; [email protected] > Subject: Re: Helping out on spark efforts > > I think getting the design right for MAHOUT-1490 is tough. Dmitriy > suggested to update the design example to Scala code and try to work in > things that fit from dply from R and MLTable. I'd love to see such a > design doc. > > --sebastian > > On 04/30/2014 05:02 PM, Ted Dunning wrote: > > +1 for foundations first. > > > > There are bunches of algorithms just behind that. K-means. SGD+Adagrad > > regression. Autoencoders. K-sparse encoding. Lots of stuff. > > > > > > > > On Wed, Apr 30, 2014 at 4:52 PM, Sebastian Schelter <[email protected]> wrote: > > > >> I think you should concentrate on MAHOUT-1490, that is a highly important > >> task that will be the foundation for a lot of stuff to be built on top. > >> Let's focus on getting this thing right and then move on to other things. > >> > >> --sebastian > >> > >> > >> On 04/30/2014 04:44 PM, Saikat Kanjilal wrote: > >> > >>> Sebastien/Dmitry,In looking through the current list of issues I didnt > >>> see other algorithms in mahout that are talked about being ported to > >>> spark, > >>> I was wondering if there's any interest/need in porting or writing things > >>> like LR/KMeans/SVM to use spark, I'd like to help out in this area while > >>> working on 1490. Also are we planning to port the distributed versions of > >>> taste to use spark as well at some point. > >>> Thanks in advance. > >>> > >>> > >> > > >
