Re: Making it easier to use Mahout algorithms with Apache Spark pipelines

2017-07-09 Thread Andrew Musselman
> > > --andy > > ____________ > From: holden.ka...@gmail.com on behalf of Holden > Karau > Sent: Friday, July 7, 2017 8:22:12 PM > To: dev@mahout.apache.org > Subject: Re: Making it easier to use Mahout algorithms with Apache Spark > pipelines

Re: Making it easier to use Mahout algorithms with Apache Spark pipelines

2017-07-09 Thread Andrew Palumbo
rau Sent: Friday, July 7, 2017 8:22:12 PM To: dev@mahout.apache.org Subject: Re: Making it easier to use Mahout algorithms with Apache Spark pipelines The version creep is certainly an issue, normally its solved by having a 2.X directory for things that are only supported in 2.X and only includi

Re: Making it easier to use Mahout algorithms with Apache Spark pipelines

2017-07-07 Thread Holden Karau
The version creep is certainly an issue, normally its solved by having a 2.X directory for things that are only supported in 2.X and only including that in the 2.X build. That being said the pipeline stuff has been around since 1.3 (albeit as an alpha component) so we could probably make it work fo

Re: Making it easier to use Mahout algorithms with Apache Spark pipelines

2017-07-07 Thread Trevor Grant
+1 on this. There's precedence with spark interoperability with the various drmWrap functions. We've discussed pipelines in the past and roll-our-own vs. utilize underlying engine. Inter-operating with other pipelines (Spark) doesn't preclude that. The goal of the pipeline discussion iirc, was t

Making it easier to use Mahout algorithms with Apache Spark pipelines

2017-07-07 Thread Holden Karau
Hi y'all, Trevor and I had been talking a bit and one of the things I'm interested in doing is trying to make it easier for the different ML libraries to be used in Spark. Spark ML has this unified pipeline interface (which is certainly far from perfect), but I was thinking I'd take a crack at try