+1 on this. There's precedence with spark interoperability with the various drmWrap functions.
We've discussed pipelines in the past and roll-our-own vs. utilize underlying engine. Inter-operating with other pipelines (Spark) doesn't preclude that. The goal of the pipeline discussion iirc, was to eventually get towards automated hyper-parameter tuning. Again, I don't see conflict- maybe a way to work in at some point? In addition to all of this- I think convenience methods and interfaces for more advanced spark operations will make the Mahout Learning curve less steep, and hopefully drive adoption. The only concern I can think of is version creep- which opens a whole other discussion on 'how long will we support Spark 1.6' (I'm not proposing to stop anytime soon), but as I understand a lot of the advance pipeline stuff came about in 2.x. I think this can be easily handled- the Spark Interpreter in Apache Zeppelin is rife with multi version support examples (1.2 - 2.1) Also- I don't see this affecting anything outside of the spark bindings, so engine neutrality should be maintained (with spark getting some favorable treatment, but at this point... we've pushed Flink to its own profile and we keep h2o around because its not causing any trouble). On Fri, Jul 7, 2017 at 4:32 PM, Holden Karau <hol...@pigscanfly.ca> wrote: > Hi y'all, > > Trevor and I had been talking a bit and one of the things I'm interested in > doing is trying to make it easier for the different ML libraries to be used > in Spark. Spark ML has this unified pipeline interface (which is certainly > far from perfect), but I was thinking I'd take a crack at trying to expose > some of Mahout's algorithms so that they could be used/configured with > Spark ML's pipeline interface. > > I'd like to take a stab at doing that inside the mahout project, but if > it's something people feel would be better to live outside I'm happy to do > that as well. > > Cheers, > > Holden > > For reference: > > https://spark.apache.org/docs/latest/ml-pipeline.html > > -- > Twitter: https://twitter.com/holdenkarau >