Re: [DISCUSS] Naming convention for multiple spark/scala combos

2017-07-07 Thread Pat Ferrel
IIRC these all fit sbt’s conventons? On Jul 7, 2017, at 2:05 PM, Trevor Grant wrote: So to tie all of this together- org.apache.mahout:mahout-spark_2.10:0.13.1_spark_1_6 org.apache.mahout:mahout-spark_2.10:0.13.1_spark_2_0 org.apache.mahout:mahout-spark_2.10:0.13.1_spark_2_1 org.apache.mahout

Re: [DISCUSS] Naming convention for multiple spark/scala combos

2017-07-07 Thread Andrew Musselman
Welcome Holden; how's that release going Trev :) On Fri, Jul 7, 2017 at 4:10 PM, Holden Karau wrote: > Thanks! :) > > On Fri, Jul 7, 2017 at 4:04 PM, Andrew Palumbo wrote: > > > Welcome! > > > > > > > > Sent from my Verizon Wireless 4G LTE smartphone > > > > > > Original message --

Re: Making it easier to use Mahout algorithms with Apache Spark pipelines

2017-07-07 Thread Holden Karau
The version creep is certainly an issue, normally its solved by having a 2.X directory for things that are only supported in 2.X and only including that in the 2.X build. That being said the pipeline stuff has been around since 1.3 (albeit as an alpha component) so we could probably make it work fo

Re: [DISCUSS] Naming convention for multiple spark/scala combos

2017-07-07 Thread Holden Karau
Thanks! :) On Fri, Jul 7, 2017 at 4:04 PM, Andrew Palumbo wrote: > Welcome! > > > > Sent from my Verizon Wireless 4G LTE smartphone > > > Original message > From: Holden Karau > Date: 07/07/2017 1:24 PM (GMT-08:00) > To: dev@mahout.apache.org > Cc: Trevor Grant > Subject: Re:

RE: [DISCUSS] Naming convention for multiple spark/scala combos

2017-07-07 Thread Andrew Palumbo
Welcome! Sent from my Verizon Wireless 4G LTE smartphone Original message From: Holden Karau Date: 07/07/2017 1:24 PM (GMT-08:00) To: dev@mahout.apache.org Cc: Trevor Grant Subject: Re: [DISCUSS] Naming convention for multiple spark/scala combos Trevor looped me in on this

Re: Making it easier to use Mahout algorithms with Apache Spark pipelines

2017-07-07 Thread Trevor Grant
+1 on this. There's precedence with spark interoperability with the various drmWrap functions. We've discussed pipelines in the past and roll-our-own vs. utilize underlying engine. Inter-operating with other pipelines (Spark) doesn't preclude that. The goal of the pipeline discussion iirc, was t

Making it easier to use Mahout algorithms with Apache Spark pipelines

2017-07-07 Thread Holden Karau
Hi y'all, Trevor and I had been talking a bit and one of the things I'm interested in doing is trying to make it easier for the different ML libraries to be used in Spark. Spark ML has this unified pipeline interface (which is certainly far from perfect), but I was thinking I'd take a crack at try

Re: [DISCUSS] Naming convention for multiple spark/scala combos

2017-07-07 Thread Trevor Grant
So to tie all of this together- org.apache.mahout:mahout-spark_2.10:0.13.1_spark_1_6 org.apache.mahout:mahout-spark_2.10:0.13.1_spark_2_0 org.apache.mahout:mahout-spark_2.10:0.13.1_spark_2_1 org.apache.mahout:mahout-spark_2.11:0.13.1_spark_1_6 org.apache.mahout:mahout-spark_2.11:0.13.1_spark_2_0

Re: [DISCUSS] Naming convention for multiple spark/scala combos

2017-07-07 Thread Holden Karau
Trevor looped me in on this since I hadn't had a chance to subscribe to the list yet (on now :)). Artifacts from cross spark-version building isn't super standardized (and their are two sort of very different types of cross-building). For folks who just need to build for the 1.X and 2.X and branc

Re: [DISCUSS] Naming convention for multiple spark/scala combos

2017-07-07 Thread Andrew Palumbo
another option for artifact names (using jars for example here): mahout-spark-2.11_2.10-0.13.1.jar mahout-spark-2.11_2.11-0.13.1.jar mahout-math-scala-2.11_2.10-0.13.1.jar i.e. ---.jar not exactly pretty.. I somewhat prefer Trevor's idea of Dl4j convention. F

Re: [DISCUSS] Naming convention for multiple spark/scala combos

2017-07-07 Thread Dmitriy Lyubimov
it would seem 2nd option is preferable if doable. Any option that has most desirable combinations prebuilt, is preferable i guess. Spark itself also releases tons of hadoop profile binary variations. so i don't have to build one myself. On Fri, Jul 7, 2017 at 8:57 AM, Trevor Grant wrote: > Hey a

[DISCUSS] Naming convention for multiple spark/scala combos

2017-07-07 Thread Trevor Grant
Hey all, Working on releasing 0.13.1 with multiple spark/scala combos. Afaik, there is no 'standard' for multiple spark versions (but I may be wrong, I don't claim expertise here). One approach is simply only release binaries for: Spark-1.6 + Scala 2.10 Spark-2.1 + Scala 2.11 OR We could do li