+1 if so (sbt naming re: pats comment). Also +1 on Zeppelin integration being non-trivial.
Sent from my Verizon Wireless 4G LTE smartphone -------- Original message -------- From: Pat Ferrel <p...@occamsmachete.com> Date: 07/07/2017 10:35 PM (GMT-08:00) To: dev@mahout.apache.org Cc: Holden Karau <holden.ka...@gmail.com>, u...@mahout.apache.org, Dmitriy Lyubimov <dlie...@gmail.com>, Andrew Palumbo <apalu...@apache.org> Subject: Re: [DISCUSS] Naming convention for multiple spark/scala combos IIRC these all fit sbt’s conventons? On Jul 7, 2017, at 2:05 PM, Trevor Grant <trevor.d.gr...@gmail.com> wrote: So to tie all of this together- org.apache.mahout:mahout-spark_2.10:0.13.1_spark_1_6 org.apache.mahout:mahout-spark_2.10:0.13.1_spark_2_0 org.apache.mahout:mahout-spark_2.10:0.13.1_spark_2_1 org.apache.mahout:mahout-spark_2.11:0.13.1_spark_1_6 org.apache.mahout:mahout-spark_2.11:0.13.1_spark_2_0 org.apache.mahout:mahout-spark_2.11:0.13.1_spark_2_1 (will jars compiled with 2.1 dependencies run on 2.0? I assume not, but I don't know) (afaik, mahout compiled for spark 1.6.x tends to work with spark 1.6.y, anecdotal) A non-trivial motivation here, is we would like all of these available to tighten up the Apache Zeppelin integration, where the user could have a number of different spark/scala combos going on and we want it to 'just work' out of the box (which means a wide array of binaries available, to dmitriy's point). I'm +1 on this, and as RM will begin cutting a provisional RC, just to try to figure out how all of this will work (it's my first time as release master, and this is a new thing we're doing). 72 hour lazy consensus. (will probably take me 72 hours to figure out anyway ;) ) If no objections expect an RC on Monday evening. tg On Fri, Jul 7, 2017 at 3:24 PM, Holden Karau <holden.ka...@gmail.com> wrote: > Trevor looped me in on this since I hadn't had a chance to subscribe to > the list yet (on now :)). > > Artifacts from cross spark-version building isn't super standardized (and > their are two sort of very different types of cross-building). > > For folks who just need to build for the 1.X and 2.X and branches > appending _spark1 & _spark2 to the version string is indeed pretty common > and the DL4J folks do something pretty similar as Trevor pointed out. > > The folks over at hammerlab have made some sbt specific tooling to make > this easier to do on the publishing side (see https://github.com/hammer > lab/sbt-parent ) > > It is true some people build Scala 2.10 artifacts for Spark 1.X series and > 2.11 artifacts for Spark 2.X series only and use that to differentiate (I > don't personally like this approach since it is super opaque and someone > could upgrade their Scala version and then accidentally be using a > different version of Spark which would likely not go very well). > > For folks who need to hook into internals and cross build against > different minor versions there is much less of a consistent pattern, > personally spark-testing-base is released as: > > [artifactname]_[scalaversion]:[sparkversion]_[artifact releaseversion] > > But this really only makes sense when you have to cross-build for lots of > different Spark versions (which should be avoidable for Mahout). > > Since you are likely not depending on the internals of different point > releases, I'd think the _spark1 / _spark2 is probably the right way (or > _spark_1 / _spark_2 is fine too). > > > On Fri, Jul 7, 2017 at 11:43 AM, Trevor Grant <trevor.d.gr...@gmail.com> > wrote: > >> >> ---------- Forwarded message ---------- >> From: Andrew Palumbo <ap....@outlook.com> >> Date: Fri, Jul 7, 2017 at 12:28 PM >> Subject: Re: [DISCUSS] Naming convention for multiple spark/scala combos >> To: "dev@mahout.apache.org" <dev@mahout.apache.org> >> >> >> another option for artifact names (using jars for example here): >> >> >> mahout-spark-2.11_2.10-0.13.1.jar >> mahout-spark-2.11_2.11-0.13.1.jar >> mahout-math-scala-2.11_2.10-0.13.1.jar >> >> >> i.e. <module>-<spark version>-<scala version>-<mahout-version>.jar >> >> >> not exactly pretty.. I somewhat prefer Trevor's idea of Dl4j convention. >> >> ________________________________ >> From: Trevor Grant <trevor.d.gr...@gmail.com> >> Sent: Friday, July 7, 2017 11:57:53 AM >> To: Mahout Dev List; u...@mahout.apache.org >> Subject: [DISCUSS] Naming convention for multiple spark/scala combos >> >> Hey all, >> >> Working on releasing 0.13.1 with multiple spark/scala combos. >> >> Afaik, there is no 'standard' for multiple spark versions (but I may be >> wrong, I don't claim expertise here). >> >> One approach is simply only release binaries for: >> Spark-1.6 + Scala 2.10 >> Spark-2.1 + Scala 2.11 >> >> OR >> >> We could do like dl4j >> >> org.apache.mahout:mahout-spark_2.10:0.13.1_spark_1 >> org.apache.mahout:mahout-spark_2.11:0.13.1_spark_1 >> >> org.apache.mahout:mahout-spark_2.10:0.13.1_spark_2 >> org.apache.mahout:mahout-spark_2.11:0.13.1_spark_2 >> >> OR >> >> some other option I don't know of. >> >> > > > -- > Cell : 425-233-8271 <(425)%20233-8271> >