With all due respect Patrick - this approach is seeking for troubles. Proacively ;)
Cos On Tue, Feb 25, 2014 at 04:09PM, Patrick Wendell wrote: > What I mean is this. AFIAK the shader plug-in is primarily designed > for creating uber jars which contain spark and all dependencies. But > since Spark is something people depend on in Maven, what I actually > want is to create the normal old Spark jar [1], but then include > shaded versions of some of our dependencies inside of it. Not sure if > that's even possible. > > The way we do shading now is we manually publish shaded versions of > some dependencies to maven central as their own artifacts. > > http://search.maven.org/remotecontent?filepath=org/apache/spark/spark-core_2.10/0.9.0-incubating/spark-core_2.10-0.9.0-incubating.jar > > On Tue, Feb 25, 2014 at 4:04 PM, Evan Chan <e...@ooyala.com> wrote: > > Patrick -- not sure I understand your request, do you mean > > - somehow creating a shaded jar (eg with maven shader plugin) > > - then including it in the spark jar (which would then be an assembly)? > > > > On Tue, Feb 25, 2014 at 4:01 PM, Patrick Wendell <pwend...@gmail.com> wrote: > >> Evan - this is a good thing to bring up. Wrt the shader plug-in - > >> right now we don't actually use it for bytecode shading - we simply > >> use it for creating the uber jar with excludes (which sbt supports > >> just fine via assembly). > >> > >> I was wondering actually, do you know if it's possible to added shaded > >> artifacts to the *spark jar* using this plug-in (e.g. not an uber > >> jar)? That's something I could see being really handy in the future. > >> > >> - Patrick > >> > >> On Tue, Feb 25, 2014 at 3:39 PM, Evan Chan <e...@ooyala.com> wrote: > >>> The problem is that plugins are not equivalent. There is AFAIK no > >>> equivalent to the maven shader plugin for SBT. > >>> There is an SBT plugin which can apparently read POM XML files > >>> (sbt-pom-reader). However, it can't possibly handle plugins, which > >>> is still problematic. > >>> > >>> On Tue, Feb 25, 2014 at 3:31 PM, yao <yaosheng...@gmail.com> wrote: > >>>> I would prefer keep both of them, it would be better even if that means > >>>> pom.xml will be generated using sbt. Some company, like my current one, > >>>> have their own build infrastructures built on top of maven. It is not > >>>> easy > >>>> to support sbt for these potential spark clients. But I do agree to only > >>>> keep one if there is a promising way to generate correct configuration > >>>> from > >>>> the other. > >>>> > >>>> -Shengzhe > >>>> > >>>> > >>>> On Tue, Feb 25, 2014 at 3:20 PM, Evan Chan <e...@ooyala.com> wrote: > >>>> > >>>>> The correct way to exclude dependencies in SBT is actually to declare > >>>>> a dependency as "provided". I'm not familiar with Maven or its > >>>>> dependencySet, but provided will mark the entire dependency tree as > >>>>> excluded. It is also possible to exclude jar by jar, but this is > >>>>> pretty error prone and messy. > >>>>> > >>>>> On Tue, Feb 25, 2014 at 2:45 PM, Koert Kuipers <ko...@tresata.com> > >>>>> wrote: > >>>>> > yes in sbt assembly you can exclude jars (although i never had a need > >>>>> > for > >>>>> > this) and files in jars. > >>>>> > > >>>>> > for example i frequently remove log4j.properties, because for whatever > >>>>> > reason hadoop decided to include it making it very difficult to use > >>>>> > our > >>>>> own > >>>>> > logging config. > >>>>> > > >>>>> > > >>>>> > > >>>>> > On Tue, Feb 25, 2014 at 4:24 PM, Konstantin Boudnik <c...@apache.org> > >>>>> wrote: > >>>>> > > >>>>> >> On Fri, Feb 21, 2014 at 11:11AM, Patrick Wendell wrote: > >>>>> >> > Kos - thanks for chiming in. Could you be more specific about what > >>>>> >> > is > >>>>> >> > available in maven and not in sbt for these issues? I took a look > >>>>> >> > at > >>>>> >> > the bigtop code relating to Spark. As far as I could tell [1] was > >>>>> >> > the > >>>>> >> > main point of integration with the build system (maybe there are > >>>>> >> > other > >>>>> >> > integration points)? > >>>>> >> > > >>>>> >> > > - in order to integrate Spark well into existing Hadoop stack > >>>>> >> > > it > >>>>> was > >>>>> >> > > necessary to have a way to avoid transitive dependencies > >>>>> >> duplications and > >>>>> >> > > possible conflicts. > >>>>> >> > > > >>>>> >> > > E.g. Maven assembly allows us to avoid adding _all_ Hadoop > >>>>> >> > > libs > >>>>> >> and later > >>>>> >> > > merely declare Spark package dependency on standard Bigtop > >>>>> Hadoop > >>>>> >> > > packages. And yes - Bigtop packaging means the naming and > >>>>> >> > > layout > >>>>> >> would be > >>>>> >> > > standard across all commercial Hadoop distributions that are > >>>>> worth > >>>>> >> > > mentioning: ASF Bigtop convenience binary packages, and > >>>>> Cloudera or > >>>>> >> > > Hortonworks packages. Hence, the downstream user doesn't > >>>>> >> > > need to > >>>>> >> spend any > >>>>> >> > > effort to make sure that Spark "clicks-in" properly. > >>>>> >> > > >>>>> >> > The sbt build also allows you to plug in a Hadoop version similar > >>>>> >> > to > >>>>> >> > the maven build. > >>>>> >> > >>>>> >> I am actually talking about an ability to exclude a set of > >>>>> >> dependencies > >>>>> >> from an > >>>>> >> assembly, similarly to what's happening in dependencySet sections of > >>>>> >> assembly/src/main/assembly/assembly.xml > >>>>> >> If there is a comparable functionality in Sbt, that would help quite > >>>>> >> a > >>>>> bit, > >>>>> >> apparently. > >>>>> >> > >>>>> >> Cos > >>>>> >> > >>>>> >> > > - Maven provides a relatively easy way to deal with the > >>>>> >> > > jar-hell > >>>>> >> problem, > >>>>> >> > > although the original maven build was just Shader'ing > >>>>> >> > > everything > >>>>> >> into a > >>>>> >> > > huge lump of class files. Oftentimes ending up with classes > >>>>> >> slamming on > >>>>> >> > > top of each other from different transitive dependencies. > >>>>> >> > > >>>>> >> > AFIAK we are only using the shade plug-in to deal with conflict > >>>>> >> > resolution in the assembly jar. These are dealt with in sbt via the > >>>>> >> > sbt assembly plug-in in an identical way. Is there a difference? > >>>>> >> > >>>>> >> I am bringing up the Sharder, because it is an awful hack, which is > >>>>> can't > >>>>> >> be > >>>>> >> used in real controlled deployment. > >>>>> >> > >>>>> >> Cos > >>>>> >> > >>>>> >> > [1] > >>>>> >> > >>>>> https://git-wip-us.apache.org/repos/asf?p=bigtop.git;a=blob;f=bigtop-packages/src/common/spark/do-component-build;h=428540e0f6aa56cd7e78eb1c831aa7fe9496a08f;hb=master > >>>>> >> > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> -- > >>>>> Evan Chan > >>>>> Staff Engineer > >>>>> e...@ooyala.com | > >>>>> > >>> > >>> > >>> > >>> -- > >>> -- > >>> Evan Chan > >>> Staff Engineer > >>> e...@ooyala.com | > > > > > > > > -- > > -- > > Evan Chan > > Staff Engineer > > e...@ooyala.com |