With all due respect Patrick - this approach is seeking for troubles.
Proacively ;)

Cos

On Tue, Feb 25, 2014 at 04:09PM, Patrick Wendell wrote:
> What I mean is this. AFIAK the shader plug-in is primarily designed
> for creating uber jars which contain spark and all dependencies. But
> since Spark is something people depend on in Maven, what I actually
> want is to create the normal old Spark jar [1], but then include
> shaded versions of some of our dependencies inside of it. Not sure if
> that's even possible.
> 
> The way we do shading now is we manually publish shaded versions of
> some dependencies to maven central as their own artifacts.
> 
> http://search.maven.org/remotecontent?filepath=org/apache/spark/spark-core_2.10/0.9.0-incubating/spark-core_2.10-0.9.0-incubating.jar
> 
> On Tue, Feb 25, 2014 at 4:04 PM, Evan Chan <e...@ooyala.com> wrote:
> > Patrick -- not sure I understand your request, do you mean
> > - somehow creating a shaded jar (eg with maven shader plugin)
> > - then including it in the spark jar (which would then be an assembly)?
> >
> > On Tue, Feb 25, 2014 at 4:01 PM, Patrick Wendell <pwend...@gmail.com> wrote:
> >> Evan - this is a good thing to bring up. Wrt the shader plug-in -
> >> right now we don't actually use it for bytecode shading - we simply
> >> use it for creating the uber jar with excludes (which sbt supports
> >> just fine via assembly).
> >>
> >> I was wondering actually, do you know if it's possible to added shaded
> >> artifacts to the *spark jar* using this plug-in (e.g. not an uber
> >> jar)? That's something I could see being really handy in the future.
> >>
> >> - Patrick
> >>
> >> On Tue, Feb 25, 2014 at 3:39 PM, Evan Chan <e...@ooyala.com> wrote:
> >>> The problem is that plugins are not equivalent.  There is AFAIK no
> >>> equivalent to the maven shader plugin for SBT.
> >>> There is an SBT plugin which can apparently read POM XML files
> >>> (sbt-pom-reader).   However, it can't possibly handle plugins, which
> >>> is still problematic.
> >>>
> >>> On Tue, Feb 25, 2014 at 3:31 PM, yao <yaosheng...@gmail.com> wrote:
> >>>> I would prefer keep both of them, it would be better even if that means
> >>>> pom.xml will be generated using sbt. Some company, like my current one,
> >>>> have their own build infrastructures built on top of maven. It is not 
> >>>> easy
> >>>> to support sbt for these potential spark clients. But I do agree to only
> >>>> keep one if there is a promising way to generate correct configuration 
> >>>> from
> >>>> the other.
> >>>>
> >>>> -Shengzhe
> >>>>
> >>>>
> >>>> On Tue, Feb 25, 2014 at 3:20 PM, Evan Chan <e...@ooyala.com> wrote:
> >>>>
> >>>>> The correct way to exclude dependencies in SBT is actually to declare
> >>>>> a dependency as "provided".   I'm not familiar with Maven or its
> >>>>> dependencySet, but provided will mark the entire dependency tree as
> >>>>> excluded.   It is also possible to exclude jar by jar, but this is
> >>>>> pretty error prone and messy.
> >>>>>
> >>>>> On Tue, Feb 25, 2014 at 2:45 PM, Koert Kuipers <ko...@tresata.com> 
> >>>>> wrote:
> >>>>> > yes in sbt assembly you can exclude jars (although i never had a need 
> >>>>> > for
> >>>>> > this) and files in jars.
> >>>>> >
> >>>>> > for example i frequently remove log4j.properties, because for whatever
> >>>>> > reason hadoop decided to include it making it very difficult to use 
> >>>>> > our
> >>>>> own
> >>>>> > logging config.
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > On Tue, Feb 25, 2014 at 4:24 PM, Konstantin Boudnik <c...@apache.org>
> >>>>> wrote:
> >>>>> >
> >>>>> >> On Fri, Feb 21, 2014 at 11:11AM, Patrick Wendell wrote:
> >>>>> >> > Kos - thanks for chiming in. Could you be more specific about what 
> >>>>> >> > is
> >>>>> >> > available in maven and not in sbt for these issues? I took a look 
> >>>>> >> > at
> >>>>> >> > the bigtop code relating to Spark. As far as I could tell [1] was 
> >>>>> >> > the
> >>>>> >> > main point of integration with the build system (maybe there are 
> >>>>> >> > other
> >>>>> >> > integration points)?
> >>>>> >> >
> >>>>> >> > >   - in order to integrate Spark well into existing Hadoop stack 
> >>>>> >> > > it
> >>>>> was
> >>>>> >> > >     necessary to have a way to avoid transitive dependencies
> >>>>> >> duplications and
> >>>>> >> > >     possible conflicts.
> >>>>> >> > >
> >>>>> >> > >     E.g. Maven assembly allows us to avoid adding _all_ Hadoop 
> >>>>> >> > > libs
> >>>>> >> and later
> >>>>> >> > >     merely declare Spark package dependency on standard Bigtop
> >>>>> Hadoop
> >>>>> >> > >     packages. And yes - Bigtop packaging means the naming and 
> >>>>> >> > > layout
> >>>>> >> would be
> >>>>> >> > >     standard across all commercial Hadoop distributions that are
> >>>>> worth
> >>>>> >> > >     mentioning: ASF Bigtop convenience binary packages, and
> >>>>> Cloudera or
> >>>>> >> > >     Hortonworks packages. Hence, the downstream user doesn't 
> >>>>> >> > > need to
> >>>>> >> spend any
> >>>>> >> > >     effort to make sure that Spark "clicks-in" properly.
> >>>>> >> >
> >>>>> >> > The sbt build also allows you to plug in a Hadoop version similar 
> >>>>> >> > to
> >>>>> >> > the maven build.
> >>>>> >>
> >>>>> >> I am actually talking about an ability to exclude a set of 
> >>>>> >> dependencies
> >>>>> >> from an
> >>>>> >> assembly, similarly to what's happening in dependencySet sections of
> >>>>> >>     assembly/src/main/assembly/assembly.xml
> >>>>> >> If there is a comparable functionality in Sbt, that would help quite 
> >>>>> >> a
> >>>>> bit,
> >>>>> >> apparently.
> >>>>> >>
> >>>>> >> Cos
> >>>>> >>
> >>>>> >> > >   - Maven provides a relatively easy way to deal with the 
> >>>>> >> > > jar-hell
> >>>>> >> problem,
> >>>>> >> > >     although the original maven build was just Shader'ing 
> >>>>> >> > > everything
> >>>>> >> into a
> >>>>> >> > >     huge lump of class files. Oftentimes ending up with classes
> >>>>> >> slamming on
> >>>>> >> > >     top of each other from different transitive dependencies.
> >>>>> >> >
> >>>>> >> > AFIAK we are only using the shade plug-in to deal with conflict
> >>>>> >> > resolution in the assembly jar. These are dealt with in sbt via the
> >>>>> >> > sbt assembly plug-in in an identical way. Is there a difference?
> >>>>> >>
> >>>>> >> I am bringing up the Sharder, because it is an awful hack, which is
> >>>>> can't
> >>>>> >> be
> >>>>> >> used in real controlled deployment.
> >>>>> >>
> >>>>> >> Cos
> >>>>> >>
> >>>>> >> > [1]
> >>>>> >>
> >>>>> https://git-wip-us.apache.org/repos/asf?p=bigtop.git;a=blob;f=bigtop-packages/src/common/spark/do-component-build;h=428540e0f6aa56cd7e78eb1c831aa7fe9496a08f;hb=master
> >>>>> >>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> --
> >>>>> Evan Chan
> >>>>> Staff Engineer
> >>>>> e...@ooyala.com  |
> >>>>>
> >>>
> >>>
> >>>
> >>> --
> >>> --
> >>> Evan Chan
> >>> Staff Engineer
> >>> e...@ooyala.com  |
> >
> >
> >
> > --
> > --
> > Evan Chan
> > Staff Engineer
> > e...@ooyala.com  |

Reply via email to