Re: [DISCUSS] Necessity of Maven and SBT Build in Spark

Konstantin Boudnik Wed, 12 Mar 2014 16:28:14 -0700

I think Kevin's point is somewhat different: there's no question that Sbt can
be integrated into Maven ecosystem - mostly the repositories and artifact
management, of course.
However, Sbt is a niche build tool and is unlikely to be widely supported by
engineering teams nor IT organizations. Sbt isn't used for large or medium
scale enterprise software projects, unless it is a Scala heavy one. But I am
not aware about those. And I am not trying to hurt the Scala camp, really.


Integrating something like Sbt-based build into Hadoop stack would be a pain
in the rear. And I am talking about the last point from experience of getting
Spark into Bigtop and effectively into Hadoop's ecosystem mainstream. Now any
commercial distro vendor can have Spark as a part of their product offering
without even lifting a finger.

So, in other words - and I am of course talking for myself here - if Maven is
fenced out of the Spark project it will create serious difficulties in the
downstream integration. Unless there's a champion in the community who will be
helping with it on an ongoing basis.

Cos

On Tue, Mar 11, 2014 at 11:16PM, Koert Kuipers wrote:
> we have a maven corporate repository inhouse and of course we also use
> maven central. sbt can handle retrieving from and publishing to maven
> repositories just fine. we have maven, ant/ivy and sbt projects depending
> on each others artifacts. not sure i see the issue there.
> 
> 
> On Tue, Mar 11, 2014 at 5:34 PM, Kevin Markey <kevin.mar...@oracle.com>wrote:
> 
> > Pardon my late entry into the fray, here, but we've just struggled though
> > some library conflicts that could have been avoided and whose story shed
> > some light on this question.
> >
> > We have been integrating Spark with a number of other components. We
> > discovered several conflicts, most easily eliminated.  But the ASM
> > conflicts were not quite so easy to handle because of ASM's API changes
> > between 3.x and 4.x (most usually seen first in ClassVisitor which was an
> > interface and now is an abstract class).
> >
> > The spark-core_2.10 has a transitive dependency on 4.0.  Hive, Hadoop,
> > various Java EE servlets, and other libraries have transitive dependencies
> > on 3.2 or earlier.  In one of the applications we are developing, there are
> > 10 libraries with ASM dependencies.  Five are well-behaved, having shaded
> > ASM.  Another five, are poorly behaved, not shading it.  The ASM FAQ
> > specifically recommends shading ASM in any tool or framework which contains
> > it: http://asm.ow2.org/doc/faq.html#Q15.
> >
> > ASM has been shaded in the SBT build since June 2013.  However, it was not
> > properly shaded in the Maven build until last week.  As result, libraries
> > such as spark-core_2.10 pushed to Maven Central haven't reflected the SBT
> > build.  This is documented in Jira SPARK-782: https://spark-project.
> > atlassian.net/browse/SPARK-782
> >
> > We cannot use SBT for our overall project.  Maven is our standard. Hence,
> > we are dependent on Maven Central and libraries mirrored by our corporate
> > repository.
> >
> > In this context, if both builds are maintained, then they need to have the
> > same functionality.
> >
> > If only one build must be retained, it should be Maven because Maven and
> > other tools that use Maven Central are more likely to be used for large
> > project integrations.  Also for this reason, the Maven build should be
> > given more priority than at present.  It seems a bit odd, if a Maven
> > project can be automatically generated from SBT, that it would take 1 year
> > for ASM shading in Maven to catch up with SBT.
> >
> > Thanks
> > Kevin Markey
> >
> >
> >  SBT appears to have syntax for both, just like Maven. Surely these
> >>> have the same meanings in SBT, and excluding artifacts is accomplished
> >>> with exclude and excludeAll, as seen in the Spark build?
> >>>
> >>> The assembly and shader stuff in Maven is more about controlling
> >>> exactly how it's put together into an artifact, at the level of files
> >>> even, to stick a license file in or exclude some data file cruft or
> >>> rename dependencies.
> >>>
> >>> exclusions and shading are necessary evils to be used as sparingly as
> >>> possible. Dependency graphs get nuts fast here, and Spark is already
> >>> quite big. (Hence my recent PR to start touching it up -- more coming
> >>> for sure.)
> >>>
> >>>
> >

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

Reply via email to

Re: [DISCUSS] Necessity of Maven and SBT Build in Spark