Re: Is uberjar a recommended way of running Spark/Scala applications?

Pierre Borckmans Tue, 03 Jun 2014 05:59:34 -0700

You might want to look at another great plugin : “sbt-pack” 
https://github.com/xerial/sbt-pack.


It collects all the dependencies JARs and creates launch scripts for *nix 
(including Mac OS) and windows.

HTH

Pierre


On 02 Jun 2014, at 17:29, Andrei <faithlessfri...@gmail.com> wrote:

> Thanks! This is even closer to what I am looking for. I'm in a trip now, so 
> I'm going to give it a try when I come back. 
> 
> 
> On Mon, Jun 2, 2014 at 5:12 AM, Ngoc Dao <ngocdaoth...@gmail.com> wrote:
> Alternative solution:
> https://github.com/xitrum-framework/xitrum-package
> 
> It collects all dependency .jar files in your Scala program into a
> directory. It doesn't merge the .jar files together, the .jar files
> are left "as is".
> 
> 
> On Sat, May 31, 2014 at 3:42 AM, Andrei <faithlessfri...@gmail.com> wrote:
> > Thanks, Stephen. I have eventually decided to go with assembly, but put away
> > Spark and Hadoop jars, and instead use `spark-submit` to automatically
> > provide these dependencies. This way no resource conflicts arise and
> > mergeStrategy needs no modification. To memorize this stable setup and also
> > share it with the community I've crafted a project [1] with minimal working
> > config. It is SBT project with assembly plugin, Spark 1.0 and Cloudera's
> > Hadoop client. Hope, it will help somebody to take Spark setup quicker.
> >
> > Though I'm fine with this setup for final builds, I'm still looking for a
> > more interactive dev setup - something that doesn't require full rebuild.
> >
> > [1]: https://github.com/faithlessfriend/sample-spark-project
> >
> > Thanks and have a good weekend,
> > Andrei
> >
> > On Thu, May 29, 2014 at 8:27 PM, Stephen Boesch <java...@gmail.com> wrote:
> >>
> >>
> >> The MergeStrategy combined with sbt assembly did work for me.  This is not
> >> painless: some trial and error and the assembly may take multiple minutes.
> >>
> >> You will likely want to filter out some additional classes from the
> >> generated jar file.  Here is an SOF answer to explain that and with IMHO 
> >> the
> >> best answer snippet included here (in this case the OP understandably did
> >> not want to not include javax.servlet.Servlet)
> >>
> >> http://stackoverflow.com/questions/7819066/sbt-exclude-class-from-jar
> >>
> >>
> >> mappings in (Compile,packageBin) ~= { (ms: Seq[(File, String)]) => ms
> >> filter { case (file, toPath) => toPath != "javax/servlet/Servlet.class" } }
> >>
> >> There is a setting to not include the project files in the assembly but I
> >> do not recall it at this moment.
> >>
> >>
> >>
> >> 2014-05-29 10:13 GMT-07:00 Andrei <faithlessfri...@gmail.com>:
> >>
> >>> Thanks, Jordi, your gist looks pretty much like what I have in my project
> >>> currently (with few exceptions that I'm going to borrow).
> >>>
> >>> I like the idea of using "sbt package", since it doesn't require third
> >>> party plugins and, most important, doesn't create a mess of classes and
> >>> resources. But in this case I'll have to handle jar list manually via 
> >>> Spark
> >>> context. Is there a way to automate this process? E.g. when I was a 
> >>> Clojure
> >>> guy, I could run "lein deps" (lein is a build tool similar to sbt) to
> >>> download all dependencies and then just enumerate them from my app. Maybe
> >>> you have heard of something like that for Spark/SBT?
> >>>
> >>> Thanks,
> >>> Andrei
> >>>
> >>>
> >>> On Thu, May 29, 2014 at 3:48 PM, jaranda <jordi.ara...@bsc.es> wrote:
> >>>>
> >>>> Hi Andrei,
> >>>>
> >>>> I think the preferred way to deploy Spark jobs is by using the sbt
> >>>> package
> >>>> task instead of using the sbt assembly plugin. In any case, as you
> >>>> comment,
> >>>> the mergeStrategy in combination with some dependency exlusions should
> >>>> fix
> >>>> your problems. Have a look at  this gist
> >>>> <https://gist.github.com/JordiAranda/bdbad58d128c14277a05>   for further
> >>>> details (I just followed some recommendations commented in the sbt
> >>>> assembly
> >>>> plugin documentation).
> >>>>
> >>>> Up to now I haven't found a proper way to combine my
> >>>> development/deployment
> >>>> phases, although I must say my experience in Spark is pretty poor (it
> >>>> really
> >>>> depends in your deployment requirements as well). In this case, I think
> >>>> someone else could give you some further insights.
> >>>>
> >>>> Best,
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> View this message in context:
> >>>> http://apache-spark-user-list.1001560.n3.nabble.com/Is-uberjar-a-recommended-way-of-running-Spark-Scala-applications-tp6518p6520.html
> >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >>>
> >>>
> >>
> >
>

Re: Is uberjar a recommended way of running Spark/Scala applications?

Reply via email to