Daniel,
Thanks for sharing this. It is very helpful.

The reason I want to use Spark submit is that it provides more flexibility. For 
example, with spark-submit, I don’t need to hard code the master info in the 
code. I can easily change the config without having to change and recompile 
code.

Do you mind sharing the sbt build file for your play app? I tried to build an 
uber jar using sbt-assembly. It gets built, but when I run it, it throws all 
sorts of exception. I have seen some blog posts that Spark and Play use 
different version of the Akka library. So I included Akka in my build.scala 
file, but still cannot get rid of Akka related exceptions. I suspect that the 
settings in the build.scala file for my play project is incorrect.

Mohammed

From: Daniel Siegmann [mailto:daniel.siegm...@velos.io]
Sent: Thursday, October 16, 2014 7:15 AM
To: Mohammed Guller
Cc: user@spark.apache.org
Subject: Re: Play framework

We execute Spark jobs from a Play application but we don't use spark-submit. I 
don't know if you really want to use spark-submit, but if not you can just 
create a SparkContext programmatically in your app.
In development I typically run Spark locally. Creating the Spark context is 
pretty trivial:

val conf = new SparkConf().setMaster("local[*]").setAppName(s"My Awesome App")
// call conf.set for any other configuration you want
val sc = new SparkContext(sparkConf)

It is important to keep in mind you cannot have multiple local contexts (you 
can create them but you'll get odd errors), so if you are running things in 
parallel within your app (even unit tests) you'd need to share a context in 
this case. If you are running sequentially you can create a new local context 
each time, but you must make sure to call SparkContext.stop() when you're done.
Running against a cluster is a bit more complicated because you need to add all 
your dependency jars. I'm not sure how to get this to work with play run. I 
stick to building the app with play dist and then running against the packaged 
application, because it very conveniently provides all the dependencies in a 
lib folder. Here is some code to load all the paths you need from the dist:

    def libs : Seq[String] = {
        val libDir = play.api.Play.application.getFile("lib")

        logger.info<http://logger.info>(s"SparkContext will be initialized with 
libraries from directory $libDir")

        return if ( libDir.exists ) {
            
libDir.listFiles().map(_.getCanonicalFile().getAbsolutePath()).filter(_.endsWith(".jar"))
        } else {
            throw new IllegalStateException(s"lib dir is missing: $libDir")
        }
    }
Creating the context is similar to above, but with this extra line:

conf.setJars(libs)
I hope this helps. I should note that I don't use play run very much, at least 
not for when I'm actually executing Spark jobs. So I'm not sure if this 
integrates properly with that. I have unit tests which execute on Spark and 
have executed the dist package both locally and on a cluster. To make working 
with the dist locally easier, I wrote myself a little shell script to unzip and 
run the dist.


On Wed, Oct 15, 2014 at 10:51 PM, Mohammed Guller 
<moham...@glassbeam.com<mailto:moham...@glassbeam.com>> wrote:
Hi –

Has anybody figured out how to integrate a Play application with Spark and run 
it on a Spark cluster using spark-submit script? I have seen some blogs about 
creating a simple Play app and running it locally on a dev machine with sbt run 
command. However, those steps don’t work for Spark-submit.

If you have figured out how to build and run a Play app with Spark-submit, I 
would appreciate if you could share the steps and the sbt settings for your 
Play app.

Thanks,
Mohammed




--
Daniel Siegmann, Software Developer
Velos
Accelerating Machine Learning

440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
E: daniel.siegm...@velos.io<mailto:daniel.siegm...@velos.io> W: 
www.velos.io<http://www.velos.io>

Reply via email to