The remaining dependencies (Spark libraries) are available for the context from
the sparkhome. I have installed spark such that all the slaves to have same
sparkhome. Code looks like this.
val conf = new SparkConf()
.setSparkHome(/home/dev/spark)
.setMaster("spark://99.99.99.999:7077")
.setAppName(xxx")
.setJars(Seq("/home/dev/play/target/scala-2.10/xxx_2.10-1.0.jar"))
val sc = new SparkContext(sparkConf)
If you have more dependancies, you can keep adding them to the setJars.
Raju
________________________________
From: Mohammed Guller <[email protected]>
Sent: Thursday, October 16, 2014 4:00 PM
To: US Office Admin; Surendranauth Hiraman
Cc: Daniel Siegmann; [email protected]
Subject: RE: Play framework
Thanks, Suren and Raju.
Raju – if I remember correctly, Play package command just creates a jar for
your app. That jar file will not include other dependencies. So it is not
really a full jar as you mentioned below. So how you are passing all the other
dependency jars to spark? Can you share that piece of code? Also is there any
specific reason why you are not using play dist instead?
Mohammed
From: US Office Admin [mailto:[email protected]]
Sent: Thursday, October 16, 2014 11:41 AM
To: Surendranauth Hiraman; Mohammed Guller
Cc: Daniel Siegmann; [email protected]
Subject: Re: Play framework
We integrated Spark into Play and use SparkSQL extensively on an ec2 spark
cluster on Hadoop hdfs 1.2.1 and tachyon 0.4.
Step 1: Create a play scala application as usual
Step 2. In Build.sbt put all your spark dependencies. What works for us is Play
2.2.3 Scala 2.10.4 Spark 1.1. We have Akka 2.2.3. This is straight forward
step3: As Daniel mentioned, create spark context within Play. And rest of the
application is as usual.
Step4: Create a full jar using Play Package and use that package to be included
in library of jars passed to spark context.
Step 5: Play run as usual.
It works very well, and the convenience is, we have all scala application
throughout.
Regards
Raju
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
________________________________
From: Surendranauth Hiraman
<[email protected]<mailto:[email protected]>>
Sent: Thursday, October 16, 2014 12:42 PM
To: Mohammed Guller
Cc: Daniel Siegmann; [email protected]<mailto:[email protected]>
Subject: Re: Play framework
Mohammed,
Jumping in for Daniel, we actually address the configuration issue by pulling
values from environment variables or command line options. Maybe that may
handle at least some of your needs.
For the akka issue, here is the akka version we include in build.sbt:
"com.typesafe.akka" %% "akka-actor" % "2.2.1"
-Suren
On Thu, Oct 16, 2014 at 12:23 PM, Mohammed Guller
<[email protected]<mailto:[email protected]>> wrote:
Daniel,
Thanks for sharing this. It is very helpful.
The reason I want to use Spark submit is that it provides more flexibility. For
example, with spark-submit, I don’t need to hard code the master info in the
code. I can easily change the config without having to change and recompile
code.
Do you mind sharing the sbt build file for your play app? I tried to build an
uber jar using sbt-assembly. It gets built, but when I run it, it throws all
sorts of exception. I have seen some blog posts that Spark and Play use
different version of the Akka library. So I included Akka in my build.scala
file, but still cannot get rid of Akka related exceptions. I suspect that the
settings in the build.scala file for my play project is incorrect.
Mohammed
From: Daniel Siegmann
[mailto:[email protected]<mailto:[email protected]>]
Sent: Thursday, October 16, 2014 7:15 AM
To: Mohammed Guller
Cc: [email protected]<mailto:[email protected]>
Subject: Re: Play framework
We execute Spark jobs from a Play application but we don't use spark-submit. I
don't know if you really want to use spark-submit, but if not you can just
create a SparkContext programmatically in your app.
In development I typically run Spark locally. Creating the Spark context is
pretty trivial:
val conf = new SparkConf().setMaster("local[*]").setAppName(s"My Awesome App")
// call conf.set for any other configuration you want
val sc = new SparkContext(sparkConf)
It is important to keep in mind you cannot have multiple local contexts (you
can create them but you'll get odd errors), so if you are running things in
parallel within your app (even unit tests) you'd need to share a context in
this case. If you are running sequentially you can create a new local context
each time, but you must make sure to call SparkContext.stop() when you're done.
Running against a cluster is a bit more complicated because you need to add all
your dependency jars. I'm not sure how to get this to work with play run. I
stick to building the app with play dist and then running against the packaged
application, because it very conveniently provides all the dependencies in a
lib folder. Here is some code to load all the paths you need from the dist:
def libs : Seq[String] = {
val libDir = play.api.Play.application.getFile("lib")
logger.info<http://logger.info>(s"SparkContext will be initialized with
libraries from directory $libDir")
return if ( libDir.exists ) {
libDir.listFiles().map(_.getCanonicalFile().getAbsolutePath()).filter(_.endsWith(".jar"))
} else {
throw new IllegalStateException(s"lib dir is missing: $libDir")
}
}
Creating the context is similar to above, but with this extra line:
conf.setJars(libs)
I hope this helps. I should note that I don't use play run very much, at least
not for when I'm actually executing Spark jobs. So I'm not sure if this
integrates properly with that. I have unit tests which execute on Spark and
have executed the dist package both locally and on a cluster. To make working
with the dist locally easier, I wrote myself a little shell script to unzip and
run the dist.
On Wed, Oct 15, 2014 at 10:51 PM, Mohammed Guller
<[email protected]<mailto:[email protected]>> wrote:
Hi –
Has anybody figured out how to integrate a Play application with Spark and run
it on a Spark cluster using spark-submit script? I have seen some blogs about
creating a simple Play app and running it locally on a dev machine with sbt run
command. However, those steps don’t work for Spark-submit.
If you have figured out how to build and run a Play app with Spark-submit, I
would appreciate if you could share the steps and the sbt settings for your
Play app.
Thanks,
Mohammed
--
Daniel Siegmann, Software Developer
Velos
Accelerating Machine Learning
440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
E: [email protected]<mailto:[email protected]> W:
www.velos.io<http://www.velos.io>
--
SUREN HIRAMAN, VP TECHNOLOGY
Velos
Accelerating Machine Learning
440 NINTH AVENUE, 11TH FLOOR
NEW YORK, NY 10001
O: (917) 525-2466 ext. 105
F: 646.349.4063
E: suren.hiraman@v<mailto:[email protected]>elos.io<http://elos.io>
W: www.velos.io<http://www.velos.io/>