Running spark-submit from a remote machine using a YARN application

2014-12-11 Thread ryaminal
We are trying to submit a Spark application from a Tomcat application running
our business logic. The Tomcat app lives in a seperate non-hadoop cluster.
We first were doing this by using the spark-yarn package to directly call
Client#runApp() but found that the API we were using in Spark is being made
private in future releases. 
 
Now our solution is to make a very simply YARN application which execustes
as its command spark-submit --master yarn-cluster s3n://application/jar.jar
 This seemed so simple and elegant, but it has some weird issues. We
get NoClassDefFoundErrors. When we ssh to the box, run the same
spark-submit command it works, but doing this through YARN leads in the
NoClassDefFoundErrors mentioned.
 
Also, examining the environment and Java properties between the working and
broken, we find that they have a different java classpath. So weird...
 
Has anyone had this problem or know a solution? We would be happy to post
our very simple code for creating the YARN application.
 
Thanks!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-submit-from-a-remote-machine-using-a-YARN-application-tp20642.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Calling spark from a java web application.

2014-12-01 Thread ryaminal
If you are able to use YARN in your hadoop cluster, then the following
technique is pretty straightforward:
http://blog.sequenceiq.com/blog/2014/08/22/spark-submit-in-java/

We use this in our system and it's super easy to execute from our Tomcat
application.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Calling-spark-from-a-java-web-application-tp20007p20145.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Multiple Applications(Spark Contexts) Concurrently Fail With Broadcast Error

2014-11-07 Thread ryaminal
We are unable to run more than one application at a time using Spark 1.0.0 on
CDH5. We submit two applications using two different SparkContexts on the
same Spark Master. The Spark Master was started using the following command
and parameters and is running in standalone mode:

 /usr/java/jdk1.7.0_55-cloudera/bin/java   -XX:MaxPermSize=128m  
 -Djava.net.preferIPv4Stack=true   -Dspark.akka.logLifecycleEvents=true  
 -Xms8589934592   -Xmx8589934592   org.apache.spark.deploy.master.Master
 --ip ip-10-186-155-45.ec2.internal

When submitting this application by itself it finishes and all of the data
comes out happy. The problem occurs when trying to run another application
while an existing application is still processing and we get an error
stating that the spark contexts were shut down prematurely.The errors can be
viewed in the following pastebins. All IP addresses have been changed to
1.1.1.1 for security reasons. Notice that on the top of the logs we have
printed out the spark config stuff for reference.The working logs:  Working
Pastebin http://pastebin.com/CnitnMhy  The broken logs:  Broken Pastebin
http://pastebin.com/VGs87bBZ  We have also included the worker logs. For
the second app, we see in the work/app/ directory 7 additional directors:
`0/ 1/ 2/ 3/ 4/ 5/ 6/`. There are then two different groups of errors. The
first three are one group and the other 4 are the other group of errors.
Worker log for broken app group 1:  Broken App Group 1
http://pastebin.com/7VwZ1Gwu  Worker log for broken app group 2:  Broken
App Group 2 http://pastebin.com/shs4d8T4  Worker log for working app:
available upon request.
The two different errors are the last lines of both groups and are:

 Received LaunchTask command but executor was null



 Slave registration failed: Duplicate executor ID: 4

tl;drWe are unable to run more than one application in the same spark master
using different spark contexts. The only errors we see are broadcast errors.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Multiple-Applications-Spark-Contexts-Concurrently-Fail-With-Broadcast-Error-tp18374.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: application as a service

2014-08-17 Thread ryaminal
You can also look into using ooyala's job server at
https://github.com/ooyala/spark-jobserver

This already has a spary server built in that allows you to do what has
already been explained above. Sounds like it should solve your problem. 

Enjoy!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/application-as-a-service-tp12253p12267.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org