Spark-submit hangs indefinitely after job completion.

2016-05-24 Thread Pradeep Nayak
I have posted the same question of Stack Overflow: http://stackoverflow.com/questions/37421852/spark-submit-continues-to-hang-after-job-completion I am trying to test spark 1.6 with hdfs in AWS. I am using the wordcount python example available in the examples folder. I submit the job with spark

Re: Spark job is failing with kerberos error while creating hive context in yarn-cluster mode (through spark-submit)

2016-05-23 Thread Chandraprakash Bhagtani
Thanks, It worked !!! On Tue, May 24, 2016 at 1:14 AM, Marcelo Vanzin wrote: > On Mon, May 23, 2016 at 4:41 AM, Chandraprakash Bhagtani > wrote: > > I am passing hive-site.xml through --files option. > > You need hive-site-xml in Spark's classpath

Re: Spark job is failing with kerberos error while creating hive context in yarn-cluster mode (through spark-submit)

2016-05-23 Thread Marcelo Vanzin
On Mon, May 23, 2016 at 4:41 AM, Chandraprakash Bhagtani wrote: > I am passing hive-site.xml through --files option. You need hive-site-xml in Spark's classpath too. Easiest way is to copy / symlink hive-site.xml in your Spark's conf directory. -- Marcelo

Re: Spark job is failing with kerberos error while creating hive context in yarn-cluster mode (through spark-submit)

2016-05-23 Thread Chandraprakash Bhagtani
Thanks Doug, I have all the 4 configs (mentioned by you) already in my hive-site.xml. Do I need to create a hive-site.xml in spark conf directory (it is not there by default in 1.6.1)? Please suggest. On Mon, May 23, 2016 at 9:53 PM, Doug Balog wrote: > I have a

Re: Spark job is failing with kerberos error while creating hive context in yarn-cluster mode (through spark-submit)

2016-05-23 Thread Doug Balog
I have a custom hive-site.xml for spark in sparks conf directory. These properties are the minimal ones that you need for spark, I believe. hive.metastore.kerberos.principal = copy from your hive-site.xml, i.e. "hive/_h...@foo.com" hive.metastore.uris = copy from your hive-site.xml, i.e.

Re: Spark job is failing with kerberos error while creating hive context in yarn-cluster mode (through spark-submit)

2016-05-23 Thread Ted Yu
Can you describe the kerberos issues in more detail ? Which release of YARN are you using ? Cheers On Mon, May 23, 2016 at 4:41 AM, Chandraprakash Bhagtani < cpbhagt...@gmail.com> wrote: > Hi, > > My Spark job is failing with kerberos issues while creating hive context > in yarn-cluster mode.

Spark job is failing with kerberos error while creating hive context in yarn-cluster mode (through spark-submit)

2016-05-23 Thread Chandraprakash Bhagtani
Hi, My Spark job is failing with kerberos issues while creating hive context in yarn-cluster mode. However it is running with yarn-client mode. My spark version is 1.6.1 I am passing hive-site.xml through --files option. I tried searching online and found that the same issue is fixed with the

Re: How to use the spark submit script / capability

2016-05-15 Thread John Trengrove
Assuming you are refering to running SparkSubmit.main programatically otherwise read this [1]. I can't find any scaladocs for org.apache.spark.deploy.* but Oozie's [2] example of using SparkSubmit is pretty comprehensive. [1] http://spark.apache.org/docs/latest/submitting-applications.html [2]

Re: How to use the spark submit script / capability

2016-05-15 Thread Marcelo Vanzin
comment-tabpanel#comment-15072396> > Jiahongchao > <https://issues.apache.org/jira/secure/ViewProfile.jspa?name=ihavenoemail%40163.com> > added > a comment - 28/Dec/15 03:51 > > Where is the official document? > > > > > > 2016-05-15 12:04 GMT-07:

Re: How to use the spark submit script / capability

2016-05-15 Thread Stephen Boesch
GMT-07:00 Marcelo Vanzin <van...@cloudera.com>: > I don't understand your question. The PR you mention is not about > spark-submit. > > If you want help with spark-submit, check the Spark docs or "spark-submit > -h". > > If you want help with the library added in th

Re: How to use the spark submit script / capability

2016-05-15 Thread Marcelo Vanzin
I don't understand your question. The PR you mention is not about spark-submit. If you want help with spark-submit, check the Spark docs or "spark-submit -h". If you want help with the library added in the PR, check Spark's API documentation. On Sun, May 15, 2016 at 9:33 AM, Step

How to use the spark submit script / capability

2016-05-15 Thread Stephen Boesch
There is a committed PR from Marcelo Vanzin addressing that capability: https://github.com/apache/spark/pull/3916/files Is there any documentation on how to use this? The PR itself has two comments asking for the docs that were not answered.

Re: High virtual memory consumption on spark-submit client.

2016-05-13 Thread jone
no, i have set master to yarn-cluster. when the sparkpi.running,the result of  free -t as follow [running]mqq@10.205.3.29:/data/home/hive/conf$ free -t total   used   free shared    buffers cached Mem:  32740732   32105684 635048  0 683332  

Re: High virtual memory consumption on spark-submit client.

2016-05-12 Thread Harsh J
How many CPU cores are on that machine? Read http://qr.ae/8Uv3Xq You can also confirm the above by running the pmap utility on your process and most of the virtual memory would be under 'anon'. On Fri, 13 May 2016 09:11 jone, wrote: > The virtual memory is 9G When i run

Re: High virtual memory consumption on spark-submit client.

2016-05-12 Thread Mich Talebzadeh
can you please do the following: jps|grep SparkSubmit| and send the output of ps aux|grep pid top -p PID and the output of free HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

High virtual memory consumption on spark-submit client.

2016-05-12 Thread jone
The virtual memory is 9G When i run org.apache.spark.examples.SparkPi under yarn-cluster model,which using default configurations.   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

Re: Error while running jar using spark-submit on another machine

2016-05-03 Thread nsalian
Thank you for the question. What is different on this machine as compared to the ones where the job succeeded? - Neelesh S. Salian Cloudera -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Error-while-running-jar-using-spark-submit-on-another-machine

spark-submit not adding application jar to classpath

2016-04-18 Thread Saif.A.Ellafi
Hi, I am submitting a jar file to spark submit, which has some content inside src/main/resources. I am unable to access such resources, since the application jar is not being added to the classpath. This works fine if I include the application jar also in the -driver-class-path entry

Re: How does spark-submit handle Python scripts (and how to repeat it)?

2016-04-14 Thread Andrei
ontext initialization later. > > > > So generally for yarn-client, maybe you can skip spark-submit and directly > launching the spark application with some configurations setup before new > SparkContext. > > > > Not sure about your error, have you setup YARN_CONF_DIR? >

RE: How does spark-submit handle Python scripts (and how to repeat it)?

2016-04-13 Thread Sun, Rui
will be used in SparkContext initialization later. So generally for yarn-client, maybe you can skip spark-submit and directly launching the spark application with some configurations setup before new SparkContext. Not sure about your error, have you setup YARN_CONF_DIR? From: Andrei [mailto:faithlessfri

Re: How does spark-submit handle Python scripts (and how to repeat it)?

2016-04-13 Thread Andrei
concentrate the following question: how does submitting application via `spark-submit` with "yarn-client" mode differ from setting the same mode directly in `SparkConf`? On Wed, Apr 13, 2016 at 5:06 AM, Sun, Rui <rui@intel.com> wrote: > Spark configurations specified at the c

RE: How does spark-submit handle Python scripts (and how to repeat it)?

2016-04-12 Thread Sun, Rui
Spark configurations specified at the command line for spark-submit should be passed to the JVM inside Julia process. You can refer to https://github.com/apache/spark/blob/master/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java#L267 and https://github.com/apache

Re: How does spark-submit handle Python scripts (and how to repeat it)?

2016-04-12 Thread Andrei
> > One part is passing the command line options, like “--master”, from the > JVM launched by spark-submit to the JVM where SparkContext resides Since I have full control over both - JVM and Julia parts - I can pass whatever options to both. But what exactly should be passed? Currently

RE: How does spark-submit handle Python scripts (and how to repeat it)?

2016-04-12 Thread Sun, Rui
part is passing the command line options, like “--master”, from the JVM launched by spark-submit to the JVM where SparkContext resides, in the case that the two JVMs are not the same. For pySpark & SparkR, when running scripts in client deployment modes (standalone client and yarn cl

How does spark-submit handle Python scripts (and how to repeat it)?

2016-04-11 Thread Andrei
via `spark-submit`. In `SparkSubmit` class I see that for Python a special class `PythonRunner` is launched, so I tried to do similar `JuliaRunner`, which essentially does the following: val pb = new ProcessBuilder(Seq("julia", juliaScript)) val process = pb.start() process.waitFor

Re: println not appearing in libraries when running job using spark-submit --master local

2016-03-28 Thread Kevin Peng
main(args: Array[String]) { >> ... >> SocialUtil.triggerAndWait(triggerUrl) >> ... >> >> The SocialUtil object is included in a seperate jar. I launched the >> spark-submit command using --jars passing the SocialUtil jar. Inside the >> trigg

Re: println not appearing in libraries when running job using spark-submit --master local

2016-03-28 Thread Ted Yu
main(args: Array[String]) { > ... > SocialUtil.triggerAndWait(triggerUrl) > ... > > The SocialUtil object is included in a seperate jar. I launched the > spark-submit command using --jars passing the SocialUtil jar. Inside the > triggerAndWait function I have a println stat

println not appearing in libraries when running job using spark-submit --master local

2016-03-28 Thread kpeng1
Hi All, I am currently trying to debug a spark application written in scala. I have a main method: def main(args: Array[String]) { ... SocialUtil.triggerAndWait(triggerUrl) ... The SocialUtil object is included in a seperate jar. I launched the spark-submit command using --jars

Re: spark-submit reset JVM

2016-03-20 Thread Ted Yu
Not that I know of. Can you be a little more specific on which JVM(s) you want restarted (assuming spark-submit is used to start a second job) ? Thanks On Sun, Mar 20, 2016 at 6:20 AM, Udo Fholl <udofholl1...@gmail.com> wrote: > Hi all, > > Is there a way for spark-submit to

spark-submit reset JVM

2016-03-20 Thread Udo Fholl
Hi all, Is there a way for spark-submit to restart the JVM in the worker machines? Thanks. Udo.

Re: Hive query works in spark-shell not spark-submit

2016-03-14 Thread Mich Talebzadeh
amp(), 'dd/MM/ HH:mm:ss.ss') ").collect.foreach(println) println("\n Running the query \n") val rs = HiveContext.sql("show databases") rs.collect.foreach(println) println ("\nFinished at"); sqlContext.sql("SELECT FROM_unixtime(unix_timestamp(), 'd

Hive query works in spark-shell not spark-submit

2016-03-14 Thread rhuang
Hi all, I have several Hive queries that work in spark-shell, but they don't work in spark-submit. In fact, I can't even show all databases. The following works in spark-shell: import org.apache.spark._ import org.apache.spark.sql._ object ViewabilityFetchInsertDailyHive { def main

Spark-submit, Spark 1.6, how to get status of Job?

2016-03-14 Thread Emmanuel
Hello,When I used to submit a job with spark 1.4, it would return a job ID and a status RUNNING, FAILED or something like this.I just upgraded to 1.6 and there is no status returned by spark-submitIs there a way to get this information back? When I submit a job I want to know which one it

spark-submit returns nothing with spark 1.6

2016-03-12 Thread Emmanuel
Hello,When i used to submit a job with spark 1.4, it would return a job ID and a status RUNNING, FAILED or something like this.I just upgraded to 1.6 and there is no status returned by spark-submitIs there a way to get this information back? When submit a job i want to know which one it

spark-submit with cluster deploy mode fails with ClassNotFoundException (jars are not passed around properley?)

2016-03-11 Thread Hiroyuki Yamada
Hi, I am trying to work with spark-submit with cluster deploy mode in single node, but I keep getting ClassNotFoundException as shown below. (in this case, snakeyaml.jar is not found from the spark cluster) === 16/03/12 14:19:12 INFO Remoting: Starting remoting 16/03/12 14:19:12 INFO Remoting

Spark Submit using Convert to Marthon REST API

2016-03-01 Thread Ashish Soni
Hi All , Can some one please help me how do i translate below spark submit to marathon JSON request docker run -it --rm -e SPARK_MASTER="mesos://10.0.2.15:5050" -e SPARK_IMAGE="spark_driver:latest" spark_driver:latest /opt/spark/bin/spark-submit --name

How to add a typesafe config file which is located on HDFS to spark-submit (cluster-mode)?

2016-02-22 Thread Johannes Haaß
Hi, I have a Spark (Spark 1.5.2) application that streams data from Kafka to HDFS. My application contains two Typesafe config files to configure certain things like Kafka topic etc. Now I want to run my application with spark-submit (cluster mode) in a cluster. The jar file with all dependencies

How to add a typesafe config file which is located on HDFS to spark-submit (cluster-mode)?

2016-02-22 Thread Jobs
Hi, I have a Spark (Spark 1.5.2) application that streams data from Kafka to HDFS. My application contains two Typesafe config files to configure certain things like Kafka topic etc. Now I want to run my application with spark-submit (cluster mode) in a cluster. The jar file with all

spark-submit: remote protocol vs --py-files

2016-02-12 Thread Jeff Henrikson
Spark users, I am testing different cluster spinup and batch submission jobs. Using the sequenceiq/spark docker package, I have succeeded in submitting "fat egg" (analogous to "fat jar") style python code remotely over YARN. spark-submit --py-files is able to transm

Re: Spark Submit

2016-02-12 Thread Ashish Soni
it works as below spark-submit --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j.xml" --conf spark.executor.memory=512m Thanks all for the quick help. On Fri, Feb 12, 2016 at 10:59 AM, Diwakar Dhanuskodi < diwakar.dhanusk...@gmail.com> wrote: > Try >

Spark Submit

2016-02-12 Thread Ashish Soni
Hi All , How do i pass multiple configuration parameter while spark submit Please help i am trying as below spark-submit --conf "spark.executor.memory=512m spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j.xml" Thanks,

Re: Spark Submit

2016-02-12 Thread Ted Yu
Have you tried specifying multiple '--conf key=value' ? Cheers On Fri, Feb 12, 2016 at 7:44 AM, Ashish Soni <asoni.le...@gmail.com> wrote: > Hi All , > > How do i pass multiple configuration parameter while spark submit > > Please help i am trying as below >

Re: Spark Submit

2016-02-12 Thread Jacek Laskowski
t; >> Hi All , >> >> How do i pass multiple configuration parameter while spark submit >> >> Please help i am trying as below >> >> spark-submit --conf "spark.executor.memory=512m >> spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j.xml" >> >> Thanks, >> > >

Re: Spark Submit

2016-02-12 Thread Diwakar Dhanuskodi
Try  spark-submit  --conf "spark.executor.memory=512m" --conf "spark.executor.extraJavaOptions=x" --conf "Dlog4j.configuration=log4j.xml" Sent from Samsung Mobile. Original message From: Ted Yu <yuzhih...@gmail.com> Date:12/02/2016

Re: Fwd: how to submit multiple jar files when using spark-submit script in shell?

2016-01-12 Thread Jim Lohse
.com <mailto:j...@megalearningllc.com>> wrote: Question is: Looking for all the ways to specify a set of jars using --jars on spark-submit I know this is old but I am about to submit a proposed docs change on --jars, and I had an issue with --jars today When this user

Fwd: how to submit multiple jar files when using spark-submit script in shell?

2016-01-11 Thread UMESH CHAUDHARY
n is: Looking for all the ways to specify a set of jars using --jars > on spark-submit > > I know this is old but I am about to submit a proposed docs change on > --jars, and I had an issue with --jars today > > When this user submitted the following command line, is that a prop

Re: how to submit multiple jar files when using spark-submit script in shell?

2016-01-11 Thread jiml
Question is: Looking for all the ways to specify a set of jars using --jars on spark-submit I know this is old but I am about to submit a proposed docs change on --jars, and I had an issue with --jars today When this user submitted the following command line, is that a proper way to reference

How to compile Python and use How to compile Python and use spark-submit

2016-01-08 Thread Ascot Moss
Hi, Instead of using Spark-shell, does anyone know how to build .zip (or .egg) for Python and use Spark-submit to run? Regards

Re: How to compile Python and use How to compile Python and use spark-submit

2016-01-08 Thread Denny Lee
Per http://spark.apache.org/docs/latest/submitting-applications.html: For Python, you can use the --py-files argument of spark-submit to add .py, .zip or .egg files to be distributed with your application. If you depend on multiple Python files we recommend packaging them into a .zip or .egg

Re: Spark submit does automatically upload the jar to cluster?

2015-12-29 Thread jiml
t link to it: " spark-submit does not pass the JAR along to the Driver, but the Driver will pass it to the executors. I ended up putting the JAR in HDFS and passing an hdfs:// path to spark-submit. This is a subtle difference from Spark on YARN which does pass the JAR along to the Driver automatical

Re: Spark submit does automatically upload the jar to cluster?

2015-12-28 Thread jiml
the spark-submit docs: I must admit, as a limitation on this, it confuses me in the Spark docs that for spark.executor.extraClassPath it says: Users typically should not need to set this option I assume they mean most people will get the classpath out through a driver config option. I know most

1.5.2 prebuilt for 2.4 spark-submit standalone Python scripts not running

2015-12-26 Thread peteranolaN
at PROVIDED that I remove the sc initialisation. Otherwise, if I try to run any Python script using spark-submit, I get the verbose error message I show below and no output. I am not able to fix this. Any assistance would be very gratefully received. My machine runs Windows 10 HOME, with 8GB ram

Re: spark-submit is ignoring "--executor-cores"

2015-12-22 Thread Siva
Thanks a lot Saisai and Zhan, I see DefaultResourceCalculator currently being used for Capacity scheduler. We will change it to DominantResourceCalculator. Thanks, Sivakumar Bhavanari. On Mon, Dec 21, 2015 at 5:56 PM, Zhan Zhang wrote: > BTW: It is not only a Yarn-webui

Re: spark-submit for dependent jars

2015-12-21 Thread Shixiong Zhu
Kumar < > mrajaf...@gmail.com> wrote: > >> Hi Jeff and Satish, >> >> I have modified script and executed. Please find below command >> >> ./spark-submit --master local --class test.Main --jars >> /home/user/download/jar/ojdbc7.jar >> /home//t

spark-submit is ignoring "--executor-cores"

2015-12-21 Thread Siva
Hi Everyone, Observing a strange problem while submitting spark streaming job in yarn-cluster mode through spark-submit. All the executors are using only 1 Vcore irrespective value of the parameter --executor-cores. Are there any config parameters overrides --executor-cores value? Thanks

Re: spark-submit for dependent jars

2015-12-21 Thread Madabhattula Rajesh Kumar
Hi Jeff and Satish, I have modified script and executed. Please find below command ./spark-submit --master local --class test.Main --jars /home/user/download/jar/ojdbc7.jar /home//test/target/spark16-0.0.1-SNAPSHOT.jar Still I'm getting same exception. Exception in thread "

Re: spark-submit for dependent jars

2015-12-21 Thread Jeff Zhang
Put /test/target/spark16-0.0.1-SNAPSHOT.jar as the last argument ./spark-submit --master local --class test.Main --jars /home/user/download/jar/ojdbc7.jar /test/target/spark16-0.0.1-SNAPSHOT.jar On Mon, Dec 21, 2015 at 9:15 PM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote:

Re: spark-submit for dependent jars

2015-12-21 Thread satish chandra j
Hi Rajesh, Could you please try giving your cmd as mentioned below: ./spark-submit --master local --class --jars Regards, Satish Chandra On Mon, Dec 21, 2015 at 6:45 PM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: > Hi, > > How to add dependent jars in spark

Re: spark-submit for dependent jars

2015-12-21 Thread Jeff Zhang
Please make sure this is correct jdbc url, jdbc:oracle:thin:@:1521:xxx On Mon, Dec 21, 2015 at 9:54 PM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: > Hi Jeff and Satish, > > I have modified script and executed. Please find below command > > ./spark-

Re: spark-submit is ignoring "--executor-cores"

2015-12-21 Thread Saisai Shao
g job in > yarn-cluster mode through spark-submit. All the executors are using only 1 > Vcore irrespective value of the parameter --executor-cores. > > Are there any config parameters overrides --executor-cores value? > > Thanks, > Sivakumar Bhavanari. >

Re: spark-submit is ignoring "--executor-cores"

2015-12-21 Thread Saisai Shao
ated? >> >> Thanks >> Saisai >> >> On Tue, Dec 22, 2015 at 9:08 AM, Siva <sbhavan...@gmail.com> wrote: >> >>> Hi Everyone, >>> >>> Observing a strange problem while submitting spark streaming job in >>> yarn-cluster mode

Re: spark-submit is ignoring "--executor-cores"

2015-12-21 Thread Siva
>> Hi Everyone, >> >> Observing a strange problem while submitting spark streaming job in >> yarn-cluster mode through spark-submit. All the executors are using only 1 >> Vcore irrespective value of the parameter --executor-cores. >> >> Are there any config parameters overrides --executor-cores value? >> >> Thanks, >> Sivakumar Bhavanari. >> > >

Re: spark-submit is ignoring "--executor-cores"

2015-12-21 Thread Zhan Zhang
BTW: It is not only a Yarn-webui issue. In capacity scheduler, vcore is ignored. If you want Yarn to honor vcore requests, you have to use DominantResourceCalculator as Saisai suggested. Thanks. Zhan Zhang On Dec 21, 2015, at 5:30 PM, Saisai Shao

get parameters of spark-submit

2015-12-21 Thread Bonsen
1.I code my scala class and pack.(not input the hdfs files' paths,just use the paths from "spark-submit"'s parameters) 2.Then,If I input like this: ${SPARK_HOME/bin}/spark-submit \ --master \ \ hdfs:// \ hdfs:// \ what should I do to get the two hdfs files' paths in my scala class's c

spark-submit for dependent jars

2015-12-21 Thread Madabhattula Rajesh Kumar
Hi, How to add dependent jars in spark-submit command. For example: Oracle. Could you please help me to resolve this issue I have a standalone cluster. One Master and One slave. I have used below command it is not working ./spark-submit --master local --class test.Main /test/target/spark16

Re: get parameters of spark-submit

2015-12-21 Thread Jeff Zhang
don't understand your question. These parameter are passed to your program as args of the main function. On Mon, Dec 21, 2015 at 9:09 PM, Bonsen <hengbohe...@126.com> wrote: > 1.I code my scala class and pack.(not input the hdfs files' paths,just use > the paths from "spark-sub

Spark Submit - java.lang.IllegalArgumentException: requirement failed

2015-12-11 Thread Afshartous, Nick
Hi, I'm trying to run a streaming job on a single node EMR 4.1/Spark 1.5 cluster. Its throwing an IllegalArgumentException right away on the submit. Attaching full output from console. Thanks for any insights. -- Nick 15/12/11 16:44:43 WARN util.NativeCodeLoader: Unable to load

spark-submit problems with --packages and --deploy-mode cluster

2015-12-11 Thread Greg Hill
I'm using Spark 1.5.0 with the standalone scheduler, and for the life of me I can't figure out why this isn't working. I have an application that works fine with --deploy-mode client that I'm trying to get to run in cluster mode so I can use --supervise. I ran into a few issues with my

Re: Spark Submit - java.lang.IllegalArgumentException: requirement failed

2015-12-11 Thread Jean-Baptiste Onofré
Hi Nick, the localizedPath has to be not null, that's why the requirement fails. In the SparkConf used by the spark-submit (default in conf/spark-default.conf), do you have all properties defined, especially spark.yarn.keytab ? Thanks, Regards JB On 12/11/2015 05:49 PM, Afshartous, Nick

Re: Spark Submit - java.lang.IllegalArgumentException: requirement failed

2015-12-11 Thread Afshartous, Nick
on spark-submit then the error no longer appears. Just wondering if there's any difference to using client versus cluster mode if the submit is being done on the master node. Thanks for any suggestions, -- Nick From: Jean-Baptiste Onofré &l

Getting ParquetDecodingException when I am running my spark application from spark-submit

2015-11-24 Thread Kapil Raaj
The relevant error lines are: Caused by: parquet.io.ParquetDecodingException: Can't read value in column [roll_key] BINARY at value 19600 out of 4814, 19600 out of 19600 in currentPage. repetition level: 0, definition level: 1 Caused by: org.apache.spark.SparkException: Job aborted due to stage

Re: How to kill spark applications submitted using spark-submit reliably?

2015-11-22 Thread Ted Yu
ning on your machine ? >>> >>> Thanks >>> >>> On Nov 20, 2015, at 6:46 PM, Vikram Kone <vikramk...@gmail.com> wrote: >>> >>> Hi, >>> I'm seeing a strange problem. I have a spark cluster in standalone mode. >>> I submit spark j

Re: How to kill spark applications submitted using spark-submit reliably?

2015-11-22 Thread Sudhanshu Janghel
ebin the stack trace of the process running on your machine ? >>>> >>>> Thanks >>>> >>>>> On Nov 20, 2015, at 6:46 PM, Vikram Kone <vikramk...@gmail.com> wrote: >>>>> >>>>> Hi, >>>>> I'm seei

Re: How to kill spark applications submitted using spark-submit reliably?

2015-11-22 Thread Ted Yu
46 PM, Vikram Kone <vikramk...@gmail.com> wrote: >> >> Hi, >> I'm seeing a strange problem. I have a spark cluster in standalone mode. >> I submit spark jobs from a remote node as follows from the terminal >> >> spark-submit --master spark://10.1.40.18:7077 -

Re: How to kill spark applications submitted using spark-submit reliably?

2015-11-20 Thread Ted Yu
s shutdown when spark stopped > > I hopes this help > > Stephane > ​ > > On Fri, Nov 20, 2015 at 7:46 PM, Vikram Kone <vikramk...@gmail.com> wrote: > >> Hi, >> I'm seeing a strange problem. I have a spark cluster in standalone mode. >> I submit s

Re: How to kill spark applications submitted using spark-submit reliably?

2015-11-20 Thread Vikram Kone
s > > On Nov 20, 2015, at 6:46 PM, Vikram Kone <vikramk...@gmail.com> wrote: > > Hi, > I'm seeing a strange problem. I have a spark cluster in standalone mode. I > submit spark jobs from a remote node as follows from the terminal > > spark-submit --master spark://10.1

Re: How to kill spark applications submitted using spark-submit reliably?

2015-11-20 Thread Vikram Kone
il.com > <javascript:_e(%7B%7D,'cvml','vikramk...@gmail.com');>> wrote: > > Hi, > I'm seeing a strange problem. I have a spark cluster in standalone mode. I > submit spark jobs from a remote node as follows from the terminal > > spark-submit --master spark://10.1.40.18:7

Re: How to kill spark applications submitted using spark-submit reliably?

2015-11-20 Thread Stéphane Verlet
does CTRL-C in the terminal running spark-submit kills the app in > spark master correctly w/o any explicit shutdown hooks in the code? Can you > explain why we need to add the shutdown hook to kill it when launched via a > shell script ? > For the second issue, I'm not using any thre

Re: How to kill spark applications submitted using spark-submit reliably?

2015-11-20 Thread varun sharma
;> >> On Nov 20, 2015, at 6:46 PM, Vikram Kone <vikramk...@gmail.com> wrote: >> >> Hi, >> I'm seeing a strange problem. I have a spark cluster in standalone mode. >> I submit spark jobs from a remote node as follows from the terminal >> >> spark-sub

Re: How to kill spark applications submitted using spark-submit reliably?

2015-11-20 Thread Stéphane Verlet
from the terminal > > spark-submit --master spark://10.1.40.18:7077 --class com.test.Ping > spark-jobs.jar > > when the app is running , when I press ctrl-C on the console terminal, > then the process is killed and so is the app in the spark master UI. When I > go to spark mas

Re: How to kill spark applications submitted using spark-submit reliably?

2015-11-20 Thread Vikram Kone
Thanks for the info Stephane. Why does CTRL-C in the terminal running spark-submit kills the app in spark master correctly w/o any explicit shutdown hooks in the code? Can you explain why we need to add the shutdown hook to kill it when launched via a shell script ? For the second issue, I'm

Re: How to kill spark applications submitted using spark-submit reliably?

2015-11-20 Thread Ted Yu
I > submit spark jobs from a remote node as follows from the terminal > > spark-submit --master spark://10.1.40.18:7077 --class com.test.Ping > spark-jobs.jar > > when the app is running , when I press ctrl-C on the console terminal, then > the process is killed and so is th

Re: spark-submit is throwing NPE when trying to submit a random forest model

2015-11-19 Thread Joseph Bradley
I have a random forest model that am trying to load during streaming using > following code. The code is working fine when I am running the code from > Eclipse but getting NPE when running the code using spark-submit. > > > > JavaStreamingContext jssc = new JavaStreamingCon

spark-submit is throwing NPE when trying to submit a random forest model

2015-11-19 Thread Rachana Srivastava
Issue: I have a random forest model that am trying to load during streaming using following code. The code is working fine when I am running the code from Eclipse but getting NPE when running the code using spark-submit. JavaStreamingContext jssc = new JavaStreamingContext(jsc

Re: spark-submit stuck and no output in console

2015-11-17 Thread Kayode Odeyemi
Sonal, SparkPi couldn't run as well. Stuck to the screen with no output hadoop-user@yks-hadoop-m01:/usr/local/spark$ ./bin/run-example SparkPi On Tue, Nov 17, 2015 at 12:22 PM, Steve Loughran wrote: > 48 hours is one of those kerberos warning times (as is 24h, 72h and 7

Re: spark-submit stuck and no output in console

2015-11-17 Thread Kayode Odeyemi
Our hadoop NFS Gateway seems to be malfunctioning. I basically restart it. Now spark jobs have resumed successfully. Problem solved.

Re: spark-submit stuck and no output in console

2015-11-17 Thread Sonal Goyal
I would suggest a couple of things to try A. Try running the example program with master as local[*]. See if spark can run locally or not. B. Check spark master and worker logs. C. Check if normal hadoop jobs can be run properly on the cluster. D. Check spark master webui and see health of

Re: spark-submit stuck and no output in console

2015-11-17 Thread Kayode Odeyemi
Anyone experienced this issue as well? On Mon, Nov 16, 2015 at 8:06 PM, Kayode Odeyemi wrote: > > Or are you saying that the Java process never even starts? > > > Exactly. > > Here's what I got back from jstack as expected: > > hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$

Re: spark-submit stuck and no output in console

2015-11-17 Thread Sonal Goyal
> My env is a YARN cluster made of 7 nodes (6 datanodes/ > node manager, 1 namenode/resource manager). > > On the namenode, is where I executed the spark-submit job while on one of > the datanodes, I executed 'hadoop fs -put /binstore /user/hadoop-user/' > to dump 1TB of data int

Re: spark-submit stuck and no output in console

2015-11-17 Thread Kayode Odeyemi
Thanks for the reply Sonal. I'm on JDK 7 (/usr/lib/jvm/java-7-oracle) My env is a YARN cluster made of 7 nodes (6 datanodes/ node manager, 1 namenode/resource manager). On the namenode, is where I executed the spark-submit job while on one of the datanodes, I executed 'hadoop fs -put /binstore

Re: spark-submit stuck and no output in console

2015-11-17 Thread Sonal Goyal
Could it be jdk related ? Which version are you on? Best Regards, Sonal Founder, Nube Technologies Reifier at Strata Hadoop World Reifier at Spark Summit 2015

Re: spark-submit stuck and no output in console

2015-11-17 Thread Steve Loughran
On 17 Nov 2015, at 09:54, Kayode Odeyemi > wrote: Initially, I submitted 2 jobs to the YARN cluster which was running for 2 days and suddenly stops. Nothing in the logs shows the root cause. 48 hours is one of those kerberos warning times (as is

Re: spark-submit stuck and no output in console

2015-11-16 Thread Kayode Odeyemi
gt; On Mon, Nov 16, 2015 at 5:50 AM, Kayode Odeyemi <drey...@gmail.com> wrote: > >> ./spark-submit --class com.migration.UpdateProfiles --executor-memory 8g >> ~/migration-profiles-0.1-SNAPSHOT.jar >> >> is stuck and outputs nothing to the console. >&

Re: spark-submit stuck and no output in console

2015-11-16 Thread Ted Yu
Which release of Spark are you using ? Can you take stack trace and pastebin it ? Thanks On Mon, Nov 16, 2015 at 5:50 AM, Kayode Odeyemi <drey...@gmail.com> wrote: > ./spark-submit --class com.migration.UpdateProfiles --executor-memory 8g > ~/migration-profiles-0.1-SNAPSHOT.jar

Re: spark-submit stuck and no output in console

2015-11-16 Thread Jonathan Kelly
p "$LAUNCH_CLASSPATH" org.apache.spark.launcher.Main "$@" > > > On Mon, Nov 16, 2015 at 5:22 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Which release of Spark are you using ? >> >> Can you take stack trace and pastebin it ? >> >>

spark-submit stuck and no output in console

2015-11-16 Thread Kayode Odeyemi
./spark-submit --class com.migration.UpdateProfiles --executor-memory 8g ~/migration-profiles-0.1-SNAPSHOT.jar is stuck and outputs nothing to the console. What could be the cause of this? Current max heap size is 1.75g and it's only using 1g.

Re: spark-submit stuck and no output in console

2015-11-16 Thread Kayode Odeyemi
> Or are you saying that the Java process never even starts? Exactly. Here's what I got back from jstack as expected: hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ jstack 31316 31316: Unable to open socket file: target process not responding or HotSpot VM not loaded The -F option can be

Re: sqlCtx.sql('some_hive_table') works in pyspark but not spark-submit

2015-11-08 Thread Deng Ching-Mallete
eMethod(AbstractCommand.java:133) > at py4j.commands.CallCommand.execute(CallCommand.java:79) > at py4j.GatewayConnection.run(GatewayConnection.java:207) > at java.lang.Thread.run(Thread.java:745) > > False > Traceback (most recent call last): > File &q

sqlCtx.sql('some_hive_table') works in pyspark but not spark-submit

2015-11-07 Thread YaoPau
ck (most recent call last): File "/home/me/pyspark/pyspark_library_walkthrough.py", line 46, in print row_objects[0].dma_code -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/sqlCtx-sql-some-hive-table-works-in-pyspark-but-not-spark-submit-tp2

[jira] Ankit shared "SPARK-11213: Documentation for remote spark Submit for R Scripts from 1.5 on CDH 5.4" with you

2015-10-22 Thread Ankit (JIRA)
Ankit shared an issue with you --- > Documentation for remote spark Submit for R Scripts from 1.5 on CDH 5.4 > --- > > Key: SPARK-11213 >

<    1   2   3   4   5   6   7   8   >