Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Elkhan Dadashov
Thanks Corey for your answer, Do you mean that final status : SUCCEEDED in terminal logs means that YARN RM could clean the resources after the application has finished (application finishing does not necessarily mean succeeded or failed) ? With that logic it totally makes sense. Basically the

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Marcelo Vanzin
This might be an issue with how pyspark propagates the error back to the AM. I'm pretty sure this does not happen for Scala / Java apps. Have you filed a bug? On Tue, Jul 28, 2015 at 11:17 AM, Elkhan Dadashov elkhan8...@gmail.com wrote: Thanks Corey for your answer, Do you mean that final

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Marcelo Vanzin
BTW this is most probably caused by this line in PythonRunner.scala: System.exit(process.waitFor()) The YARN backend doesn't like applications calling System.exit(). On Tue, Jul 28, 2015 at 12:00 PM, Marcelo Vanzin van...@cloudera.com wrote: This might be an issue with how pyspark

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Elkhan Dadashov
But then how can we get to know if job is making progress in programmatic way (Java) ? Or if job has failed or succeeded ? Is looking to application log files the only way knowing about job final status (failed/succeeded) ? Because when job fails Job History server does not have much info about

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Elkhan Dadashov
Thanks a lot for feedback, Marcelo. I've filed a bug just now - SPARK-9416 https://issues.apache.org/jira/browse/SPARK-9416 On Tue, Jul 28, 2015 at 12:14 PM, Marcelo Vanzin van...@cloudera.com wrote: BTW this is most probably caused by this line in PythonRunner.scala:

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Corey Nolet
On Tue, Jul 28, 2015 at 2:17 PM, Elkhan Dadashov elkhan8...@gmail.com wrote: Thanks Corey for your answer, Do you mean that final status : SUCCEEDED in terminal logs means that YARN RM could clean the resources after the application has finished (application finishing does not necessarily

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-28 Thread Elkhan Dadashov
I run Spark in yarn-cluster mode. and yes , log aggregation enabled. In Yarn aggregated logs i can the job status correctly. The issue is Yarn Client logs (which is written to stdout in terminal) states that job has succeeded even though the job has failed. As user is not testing if Yarn RM

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-27 Thread Elkhan Dadashov
Any updates on this bug ? Why Spark log results Job final status does not match ? (one saying that job has failed, another stating that job has succeeded) Thanks. On Thu, Jul 23, 2015 at 4:43 PM, Elkhan Dadashov elkhan8...@gmail.com wrote: Hi all, While running Spark Word count python

Re: [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-27 Thread Corey Nolet
Elkhan, What does the ResourceManager say about the final status of the job? Spark jobs that run as Yarn applications can fail but still successfully clean up their resources and give them back to the Yarn cluster. Because of this, there's a difference between your code throwing an exception in

[ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode

2015-07-23 Thread Elkhan Dadashov
Hi all, While running Spark Word count python example with intentional mistake in *Yarn cluster mode*, Spark terminal states final status as SUCCEEDED, but log files state correct results indicating that the job failed. Why terminal log output application log output contradict each other ? If