Spark Job/Stage names

2015-09-29 Thread Nithin Asokan
I'm interested to see if anyone knows of a way to have custom job/stage
name for Spark Application.

I believe I can use *sparkContext.setCallSite(String)* to update job/stage
names but it does not let me update each stage name, setting this value
will set same text for all job and stage names for that application.

Anyone has done something like this before or have thoughts about have
custom stage names?

On a side note, what is the benefit of having a friendly name for RDD (
RDD.name() )

Thanks,
Nithin


Re: Executor Lost Failure

2015-09-29 Thread Nithin Asokan
Try increasing memory (--conf spark.executor.memory=3g or
--executor-memory) for executors. Here is something I noted from your logs

15/09/29 06:32:03 WARN MemoryStore: Failed to reserve initial memory
threshold of 1024.0 KB for computing block rdd_2_1813 in memory.
15/09/29 06:32:03 WARN MemoryStore: Not enough space to cache
rdd_2_1813 in memory!
(computed 840.0 B so far)

On Tue, Sep 29, 2015 at 11:02 AM Anup Sawant 
wrote:

> Hi all,
> Any idea why I am getting 'Executor heartbeat timed out' ? I am fairly new
> to Spark so I have less knowledge about the internals of it. The job was
> running for a day or so on 102 Gb of data with 40 workers.
> -Best,
> Anup.
>
> 15/09/29 06:32:03 ERROR TaskSchedulerImpl: Lost executor driver on
> localhost: Executor heartbeat timed out after 395987 ms
> 15/09/29 06:32:03 WARN MemoryStore: Failed to reserve initial memory
> threshold of 1024.0 KB for computing block rdd_2_1813 in memory.
> 15/09/29 06:32:03 WARN MemoryStore: Not enough space to cache rdd_2_1813
> in memory! (computed 840.0 B so far)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1782.0 in stage 2713.0
> (TID 9101184, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 ERROR TaskSetManager: Task 1782 in stage 2713.0 failed 1
> times; aborting job
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1791.0 in stage 2713.0
> (TID 9101193, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1800.0 in stage 2713.0
> (TID 9101202, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1764.0 in stage 2713.0
> (TID 9101166, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1773.0 in stage 2713.0
> (TID 9101175, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1809.0 in stage 2713.0
> (TID 9101211, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1794.0 in stage 2713.0
> (TID 9101196, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1740.0 in stage 2713.0
> (TID 9101142, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1803.0 in stage 2713.0
> (TID 9101205, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1812.0 in stage 2713.0
> (TID 9101214, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1785.0 in stage 2713.0
> (TID 9101187, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1767.0 in stage 2713.0
> (TID 9101169, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1776.0 in stage 2713.0
> (TID 9101178, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1797.0 in stage 2713.0
> (TID 9101199, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1779.0 in stage 2713.0
> (TID 9101181, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1806.0 in stage 2713.0
> (TID 9101208, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1788.0 in stage 2713.0
> (TID 9101190, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1761.0 in stage 2713.0
> (TID 9101163, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1755.0 in stage 2713.0
> (TID 9101157, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1796.0 in stage 2713.0
> (TID 9101198, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1778.0 in stage 2713.0
> (TID 9101180, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1787.0 in stage 2713.0
> (TID 9101189, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1805.0 in stage 2713.0
> (TID 9101207, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1790.0 in stage 2713.0
> (TID 9101192, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1781.0 in stage 2713.0
> (TID 9101183, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost task 1808.0 in stage 2713.0
> (TID 9101210, localhost): ExecutorLostFailure (executor driver lost)
> 15/09/29 06:32:03 WARN TaskSetManager: Lost 

Re: How to set System environment variables in Spark

2015-09-29 Thread Nithin Asokan
--conf is used to pass any spark configuration that starts with *spark.**

You can also use "--driver-java-options" to pass any system properties you
would like to the driver program.

On Tue, Sep 29, 2015 at 2:30 PM swetha  wrote:

>
> Hi,
>
> How to set System environment variables when submitting a job?  Suppose I
> have the environment variable as shown below. I have been trying to specify
> --- -Dcom.w1.p1.config.runOnEnv=dev and --conf
> -Dcom.w1.p1.config.runOnEnv=dev. But, it does not seem to be working. How
> to
> set environment variable when submitting a job in Spark?
>
>
> -Dcom.w1.p1.config.runOnEnv=dev
>
> Thanks,
> Swetha
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-System-environment-variables-in-Spark-tp24875.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: Potential NPE while exiting spark-shell

2015-08-31 Thread Nithin Asokan
SPARK-5869 appears to have the same exception and is fixed in 1.3.0. I
double checked the CDH package to see if it had the patch

https://github.com/cloudera/spark/blob/cdh5.4.4-release/core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala#L161

In my case, my yarn application fails after submission and a block manager
is not registered. This causes a NPE while cleaning up the folder.

On Mon, Aug 31, 2015 at 1:48 PM Akhil Das 
wrote:

> Looks like you are hitting this
> https://issues.apache.org/jira/browse/SPARK-5869 try to update your spark
> version
>
> Thanks
> Best Regards
>
> On Tue, Sep 1, 2015 at 12:09 AM, nasokan  wrote:
>
>> I'm currently using Spark 1.3.0 on yarn cluster deployed through CDH5.4.
>> My
>> cluster does not have a 'default' queue, and launching 'spark-shell'
>> submits
>> an yarn application that gets killed immediately because queue does not
>> exist. However, the spark-shell session is still in progress after
>> throwing
>> a bunch of errors while creating sql context. Upon submitting an 'exit'
>> command, there appears to be a NPE from DiskBlockManager with the
>> following
>> stack trace
>>
>> ERROR Utils: Uncaught exception in thread delete Spark local dirs
>> java.lang.NullPointerException
>> at
>> org.apache.spark.storage.DiskBlockManager.org
>> $apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161)
>> at
>>
>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141)
>> at
>>
>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>> at
>>
>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>> at
>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
>> at
>>
>> org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:139)
>> Exception in thread "delete Spark local dirs"
>> java.lang.NullPointerException
>> at
>> org.apache.spark.storage.DiskBlockManager.org
>> $apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161)
>> at
>>
>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141)
>> at
>>
>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>> at
>>
>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139)
>> at
>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
>> at
>>
>> org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:139)
>>
>> I believe the problem appears to be surfacing from a shutdown hook that's
>> tries to cleanup local directories. In this specific case because the yarn
>> application was not submitted successfully, the block manager was not
>> registered; as a result it does not have a valid blockManagerId as seen
>> here
>>
>>
>> https://github.com/apache/spark/blob/v1.3.0/core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala#L161
>>
>> Has anyone faced this issue before? Could this be a problem with the way
>> shutdown hook behaves currently?
>>
>> Note: I referenced source from apache spark repo than cloudera.
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Potential-NPE-while-exiting-spark-shell-tp24523.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>