Before we dig too far into this, the thing which most quickly jumps out to
me is groupByKey which could be causing some problems - whats the
distribution of keys like? Try replacing the groupByKey with a count() and
see if the pipeline works up until that stage. Also 1G of driver memory is
also a bit small for something with 90 executors...

On Thu, Jan 21, 2016 at 2:40 PM, Arun Luthra <arun.lut...@gmail.com> wrote:

>
>
> 16/01/21 21:52:11 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> 16/01/21 21:52:14 WARN MetricsSystem: Using default name DAGScheduler for
> source because spark.app.id is not set.
>
> spark.yarn.driver.memoryOverhead is set but does not apply in client mode.
>
> 16/01/21 21:52:16 WARN DomainSocketFactory: The short-circuit local reads
> feature cannot be used because libhadoop cannot be loaded.
>
> 16/01/21 21:52:52 WARN MemoryStore: Not enough space to cache broadcast_4
> in memory! (computed 60.2 MB so far)
>
> 16/01/21 21:52:52 WARN MemoryStore: Persisting block broadcast_4 to disk
> instead.
>
> [Stage 1:====================================================>(2260 + 7) /
> 2262]16/01/21 21:57:24 WARN TaskSetManager: Lost task 1440.1 in stage 1.0
> (TID 4530, --): TaskCommitDenied (Driver denied task commit) for job: 1,
> partition: 1440, attempt: 4530
>
> [Stage 1:====================================================>(2260 + 6) /
> 2262]16/01/21 21:57:27 WARN TaskSetManager: Lost task 1488.1 in stage 1.0
> (TID 4531, --): TaskCommitDenied (Driver denied task commit) for job: 1,
> partition: 1488, attempt: 4531
>
> [Stage 1:====================================================>(2261 + 4) /
> 2262]16/01/21 21:57:39 WARN TaskSetManager: Lost task 1982.1 in stage 1.0
> (TID 4532, --): TaskCommitDenied (Driver denied task commit) for job: 1,
> partition: 1982, attempt: 4532
>
> 16/01/21 21:57:57 WARN TaskSetManager: Lost task 2214.0 in stage 1.0 (TID
> 4482, --): TaskCommitDenied (Driver denied task commit) for job: 1,
> partition: 2214, attempt: 4482
>
> 16/01/21 21:57:57 WARN TaskSetManager: Lost task 2168.0 in stage 1.0 (TID
> 4436, --): TaskCommitDenied (Driver denied task commit) for job: 1,
> partition: 2168, attempt: 4436
>
>
> I am running with:
>
>     spark-submit --class "myclass" \
>
>       --num-executors 90 \
>
>       --driver-memory 1g \
>
>       --executor-memory 60g \
>
>       --executor-cores 8 \
>
>       --master yarn-client \
>
>       --conf "spark.executor.extraJavaOptions=-verbose:gc
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps" \
>
>       my.jar
>
>
> There are 2262 input files totaling just 98.6G. The DAG is basically
> textFile().map().filter().groupByKey().saveAsTextFile().
>
> On Thu, Jan 21, 2016 at 2:14 PM, Holden Karau <hol...@pigscanfly.ca>
> wrote:
>
>> Can you post more of your log? How big are the partitions? What is the
>> action you are performing?
>>
>> On Thu, Jan 21, 2016 at 2:02 PM, Arun Luthra <arun.lut...@gmail.com>
>> wrote:
>>
>>> Example warning:
>>>
>>> 16/01/21 21:57:57 WARN TaskSetManager: Lost task 2168.0 in stage 1.0
>>> (TID 4436, XXXXXXX): TaskCommitDenied (Driver denied task commit) for job:
>>> 1, partition: 2168, attempt: 4436
>>>
>>>
>>> Is there a solution for this? Increase driver memory? I'm using just 1G
>>> driver memory but ideally I won't have to increase it.
>>>
>>> The RDD being processed has 2262 partitions.
>>>
>>> Arun
>>>
>>
>>
>>
>> --
>> Cell : 425-233-8271
>> Twitter: https://twitter.com/holdenkarau
>>
>
>


-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Reply via email to