Appu K created SPARK-18784:
------------------------------

             Summary: Managed memory leak - spark-2.0.2
                 Key: SPARK-18784
                 URL: https://issues.apache.org/jira/browse/SPARK-18784
             Project: Spark
          Issue Type: Bug
    Affects Versions: 2.0.2
         Environment: spark : 2.0.2
            Reporter: Appu K


I’ve just ran into an issue where the job is giving me "Managed memory leak" 
with spark version 2.0.2

—————————————————
2016-12-08 16:31:25,231 [Executor task launch worker-0] 
(TaskMemoryManager.java:381) WARN leak 46.2 MB memory from 
org.apache.spark.util.collection.ExternalAppendOnlyMap@22719fb8
2016-12-08 16:31:25,232 [Executor task launch worker-0] (Logging.scala:66) WARN 
Managed memory leak detected; size = 48442112 bytes, TID = 1
—————————————————


The program itself is very basic

Program: https://gist.github.com/kutt4n/87cfcd4e794b1865b6f880412dd80bbf
Debug Log: https://gist.github.com/kutt4n/ba3cf8129999dced34ceadc588856edc


TaskMemoryManager.java:381 says that it's normal to see leaked memory if one of 
the tasks failed.  In this case from the debug log - it is not quite apparent 
which task failed and the reason for failure. 

When the TSV file itself is small the issue doesn’t exist. In this particular 
case, the file is a 21MB clickstream data from wikipedia available at 
https://ndownloader.figshare.com/files/5036392

Not sure if it is a duplicate of 18181 - If so please close this and could 
track it there 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to