Appu K created SPARK-18784:
------------------------------
Summary: Managed memory leak - spark-2.0.2
Key: SPARK-18784
URL: https://issues.apache.org/jira/browse/SPARK-18784
Project: Spark
Issue Type: Bug
Affects Versions: 2.0.2
Environment: spark : 2.0.2
Reporter: Appu K
I’ve just ran into an issue where the job is giving me "Managed memory leak"
with spark version 2.0.2
—————————————————
2016-12-08 16:31:25,231 [Executor task launch worker-0]
(TaskMemoryManager.java:381) WARN leak 46.2 MB memory from
org.apache.spark.util.collection.ExternalAppendOnlyMap@22719fb8
2016-12-08 16:31:25,232 [Executor task launch worker-0] (Logging.scala:66) WARN
Managed memory leak detected; size = 48442112 bytes, TID = 1
—————————————————
The program itself is very basic
Program: https://gist.github.com/kutt4n/87cfcd4e794b1865b6f880412dd80bbf
Debug Log: https://gist.github.com/kutt4n/ba3cf8129999dced34ceadc588856edc
TaskMemoryManager.java:381 says that it's normal to see leaked memory if one of
the tasks failed. In this case from the debug log - it is not quite apparent
which task failed and the reason for failure.
When the TSV file itself is small the issue doesn’t exist. In this particular
case, the file is a 21MB clickstream data from wikipedia available at
https://ndownloader.figshare.com/files/5036392
Not sure if it is a duplicate of 18181 - If so please close this and could
track it there
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]