Hi, Did you hit some troubles from the memory leak? I think we can ignore the message in most cases because TaskMemoryManager automatically releases the memory. In fact, spark degraded the message in SPARK-18557. https://issues.apache.org/jira/browse/SPARK-18557
// maropu On Thu, Dec 8, 2016 at 8:10 PM, Appu K <kut...@gmail.com> wrote: > Hello, > > I’ve just ran into an issue where the job is giving me "Managed memory > leak" with spark version 2.0.2 > > ————————————————— > 2016-12-08 16:31:25,231 [Executor task launch worker-0] > (TaskMemoryManager.java:381) WARN leak 46.2 MB memory from > org.apache.spark.util.collection.ExternalAppendOnlyMap@22719fb8 > 2016-12-08 16:31:25,232 [Executor task launch worker-0] (Logging.scala:66) > WARN Managed memory leak detected; size = 48442112 bytes, TID = 1 > ————————————————— > > > The program itself is very basic and looks like take() is causing the > issue > > Program: https://gist.github.com/kutt4n/87cfcd4e794b1865b6f880412dd80bbf > Debug Log: https://gist.github.com/kutt4n/ba3cf8129999dced34ceadc588856edc > > > TaskMemoryManager.java:381 says that it's normal to see leaked memory if > one of the tasks failed. In this case from the debug log - it is not quite > apparent which task failed and the reason for failure. > > When the TSV file itself is small the issue doesn’t exist. In this > particular case, the file is a 21MB clickstream data from wikipedia > available at https://ndownloader.figshare.com/files/5036392 > > Where could i read up more about managed memory leak. Any pointers on what > might be the issue would be highly helpful > > thanks > appu > > > > -- --- Takeshi Yamamuro