[ 
https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931768#comment-15931768
 ] 

Sean Owen commented on SPARK-19644:
-----------------------------------

I don't see any GC time here. 
Is this not simple stuff like you have lots of old jobs and stage metrics info 
in the driver? There is not much memory used here compared to normal operation. 
Unless you've tried stuff like restricting the number of retained jobs in the 
UI I don't think there is evidence of a problem. The dumps don't show anything 
that odd. 

> Memory leak in Spark Streaming
> ------------------------------
>
>                 Key: SPARK-19644
>                 URL: https://issues.apache.org/jira/browse/SPARK-19644
>             Project: Spark
>          Issue Type: Bug
>          Components: DStreams
>    Affects Versions: 2.0.2
>         Environment: 3 AWS EC2 c3.xLarge
> Number of cores - 3
> Number of executors 3 
> Memory to each executor 2GB
>            Reporter: Deenbandhu Agarwal
>            Priority: Critical
>              Labels: memory_leak, performance
>         Attachments: Dominator_tree.png, heapdump.png, Path2GCRoot.png
>
>
> I am using streaming on the production for some aggregation and fetching data 
> from cassandra and saving data back to cassandra. 
> I see a gradual increase in old generation heap capacity from 1161216 Bytes 
> to 1397760 Bytes over a period of six hours.
> After 50 hours of processing instances of class 
> scala.collection.immutable.$colon$colon incresed to 12,811,793 which is a 
> huge number. 
> I think this is a clear case of memory leak



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to