[ 
https://issues.apache.org/jira/browse/SPARK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931771#comment-15931771
 ] 

Deenbandhu Agarwal commented on SPARK-19644:
--------------------------------------------

Yes i have tried restricting the number of jobs retained in UI to 200 and 
moreover the default value is 1000 for number of retained batches and my batch 
interval is 10s so for 1000 batches it will take somewhere around 10000 sec 
which is equal to 3-4 hrs but it keeps on accumulating after that. I think 
there is something else which is creating problem 

> Memory leak in Spark Streaming
> ------------------------------
>
>                 Key: SPARK-19644
>                 URL: https://issues.apache.org/jira/browse/SPARK-19644
>             Project: Spark
>          Issue Type: Bug
>          Components: DStreams
>    Affects Versions: 2.0.2
>         Environment: 3 AWS EC2 c3.xLarge
> Number of cores - 3
> Number of executors 3 
> Memory to each executor 2GB
>            Reporter: Deenbandhu Agarwal
>            Priority: Critical
>              Labels: memory_leak, performance
>         Attachments: Dominator_tree.png, heapdump.png, Path2GCRoot.png
>
>
> I am using streaming on the production for some aggregation and fetching data 
> from cassandra and saving data back to cassandra. 
> I see a gradual increase in old generation heap capacity from 1161216 Bytes 
> to 1397760 Bytes over a period of six hours.
> After 50 hours of processing instances of class 
> scala.collection.immutable.$colon$colon incresed to 12,811,793 which is a 
> huge number. 
> I think this is a clear case of memory leak



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to