[jira] [Commented] (SPARK-26395) Spark Thrift server memory leak

2019-02-19 Thread Konstantinos Andrikopoulos (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772058#comment-16772058
 ] 

Konstantinos Andrikopoulos commented on SPARK-26395:


After setting the property spark.appStateStore.asyncTracking.enable to false 
that made the situation a bit better in 2 out of our 3 Thrift server instances. 
However according to my understanding after setting this property to false we 
should n't face this issue. 

> Spark Thrift server memory leak
> ---
>
> Key: SPARK-26395
> URL: https://issues.apache.org/jira/browse/SPARK-26395
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.2
>Reporter: Konstantinos Andrikopoulos
>Priority: Major
>
> We are running Thrift Server in standalone mode and we have observed that the 
> heap of the driver is constantly increasing. After analysing the heap dump 
> the issue seems to be that the ElementTrackingStore is constantly increasing 
> due to the addition of RDDOperationGraphWrapper objects that are not cleaned 
> up.
> The ElementTrackingStore defines the addTrigger method were you are able to 
> set thresholds in order to perform cleanup but in practice it is used for  
> ExecutorSummaryWrapper, JobDataWrapper and StageDataWrapper classes by using 
> the following spark properties 
>  * spark.ui.retainedDeadExecutors
>  * spark.ui.retainedJobs
>  * spark.ui.retainedStages
> So the  RDDOperationGraphWrapper which is been added using the onJobStart 
> method of  AppStatusListener class [kvstore.write(uigraph) #line 291]
> in not cleaned up and it constantly increases causing a memory leak



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26395) Spark Thrift server memory leak

2019-02-15 Thread t oo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16769704#comment-16769704
 ] 

t oo commented on SPARK-26395:
--

just wanted to add that i face same issue in spark 2.3.0 of Thrift Server in 
standalone mode that the heap of the driver is constantly increasing, it 
eventually leads to spark thrift process crashing

> Spark Thrift server memory leak
> ---
>
> Key: SPARK-26395
> URL: https://issues.apache.org/jira/browse/SPARK-26395
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.2
>Reporter: Konstantinos Andrikopoulos
>Priority: Major
>
> We are running Thrift Server in standalone mode and we have observed that the 
> heap of the driver is constantly increasing. After analysing the heap dump 
> the issue seems to be that the ElementTrackingStore is constantly increasing 
> due to the addition of RDDOperationGraphWrapper objects that are not cleaned 
> up.
> The ElementTrackingStore defines the addTrigger method were you are able to 
> set thresholds in order to perform cleanup but in practice it is used for  
> ExecutorSummaryWrapper, JobDataWrapper and StageDataWrapper classes by using 
> the following spark properties 
>  * spark.ui.retainedDeadExecutors
>  * spark.ui.retainedJobs
>  * spark.ui.retainedStages
> So the  RDDOperationGraphWrapper which is been added using the onJobStart 
> method of  AppStatusListener class [kvstore.write(uigraph) #line 291]
> in not cleaned up and it constantly increases causing a memory leak



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26395) Spark Thrift server memory leak

2019-02-12 Thread Marcelo Vanzin (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16766547#comment-16766547
 ] 

Marcelo Vanzin commented on SPARK-26395:


The code that cleans up stages does clean up the RDD graphs:

{noformat}
  if (!hasMoreAttempts) {
kvstore.delete(classOf[RDDOperationGraphWrapper], s.info.stageId)
  }
{noformat}

Are you sure stages are being properly cleaned up in your case? SPARK-25837 
could cause stage cleanup to be really slow, that will be fixed in 2.3.3.

> Spark Thrift server memory leak
> ---
>
> Key: SPARK-26395
> URL: https://issues.apache.org/jira/browse/SPARK-26395
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.2
>Reporter: Konstantinos Andrikopoulos
>Priority: Major
>
> We are running Thrift Server in standalone mode and we have observed that the 
> heap of the driver is constantly increasing. After analysing the heap dump 
> the issue seems to be that the ElementTrackingStore is constantly increasing 
> due to the addition of RDDOperationGraphWrapper objects that are not cleaned 
> up.
> The ElementTrackingStore defines the addTrigger method were you are able to 
> set thresholds in order to perform cleanup but in practice it is used for  
> ExecutorSummaryWrapper, JobDataWrapper and StageDataWrapper classes by using 
> the following spark properties 
>  * spark.ui.retainedDeadExecutors
>  * spark.ui.retainedJobs
>  * spark.ui.retainedStages
> So the  RDDOperationGraphWrapper which is been added using the onJobStart 
> method of  AppStatusListener class [kvstore.write(uigraph) #line 291]
> in not cleaned up and it constantly increases causing a memory leak



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org