[jira] [Commented] (MESOS-9852) Slow memory growth in master due to deferred deletion of offer filters and timers.

2019-08-11 Thread longfei (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904878#comment-16904878
 ] 

longfei commented on MESOS-9852:


[~bmahler]  Sorry, I might forget to press the "Add" button.



"you can tune the master's flags to reduce the amount of framework"

Do you mean max_*_tasks_per_framework? Would this history take hundreds of MBs? 
I'll try... 

> Slow memory growth in master due to deferred deletion of offer filters and 
> timers.
> --
>
> Key: MESOS-9852
> URL: https://issues.apache.org/jira/browse/MESOS-9852
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation, master
>Reporter: Benjamin Mahler
>Assignee: Benjamin Mahler
>Priority: Critical
>  Labels: resource-management
> Fix For: 1.5.4, 1.6.3, 1.7.3, 1.8.1, 1.9.0
>
> Attachments: _tmp_libprocess.Do1MrG_profile (1).dump, 
> _tmp_libprocess.Do1MrG_profile (1).svg, _tmp_libprocess.Do1MrG_profile 
> 24hours.dump, _tmp_libprocess.Do1MrG_profile 24hours.svg, screenshot-1.png, 
> statistics
>
>
> The allocator does not keep a handle to the offer filter timer, which means 
> it cannot remove the timer overhead (in this case memory) when removing the 
> offer filter earlier (e.g. due to revive):
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L1338-L1352
> In addition, the offer filter is allocated on the heap but not deleted until 
> the timer fires (which might take forever!):
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L1321
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L1408-L1413
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L2249
> We'll need to try to backport this to all active release branches.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (MESOS-9852) Slow memory growth in master due to deferred deletion of offer filters and timers.

2019-08-11 Thread Benjamin Mahler (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904850#comment-16904850
 ] 

Benjamin Mahler commented on MESOS-9852:


[~carlone] not sure if you intended to reply to my message but I noticed you 
attached the additional 24 hour data. Looking at it, it appears to be mostly 
due to task history. If you don't care about the task history, you can tune the 
master's flags to reduce the amount of framework / task history stored.

> Slow memory growth in master due to deferred deletion of offer filters and 
> timers.
> --
>
> Key: MESOS-9852
> URL: https://issues.apache.org/jira/browse/MESOS-9852
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation, master
>Reporter: Benjamin Mahler
>Assignee: Benjamin Mahler
>Priority: Critical
>  Labels: resource-management
> Fix For: 1.5.4, 1.6.3, 1.7.3, 1.8.1, 1.9.0
>
> Attachments: _tmp_libprocess.Do1MrG_profile (1).dump, 
> _tmp_libprocess.Do1MrG_profile (1).svg, _tmp_libprocess.Do1MrG_profile 
> 24hours.dump, _tmp_libprocess.Do1MrG_profile 24hours.svg, screenshot-1.png, 
> statistics
>
>
> The allocator does not keep a handle to the offer filter timer, which means 
> it cannot remove the timer overhead (in this case memory) when removing the 
> offer filter earlier (e.g. due to revive):
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L1338-L1352
> In addition, the offer filter is allocated on the heap but not deleted until 
> the timer fires (which might take forever!):
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L1321
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L1408-L1413
> https://github.com/apache/mesos/blob/1.8.0/src/master/allocator/mesos/hierarchical.cpp#L2249
> We'll need to try to backport this to all active release branches.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)