[ 
https://issues.apache.org/jira/browse/SLIDER-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha updated SLIDER-1190:
------------------------------
    Parent Issue: SLIDER-1216  (was: SLIDER-1185)

> Provide solution to possible memory issues with storing app diagnostics for 
> large no of containers
> --------------------------------------------------------------------------------------------------
>
>                 Key: SLIDER-1190
>                 URL: https://issues.apache.org/jira/browse/SLIDER-1190
>             Project: Slider
>          Issue Type: Sub-task
>          Components: appmaster, client
>    Affects Versions: Slider 0.91
>            Reporter: Gour Saha
>             Fix For: Slider 1.0.0
>
>
> [~billie.rinaldi] raised a very important point on a potential memory issue 
> in SLIDER-1187.
> I wanted to capture her point and my first initial thoughts on it. Let's use 
> this JIRA to discuss further on this topic and find the best solution.
> Billie's question: Do you think this will cause memory issues for long-lived 
> AMs?
> Gour's initial thoughts: I agree with you that any list which is only growing 
> over time is a concern for possible memory issues. However I checked the size 
> of a single container diagnostics payload and it hovers anywhere between 4-5 
> KB. So for about 100,000 containers it will end up consuming ~500MB. This is 
> at the borderline of acceptability for a 1GB AM container. However for most 
> production clusters I have seen that the min size of a container is set to 
> 4GB or higher. Either way, 100K containers for a single app (even if running 
> for years) is very unlikely but not impossible. We can do couple of things 
> here. 1) Provide an API which can be triggered to drop all container 
> diagnostics of the old/dead containers except n most recent ones (n can be 
> passed as a parameter to the API). 2) Add logic where the AM will cap the no 
> of old/dead containers to a limit of say 10,000 (which will be configurable 
> per application). Nevertheless, if an app is created with 100K+ containers we 
> can still be hosed, but here we are stretching our imaginations too much  
> Anyway I don't think we should use this patch to solve this. I am going to 
> create a new sub-task for this possible memory issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to