[ https://issues.apache.org/jira/browse/SLIDER-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gour Saha updated SLIDER-1190: ------------------------------ Parent Issue: SLIDER-1216 (was: SLIDER-1185) > Provide solution to possible memory issues with storing app diagnostics for > large no of containers > -------------------------------------------------------------------------------------------------- > > Key: SLIDER-1190 > URL: https://issues.apache.org/jira/browse/SLIDER-1190 > Project: Slider > Issue Type: Sub-task > Components: appmaster, client > Affects Versions: Slider 0.91 > Reporter: Gour Saha > Fix For: Slider 1.0.0 > > > [~billie.rinaldi] raised a very important point on a potential memory issue > in SLIDER-1187. > I wanted to capture her point and my first initial thoughts on it. Let's use > this JIRA to discuss further on this topic and find the best solution. > Billie's question: Do you think this will cause memory issues for long-lived > AMs? > Gour's initial thoughts: I agree with you that any list which is only growing > over time is a concern for possible memory issues. However I checked the size > of a single container diagnostics payload and it hovers anywhere between 4-5 > KB. So for about 100,000 containers it will end up consuming ~500MB. This is > at the borderline of acceptability for a 1GB AM container. However for most > production clusters I have seen that the min size of a container is set to > 4GB or higher. Either way, 100K containers for a single app (even if running > for years) is very unlikely but not impossible. We can do couple of things > here. 1) Provide an API which can be triggered to drop all container > diagnostics of the old/dead containers except n most recent ones (n can be > passed as a parameter to the API). 2) Add logic where the AM will cap the no > of old/dead containers to a limit of say 10,000 (which will be configurable > per application). Nevertheless, if an app is created with 100K+ containers we > can still be hosed, but here we are stretching our imaginations too much > Anyway I don't think we should use this patch to solve this. I am going to > create a new sub-task for this possible memory issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)