[jira] [Commented] (MAPREDUCE-6847) Job history server should release jobs from cache after a fixed duration

Weiwei Yang (JIRA) Wed, 15 Feb 2017 14:47:59 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868749#comment-15868749
 ]


Weiwei Yang commented on MAPREDUCE-6847:
----------------------------------------

Hello [~jlowe]

At present, JHS cache works like (for example it is allowed to cache 5 jobs or 
equivalent number of tasks)

# User clicks job1, job2 ... job5, JHS caches 5 jobs in memory
# JHS maintains all jobs in cache
# A long time passed
# Job1, 2 .. 5 are pretty out-of-dated, user clicks job6, JHS cache evicts a 
job but the cache still contains 5 jobs, 1 new and the other 4 old

This has no problem if the job size is small, but if jobs are large, e.g 100k 
tasks each, 5 jobs in cache will consume approximately more than 1.2 * 5 = 6G 
memory, is this really necessary? The patch was trying to simply expire some 
jobs in cache so let it cache recent ones that would have user access (small 
chance). Does that make sense to you?

> Job history server should release jobs from cache after a fixed duration
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6847
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6847
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobhistoryserver
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>         Attachments: MAPREDUCE-6847.01.patch
>
>
> We found history server is consuming a lot of memory when there are large 
> jobs (with more than 100k of tasks in a single job). Currently JHS cache only 
> evicts entries with size, it's better to add the time expiration as well to 
> reduce heap usage if job has no one accessing for sometime.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6847) Job history server should release jobs from cache after a fixed duration

Reply via email to