[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details
[ https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-3901: --- Resolution: Fixed Fix Version/s: 0.23.2 Release Note: Modified JobHistory records in YARN to lazily load job and task reports so as to improve UI response times. Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I just committed this to trunk, branch-0.23 and branch-0.23.2. Thanks Sid! > lazy load JobHistory Task and TaskAttempt details > - > > Key: MAPREDUCE-3901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver, mrv2 >Affects Versions: 0.23.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Fix For: 0.23.2 > > Attachments: MR3901.txt, MR3901_v2.txt, MR3901_v3.txt > > > The job history UI and MRClientProtocol calls routed via JobHistory are very > slow for large jobs. Some of this time is spent parsing the history file. A > good chunk is spent pre-creating lots of objects which may never be used. > Those can be create when required - bringing down the load times of job > history pages and getJobReport etc calls to approximately the history file > parse time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details
[ https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-3901: -- Status: Patch Available (was: Open) > lazy load JobHistory Task and TaskAttempt details > - > > Key: MAPREDUCE-3901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver, mrv2 >Affects Versions: 0.23.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: MR3901.txt, MR3901_v2.txt, MR3901_v3.txt > > > The job history UI and MRClientProtocol calls routed via JobHistory are very > slow for large jobs. Some of this time is spent parsing the history file. A > good chunk is spent pre-creating lots of objects which may never be used. > Those can be create when required - bringing down the load times of job > history pages and getJobReport etc calls to approximately the history file > parse time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details
[ https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-3901: -- Attachment: MR3901_v3.txt trying again.. the previous patch should've been ok. > lazy load JobHistory Task and TaskAttempt details > - > > Key: MAPREDUCE-3901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver, mrv2 >Affects Versions: 0.23.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: MR3901.txt, MR3901_v2.txt, MR3901_v3.txt > > > The job history UI and MRClientProtocol calls routed via JobHistory are very > slow for large jobs. Some of this time is spent parsing the history file. A > good chunk is spent pre-creating lots of objects which may never be used. > Those can be create when required - bringing down the load times of job > history pages and getJobReport etc calls to approximately the history file > parse time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details
[ https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-3901: -- Status: Open (was: Patch Available) > lazy load JobHistory Task and TaskAttempt details > - > > Key: MAPREDUCE-3901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver, mrv2 >Affects Versions: 0.23.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: MR3901.txt, MR3901_v2.txt > > > The job history UI and MRClientProtocol calls routed via JobHistory are very > slow for large jobs. Some of this time is spent parsing the history file. A > good chunk is spent pre-creating lots of objects which may never be used. > Those can be create when required - bringing down the load times of job > history pages and getJobReport etc calls to approximately the history file > parse time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details
[ https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-3901: -- Status: Patch Available (was: Open) > lazy load JobHistory Task and TaskAttempt details > - > > Key: MAPREDUCE-3901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver, mrv2 >Affects Versions: 0.23.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: MR3901.txt, MR3901_v2.txt > > > The job history UI and MRClientProtocol calls routed via JobHistory are very > slow for large jobs. Some of this time is spent parsing the history file. A > good chunk is spent pre-creating lots of objects which may never be used. > Those can be create when required - bringing down the load times of job > history pages and getJobReport etc calls to approximately the history file > parse time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details
[ https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-3901: -- Attachment: MR3901_v2.txt Updated to fix the very valid findbug warnings. > lazy load JobHistory Task and TaskAttempt details > - > > Key: MAPREDUCE-3901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver, mrv2 >Affects Versions: 0.23.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: MR3901.txt, MR3901_v2.txt > > > The job history UI and MRClientProtocol calls routed via JobHistory are very > slow for large jobs. Some of this time is spent parsing the history file. A > good chunk is spent pre-creating lots of objects which may never be used. > Those can be create when required - bringing down the load times of job > history pages and getJobReport etc calls to approximately the history file > parse time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details
[ https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-3901: -- Status: Open (was: Patch Available) > lazy load JobHistory Task and TaskAttempt details > - > > Key: MAPREDUCE-3901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver, mrv2 >Affects Versions: 0.23.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: MR3901.txt, MR3901_v2.txt > > > The job history UI and MRClientProtocol calls routed via JobHistory are very > slow for large jobs. Some of this time is spent parsing the history file. A > good chunk is spent pre-creating lots of objects which may never be used. > Those can be create when required - bringing down the load times of job > history pages and getJobReport etc calls to approximately the history file > parse time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details
[ https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-3901: -- Status: Patch Available (was: Open) > lazy load JobHistory Task and TaskAttempt details > - > > Key: MAPREDUCE-3901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver, mrv2 >Affects Versions: 0.23.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: MR3901.txt > > > The job history UI and MRClientProtocol calls routed via JobHistory are very > slow for large jobs. Some of this time is spent parsing the history file. A > good chunk is spent pre-creating lots of objects which may never be used. > Those can be create when required - bringing down the load times of job > history pages and getJobReport etc calls to approximately the history file > parse time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details
[ https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-3901: -- Attachment: MR3901.txt Straight forward patch. Adds a couple of unit tests for Completed{Job/Task/TaskAttempt}. Also fixes the completedJobCache in jobHistory to be an LRU cache. Numbers when loading a 70MB, 11700 task history file (10 node cluster) ParseTime: ~4.5 seconds Creating all Task objects: ~11.3 seconds (This comes down to ~4 seconds with a patch for MAPREDUCE-2855) Loading the full job: ~15.8 seconds. The patch defers task and task attempt creation till they're required. ParseTime: Remains the same - 4.5 seconds. Creating all task objects: <200ms (Loaded in the UI execution path) Loading the full job: < 5 seconds (for the UI and getJobReport) > lazy load JobHistory Task and TaskAttempt details > - > > Key: MAPREDUCE-3901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver, mrv2 >Affects Versions: 0.23.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: MR3901.txt > > > The job history UI and MRClientProtocol calls routed via JobHistory are very > slow for large jobs. Some of this time is spent parsing the history file. A > good chunk is spent pre-creating lots of objects which may never be used. > Those can be create when required - bringing down the load times of job > history pages and getJobReport etc calls to approximately the history file > parse time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira