[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details

2012-02-27 Thread Vinod Kumar Vavilapalli (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3901:
---

   Resolution: Fixed
Fix Version/s: 0.23.2
 Release Note: Modified JobHistory records in YARN to lazily load job and 
task reports so as to improve UI response times.
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I just committed this to trunk, branch-0.23 and branch-0.23.2. Thanks Sid!

> lazy load JobHistory Task and TaskAttempt details
> -
>
> Key: MAPREDUCE-3901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 0.23.2
>
> Attachments: MR3901.txt, MR3901_v2.txt, MR3901_v3.txt
>
>
> The job history UI and MRClientProtocol calls routed via JobHistory are very 
> slow for large jobs. Some of this time is spent parsing the history file. A 
> good chunk is spent pre-creating lots of objects which may never be used. 
> Those can be create when required - bringing down the load times of job 
> history pages and getJobReport etc calls to approximately the history file 
> parse time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details

2012-02-23 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3901:
--

Status: Patch Available  (was: Open)

> lazy load JobHistory Task and TaskAttempt details
> -
>
> Key: MAPREDUCE-3901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: MR3901.txt, MR3901_v2.txt, MR3901_v3.txt
>
>
> The job history UI and MRClientProtocol calls routed via JobHistory are very 
> slow for large jobs. Some of this time is spent parsing the history file. A 
> good chunk is spent pre-creating lots of objects which may never be used. 
> Those can be create when required - bringing down the load times of job 
> history pages and getJobReport etc calls to approximately the history file 
> parse time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details

2012-02-23 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3901:
--

Attachment: MR3901_v3.txt

trying again.. the previous patch should've been ok.

> lazy load JobHistory Task and TaskAttempt details
> -
>
> Key: MAPREDUCE-3901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: MR3901.txt, MR3901_v2.txt, MR3901_v3.txt
>
>
> The job history UI and MRClientProtocol calls routed via JobHistory are very 
> slow for large jobs. Some of this time is spent parsing the history file. A 
> good chunk is spent pre-creating lots of objects which may never be used. 
> Those can be create when required - bringing down the load times of job 
> history pages and getJobReport etc calls to approximately the history file 
> parse time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details

2012-02-23 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3901:
--

Status: Open  (was: Patch Available)

> lazy load JobHistory Task and TaskAttempt details
> -
>
> Key: MAPREDUCE-3901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: MR3901.txt, MR3901_v2.txt
>
>
> The job history UI and MRClientProtocol calls routed via JobHistory are very 
> slow for large jobs. Some of this time is spent parsing the history file. A 
> good chunk is spent pre-creating lots of objects which may never be used. 
> Those can be create when required - bringing down the load times of job 
> history pages and getJobReport etc calls to approximately the history file 
> parse time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details

2012-02-23 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3901:
--

Status: Patch Available  (was: Open)

> lazy load JobHistory Task and TaskAttempt details
> -
>
> Key: MAPREDUCE-3901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: MR3901.txt, MR3901_v2.txt
>
>
> The job history UI and MRClientProtocol calls routed via JobHistory are very 
> slow for large jobs. Some of this time is spent parsing the history file. A 
> good chunk is spent pre-creating lots of objects which may never be used. 
> Those can be create when required - bringing down the load times of job 
> history pages and getJobReport etc calls to approximately the history file 
> parse time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details

2012-02-23 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3901:
--

Attachment: MR3901_v2.txt

Updated to fix the very valid findbug warnings.

> lazy load JobHistory Task and TaskAttempt details
> -
>
> Key: MAPREDUCE-3901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: MR3901.txt, MR3901_v2.txt
>
>
> The job history UI and MRClientProtocol calls routed via JobHistory are very 
> slow for large jobs. Some of this time is spent parsing the history file. A 
> good chunk is spent pre-creating lots of objects which may never be used. 
> Those can be create when required - bringing down the load times of job 
> history pages and getJobReport etc calls to approximately the history file 
> parse time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details

2012-02-23 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3901:
--

Status: Open  (was: Patch Available)

> lazy load JobHistory Task and TaskAttempt details
> -
>
> Key: MAPREDUCE-3901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: MR3901.txt, MR3901_v2.txt
>
>
> The job history UI and MRClientProtocol calls routed via JobHistory are very 
> slow for large jobs. Some of this time is spent parsing the history file. A 
> good chunk is spent pre-creating lots of objects which may never be used. 
> Those can be create when required - bringing down the load times of job 
> history pages and getJobReport etc calls to approximately the history file 
> parse time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details

2012-02-22 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3901:
--

Status: Patch Available  (was: Open)

> lazy load JobHistory Task and TaskAttempt details
> -
>
> Key: MAPREDUCE-3901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: MR3901.txt
>
>
> The job history UI and MRClientProtocol calls routed via JobHistory are very 
> slow for large jobs. Some of this time is spent parsing the history file. A 
> good chunk is spent pre-creating lots of objects which may never be used. 
> Those can be create when required - bringing down the load times of job 
> history pages and getJobReport etc calls to approximately the history file 
> parse time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3901) lazy load JobHistory Task and TaskAttempt details

2012-02-22 Thread Siddharth Seth (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-3901:
--

Attachment: MR3901.txt

Straight forward patch. Adds a couple of unit tests for 
Completed{Job/Task/TaskAttempt}.
Also fixes the completedJobCache in jobHistory to be an LRU cache.

Numbers when loading a 70MB, 11700 task history file (10 node cluster)

ParseTime: ~4.5 seconds
Creating all Task objects: ~11.3 seconds (This comes down to ~4 seconds with a 
patch for MAPREDUCE-2855)
Loading the full job: ~15.8 seconds.

The patch defers task and task attempt creation till they're required.
ParseTime: Remains the same - 4.5 seconds.
Creating all task objects: <200ms (Loaded in the UI execution path)
Loading the full job: < 5 seconds (for the UI and getJobReport)

> lazy load JobHistory Task and TaskAttempt details
> -
>
> Key: MAPREDUCE-3901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver, mrv2
>Affects Versions: 0.23.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: MR3901.txt
>
>
> The job history UI and MRClientProtocol calls routed via JobHistory are very 
> slow for large jobs. Some of this time is spent parsing the history file. A 
> good chunk is spent pre-creating lots of objects which may never be used. 
> Those can be create when required - bringing down the load times of job 
> history pages and getJobReport etc calls to approximately the history file 
> parse time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira