[jira] [Updated] (MAPREDUCE-5268) Improve history server startup performance

2013-06-03 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5268:
--

   Resolution: Fixed
Fix Version/s: 0.23.9
   2.1.0-beta
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks, Karthik!  I committed this to trunk, branch-2, and branch-0.23.

> Improve history server startup performance
> --
>
> Key: MAPREDUCE-5268
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5268
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Karthik Kambatla
> Fix For: 2.1.0-beta, 0.23.9
>
> Attachments: mr-5268.patch, mr-5268.patch, mr-5268.patch, 
> mr-5268.patch, mr-5268.patch, mr-5268-prelim.patch
>
>
> The history server can easily take many minutes to startup when there are a 
> significant number of jobs to scan in the done directory.  However the 
> scanning of files is not the bottleneck, rather it's the heavy use of 
> ConcurrentSkipListMap.size in HistoryFileManager.  
> ConcurrentSkipListMap.size is a very expensive operation, especially on maps 
> with many entries, as it has to scan every entry to compute the size.  We 
> should avoid calling this method or at least minimize its use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5268) Improve history server startup performance

2013-05-31 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5268:


Attachment: mr-5268.patch

Thanks for the patient reviews, Jason.

My bad forgot to rename the corresponding test - fixed that now. Also, the 
tests now use MRBuilderUtils instead of FakeJob.

Reviewed the patch carefully, hopefully this takes care of everything.

> Improve history server startup performance
> --
>
> Key: MAPREDUCE-5268
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5268
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Karthik Kambatla
> Attachments: mr-5268.patch, mr-5268.patch, mr-5268.patch, 
> mr-5268.patch, mr-5268.patch, mr-5268-prelim.patch
>
>
> The history server can easily take many minutes to startup when there are a 
> significant number of jobs to scan in the done directory.  However the 
> scanning of files is not the bottleneck, rather it's the heavy use of 
> ConcurrentSkipListMap.size in HistoryFileManager.  
> ConcurrentSkipListMap.size is a very expensive operation, especially on maps 
> with many entries, as it has to scan every entry to compute the size.  We 
> should avoid calling this method or at least minimize its use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5268) Improve history server startup performance

2013-05-31 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5268:


Attachment: mr-5268.patch

Updated the helper class name to JobIdHistoryFileInfoMap

> Improve history server startup performance
> --
>
> Key: MAPREDUCE-5268
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5268
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Karthik Kambatla
> Attachments: mr-5268.patch, mr-5268.patch, mr-5268.patch, 
> mr-5268.patch, mr-5268-prelim.patch
>
>
> The history server can easily take many minutes to startup when there are a 
> significant number of jobs to scan in the done directory.  However the 
> scanning of files is not the bottleneck, rather it's the heavy use of 
> ConcurrentSkipListMap.size in HistoryFileManager.  
> ConcurrentSkipListMap.size is a very expensive operation, especially on maps 
> with many entries, as it has to scan every entry to compute the size.  We 
> should avoid calling this method or at least minimize its use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5268) Improve history server startup performance

2013-05-30 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5268:


Attachment: mr-5268.patch

Updated the patch to address Jason's comments:
# use AtomicInteger instead of synchronized methods
# wrapper not to use templates

> Improve history server startup performance
> --
>
> Key: MAPREDUCE-5268
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5268
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Karthik Kambatla
> Attachments: mr-5268.patch, mr-5268.patch, mr-5268.patch, 
> mr-5268-prelim.patch
>
>
> The history server can easily take many minutes to startup when there are a 
> significant number of jobs to scan in the done directory.  However the 
> scanning of files is not the bottleneck, rather it's the heavy use of 
> ConcurrentSkipListMap.size in HistoryFileManager.  
> ConcurrentSkipListMap.size is a very expensive operation, especially on maps 
> with many entries, as it has to scan every entry to compute the size.  We 
> should avoid calling this method or at least minimize its use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5268) Improve history server startup performance

2013-05-29 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5268:


Status: Patch Available  (was: Open)

> Improve history server startup performance
> --
>
> Key: MAPREDUCE-5268
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5268
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.0.4-alpha, 0.23.7
>Reporter: Jason Lowe
>Assignee: Karthik Kambatla
> Attachments: mr-5268.patch, mr-5268.patch, mr-5268-prelim.patch
>
>
> The history server can easily take many minutes to startup when there are a 
> significant number of jobs to scan in the done directory.  However the 
> scanning of files is not the bottleneck, rather it's the heavy use of 
> ConcurrentSkipListMap.size in HistoryFileManager.  
> ConcurrentSkipListMap.size is a very expensive operation, especially on maps 
> with many entries, as it has to scan every entry to compute the size.  We 
> should avoid calling this method or at least minimize its use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5268) Improve history server startup performance

2013-05-29 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5268:


Attachment: mr-5268.patch

Updated patch to abstract out collection/size into a local utility class and 
minimal changes to JobListCache.

Considered implementing a full-blown implementation of 
ConcurrentSkipListMapWithSize extending ConcurrentSkipList and adding it to 
hadoop-common, but decided against it as that seemed a little overboard.

> Improve history server startup performance
> --
>
> Key: MAPREDUCE-5268
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5268
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Karthik Kambatla
> Attachments: mr-5268.patch, mr-5268.patch, mr-5268-prelim.patch
>
>
> The history server can easily take many minutes to startup when there are a 
> significant number of jobs to scan in the done directory.  However the 
> scanning of files is not the bottleneck, rather it's the heavy use of 
> ConcurrentSkipListMap.size in HistoryFileManager.  
> ConcurrentSkipListMap.size is a very expensive operation, especially on maps 
> with many entries, as it has to scan every entry to compute the size.  We 
> should avoid calling this method or at least minimize its use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5268) Improve history server startup performance

2013-05-29 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5268:


Status: Open  (was: Patch Available)

> Improve history server startup performance
> --
>
> Key: MAPREDUCE-5268
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5268
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.0.4-alpha, 0.23.7
>Reporter: Jason Lowe
>Assignee: Karthik Kambatla
> Attachments: mr-5268.patch, mr-5268-prelim.patch
>
>
> The history server can easily take many minutes to startup when there are a 
> significant number of jobs to scan in the done directory.  However the 
> scanning of files is not the bottleneck, rather it's the heavy use of 
> ConcurrentSkipListMap.size in HistoryFileManager.  
> ConcurrentSkipListMap.size is a very expensive operation, especially on maps 
> with many entries, as it has to scan every entry to compute the size.  We 
> should avoid calling this method or at least minimize its use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5268) Improve history server startup performance

2013-05-29 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5268:


Status: Patch Available  (was: Open)

> Improve history server startup performance
> --
>
> Key: MAPREDUCE-5268
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5268
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 2.0.4-alpha, 0.23.7
>Reporter: Jason Lowe
>Assignee: Karthik Kambatla
> Attachments: mr-5268.patch, mr-5268-prelim.patch
>
>
> The history server can easily take many minutes to startup when there are a 
> significant number of jobs to scan in the done directory.  However the 
> scanning of files is not the bottleneck, rather it's the heavy use of 
> ConcurrentSkipListMap.size in HistoryFileManager.  
> ConcurrentSkipListMap.size is a very expensive operation, especially on maps 
> with many entries, as it has to scan every entry to compute the size.  We 
> should avoid calling this method or at least minimize its use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5268) Improve history server startup performance

2013-05-29 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5268:


Attachment: mr-5268.patch

Updated patch with tests.

> Improve history server startup performance
> --
>
> Key: MAPREDUCE-5268
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5268
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Karthik Kambatla
> Attachments: mr-5268.patch, mr-5268-prelim.patch
>
>
> The history server can easily take many minutes to startup when there are a 
> significant number of jobs to scan in the done directory.  However the 
> scanning of files is not the bottleneck, rather it's the heavy use of 
> ConcurrentSkipListMap.size in HistoryFileManager.  
> ConcurrentSkipListMap.size is a very expensive operation, especially on maps 
> with many entries, as it has to scan every entry to compute the size.  We 
> should avoid calling this method or at least minimize its use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5268) Improve history server startup performance

2013-05-28 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-5268:


Attachment: mr-5268-prelim.patch

The least intrusive way seems to be to maintain the size of the 
{{ConcurrentSkipListMap}} outside in {{JobListCache}}. Uploading a preliminary 
patch with the changes:
# Add an AtomicInteger field size to JobListcache, incremented when we add an 
entry and decremented when we remove
# Haven't used locks around cache#insert/delete and size update, as I think it 
might be okay to not conform to the maximum size exactly. If we choose to 
synchronize, we should probably use int for size instead of AtomicInteger.
# I am still working on tests for JobListCache, will try to update the patch 
tomorrow.

IIUC, we are using {{ConcurrentSkipListMap}} now and {{TreeMap}} in one of the 
previous versions to evict items in the order of {{JobId}}. Is the ordering a 
requirement? If it is not, we could may be use something like GuavaCache?  
Thoughts?

> Improve history server startup performance
> --
>
> Key: MAPREDUCE-5268
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5268
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Affects Versions: 0.23.7, 2.0.4-alpha
>Reporter: Jason Lowe
>Assignee: Karthik Kambatla
> Attachments: mr-5268-prelim.patch
>
>
> The history server can easily take many minutes to startup when there are a 
> significant number of jobs to scan in the done directory.  However the 
> scanning of files is not the bottleneck, rather it's the heavy use of 
> ConcurrentSkipListMap.size in HistoryFileManager.  
> ConcurrentSkipListMap.size is a very expensive operation, especially on maps 
> with many entries, as it has to scan every entry to compute the size.  We 
> should avoid calling this method or at least minimize its use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira