[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12926072#action_12926072
 ] 

Hudson commented on MAPREDUCE-1881:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #523 (See 
[https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/523/])


 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881-v3.patch, mapreduce-1881-v4.patch, mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-20 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12900840#action_12900840
 ] 

Matei Zaharia commented on MAPREDUCE-1881:
--

Thanks, Arun!





 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881-v3.patch, mapreduce-1881-v4.patch, mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-16 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12898992#action_12898992
 ] 

Luke Lu commented on MAPREDUCE-1881:


One nit for the test code: could have used mock (we have mockito in trunk) to 
avoid manually writing a instrumentation class for verification.
I also suggest that you refactor the instrumentation object creation code into 
a static factory method so that you can unit test the expected behavior as well.

The code looks fine otherwise, if the above concerns (especially the latter) 
are addressed and hudson finds no related issues.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881-v3.patch, mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-16 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12899031#action_12899031
 ] 

Matei Zaharia commented on MAPREDUCE-1881:
--

By instrumentation creation code, do you mean the one in TaskTracker? I can do 
that.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881-v3.patch, mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-16 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12899035#action_12899035
 ] 

Luke Lu commented on MAPREDUCE-1881:


Yes, I meant something like: 

{{static TaskTrackerInstrumentation createInstrumenation(JobConf conf);}} in 
the TaskTracker class or
{{static TaskTrackerInstrumentation create(JobConf conf);}} in the 
TaskTrackerInstrumentation class

Though I prefer the latter, either way is fine, as long as you write a unit 
test for it.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881-v3.patch, mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-16 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12899205#action_12899205
 ] 

Luke Lu commented on MAPREDUCE-1881:


v4 looks fine to me. Thanks Matei.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881-v3.patch, mapreduce-1881-v4.patch, mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-15 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12898787#action_12898787
 ] 

Matei Zaharia commented on MAPREDUCE-1881:
--

BTW, the test failures in the previous Hudson output seem to be unrelated to 
this patch. Let me know if it looks good to commit with the new additions.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881-v3.patch, mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12898264#action_12898264
 ] 

Hadoop QA commented on MAPREDUCE-1881:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12451711/mapreduce-1881-v2b.patch
  against trunk revision 984707.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/607/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/607/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/607/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/607/console

This message is automatically generated.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-11 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12897178#action_12897178
 ] 

Luke Lu commented on MAPREDUCE-1881:


I have no issue with the statusUpdate method. I got where you're coming from :) 
But I question many users will want to do the same thing. I'm curious about 
many useful instrumentation classes being written. Adding features 
(especially redundant ones), IMO, doesn't necessarily make Hadoop better but 
rather bloated and harder to maintain. You know, perfection is attained not 
when no more can be added, but when no more can be removed.

Another thing about the patch is that if the instrumentation class is specified 
as an empty string, it silently defaults to the composite class with a empty 
list (essentially a noop instrumentation), which is a behavior change from the 
existing behavior: an exception would be thrown.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-11 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12897334#action_12897334
 ] 

Philip Zeyliger commented on MAPREDUCE-1881:


I'll chime in that I'm using the instrumentation classes and find them a useful 
way to listen to some events that are otherwise hard to get at.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-11 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12897348#action_12897348
 ] 

Arun C Murthy commented on MAPREDUCE-1881:
--

I'm trying to understand the proposal... please help me.

Currently you can define multiple 'sinks' for the same data via 
CompositeContext. Thus you can define multiple listeners and each will get the 
same data, is that sufficient for this use case?



 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-11 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12897367#action_12897367
 ] 

Luke Lu commented on MAPREDUCE-1881:


The instrumentation class is related to but not dependent on metrics 
frameworks. Some of the events are actually not collected in the regular 
metrics, so there is an expert level config property 
mapreduce.tasktracker.instrumentation to specify a subclass for 
TaskTrackerInstrumentation which contains all the overridable callbacks. The 
default value for the property is the TaskTrackerMetricsInst class which 
currently implements the Updater interface to collect tasktracker metrics in 
the mapred metrics context. Similarly for metrics v2, 
TaskTrackerMetricsSource would be the default.

Matei and others want to use the overridable instrumentation property to hook 
in other listeners, for things that're not strictly metrics related, like 
statusUpdate, which is useful for his project which does two-level scheduling 
:) He can achieve this with the addition of the statusUpdate method in 
TaskTrackerInstrumentation. To make adding more instrumentation classes (while 
preserving the existing instrumentation like metrics) slightly easier (IMO, a 
user defined composite class is just as easy), he wants to make the property a 
list of classes so that the events are fired for each instances of the 
specified classes.

The latter part of the patch would add a composite instrumentation class that 
dispatches all the events to all the instances of the specified instrumentation 
classes. Currently the patch lacks unit tests for the composite class. I can 
see problems down the road maintaining the class, like making sure it doesn't 
block in one of the classes that can potentially do RPCs etc and properly 
handle exceptions in the delegate objects. 



 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-11 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12897442#action_12897442
 ] 

Luke Lu commented on MAPREDUCE-1881:


The jobtracker and tasktracker instrumentation is introduced in HADOOP-3772, 
which contains more background info.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-10 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12897047#action_12897047
 ] 

Luke Lu commented on MAPREDUCE-1881:


Having to route through a specific implementation of composite object could 
lead to situations that user cannot override without changing library code. 
Currently, we can measure the overhead of instrumentation by comparing with a 
noop instrumentation. Forcing it through the composite object incurs overhead 
of a loop construct and doubles the amount of method calls, which may or may 
not be acceptable given a user application (it's not you or I who should decide 
whether it's acceptable or not.)

IMO, you don't even need the composite class in official hadoop source to 
support multiple listeners, which adds minor convenience as well as maintenance 
burden to Hadoop developers. The user instrumentation feature is supposedly 
only for experts who knows how to write a more complex instrumentation class 
than a trivial composite class.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-10 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12897051#action_12897051
 ] 

Luke Lu commented on MAPREDUCE-1881:


BTW, the v2b patch is looks fine besides the necessity question.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-10 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12897114#action_12897114
 ] 

Matei Zaharia commented on MAPREDUCE-1881:
--

By necessity, do you mean why should Hadoop provide this feature rather than 
letting users implement it themselves? The answer is pretty simple -- since 
many users will want to do the same thing, it makes sense to put it into the 
platform instead of asking them all to reinvent it. The goal of the JIRA 
process is not to minimize changes to Hadoop, it's to make Hadoop better. One 
can imagine many useful instrumentation classes being written that people will 
combine (already, lots of people are using the default metrics one).

I actually opened this issue because I'm working on a project where I want to 
programmatically launch a TaskTracker with an extra instrumentation class on 
top of the ones the user configured in mapred-site.xml. I could do it by 
setting the parameter to a composite class, and then passing it the old 
parameter, but it felt more natural to add support for multiple instrumentation 
objects and just append to the user's list. I care more about the second part 
of the issue (statusUpdate callback) though, because my project can't work at 
all without that.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, 
 mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-09 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896663#action_12896663
 ] 

Luke Lu commented on MAPREDUCE-1881:


Having to be routed via the default composite object is really what I object. 
There are many reasons why it's a bad idea. However there is a simple fix to 
your patch: if there is only one class specified in the config, use it as the 
top level class instead of a delegate in the default composite class. This way, 
user (arguably expert :), who want to mess with tasktracker implementation) 
convenience is preserved and the default composite class implementation is 
overridable.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-09 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896769#action_12896769
 ] 

Matei Zaharia commented on MAPREDUCE-1881:
--

Sure, I can do that. I think the biggest risk with the composite object is 
adding a method to TaskTrackerInstrumentation that we forget to add in the 
composite. While this is bad, it would arguably get noticed faster if the 
composite object is used by default than if it isn't. Any other opinions on 
this? Any other reasons to avoid it?

I really see the composite object as not much different than having a for loop 
at every call site. Supporting multiple instrumentation objects is clearly 
useful (the same way that most classes in the JDK with events support multiple 
listeners). The question then is how to do it. Since the instrumentation 
interface isn't designed in such a way that an implementation knows who called 
it (i.e. can tell whether it went through a composite object), it seems OK to 
use a composite object to route calls to a list of implementations.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-06 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896181#action_12896181
 ] 

Matei Zaharia commented on MAPREDUCE-1881:
--

The point of this change is to allow the user to specify a comma-separated list 
of classes in the job.tracker.instrumentation field instead of a single class. 
Asking users to specify the composite class, and then go set another property 
somewhere else, is needlessly inconvenient. What is the problem with the 
current approach? If the user only specifies one instrumentation class (as they 
do today), only that one class will be used, and the behavior will be exactly 
the same as today (except that calls get routed through the composite object 
first). If the user lists multiple classes, multiple classes will be used.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-05 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895845#action_12895845
 ] 

Luke Lu commented on MAPREDUCE-1881:


The problem of the v2 patch is that the composite instrumentation class is 
always used and not pluggable. IMO, I would not change the semantics of 
job.tracker.instrumentation to a list a classes. You can add a composite 
instrumentation class that looks for 
job.tracker.instrumentation.composite.classes (or something like that.) BTW, 
although the composite class is a convenience, users wanting the feature 
already can implement this feature without changing the the hadoop code.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881-v2.patch, mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-03 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895073#action_12895073
 ] 

Matei Zaharia commented on MAPREDUCE-1881:
--

Sounds good, I will add a composite class then. I used for loops because other 
listener systems in Hadoop, such as the JobTrackerListener, use them as well.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-08-02 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894602#action_12894602
 ] 

Luke Lu commented on MAPREDUCE-1881:


The main point of having an instrumentation class is to hide the implementation 
details behind the instrumentation interface. Using for loops in the 
instrumentation client code is really jarring. I'd recommend implementing a 
composite instrumentation class, if you want send events to multiple 
instrumentation implementations. That way client code is not changed and you 
can implement more advanced logic (like only send events to a subset of 
instrumentation objects based on some rules.) without changing client code.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-07-26 Thread Scott Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892410#action_12892410
 ] 

Scott Chen commented on MAPREDUCE-1881:
---

+1 The patch looks good to me.

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Assignee: Matei Zaharia
Priority: Minor
 Attachments: mapreduce-1881.patch


 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation

2010-06-20 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12880594#action_12880594
 ] 

Matei Zaharia commented on MAPREDUCE-1881:
--

One other suggestion: A statusUpdate callback should be added to 
TaskTrackerInstrumentation to let it know when a task changes state (e.g. from 
RUNNING to COMMIT_PENDING). If this is done, then there's probably no need to 
modify reportTaskLaunch and reportTaskEnd (which means no changes to existing 
clients).

 Improve TaskTrackerInstrumentation
 --

 Key: MAPREDUCE-1881
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Matei Zaharia
Priority: Minor

 The TaskTrackerInstrumentation class provides a useful way to capture key 
 events at the TaskTracker for use in various reporting tools, but it is 
 currently rather limited, because only one TaskTrackerInstrumentation can be 
 added to a given TaskTracker and this objects receives minimal information 
 about tasks (only their IDs). I propose enhancing the functionality through 
 two changes:
 # Support a comma-separated list of TaskTrackerInstrumentation classes rather 
 than just a single one in the JobConf, and report events to all of them.
 # Make the reportTaskLaunch and reportTaskEnd methods in 
 TaskTrackerInstrumentation receive a reference to a whole Task object rather 
 than just its TaskAttemptID. It might also be useful to make the latter 
 receive the task's final state, i.e. failed, killed, or successful.
 I'm just posting this here to get a sense of whether this is a good idea. If 
 people think it's okay, I will make a patch against trunk that implements 
 these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.