[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-08-23 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-479:


  Resolution: Fixed
Hadoop Flags: [Incompatible change, Reviewed]
  Status: Resolved  (was: Patch Available)

+1

I committed this. Thanks Jiaqi!

 Add reduce ID to shuffle clienttrace
 

 Key: MAPREDUCE-479
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Jiaqi Tan
Assignee: Jiaqi Tan
Priority: Minor
 Fix For: 0.21.0

 Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
 MAPREDUCE-479-2.patch, MAPREDUCE-479-3.patch, MAPREDUCE-479-4.patch, 
 MAPREDUCE-479.patch


 Current clienttrace messages from shuffles note only the destination map ID 
 but not the source reduce ID. Having both source and destination ID of each 
 shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-08-06 Thread Jiaqi Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Tan updated MAPREDUCE-479:


Release Note: Adds Reduce Attempt ID to ClientTrace log messages, and adds 
Reduce Attempt ID to HTTP query string sent to mapOutputServlet. Extracts 
partition number from attempt ID.   (was: Adds Reduce Attempt ID to ClientTrace 
log messages, and adds Reduce Attempt ID to HTTP query string sent to 
mapOutputServlet.)
  Status: Patch Available  (was: Open)

Did microbenchmark of shuffle durations with and without added reduce attempt 
ID transmission and reduce partition number extraction; shuffle times before 
and after this patch are statistically comparable (chi-squared test for 
distribution similarity of shuffle times, p-value 0.23 = null-hypothesis of 
statistically different distributions not rejected); thus this patch does not 
cause any performance impact.

 Add reduce ID to shuffle clienttrace
 

 Key: MAPREDUCE-479
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Jiaqi Tan
Assignee: Jiaqi Tan
Priority: Minor
 Fix For: 0.21.0

 Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
 MAPREDUCE-479-2.patch, MAPREDUCE-479-3.patch, MAPREDUCE-479-4.patch, 
 MAPREDUCE-479.patch


 Current clienttrace messages from shuffles note only the destination map ID 
 but not the source reduce ID. Having both source and destination ID of each 
 shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-07-16 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-479:


Status: Open  (was: Patch Available)

bq. That would be suboptimal... it's not actually a parameter in the request 
and maintaining it as a necessary side-effect requires future versions to 
preserve it.

I remain opposed to adding a string to the query to be logged on the remote 
side. If you want to make the case for _replacing_ the partition with the 
attempt ID- and extracting the partition from it on the TaskTracker side- I 
would be +0 on that approach.

 Add reduce ID to shuffle clienttrace
 

 Key: MAPREDUCE-479
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Jiaqi Tan
Assignee: Jiaqi Tan
Priority: Minor
 Fix For: 0.21.0

 Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
 MAPREDUCE-479-2.patch, MAPREDUCE-479-3.patch, MAPREDUCE-479.patch


 Current clienttrace messages from shuffles note only the destination map ID 
 but not the source reduce ID. Having both source and destination ID of each 
 shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-07-13 Thread Jiaqi Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Tan updated MAPREDUCE-479:


Status: Open  (was: Patch Available)

Will submit a new patch to add reduce attempt ID to eliminate assumption that 
no 2 attempts will run on same host, in case the assumption breaks in post-0.20 
scheduling.

 Add reduce ID to shuffle clienttrace
 

 Key: MAPREDUCE-479
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Jiaqi Tan
Assignee: Jiaqi Tan
Priority: Minor
 Fix For: 0.21.0

 Attachments: HADOOP-6013.patch, MAPREDUCE-479.patch


 Current clienttrace messages from shuffles note only the destination map ID 
 but not the source reduce ID. Having both source and destination ID of each 
 shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-07-13 Thread Jiaqi Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Tan updated MAPREDUCE-479:


Release Note: Adds Reduce Attempt ID to ClientTrace log messages, and adds 
Reduce Attempt ID to HTTP query string sent to mapOutputServlet.  (was: Adds 
Reduce ID to ClientTrace log messages. Explicitly uses new mapreduce.JobID for 
compatibility with updated TaskID constructor.)
  Status: Patch Available  (was: Open)

I would prefer adding the reduce attempt ID to the HTTP query string because 
this eliminates the need for assuming that no two attempts of the same task can 
run on the same node; I can see scenarios where a custom scheduler may break 
this assumption and make tracing very complicated. The incremental cost in 
terms of additional network traffic of adding the reduce attempt ID should be 
minimal and much smaller than the total data shuffled in a typical job. 

 Add reduce ID to shuffle clienttrace
 

 Key: MAPREDUCE-479
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Jiaqi Tan
Assignee: Jiaqi Tan
Priority: Minor
 Fix For: 0.21.0

 Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
 MAPREDUCE-479.patch


 Current clienttrace messages from shuffles note only the destination map ID 
 but not the source reduce ID. Having both source and destination ID of each 
 shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-07-13 Thread Jiaqi Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Tan updated MAPREDUCE-479:


Status: Patch Available  (was: Open)

 Add reduce ID to shuffle clienttrace
 

 Key: MAPREDUCE-479
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Jiaqi Tan
Assignee: Jiaqi Tan
Priority: Minor
 Fix For: 0.21.0

 Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
 MAPREDUCE-479-2.patch, MAPREDUCE-479.patch


 Current clienttrace messages from shuffles note only the destination map ID 
 but not the source reduce ID. Having both source and destination ID of each 
 shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-07-13 Thread Jiaqi Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Tan updated MAPREDUCE-479:


Attachment: MAPREDUCE-479-2.patch

Updated, correct patch.

 Add reduce ID to shuffle clienttrace
 

 Key: MAPREDUCE-479
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Jiaqi Tan
Assignee: Jiaqi Tan
Priority: Minor
 Fix For: 0.21.0

 Attachments: HADOOP-6013.patch, MAPREDUCE-479-1.patch, 
 MAPREDUCE-479-2.patch, MAPREDUCE-479.patch


 Current clienttrace messages from shuffles note only the destination map ID 
 but not the source reduce ID. Having both source and destination ID of each 
 shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-479) Add reduce ID to shuffle clienttrace

2009-07-12 Thread Jiaqi Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiaqi Tan updated MAPREDUCE-479:


Attachment: MAPREDUCE-479.patch

Cleaned up patch for new branched tree

 Add reduce ID to shuffle clienttrace
 

 Key: MAPREDUCE-479
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-479
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Jiaqi Tan
Assignee: Jiaqi Tan
Priority: Minor
 Fix For: 0.21.0

 Attachments: HADOOP-6013.patch, MAPREDUCE-479.patch


 Current clienttrace messages from shuffles note only the destination map ID 
 but not the source reduce ID. Having both source and destination ID of each 
 shuffle enables full tracing of execution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.