[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Status: Patch Available  (was: Open)

test-patch, contrib-tests passed on my local machine.

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918-5.patch, 
> mapreduce-728-20090918-6.patch, mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Attachment: (was: mapreduce-728-20090918-6.patch)

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918-5.patch, 
> mapreduce-728-20090918-6.patch, mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Attachment: mapreduce-728-20090918-6.patch

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918-5.patch, 
> mapreduce-728-20090918-6.patch, mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Attachment: mapreduce-728-20090918-6.patch

Fixed problem introduced by MAPREDUCE-980.

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918-5.patch, 
> mapreduce-728-20090918-6.patch, mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Attachment: (was: mapreduce-728-20090918-4.patch)

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918-5.patch, 
> mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757625#action_12757625
 ] 

Hong Tang commented on MAPREDUCE-728:
-

Given that the reason it was broken is due to the very late check in of 
MAPREDUCE-980, I'd like to request an extension of this patch to go in to 21.

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918-4.patch, 
> mapreduce-728-20090918-5.patch, mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-849) Renaming of configuration property names in mapreduce

2009-09-18 Thread Sharad Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal updated MAPREDUCE-849:
-

Release Note: Rename and categorize configuration keys into - cluster, 
jobtracker, tasktracker, job, client. Constants are defined for all keys in 
java and code is changed to use constants instead of direct strings. All old 
keys are deprecated except of examples and tests. The change is incompatible 
because support for old keys is not provided for config keys in examples.

> Renaming of configuration property names in mapreduce
> -
>
> Key: MAPREDUCE-849
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-849
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: Config changes.xls, Config changes.xls, patch-849-1.txt, 
> patch-849-2.txt, patch-849-3.txt, patch-849.txt
>
>
> In-line with HDFS-531, property names in configuration files should be 
> standardized in MAPREDUCE. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-849) Renaming of configuration property names in mapreduce

2009-09-18 Thread Sharad Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757617#action_12757617
 ] 

Sharad Agarwal commented on MAPREDUCE-849:
--

bq. Why is this incompatible.
We deprecated all old keys except of examples and tests. I assume for tests we 
don't need to bother. We made incompatible due to keys in examples not being 
deprecated. 

> Renaming of configuration property names in mapreduce
> -
>
> Key: MAPREDUCE-849
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-849
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: Config changes.xls, Config changes.xls, patch-849-1.txt, 
> patch-849-2.txt, patch-849-3.txt, patch-849.txt
>
>
> In-line with HDFS-531, property names in configuration files should be 
> standardized in MAPREDUCE. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-980) Modify JobHistory to use Avro for serialization instead of raw JSON

2009-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757584#action_12757584
 ] 

Hudson commented on MAPREDUCE-980:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #58 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/58/])
.  Modify JobHistory to use Avro for serialization.


> Modify JobHistory to use Avro for serialization instead of raw JSON
> ---
>
> Key: MAPREDUCE-980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-980
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Jothi Padmanabhan
>Assignee: Doug Cutting
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch
>
>
> MAPREDUCE-157 modifies JobHistory to log events using Json Format.  This can 
> be modified to use Avro instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1011) Git and Subversion ignore of build.properties

2009-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757585#action_12757585
 ] 

Hudson commented on MAPREDUCE-1011:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #58 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/58/])
. Add build.properties to svn and git ignore. (omalley)


> Git and Subversion ignore of build.properties
> -
>
> Key: MAPREDUCE-1011
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1011
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 0.21.0
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.21.0
>
> Attachments: m-1011.patch
>
>
> Currently ant test-patch can't use build.properties, because it counts as an 
> non-pristine directory. I'll add build.properties to the subversion 
> properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-912) apache license header missing for some java files

2009-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757580#action_12757580
 ] 

Hudson commented on MAPREDUCE-912:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #58 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/58/])
. Add and standardize Apache license headers. Contributed by Chad Metcalf


> apache license header missing for some java files
> -
>
> Key: MAPREDUCE-912
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-912
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Amareshwari Sriramadasu
>Assignee: Chad Metcalf
> Fix For: 0.21.0
>
> Attachments: 912.patch
>
>
> The following files do not have apache license header :
> src/java/org/apache/hadoop/mapred/lib/db/DBWritable.java
> src/java/org/apache/hadoop/mapreduce/Counters.java
> src/test/mapred/org/apache/hadoop/mapred/lib/db/TestConstructQuery.java
> src/examples/org/apache/hadoop/examples/WordCount.java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-954) The new interface's Context objects should be interfaces

2009-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757583#action_12757583
 ] 

Hudson commented on MAPREDUCE-954:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #58 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/58/])
. Change Map-Reduce context objects to be interfaces. (acmurthy)


> The new interface's Context objects should be interfaces
> 
>
> Key: MAPREDUCE-954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch, MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch, MAPREDUCE-954.patch
>
>
> When I was doing HADOOP-1230, I was persuaded to make the Context objects as 
> classes. I think that was a serious mistake. It caused a lot of information 
> leakage into the public classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-639) Update the TeraSort to reflect the new benchmark rules for '09

2009-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757581#action_12757581
 ] 

Hudson commented on MAPREDUCE-639:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #58 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/58/])
. Change Terasort example to reflect the 2009 updates. 
(omalley)


> Update the TeraSort to reflect the new benchmark rules for '09
> --
>
> Key: MAPREDUCE-639
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-639
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.21.0
>
> Attachments: m-639.patch, m-639.patch, m-639.patch, timeline-png.tgz
>
>
> The terabyte sort rules have been changed and the example should be updated 
> to match them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-679) XML-based metrics as JSP servlet for JobTracker

2009-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757582#action_12757582
 ] 

Hudson commented on MAPREDUCE-679:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #58 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/58/])
. XML-based metrics as JSP servlet for JobTracker. Contributed by Aaron 
Kimball.


> XML-based metrics as JSP servlet for JobTracker
> ---
>
> Key: MAPREDUCE-679
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-679
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobtracker
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Fix For: 0.21.0
>
> Attachments: example-jobtracker-completed-job.xml, 
> example-jobtracker-running-job.xml, MAPREDUCE-679.2.patch, 
> MAPREDUCE-679.3.patch, MAPREDUCE-679.4.patch, MAPREDUCE-679.5.patch, 
> MAPREDUCE-679.6.patch, MAPREDUCE-679.7.patch, MAPREDUCE-679.patch
>
>
> In HADOOP-4559, a general REST API for reporting metrics was proposed but 
> work seems to have stalled. In the interim, we have a simple XML translation 
> of the existing JobTracker status page which provides the same metrics 
> (including the tables of running/completed/failed jobs) as the human-readable 
> page. This is a relatively lightweight addition to provide some 
> machine-understandable metrics reporting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3

2009-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757562#action_12757562
 ] 

Hadoop QA commented on MAPREDUCE-972:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12419318/MAPREDUCE-972.3.patch
  against trunk revision 816782.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/55/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/55/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/55/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/55/console

This message is automatically generated.

> distcp can timeout during rename operation to s3
> 
>
> Key: MAPREDUCE-972
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 0.20.1
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, 
> MAPREDUCE-972.patch
>
>
> rename() in S3 is implemented as copy + delete. The S3 copy operation can 
> perform very slowly, which may cause task timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-980) Modify JobHistory to use Avro for serialization instead of raw JSON

2009-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757559#action_12757559
 ] 

Hadoop QA commented on MAPREDUCE-980:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12420063/MAPREDUCE-980.patch
  against trunk revision 816782.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/116/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/116/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/116/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/116/console

This message is automatically generated.

> Modify JobHistory to use Avro for serialization instead of raw JSON
> ---
>
> Key: MAPREDUCE-980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-980
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Jothi Padmanabhan
>Assignee: Doug Cutting
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch
>
>
> MAPREDUCE-157 modifies JobHistory to log events using Json Format.  This can 
> be modified to use Avro instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-639) Update the TeraSort to reflect the new benchmark rules for '09

2009-09-18 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated MAPREDUCE-639:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I just committed this.

> Update the TeraSort to reflect the new benchmark rules for '09
> --
>
> Key: MAPREDUCE-639
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-639
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.21.0
>
> Attachments: m-639.patch, m-639.patch, m-639.patch, timeline-png.tgz
>
>
> The terabyte sort rules have been changed and the example should be updated 
> to match them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-639) Update the TeraSort to reflect the new benchmark rules for '09

2009-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757555#action_12757555
 ] 

Hadoop QA commented on MAPREDUCE-639:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12420080/m-639.patch
  against trunk revision 816782.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 2329 javac compiler warnings (more 
than the trunk's current 2289 warnings).

-1 findbugs.  The patch appears to introduce 8 new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/2/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/2/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/2/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/2/console

This message is automatically generated.

> Update the TeraSort to reflect the new benchmark rules for '09
> --
>
> Key: MAPREDUCE-639
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-639
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.21.0
>
> Attachments: m-639.patch, m-639.patch, m-639.patch, timeline-png.tgz
>
>
> The terabyte sort rules have been changed and the example should be updated 
> to match them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-912) apache license header missing for some java files

2009-09-18 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-912:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

+1

I committed this. Thanks, Chad!

> apache license header missing for some java files
> -
>
> Key: MAPREDUCE-912
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-912
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Amareshwari Sriramadasu
>Assignee: Chad Metcalf
> Fix For: 0.21.0
>
> Attachments: 912.patch
>
>
> The following files do not have apache license header :
> src/java/org/apache/hadoop/mapred/lib/db/DBWritable.java
> src/java/org/apache/hadoop/mapreduce/Counters.java
> src/test/mapred/org/apache/hadoop/mapred/lib/db/TestConstructQuery.java
> src/examples/org/apache/hadoop/examples/WordCount.java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-639) Update the TeraSort to reflect the new benchmark rules for '09

2009-09-18 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated MAPREDUCE-639:


Attachment: m-639.patch

Added a few findbugs suppressions and a fix.

> Update the TeraSort to reflect the new benchmark rules for '09
> --
>
> Key: MAPREDUCE-639
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-639
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.21.0
>
> Attachments: m-639.patch, m-639.patch, m-639.patch, timeline-png.tgz
>
>
> The terabyte sort rules have been changed and the example should be updated 
> to match them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-639) Update the TeraSort to reflect the new benchmark rules for '09

2009-09-18 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757540#action_12757540
 ] 

Arun C Murthy commented on MAPREDUCE-639:
-

+1

> Update the TeraSort to reflect the new benchmark rules for '09
> --
>
> Key: MAPREDUCE-639
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-639
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.21.0
>
> Attachments: m-639.patch, m-639.patch, timeline-png.tgz
>
>
> The terabyte sort rules have been changed and the example should be updated 
> to match them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-954) The new interface's Context objects should be interfaces

2009-09-18 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy resolved MAPREDUCE-954.
-

  Resolution: Fixed
Release Note: Changed Map-Reduce context objects to be interfaces.

I just committed this.

> The new interface's Context objects should be interfaces
> 
>
> Key: MAPREDUCE-954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch, MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch, MAPREDUCE-954.patch
>
>
> When I was doing HADOOP-1230, I was persuaded to make the Context objects as 
> classes. I think that was a serious mistake. It caused a lot of information 
> leakage into the public classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-954) The new interface's Context objects should be interfaces

2009-09-18 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-954:


Attachment: MAPREDUCE-954.patch

Fixed javadoc warnings:

{noformat}
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 42 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
{noformat}

> The new interface's Context objects should be interfaces
> 
>
> Key: MAPREDUCE-954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch, MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch, MAPREDUCE-954.patch
>
>
> When I was doing HADOOP-1230, I was persuaded to make the Context objects as 
> classes. I think that was a serious mistake. It caused a lot of information 
> leakage into the public classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-954) The new interface's Context objects should be interfaces

2009-09-18 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757525#action_12757525
 ] 

Arun C Murthy commented on MAPREDUCE-954:
-

bq.  I'm not sure that WrappedMapper and WrappedReducer belong in a "lib" 
package, since the classes in "lib" are user-facing, and these are framework 
classes. (People might see them and wonder how they can use a WrappedMapper in 
their application, for example.) They would be better in the task package I 
think.

Owen & I went back forth on this - finally Owen convinced me it's better to put 
it lib.

bq. Can we move org.apache.hadoop.mapreduce.task to 
org.apache.hadoop.mapreduce.server.task to better emphasise that this is 
non-user code. This reflects the packaging of HDFS more, where things that run 
on the cluster are under a "server" package. We should have another JIRA to 
move org.apache.hadoop.mapreduce.task.reduce to 
org.apache.hadoop.mapreduce.server.task.reduce.

We already have the new shuffle in org.apache.hadoop.mapreduce.task.reduce, 
thus  org.apache.hadoop.mapreduce.task seemed logical.

bq. It's a shame JobContextImpl is public and in org.apache.hadoop.mapreduce 
since users shouldn't be exposed to it. Can we move it to another package?

Done.

bq. Since Job extends JobContextImpl you don't need the changes that change the 
*_ATTR constants (e.g. OUTPUT_FORMAT_CLASS_ATTR) to JobContextImpl.*_ATTR - 
they can be referred to directly.

Done.

* What's the compatibility story for previous releases? Would a 0.20 MR 
program written to the new ("mapreduce" package) API work with the new 
interfaces unchanged? What about a 0.20 program using the old MR API - will it 
continue to work with the old MR API with these changes?

Yes to both.

> The new interface's Context objects should be interfaces
> 
>
> Key: MAPREDUCE-954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch, MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch
>
>
> When I was doing HADOOP-1230, I was persuaded to make the Context objects as 
> classes. I think that was a serious mistake. It caused a lot of information 
> leakage into the public classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1010) Adding tests for changes in archives.

2009-09-18 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-1010:
-

Status: Patch Available  (was: Open)

> Adding tests for changes in archives.
> -
>
> Key: MAPREDUCE-1010
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1010
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Affects Versions: 0.20.1
>Reporter: Mahadev konar
>Assignee: Mahadev konar
>Priority: Minor
> Fix For: 0.20.2
>
> Attachments: MAPREDUCE-1010.patch
>
>
> Created this jira so that the tests can be added for HADOOP-6047. The test 
> cases for hadoop archives are in mapreduce.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-954) The new interface's Context objects should be interfaces

2009-09-18 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-954:


Status: Open  (was: Patch Available)

Uh, its javadoc warnings - fixing it.

> The new interface's Context objects should be interfaces
> 
>
> Key: MAPREDUCE-954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch, MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch
>
>
> When I was doing HADOOP-1230, I was persuaded to make the Context objects as 
> classes. I think that was a serious mistake. It caused a lot of information 
> leakage into the public classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1010) Adding tests for changes in archives.

2009-09-18 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-1010:
-

Attachment: MAPREDUCE-1010.patch

this patch adds tests for HADOOP-6097. 


> Adding tests for changes in archives.
> -
>
> Key: MAPREDUCE-1010
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1010
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Affects Versions: 0.20.1
>Reporter: Mahadev konar
>Assignee: Mahadev konar
>Priority: Minor
> Fix For: 0.20.2
>
> Attachments: MAPREDUCE-1010.patch
>
>
> Created this jira so that the tests can be added for HADOOP-6047. The test 
> cases for hadoop archives are in mapreduce.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-954) The new interface's Context objects should be interfaces

2009-09-18 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757514#action_12757514
 ] 

Owen O'Malley commented on MAPREDUCE-954:
-

+1 except for the JavaDoc warning.

We should also only include org.apache.hadoop.mapreduce.task's JavaDoc in the 
javadoc-dev target, but we can do that in a followup.

> The new interface's Context objects should be interfaces
> 
>
> Key: MAPREDUCE-954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch, MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch
>
>
> When I was doing HADOOP-1230, I was persuaded to make the Context objects as 
> classes. I think that was a serious mistake. It caused a lot of information 
> leakage into the public classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-661) distcp doesn't ignore read failures with -i in setup() before MapReduce job is started

2009-09-18 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-661:


Status: Open  (was: Patch Available)

The current patch causes TestCopyFiles to fail

> distcp doesn't ignore read failures with -i in setup() before MapReduce job 
> is started
> --
>
> Key: MAPREDUCE-661
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-661
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 0.21.0
>Reporter: Ravi Gummadi
>Assignee: Ravi Gummadi
> Fix For: 0.21.0
>
> Attachments: d_ignore_read_failures.patch, 
> d_ignore_read_failures661.patch, d_ignore_read_failures661.v1.patch, 
> MR661.patch
>
>
> After HADOOP-3873, file checksums are checked in setup() before actual 
> MapReduce job is started. And when getFileChecksum() fails with 
> socketTimeoutException when called from setup(), distcp fails even though -i 
> option is specified by user. Similar to how map tasks ignore read failures, 
> setup() should also ignore them and continue processing remaining files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-1011) Git and Subversion ignore of build.properties

2009-09-18 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved MAPREDUCE-1011.
--

   Resolution: Fixed
Fix Version/s: 0.21.0
 Hadoop Flags: [Reviewed]

The trivial change is committed.

> Git and Subversion ignore of build.properties
> -
>
> Key: MAPREDUCE-1011
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1011
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 0.21.0
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.21.0
>
> Attachments: m-1011.patch
>
>
> Currently ant test-patch can't use build.properties, because it counts as an 
> non-pristine directory. I'll add build.properties to the subversion 
> properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-954) The new interface's Context objects should be interfaces

2009-09-18 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-954:


Status: Patch Available  (was: Open)

> The new interface's Context objects should be interfaces
> 
>
> Key: MAPREDUCE-954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch, MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch
>
>
> When I was doing HADOOP-1230, I was persuaded to make the Context objects as 
> classes. I think that was a serious mistake. It caused a lot of information 
> leakage into the public classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-954) The new interface's Context objects should be interfaces

2009-09-18 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-954:


Attachment: MAPREDUCE-954.patch

Final patch, passes all tests and output of test-patch is:

{noformat}
 [exec] -1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to include 42 new or 
modified tests.
 [exec]
 [exec] -1 javadoc.  The javadoc tool appears to have generated 1 
warning messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec]
{noformat}

The javac warning is due to the fact that this patch introduces some new 
deprecated classes in org.apache.hadoop.mapred.

> The new interface's Context objects should be interfaces
> 
>
> Key: MAPREDUCE-954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch, MAPREDUCE-954.patch, MAPREDUCE-954.patch, 
> MAPREDUCE-954.patch
>
>
> When I was doing HADOOP-1230, I was persuaded to make the Context objects as 
> classes. I think that was a serious mistake. It caused a lot of information 
> leakage into the public classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1011) Git and Subversion ignore of build.properties

2009-09-18 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757501#action_12757501
 ] 

Chris Douglas commented on MAPREDUCE-1011:
--

+1

> Git and Subversion ignore of build.properties
> -
>
> Key: MAPREDUCE-1011
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1011
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 0.21.0
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: m-1011.patch
>
>
> Currently ant test-patch can't use build.properties, because it counts as an 
> non-pristine directory. I'll add build.properties to the subversion 
> properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1011) Git and Subversion ignore of build.properties

2009-09-18 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated MAPREDUCE-1011:
-

Attachment: m-1011.patch

> Git and Subversion ignore of build.properties
> -
>
> Key: MAPREDUCE-1011
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1011
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 0.21.0
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: m-1011.patch
>
>
> Currently ant test-patch can't use build.properties, because it counts as an 
> non-pristine directory. I'll add build.properties to the subversion 
> properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1011) Git and Subversion ignore of build.properties

2009-09-18 Thread Owen O'Malley (JIRA)
Git and Subversion ignore of build.properties
-

 Key: MAPREDUCE-1011
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1011
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: build
Affects Versions: 0.21.0
Reporter: Owen O'Malley
Assignee: Owen O'Malley


Currently ant test-patch can't use build.properties, because it counts as an 
non-pristine directory. I'll add build.properties to the subversion properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-932) Rumen needs a job trace sorter

2009-09-18 Thread Dick King (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dick King updated MAPREDUCE-932:


Status: Open  (was: Patch Available)

> Rumen needs a job trace sorter
> --
>
> Key: MAPREDUCE-932
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-932
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Dick King
>Assignee: Dick King
> Attachments: MAPREDUCE-932--2009-09-18-PM.patch, 
> MAPREDUCE-932--2009-09-18.patch, patch-932--2009-08-31--1702.patch
>
>
> Rumen reads job history logs and produces job traces.  The jobs in a job 
> trace do not occur in any promised order.  Certain tools need the jobs to be 
> ordered by job submission time.  We should include, in Rumen, a tool to sort 
> job traces.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-932) Rumen needs a job trace sorter

2009-09-18 Thread Dick King (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dick King updated MAPREDUCE-932:


Status: Patch Available  (was: Open)

> Rumen needs a job trace sorter
> --
>
> Key: MAPREDUCE-932
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-932
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Dick King
>Assignee: Dick King
> Attachments: MAPREDUCE-932--2009-09-18-PM.patch, 
> MAPREDUCE-932--2009-09-18.patch, patch-932--2009-08-31--1702.patch
>
>
> Rumen reads job history logs and produces job traces.  The jobs in a job 
> trace do not occur in any promised order.  Certain tools need the jobs to be 
> ordered by job submission time.  We should include, in Rumen, a tool to sort 
> job traces.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-932) Rumen needs a job trace sorter

2009-09-18 Thread Dick King (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dick King updated MAPREDUCE-932:


Attachment: MAPREDUCE-932--2009-09-18-PM.patch

Hudson's findbugs found a file opening leak that mine didn't.  I fixed it.

> Rumen needs a job trace sorter
> --
>
> Key: MAPREDUCE-932
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-932
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Dick King
>Assignee: Dick King
> Attachments: MAPREDUCE-932--2009-09-18-PM.patch, 
> MAPREDUCE-932--2009-09-18.patch, patch-932--2009-08-31--1702.patch
>
>
> Rumen reads job history logs and produces job traces.  The jobs in a job 
> trace do not occur in any promised order.  Certain tools need the jobs to be 
> ordered by job submission time.  We should include, in Rumen, a tool to sort 
> job traces.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757492#action_12757492
 ] 

Chris Douglas commented on MAPREDUCE-728:
-

MAPREDUCE-980 broke the patch in an unrecoverable way. This cannot make feature 
freeze.

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918-4.patch, 
> mapreduce-728-20090918-5.patch, mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1010) Adding tests for changes in archives.

2009-09-18 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated MAPREDUCE-1010:
-

  Component/s: harchive
 Priority: Minor  (was: Major)
Affects Version/s: 0.20.1

> Adding tests for changes in archives.
> -
>
> Key: MAPREDUCE-1010
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1010
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Affects Versions: 0.20.1
>Reporter: Mahadev konar
>Assignee: Mahadev konar
>Priority: Minor
> Fix For: 0.20.2
>
>
> Created this jira so that the tests can be added for HADOOP-6047. The test 
> cases for hadoop archives are in mapreduce.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1010) Adding tests for changes in archives.

2009-09-18 Thread Mahadev konar (JIRA)
Adding tests for changes in archives.
-

 Key: MAPREDUCE-1010
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1010
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Mahadev konar
 Fix For: 0.20.2


Created this jira so that the tests can be added for HADOOP-6047. The test 
cases for hadoop archives are in mapreduce.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Status: Open  (was: Patch Available)

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918-4.patch, 
> mapreduce-728-20090918-5.patch, mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Attachment: mapreduce-728-20090918-5.patch

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918-4.patch, 
> mapreduce-728-20090918-5.patch, mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Attachment: (was: mapreduce-728-20090918-4.patch)

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918-4.patch, 
> mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Attachment: mapreduce-728-20090918-4.patch

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918-4.patch, 
> mapreduce-728-20090918-4.patch, mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Status: Patch Available  (was: Open)

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918-4.patch, 
> mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Attachment: mapreduce-728-20090918-4.patch

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918-4.patch, 
> mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-980) Modify JobHistory to use Avro for serialization instead of raw JSON

2009-09-18 Thread Doug Cutting (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Cutting updated MAPREDUCE-980:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this.

> Modify JobHistory to use Avro for serialization instead of raw JSON
> ---
>
> Key: MAPREDUCE-980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-980
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Jothi Padmanabhan
>Assignee: Doug Cutting
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch
>
>
> MAPREDUCE-157 modifies JobHistory to log events using Json Format.  This can 
> be modified to use Avro instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Attachment: mapreduce-728-20090918-3.patch

Patch added some changes that were lost btw 20090918 and 20090917-4

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918-3.patch, mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Status: Open  (was: Patch Available)

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-679) XML-based metrics as JSP servlet for JobTracker

2009-09-18 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-679:


   Resolution: Fixed
Fix Version/s: 0.21.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

+1 

I've just committed this. Thanks Aaron!

> XML-based metrics as JSP servlet for JobTracker
> ---
>
> Key: MAPREDUCE-679
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-679
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobtracker
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Fix For: 0.21.0
>
> Attachments: example-jobtracker-completed-job.xml, 
> example-jobtracker-running-job.xml, MAPREDUCE-679.2.patch, 
> MAPREDUCE-679.3.patch, MAPREDUCE-679.4.patch, MAPREDUCE-679.5.patch, 
> MAPREDUCE-679.6.patch, MAPREDUCE-679.7.patch, MAPREDUCE-679.patch
>
>
> In HADOOP-4559, a general REST API for reporting metrics was proposed but 
> work seems to have stalled. In the interim, we have a simple XML translation 
> of the existing JobTracker status page which provides the same metrics 
> (including the tables of running/completed/failed jobs) as the human-readable 
> page. This is a relatively lightweight addition to provide some 
> machine-understandable metrics reporting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-893) Provide an ability to refresh queue configuration without restart.

2009-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757474#action_12757474
 ] 

Hudson commented on MAPREDUCE-893:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #57 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/57/])
. Provides an ability to refresh queue configuration without restarting the 
JobTracker. Contributed by Vinod Kumar Vavilapalli and Rahul Kumar Singh.


> Provide an ability to refresh queue configuration without restart.
> --
>
> Key: MAPREDUCE-893
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-893
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Hemanth Yamijala
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-893-20090915.1.txt, 
> MAPREDUCE-893-20090917.2.txt, MAPREDUCE-893-20090917.4.txt, 
> MAPREDUCE-893-20090918.2.txt, MAPREDUCE-893-20090918.over.849.patch, 
> MAPREDUCE-893-20090918.txt, MAPREDUCE-893-7.patch
>
>
> While administering a cluster using multiple queues, administrators feel a 
> need to refresh queue properties on the fly without needing to restart the 
> JobTracker. This is partially supported for some properties such as queue 
> ACLs (HADOOP-5396) and state (HADOOP-5913). The idea is to extend the 
> facility to refresh other queue properties as well, including scheduler 
> properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-923) Sqoop's ORM uses URLDecoder on a file, which replaces plus signs in a jar file name with spaces

2009-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757476#action_12757476
 ] 

Hudson commented on MAPREDUCE-923:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #57 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/57/])
. Sqoop classpath breaks for jar files with a plus sign in their names. 
Contributed by Aaron Kimball.


> Sqoop's ORM uses URLDecoder on a file, which replaces plus signs in a jar 
> file name with spaces
> ---
>
> Key: MAPREDUCE-923
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-923
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/sqoop
>Affects Versions: 0.20.1
>Reporter: Kevin Weil
>Assignee: Aaron Kimball
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-923.patch
>
>
> In findThisJar, sqoop runs URLDecoder.decode on the resulting jar, which has 
> the effect of replacing any + signs in the path with a space.  This obviously 
> breaks the classpath variable that it's trying to set, and the 
> sqoop-generated code fails to compile.  Ironically, Cloudera's hadoop distro 
> is the one that puts + characters in jar files, and so exhibits the bug.  
> Here is an example from running sqoop with log4j at debug level.  Note the 
> space in the very last term, which should read hadoop-0.20.0+61-sqoop.jar 
> rather than hadoop-0.20.0 61-sqoop.jar.
> 09/08/27 18:00:07 DEBUG orm.CompilationManager: Invoking javac with args: 
> -sourcepath ./ -d /tmp/sqoop/compile/ -classpath 
> /usr/lib/hadoop-0.20/conf:/usr/java/jdk1.6.0_06/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-0.20.0+61-core.jar:/usr/lib/hadoop-0.20/lib/commons-cli-2.0-SNAPSHOT.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-0.20.0+61-fairscheduler.jar:/usr/lib/hadoop-0.20/lib/hadoop-0.20.0+61-scribe-log4j.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-3.8.1.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar:/usr/local/hadoop/lib/hadoop-gpl-compression.jar:/usr/lib/hadoop-0.20/hadoop-0.20.0+61-core.jar:/usr/lib/hadoop-0.20/contrib/sqoop/hadoop-0.20.0
>  61-sqoop.jar

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-775) Add input/output formatters for Vertica clustered ADBMS.

2009-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757475#action_12757475
 ] 

Hudson commented on MAPREDUCE-775:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #57 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/57/])
. Add native and streaming support for Vertica as an input or output format 
taking advantage of parallel read and write properties of the DBMS. Contributed 
by Omer Trajman.


> Add input/output formatters for Vertica clustered ADBMS.
> 
>
> Key: MAPREDUCE-775
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-775
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Omer Trajman
>Assignee: Omer Trajman
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-775.2.patch, MAPREDUCE-775.3.patch, 
> MAPREDUCE-775.4.patch, MAPREDUCE-775.patch
>
>
> Add native support for Vertica as an input or output format taking advantage 
> of parallel read and write properties of the DBMS.
>  
> On the input side allow for parametrized queries (a la prepared statements) 
> and create a split for each combination of parameters.  Also support the 
> parameter list to be generated from a sql statement.  For example - return 
> metrics for all dimensions that meet criteria X with one input split for each 
> dimension.  Divide the read among any number of hosts in the Vertica cluster.
>  
> On the output side, support Vertica streaming load to any number of hosts in 
> the Vertica cluster.  Output may be to a different cluster than input.
>  
> Also includes Input and Output formatters that support streaming interface.
> Code has been tested and run on live systems under 19 and 20.  Patch for 21 
> with new API will be ready end of this week.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757469#action_12757469
 ] 

Chris Douglas commented on MAPREDUCE-728:
-

{noformat}
 [exec] -1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to include 30 new or 
modified tests.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] -1 release audit.  The applied patch generated 180 release 
audit warnings (more than the trunk's current 176 warnings).
{noformat}

These files need license headers:
{noformat}
src/contrib/mumak/src/java/org/apache/hadoop/mapred/SimulatorClock.java
src/contrib/mumak/src/java/org/apache/hadoop/mapred/SimulatorJobStory.java
{noformat}

The two .gz files don't need license headers, of course.

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-679) XML-based metrics as JSP servlet for JobTracker

2009-09-18 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757470#action_12757470
 ] 

Aaron Kimball commented on MAPREDUCE-679:
-

unrelated test failure.

> XML-based metrics as JSP servlet for JobTracker
> ---
>
> Key: MAPREDUCE-679
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-679
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobtracker
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: example-jobtracker-completed-job.xml, 
> example-jobtracker-running-job.xml, MAPREDUCE-679.2.patch, 
> MAPREDUCE-679.3.patch, MAPREDUCE-679.4.patch, MAPREDUCE-679.5.patch, 
> MAPREDUCE-679.6.patch, MAPREDUCE-679.7.patch, MAPREDUCE-679.patch
>
>
> In HADOOP-4559, a general REST API for reporting metrics was proposed but 
> work seems to have stalled. In the interim, we have a simple XML translation 
> of the existing JobTracker status page which provides the same metrics 
> (including the tables of running/completed/failed jobs) as the human-readable 
> page. This is a relatively lightweight addition to provide some 
> machine-understandable metrics reporting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-999) Improve Sqoop test speed and refactor tests

2009-09-18 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-999:


Status: Patch Available  (was: Open)

> Improve Sqoop test speed and refactor tests
> ---
>
> Key: MAPREDUCE-999
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-999
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-999.2.patch, MAPREDUCE-999.patch
>
>
> Sqoop's tests take a long time to run, but this can be improved (by a factor 
> of 2 or more) by taking advantage of {{jobclient.completion.poll.interval}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-999) Improve Sqoop test speed and refactor tests

2009-09-18 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-999:


Attachment: MAPREDUCE-999.2.patch

Updated patch sync'd to trunk

> Improve Sqoop test speed and refactor tests
> ---
>
> Key: MAPREDUCE-999
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-999
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-999.2.patch, MAPREDUCE-999.patch
>
>
> Sqoop's tests take a long time to run, but this can be improved (by a factor 
> of 2 or more) by taking advantage of {{jobclient.completion.poll.interval}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-679) XML-based metrics as JSP servlet for JobTracker

2009-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757464#action_12757464
 ] 

Hadoop QA commented on MAPREDUCE-679:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12420056/MAPREDUCE-679.7.patch
  against trunk revision 816735.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 2 new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/115/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/115/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/115/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/115/console

This message is automatically generated.

> XML-based metrics as JSP servlet for JobTracker
> ---
>
> Key: MAPREDUCE-679
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-679
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: jobtracker
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: example-jobtracker-completed-job.xml, 
> example-jobtracker-running-job.xml, MAPREDUCE-679.2.patch, 
> MAPREDUCE-679.3.patch, MAPREDUCE-679.4.patch, MAPREDUCE-679.5.patch, 
> MAPREDUCE-679.6.patch, MAPREDUCE-679.7.patch, MAPREDUCE-679.patch
>
>
> In HADOOP-4559, a general REST API for reporting metrics was proposed but 
> work seems to have stalled. In the interim, we have a simple XML translation 
> of the existing JobTracker status page which provides the same metrics 
> (including the tables of running/completed/failed jobs) as the human-readable 
> page. This is a relatively lightweight addition to provide some 
> machine-understandable metrics reporting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-980) Modify JobHistory to use Avro for serialization instead of raw JSON

2009-09-18 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757466#action_12757466
 ] 

Owen O'Malley commented on MAPREDUCE-980:
-

This looks like a good change. I love it when we get to rip out code.

+1

> Modify JobHistory to use Avro for serialization instead of raw JSON
> ---
>
> Key: MAPREDUCE-980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-980
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Jothi Padmanabhan
>Assignee: Doug Cutting
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch
>
>
> MAPREDUCE-157 modifies JobHistory to log events using Json Format.  This can 
> be modified to use Avro instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-999) Improve Sqoop test speed and refactor tests

2009-09-18 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-999:


Status: Open  (was: Patch Available)

> Improve Sqoop test speed and refactor tests
> ---
>
> Key: MAPREDUCE-999
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-999
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-999.2.patch, MAPREDUCE-999.patch
>
>
> Sqoop's tests take a long time to run, but this can be improved (by a factor 
> of 2 or more) by taking advantage of {{jobclient.completion.poll.interval}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Status: Patch Available  (was: Open)

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Attachment: mapreduce-728-20090918-2.patch

New patch that resolves a conflict in src/contrib/build-contrib.xml.

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918-2.patch, 
> mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-932) Rumen needs a job trace sorter

2009-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757455#action_12757455
 ] 

Hadoop QA commented on MAPREDUCE-932:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12420051/MAPREDUCE-932--2009-09-18.patch
  against trunk revision 816735.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 1 new Findbugs warnings.

-1 release audit.  The applied patch generated 181 release audit warnings 
(more than the trunk's current 179 warnings).

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/1/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/1/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/1/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/1/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/1/console

This message is automatically generated.

> Rumen needs a job trace sorter
> --
>
> Key: MAPREDUCE-932
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-932
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Dick King
>Assignee: Dick King
> Attachments: MAPREDUCE-932--2009-09-18.patch, 
> patch-932--2009-08-31--1702.patch
>
>
> Rumen reads job history logs and produces job traces.  The jobs in a job 
> trace do not occur in any promised order.  Certain tools need the jobs to be 
> ordered by job submission time.  We should include, in Rumen, a tool to sort 
> job traces.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-980) Modify JobHistory to use Avro for serialization instead of raw JSON

2009-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757456#action_12757456
 ] 

Hadoop QA commented on MAPREDUCE-980:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12420061/MAPREDUCE-980.patch
  against trunk revision 816735.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/54/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/54/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/54/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/54/console

This message is automatically generated.

> Modify JobHistory to use Avro for serialization instead of raw JSON
> ---
>
> Key: MAPREDUCE-980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-980
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Jothi Padmanabhan
>Assignee: Doug Cutting
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch
>
>
> MAPREDUCE-157 modifies JobHistory to log events using Json Format.  This can 
> be modified to use Avro instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-972) distcp can timeout during rename operation to s3

2009-09-18 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-972:


Status: Patch Available  (was: Open)

Attempting to cycle patch again..

> distcp can timeout during rename operation to s3
> 
>
> Key: MAPREDUCE-972
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 0.20.1
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, 
> MAPREDUCE-972.patch
>
>
> rename() in S3 is implemented as copy + delete. The S3 copy operation can 
> perform very slowly, which may cause task timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-972) distcp can timeout during rename operation to s3

2009-09-18 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-972:


Status: Open  (was: Patch Available)

> distcp can timeout during rename operation to s3
> 
>
> Key: MAPREDUCE-972
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 0.20.1
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, 
> MAPREDUCE-972.patch
>
>
> rename() in S3 is implemented as copy + delete. The S3 copy operation can 
> perform very slowly, which may cause task timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-972) distcp can timeout during rename operation to s3

2009-09-18 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757448#action_12757448
 ] 

Aaron Kimball commented on MAPREDUCE-972:
-

That Hadoop QA output seems a bit confused; it's a failing build applying 
MAPREDUCE-973. Can anyone explain what's up there?

> distcp can timeout during rename operation to s3
> 
>
> Key: MAPREDUCE-972
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-972
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 0.20.1
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-972.2.patch, MAPREDUCE-972.3.patch, 
> MAPREDUCE-972.patch
>
>
> rename() in S3 is implemented as copy + delete. The S3 copy operation can 
> perform very slowly, which may cause task timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-639) Update the TeraSort to reflect the new benchmark rules for '09

2009-09-18 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated MAPREDUCE-639:


Attachment: m-639.patch

Updated to reflect the moving trunk...

> Update the TeraSort to reflect the new benchmark rules for '09
> --
>
> Key: MAPREDUCE-639
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-639
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.21.0
>
> Attachments: m-639.patch, m-639.patch, timeline-png.tgz
>
>
> The terabyte sort rules have been changed and the example should be updated 
> to match them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-277) Job history counters should be avaible on the UI.

2009-09-18 Thread Nigel Daley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757433#action_12757433
 ] 

Nigel Daley commented on MAPREDUCE-277:
---

No *automated* tests for this?

> Job history counters should be avaible on the UI.
> -
>
> Key: MAPREDUCE-277
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-277
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Affects Versions: 0.20.1
>Reporter: Amareshwari Sriramadasu
>Assignee: Jothi Padmanabhan
>Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: HADOOP-3200-20080915.1.txt, mapred-277-v1.2.patch, 
> mapred-277-v1.4.patch, mapred-277-v2.patch, mapred-277.patch
>
>
> Job history is logging counters. But they are not visible on the UI. 
> Job history parser and UI should be modified to view counters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1009) Forrest documentation needs to be updated to describes features provided for supporting hierarchical queues

2009-09-18 Thread Hemanth Yamijala (JIRA)
Forrest documentation needs to be updated to describes features provided for 
supporting hierarchical queues
---

 Key: MAPREDUCE-1009
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1009
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.21.0
Reporter: Hemanth Yamijala
Priority: Blocker
 Fix For: 0.21.0


Forrest documentation must be updated for describing how to set up and use 
hierarchical queues in the framework and the capacity scheduler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-893) Provide an ability to refresh queue configuration without restart.

2009-09-18 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-893:
---

  Resolution: Fixed
Release Note: 
Extended the framework to refresh queue properties to support refresh of 
scheduler properties.
Implemented the refresh operation for capacity scheduler properties like queue 
capacities.
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Vinod and Rahul !

> Provide an ability to refresh queue configuration without restart.
> --
>
> Key: MAPREDUCE-893
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-893
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Hemanth Yamijala
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-893-20090915.1.txt, 
> MAPREDUCE-893-20090917.2.txt, MAPREDUCE-893-20090917.4.txt, 
> MAPREDUCE-893-20090918.2.txt, MAPREDUCE-893-20090918.over.849.patch, 
> MAPREDUCE-893-20090918.txt, MAPREDUCE-893-7.patch
>
>
> While administering a cluster using multiple queues, administrators feel a 
> need to refresh queue properties on the fly without needing to restart the 
> JobTracker. This is partially supported for some properties such as queue 
> ACLs (HADOOP-5396) and state (HADOOP-5913). The idea is to extend the 
> facility to refresh other queue properties as well, including scheduler 
> properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Attachment: mapreduce-728-20090918.patch

Patch that fixes problems caused by API changes in MAPREDUCE-777.

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-728) Mumak: Map-Reduce Simulator

2009-09-18 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-728:


Status: Open  (was: Patch Available)

Patch broken due to MAPREDUCE-777.

> Mumak: Map-Reduce Simulator
> ---
>
> Key: MAPREDUCE-728
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-728
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Hong Tang
> Fix For: 0.21.0
>
> Attachments: 19-jobs.topology.json.gz, 19-jobs.trace.json.gz, 
> mapreduce-728-20090917-3.patch, mapreduce-728-20090917-4.patch, 
> mapreduce-728-20090917.patch, mapreduce-728-20090918.patch, mumak.png
>
>
> h3. Vision:
> We want to build a Simulator to simulate large-scale Hadoop clusters, 
> applications and workloads. This would be invaluable in furthering Hadoop by 
> providing a tool for researchers and developers to prototype features (e.g. 
> pluggable block-placement for HDFS, Map-Reduce schedulers etc.) and predict 
> their behaviour and performance with reasonable amount of confidence, 
> there-by aiding rapid innovation.
> 
> h3. First Cut: Simulator for the Map-Reduce Scheduler
> The Map-Reduce Scheduler is a fertile area of interest with at least four 
> schedulers, each with their own set of features, currently in existence: 
> Default Scheduler, Capacity Scheduler, Fairshare Scheduler & Priority 
> Scheduler.
> Each scheduler's scheduling decisions are driven by many factors, such as 
> fairness, capacity guarantee, resource availability, data-locality etc.
> Given that, it is non-trivial to accurately choose a single scheduler or even 
> a set of desired features to predict the right scheduler (or features) for a 
> given workload. Hence a simulator which can predict how well a particular 
> scheduler works for some specific workload by quickly iterating over 
> schedulers and/or scheduler features would be quite useful.
> So, the first cut is to implement a simulator for the Map-Reduce scheduler 
> which take as input a job trace derived from production workload and a 
> cluster definition, and simulates the execution of the jobs in as defined in 
> the trace in this virtual cluster. As output, the detailed job execution 
> trace (recorded in relation to virtual simulated time) could then be analyzed 
> to understand various traits of individual schedulers (individual jobs turn 
> around time, throughput, faireness, capacity guarantee, etc). To support 
> this, we would need a simulator which could accurately model the conditions 
> of the actual system which would affect a schedulers decisions. These include 
> very large-scale clusters (thousands of nodes), the detailed characteristics 
> of the workload thrown at the clusters, job or task failures, data locality, 
> and cluster hardware (cpu, memory, disk i/o, network i/o, network topology) 
> etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-893) Provide an ability to refresh queue configuration without restart.

2009-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757410#action_12757410
 ] 

Hadoop QA commented on MAPREDUCE-893:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12420052/MAPREDUCE-893-7.patch
  against trunk revision 816735.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 47 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h1.grid.sp2.yahoo.net/1/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h1.grid.sp2.yahoo.net/1/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h1.grid.sp2.yahoo.net/1/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h1.grid.sp2.yahoo.net/1/console

This message is automatically generated.

> Provide an ability to refresh queue configuration without restart.
> --
>
> Key: MAPREDUCE-893
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-893
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Hemanth Yamijala
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-893-20090915.1.txt, 
> MAPREDUCE-893-20090917.2.txt, MAPREDUCE-893-20090917.4.txt, 
> MAPREDUCE-893-20090918.2.txt, MAPREDUCE-893-20090918.over.849.patch, 
> MAPREDUCE-893-20090918.txt, MAPREDUCE-893-7.patch
>
>
> While administering a cluster using multiple queues, administrators feel a 
> need to refresh queue properties on the fly without needing to restart the 
> JobTracker. This is partially supported for some properties such as queue 
> ACLs (HADOOP-5396) and state (HADOOP-5913). The idea is to extend the 
> facility to refresh other queue properties as well, including scheduler 
> properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1007) MAPREDUCE 777 breaks the UI for hierarchial Queues.

2009-09-18 Thread rahul k singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rahul k singh updated MAPREDUCE-1007:
-

Attachment: MAPREDUCE-1007.patch

attaching a preliminary patch , needs more testing, it needs to be applied on 
893

> MAPREDUCE 777 breaks the UI for hierarchial Queues. 
> 
>
> Key: MAPREDUCE-1007
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1007
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: rahul k singh
>Priority: Blocker
> Attachments: MAPREDUCE-1007.patch
>
>
> mapreduce 777 breaks jobtracker UI for hierarchial queues

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-664) distcp with -delete option does not display number of files deleted from the target that were not present on source

2009-09-18 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757401#action_12757401
 ] 

Chris Douglas commented on MAPREDUCE-664:
-

bq. No automated unit test when there really could have been. How to you ensure 
this never breaks? 

Good point. A test for this should be included in HADOOP-1008

> distcp with -delete option does not display number of files deleted from the 
> target that were not present on source 
> 
>
> Key: MAPREDUCE-664
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-664
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: distcp
>Affects Versions: 0.20.1
>Reporter: Suhas Gogate
>Assignee: Ravi Gummadi
> Fix For: 0.21.0
>
> Attachments: d_deletedPathsCount.patch, d_deletedPathsCount664.patch
>
>
> distcp with -delete option should provide information on total number of 
> files deleted from the target that were not present on the source. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-639) Update the TeraSort to reflect the new benchmark rules for '09

2009-09-18 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated MAPREDUCE-639:


Status: Open  (was: Patch Available)

> Update the TeraSort to reflect the new benchmark rules for '09
> --
>
> Key: MAPREDUCE-639
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-639
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.21.0
>
> Attachments: m-639.patch, timeline-png.tgz
>
>
> The terabyte sort rules have been changed and the example should be updated 
> to match them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-639) Update the TeraSort to reflect the new benchmark rules for '09

2009-09-18 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated MAPREDUCE-639:


Status: Patch Available  (was: Open)

Resubmitting since it got dropped by hudson.

> Update the TeraSort to reflect the new benchmark rules for '09
> --
>
> Key: MAPREDUCE-639
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-639
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.21.0
>
> Attachments: m-639.patch, timeline-png.tgz
>
>
> The terabyte sort rules have been changed and the example should be updated 
> to match them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1008) distcp needs CLI unit tests

2009-09-18 Thread Chris Douglas (JIRA)
distcp needs CLI unit tests
---

 Key: MAPREDUCE-1008
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1008
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: distcp
Reporter: Chris Douglas


The distcp tool is often used in automated environments and includes many 
diagnostic messages. It would be helpful to catch changes to the CLI and 
validate the correctness of output messages.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-654) Add an option -count to distcp for displaying some info about the src files

2009-09-18 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757400#action_12757400
 ] 

Chris Douglas commented on MAPREDUCE-654:
-

We don't have any CLI tests for distcp, as far as I know. Adding some would be 
a good idea, since this tool is often included in automated environments.

Filed MAPREDUCE-1008

> Add an option -count to distcp for displaying some info about the src files
> ---
>
> Key: MAPREDUCE-654
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-654
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: distcp
>Affects Versions: 0.21.0
>Reporter: Ravi Gummadi
>Assignee: Ravi Gummadi
> Fix For: 0.21.0
>
> Attachments: d_count.patch, d_count654.patch, d_count_v1.patch, 
> M654-2.patch
>
>
> Add an option -count to distcp for displaying metadata about src files like 
> number of files to be copied and total size of src files to be copied.
> WIth -count, distcp doesn't do any copy. Just displays info and exits.
> This is useful specifically when used with -update.
>  distcp -update -count *  
>   would display the number of files to be updated and the total size of 
> copy needs to be done(by comparing the file sizes and checksums at src and 
> dst). Based on this info, users could allocate the number of nodes needed for 
> the actual update job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-923) Sqoop's ORM uses URLDecoder on a file, which replaces plus signs in a jar file name with spaces

2009-09-18 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-923:


   Resolution: Fixed
Fix Version/s: 0.21.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

+1

I've just committed this. Thanks Aaron!

> Sqoop's ORM uses URLDecoder on a file, which replaces plus signs in a jar 
> file name with spaces
> ---
>
> Key: MAPREDUCE-923
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-923
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/sqoop
>Affects Versions: 0.20.1
>Reporter: Kevin Weil
>Assignee: Aaron Kimball
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-923.patch
>
>
> In findThisJar, sqoop runs URLDecoder.decode on the resulting jar, which has 
> the effect of replacing any + signs in the path with a space.  This obviously 
> breaks the classpath variable that it's trying to set, and the 
> sqoop-generated code fails to compile.  Ironically, Cloudera's hadoop distro 
> is the one that puts + characters in jar files, and so exhibits the bug.  
> Here is an example from running sqoop with log4j at debug level.  Note the 
> space in the very last term, which should read hadoop-0.20.0+61-sqoop.jar 
> rather than hadoop-0.20.0 61-sqoop.jar.
> 09/08/27 18:00:07 DEBUG orm.CompilationManager: Invoking javac with args: 
> -sourcepath ./ -d /tmp/sqoop/compile/ -classpath 
> /usr/lib/hadoop-0.20/conf:/usr/java/jdk1.6.0_06/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-0.20.0+61-core.jar:/usr/lib/hadoop-0.20/lib/commons-cli-2.0-SNAPSHOT.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-0.20.0+61-fairscheduler.jar:/usr/lib/hadoop-0.20/lib/hadoop-0.20.0+61-scribe-log4j.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-3.8.1.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar:/usr/local/hadoop/lib/hadoop-gpl-compression.jar:/usr/lib/hadoop-0.20/hadoop-0.20.0+61-core.jar:/usr/lib/hadoop-0.20/contrib/sqoop/hadoop-0.20.0
>  61-sqoop.jar

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-849) Renaming of configuration property names in mapreduce

2009-09-18 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757392#action_12757392
 ] 

Hemanth Yamijala commented on MAPREDUCE-849:


Umm. I am confused. Why is this incompatible. We are using the backwards 
compatible feature provided in HADOOP-6105, right ? So old keys work just the 
same, they are deprecated thats all, isn't it ? Or did I miss something ?

> Renaming of configuration property names in mapreduce
> -
>
> Key: MAPREDUCE-849
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-849
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: Config changes.xls, Config changes.xls, patch-849-1.txt, 
> patch-849-2.txt, patch-849-3.txt, patch-849.txt
>
>
> In-line with HDFS-531, property names in configuration files should be 
> standardized in MAPREDUCE. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-980) Modify JobHistory to use Avro for serialization instead of raw JSON

2009-09-18 Thread Jothi Padmanabhan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757390#action_12757390
 ] 

Jothi Padmanabhan commented on MAPREDUCE-980:
-

Got it. Thanks.

> Modify JobHistory to use Avro for serialization instead of raw JSON
> ---
>
> Key: MAPREDUCE-980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-980
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Jothi Padmanabhan
>Assignee: Doug Cutting
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch
>
>
> MAPREDUCE-157 modifies JobHistory to log events using Json Format.  This can 
> be modified to use Avro instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-893) Provide an ability to refresh queue configuration without restart.

2009-09-18 Thread rahul k singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757393#action_12757393
 ] 

rahul k singh commented on MAPREDUCE-893:
-

ran ant test on the existing patch . All passed.


> Provide an ability to refresh queue configuration without restart.
> --
>
> Key: MAPREDUCE-893
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-893
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Hemanth Yamijala
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-893-20090915.1.txt, 
> MAPREDUCE-893-20090917.2.txt, MAPREDUCE-893-20090917.4.txt, 
> MAPREDUCE-893-20090918.2.txt, MAPREDUCE-893-20090918.over.849.patch, 
> MAPREDUCE-893-20090918.txt, MAPREDUCE-893-7.patch
>
>
> While administering a cluster using multiple queues, administrators feel a 
> need to refresh queue properties on the fly without needing to restart the 
> JobTracker. This is partially supported for some properties such as queue 
> ACLs (HADOOP-5396) and state (HADOOP-5913). The idea is to extend the 
> facility to refresh other queue properties as well, including scheduler 
> properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-980) Modify JobHistory to use Avro for serialization instead of raw JSON

2009-09-18 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757389#action_12757389
 ] 

Doug Cutting commented on MAPREDUCE-980:


> should the EventWriter.Version be something different than "Avro-Binary"? 
> Something that will help us keep track of schema evolutions

The entire schema is included in the file.  If the schema changes, Avro can 
still read old data.  We don't need to update the file version if we, e.g., add 
a field.  If we make such a fundamental change to the schema that Avro's 
automatic versioning cannot handle it, then we could change the version string 
to be "Avro-Binary-v2" or something.  Or we could examine the schema itself to 
determine which version it is.

> Modify JobHistory to use Avro for serialization instead of raw JSON
> ---
>
> Key: MAPREDUCE-980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-980
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Jothi Padmanabhan
>Assignee: Doug Cutting
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch
>
>
> MAPREDUCE-157 modifies JobHistory to log events using Json Format.  This can 
> be modified to use Avro instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1007) MAPREDUCE 777 breaks the UI for hierarchial Queues.

2009-09-18 Thread rahul k singh (JIRA)
MAPREDUCE 777 breaks the UI for hierarchial Queues. 


 Key: MAPREDUCE-1007
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1007
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: rahul k singh
Priority: Blocker


mapreduce 777 breaks jobtracker UI for hierarchial queues

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-980) Modify JobHistory to use Avro for serialization instead of raw JSON

2009-09-18 Thread Jothi Padmanabhan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757383#action_12757383
 ] 

Jothi Padmanabhan commented on MAPREDUCE-980:
-

I do not know if I am being naive here, but should the EventWriter.Version be 
something different than "Avro-Binary"? Something that will help us keep track 
of schema evolutions, like "1.0" ? Or is the version used for a different 
purpose? 

> Modify JobHistory to use Avro for serialization instead of raw JSON
> ---
>
> Key: MAPREDUCE-980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-980
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Jothi Padmanabhan
>Assignee: Doug Cutting
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch
>
>
> MAPREDUCE-157 modifies JobHistory to log events using Json Format.  This can 
> be modified to use Avro instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-923) Sqoop's ORM uses URLDecoder on a file, which replaces plus signs in a jar file name with spaces

2009-09-18 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757367#action_12757367
 ] 

Aaron Kimball commented on MAPREDUCE-923:
-

We confirmed via email that this fixes it for him.

> Sqoop's ORM uses URLDecoder on a file, which replaces plus signs in a jar 
> file name with spaces
> ---
>
> Key: MAPREDUCE-923
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-923
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/sqoop
>Affects Versions: 0.20.1
>Reporter: Kevin Weil
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-923.patch
>
>
> In findThisJar, sqoop runs URLDecoder.decode on the resulting jar, which has 
> the effect of replacing any + signs in the path with a space.  This obviously 
> breaks the classpath variable that it's trying to set, and the 
> sqoop-generated code fails to compile.  Ironically, Cloudera's hadoop distro 
> is the one that puts + characters in jar files, and so exhibits the bug.  
> Here is an example from running sqoop with log4j at debug level.  Note the 
> space in the very last term, which should read hadoop-0.20.0+61-sqoop.jar 
> rather than hadoop-0.20.0 61-sqoop.jar.
> 09/08/27 18:00:07 DEBUG orm.CompilationManager: Invoking javac with args: 
> -sourcepath ./ -d /tmp/sqoop/compile/ -classpath 
> /usr/lib/hadoop-0.20/conf:/usr/java/jdk1.6.0_06/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-0.20.0+61-core.jar:/usr/lib/hadoop-0.20/lib/commons-cli-2.0-SNAPSHOT.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-0.20.0+61-fairscheduler.jar:/usr/lib/hadoop-0.20/lib/hadoop-0.20.0+61-scribe-log4j.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-3.8.1.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar:/usr/local/hadoop/lib/hadoop-gpl-compression.jar:/usr/lib/hadoop-0.20/hadoop-0.20.0+61-core.jar:/usr/lib/hadoop-0.20/contrib/sqoop/hadoop-0.20.0
>  61-sqoop.jar

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-980) Modify JobHistory to use Avro for serialization instead of raw JSON

2009-09-18 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757356#action_12757356
 ] 

Doug Cutting commented on MAPREDUCE-980:


> should we change the encoder to json [ ... ]

That was my initial instinct too, but Eric and Owen both indicated to me that 
they preferred that we use binary.

Eric's comment is:

https://issues.apache.org/jira/browse/MAPREDUCE-157?focusedCommentId=12745279&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12745279

Owen has indicated this in offline discussions.  The idea is that one can 
easily use Avro to dump the binary as JSON, but that the binary is smaller and 
faster.

It's a trivial change to make if we prefer JSON instead of binary.

> Modify JobHistory to use Avro for serialization instead of raw JSON
> ---
>
> Key: MAPREDUCE-980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-980
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Jothi Padmanabhan
>Assignee: Doug Cutting
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch
>
>
> MAPREDUCE-157 modifies JobHistory to log events using Json Format.  This can 
> be modified to use Avro instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-849) Renaming of configuration property names in mapreduce

2009-09-18 Thread Nigel Daley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757354#action_12757354
 ] 

Nigel Daley commented on MAPREDUCE-849:
---

No release note for this massively incompatible change?

> Renaming of configuration property names in mapreduce
> -
>
> Key: MAPREDUCE-849
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-849
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: Config changes.xls, Config changes.xls, patch-849-1.txt, 
> patch-849-2.txt, patch-849-3.txt, patch-849.txt
>
>
> In-line with HDFS-531, property names in configuration files should be 
> standardized in MAPREDUCE. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-990) Making distributed cache getters in JobContext never return null

2009-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757352#action_12757352
 ] 

Hadoop QA commented on MAPREDUCE-990:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12420045/MAPREDUCE-990.patch
  against trunk revision 816735.

-1 @author.  The patch appears to contain  @author tags which the Hadoop 
community has agreed to not allow in code contributions.

+1 tests included.  The patch appears to include  new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/114/console

This message is automatically generated.

> Making distributed cache getters in JobContext never return null
> 
>
> Key: MAPREDUCE-990
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-990
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: MAPREDUCE-990.patch, MAPREDUCE-990.patch.txt
>
>
> MAPREDUCE-898 moved distributed cache setters and getters into Job and 
> JobContext.  Since the API is new, I'd like to propose that those getters 
> never return null, but instead always return an array, even if it's empty.
> If people don't like this change, I can instead merely update the javadoc to 
> reflect the fact that null may be returned.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-990) Making distributed cache getters in JobContext never return null

2009-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757351#action_12757351
 ] 

Hadoop QA commented on MAPREDUCE-990:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12420045/MAPREDUCE-990.patch
  against trunk revision 816664.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/113/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/113/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/113/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/113/console

This message is automatically generated.

> Making distributed cache getters in JobContext never return null
> 
>
> Key: MAPREDUCE-990
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-990
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: MAPREDUCE-990.patch, MAPREDUCE-990.patch.txt
>
>
> MAPREDUCE-898 moved distributed cache setters and getters into Job and 
> JobContext.  Since the API is new, I'd like to propose that those getters 
> never return null, but instead always return an array, even if it's empty.
> If people don't like this change, I can instead merely update the javadoc to 
> reflect the fact that null may be returned.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-980) Modify JobHistory to use Avro for serialization instead of raw JSON

2009-09-18 Thread Jothi Padmanabhan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757350#action_12757350
 ] 

Jothi Padmanabhan commented on MAPREDUCE-980:
-

Just another clarification -- since the patch is using avro1.1, should we 
change the encoder to json instead of binary so that tools that scrape the logs 
instead of using EventReaders be supported? 

> Modify JobHistory to use Avro for serialization instead of raw JSON
> ---
>
> Key: MAPREDUCE-980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-980
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Jothi Padmanabhan
>Assignee: Doug Cutting
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch
>
>
> MAPREDUCE-157 modifies JobHistory to log events using Json Format.  This can 
> be modified to use Avro instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-980) Modify JobHistory to use Avro for serialization instead of raw JSON

2009-09-18 Thread Doug Cutting (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Cutting updated MAPREDUCE-980:
---

Attachment: MAPREDUCE-980.patch

Improved javadoc a bit.

> Modify JobHistory to use Avro for serialization instead of raw JSON
> ---
>
> Key: MAPREDUCE-980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-980
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Jothi Padmanabhan
>Assignee: Doug Cutting
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch
>
>
> MAPREDUCE-157 modifies JobHistory to log events using Json Format.  This can 
> be modified to use Avro instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-277) Job history counters should be avaible on the UI.

2009-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757342#action_12757342
 ] 

Hudson commented on MAPREDUCE-277:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #56 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/56/])
. Makes job history counters available on the job history viewers. 
Contributed by Jothi Padmanabhan.


> Job history counters should be avaible on the UI.
> -
>
> Key: MAPREDUCE-277
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-277
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Affects Versions: 0.20.1
>Reporter: Amareshwari Sriramadasu
>Assignee: Jothi Padmanabhan
>Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: HADOOP-3200-20080915.1.txt, mapred-277-v1.2.patch, 
> mapred-277-v1.4.patch, mapred-277-v2.patch, mapred-277.patch
>
>
> Job history is logging counters. But they are not visible on the UI. 
> Job history parser and UI should be modified to view counters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-980) Modify JobHistory to use Avro for serialization instead of raw JSON

2009-09-18 Thread Doug Cutting (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Cutting updated MAPREDUCE-980:
---

Attachment: MAPREDUCE-980.patch

> Don't we need Javadocs for the newly introduced public constructors in 
> Counter and CounterGroup?

Added.


> Modify JobHistory to use Avro for serialization instead of raw JSON
> ---
>
> Key: MAPREDUCE-980
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-980
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Jothi Padmanabhan
>Assignee: Doug Cutting
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch, MAPREDUCE-980.patch, MAPREDUCE-980.patch, 
> MAPREDUCE-980.patch
>
>
> MAPREDUCE-157 modifies JobHistory to log events using Json Format.  This can 
> be modified to use Avro instead. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-892) command line tool to list all tasktrackers and their status

2009-09-18 Thread Dmytro Molkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757339#action_12757339
 ] 

Dmytro Molkov commented on MAPREDUCE-892:
-

Can someone please review this patch?

> command line tool to list all tasktrackers and their status
> ---
>
> Key: MAPREDUCE-892
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-892
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0
>Reporter: dhruba borthakur
>Assignee: Dmytro Molkov
> Attachments: MAPRED-892.patch.3, MAPREDUCE-892.patch, 
> MAPREDUCE-892.patch, MAPREDUCE-892.patch.1
>
>
> The "hadoop mradmin -report" could list all the tasktrackers that the 
> JobTracker knows about. It will also list a brief status summary for each of 
> the TaskTracker. (This is similar to the hadop dfsadmin -report command that 
> lists all Datanodes)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1003) trunk build fails when -Declipse.home is set

2009-09-18 Thread Nigel Daley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757335#action_12757335
 ] 

Nigel Daley commented on MAPREDUCE-1003:


Umm, the test-path DID find this.  It was just ignored by the committer and 
contributor!

> trunk build fails when -Declipse.home is set
> 
>
> Key: MAPREDUCE-1003
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1003
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Giridharan Kesavan
>Assignee: Ravi Gummadi
> Fix For: 0.21.0
>
> Attachments: MR1003.patch.txt
>
>
> compile:
>  [echo] contrib: eclipse-plugin 
> [javac] Compiling 45 source files to 
> /grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h3.grid.sp2.yahoo.net/trunk/build/contrib/eclipse-plugin/classes
> [javac] 
> /grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h3.grid.sp2.yahoo.net/trunk/src/contrib/eclipse-plugin/src/java/org/apache/hadoop/eclipse/server/HadoopJob.java:54:
>  constant expression required
> [javac] case JobStatus.PREP:
> [javac]   ^
> [javac] 
> /grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h3.grid.sp2.yahoo.net/trunk/src/contrib/eclipse-plugin/src/java/org/apache/hadoop/eclipse/server/HadoopJob.java:56:
>  constant expression required
> [javac] case JobStatus.RUNNING:
> [javac]   ^
> [javac] 
> /grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h3.grid.sp2.yahoo.net/trunk/src/contrib/eclipse-plugin/src/java/org/apache/hadoop/eclipse/server/HadoopJob.java:58:
>  constant expression required
> [javac] case JobStatus.FAILED:
> [javac]   ^
> [javac] 
> /grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h3.grid.sp2.yahoo.net/trunk/src/contrib/eclipse-plugin/src/java/org/apache/hadoop/eclipse/server/HadoopJob.java:60:
>  constant expression required
> [javac] case JobStatus.SUCCEEDED:
> [javac]   ^
> [javac] Note: Some input files use or override a deprecated API.
> [javac] Note: Recompile with -Xlint:deprecation for details.
> [javac] Note: Some input files use unchecked or unsafe operations.
> [javac] Note: Recompile with -Xlint:unchecked for details.
> [javac] 4 errors

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-777) A method for finding and tracking jobs from the new API

2009-09-18 Thread Nigel Daley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757336#action_12757336
 ] 

Nigel Daley commented on MAPREDUCE-777:
---

Committer and contributors need to look at failures more closely.  The reason 
the contrib got a -1 was because this patch broke the eclipse plugin build!  
See 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/45/console

MAPREDUCE-1003 was filed and now fixed for this issue.

> A method for finding and tracking jobs from the new API
> ---
>
> Key: MAPREDUCE-777
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-777
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: client
>Reporter: Owen O'Malley
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: m-777.patch, patch-777-1.txt, patch-777-10.txt, 
> patch-777-11.txt, patch-777-12.txt, patch-777-13.txt, patch-777-14.txt, 
> patch-777-15.txt, patch-777-16.txt, patch-777-17.txt, patch-777-2.txt, 
> patch-777-3.txt, patch-777-4.txt, patch-777-5.txt, patch-777-6.txt, 
> patch-777-7.txt, patch-777-8.txt, patch-777-9.txt, patch-777.txt
>
>
> We need to create a replacement interface for the JobClient API in the new 
> interface. In particular, the user needs to be able to query and track jobs 
> that were launched by other processes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



  1   2   3   >