[jira] Updated: (MAPREDUCE-157) Job History log file format is not friendly for external tools.

2009-09-04 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan updated MAPREDUCE-157:


Status: Open  (was: Patch Available)

> Job History log file format is not friendly for external tools.
> ---
>
> Key: MAPREDUCE-157
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Affects Versions: 0.20.1
>Reporter: Owen O'Malley
>Assignee: Jothi Padmanabhan
> Fix For: 0.21.0
>
> Attachments: mapred-157-4Sep.patch, mapred-157-prelim.patch, 
> MAPREDUCE-157-avro.patch
>
>
> Currently, parsing the job history logs with external tools is very difficult 
> because of the format. The most critical problem is that newlines aren't 
> escaped in the strings. That makes using tools like grep, sed, and awk very 
> tricky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-956) Shuffle should be broken down to only two phases (copy/reduce) instead of three (copy/sort/reduce)

2009-09-04 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan updated MAPREDUCE-956:


  Component/s: task
Affects Version/s: 0.21.0

> Shuffle should be broken down to only two phases (copy/reduce) instead of 
> three (copy/sort/reduce)
> --
>
> Key: MAPREDUCE-956
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-956
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.21.0
>Reporter: Jothi Padmanabhan
>
> For the progress calculations and displaying on the UI, shuffle, in its 
> current form,  is decomposed into three phases (copy/sort/reduce). Actually, 
> the sort phase is no longer applicable. I think we should just reduce the 
> number of phases to two and assign 50% weight-age to each of copy and reduce 
> phases. Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-956) Shuffle should be broken down to only two phases (copy/reduce) instead of three (copy/sort/reduce)

2009-09-04 Thread Jothi Padmanabhan (JIRA)
Shuffle should be broken down to only two phases (copy/reduce) instead of three 
(copy/sort/reduce)
--

 Key: MAPREDUCE-956
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-956
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jothi Padmanabhan


For the progress calculations and displaying on the UI, shuffle, in its current 
form,  is decomposed into three phases (copy/sort/reduce). Actually, the sort 
phase is no longer applicable. I think we should just reduce the number of 
phases to two and assign 50% weight-age to each of copy and reduce phases. 
Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.

2009-09-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751685#action_12751685
 ] 

Hudson commented on MAPREDUCE-370:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #17 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/17/])
. Update MultipleOutputs to use the API, merge funcitonality
of MultipleOutputFormat. Contributed by Amareshwari Sriramadasu


> Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
> ---
>
> Key: MAPREDUCE-370
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-370-1.txt, patch-370-2.txt, patch-370-3.txt, 
> patch-370-4.txt, patch-370-5.txt, patch-370.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-892) command line tool to list all tasktrackers and their status

2009-09-04 Thread Dmytro Molkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Molkov updated MAPREDUCE-892:


Attachment: MAPREDUCE-892.patch

Please review change

> command line tool to list all tasktrackers and their status
> ---
>
> Key: MAPREDUCE-892
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-892
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Assignee: Dmytro Molkov
> Attachments: MAPREDUCE-892.patch
>
>
> The "hadoop mradmin -report" could list all the tasktrackers that the 
> JobTracker knows about. It will also list a brief status summary for each of 
> the TaskTracker. (This is similar to the hadop dfsadmin -report command that 
> lists all Datanodes)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.

2009-09-04 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-370:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1

The test failure, TestNodeRefresh.testMRExcludeHostsAcrossRestarts, is not 
related.

I committed this. Thanks, Amareshwari!

> Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
> ---
>
> Key: MAPREDUCE-370
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-370-1.txt, patch-370-2.txt, patch-370-3.txt, 
> patch-370-4.txt, patch-370-5.txt, patch-370.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-892) command line tool to list all tasktrackers and their status

2009-09-04 Thread Dmytro Molkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Molkov updated MAPREDUCE-892:


Attachment: (was: 
0001-First-iteration-of-adding-the-task-tracker-reporting.patch)

> command line tool to list all tasktrackers and their status
> ---
>
> Key: MAPREDUCE-892
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-892
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Assignee: Dmytro Molkov
>
> The "hadoop mradmin -report" could list all the tasktrackers that the 
> JobTracker knows about. It will also list a brief status summary for each of 
> the TaskTracker. (This is similar to the hadop dfsadmin -report command that 
> lists all Datanodes)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-936) Allow a load difference in fairshare scheduler

2009-09-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751671#action_12751671
 ] 

Hudson commented on MAPREDUCE-936:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #16 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/16/])
. Allow a load difference for fairshare scheduler.
(Zheng Shao via dhruba)


> Allow a load difference in fairshare scheduler
> --
>
> Key: MAPREDUCE-936
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-936
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1, 0.21.0, 0.22.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-936.1.patch, MAPREDUCE-936.2.patch
>
>
> The problem we are facing: It takes a long time for all tasks of a job to get 
> scheduled on the cluster, even if the cluster is almost empty.
> There are two reasons that together lead to this situation:
> 1. The load factor makes sure each TT runs the same number of tasks. (This is 
> the part that this patch tries to change).
> 2. The scheduler tries to schedule map tasks locally (first node-local, then 
> rack-local). There is a wait time (mapred.fairscheduler.localitywait.node and 
> mapred.fairscheduler.localitywait.rack, both are around 10 sec in our conf), 
> and accumulated wait time (JobInfo.localityWait). The accumulated wait time 
> is reset to 0 whenever a non-local map task is scheduled. That means it takes 
> N * wait_time to schedule N non-local map tasks.
> Because of 1, a lot of TT will not be able to take more tasks, even if they 
> have free slots. As a result, a lot of the map tasks cannot be scheduled 
> locally.
> Because of 2, it's really hard to schedule a non-local task.
> As a result, sometimes we are seeing that it takes more than 2 minutes to 
> schedule all the mappers of a job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-903) Adding AVRO jar to eclipse classpath

2009-09-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751668#action_12751668
 ] 

Hudson commented on MAPREDUCE-903:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #15 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/15/])
. Add Avro jar to eclipse classpath. Contributed by Philip Zeyliger.


> Adding AVRO jar to eclipse classpath
> 
>
> Key: MAPREDUCE-903
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-903
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-903.patch
>
>
> Avro is missing from the eclipse classpath, which caused Eclipse to whine.  
> Easy fix.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-936) Allow a load difference in fairshare scheduler

2009-09-04 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated MAPREDUCE-936:
---

   Resolution: Fixed
Fix Version/s: 0.21.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I just committed this. Thanks Zheng.

> Allow a load difference in fairshare scheduler
> --
>
> Key: MAPREDUCE-936
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-936
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1, 0.21.0, 0.22.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-936.1.patch, MAPREDUCE-936.2.patch
>
>
> The problem we are facing: It takes a long time for all tasks of a job to get 
> scheduled on the cluster, even if the cluster is almost empty.
> There are two reasons that together lead to this situation:
> 1. The load factor makes sure each TT runs the same number of tasks. (This is 
> the part that this patch tries to change).
> 2. The scheduler tries to schedule map tasks locally (first node-local, then 
> rack-local). There is a wait time (mapred.fairscheduler.localitywait.node and 
> mapred.fairscheduler.localitywait.rack, both are around 10 sec in our conf), 
> and accumulated wait time (JobInfo.localityWait). The accumulated wait time 
> is reset to 0 whenever a non-local map task is scheduled. That means it takes 
> N * wait_time to schedule N non-local map tasks.
> Because of 1, a lot of TT will not be able to take more tasks, even if they 
> have free slots. As a result, a lot of the map tasks cannot be scheduled 
> locally.
> Because of 2, it's really hard to schedule a non-local task.
> As a result, sometimes we are seeing that it takes more than 2 minutes to 
> schedule all the mappers of a job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-903) Adding AVRO jar to eclipse classpath

2009-09-04 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-903:


   Resolution: Fixed
Fix Version/s: 0.21.0
 Assignee: Philip Zeyliger
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Philip!

> Adding AVRO jar to eclipse classpath
> 
>
> Key: MAPREDUCE-903
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-903
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-903.patch
>
>
> Avro is missing from the eclipse classpath, which caused Eclipse to whine.  
> Easy fix.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-955) CombineFileRecordReader should pass a InputSplit in the constructor instead of CombineFileSplit

2009-09-04 Thread Namit Jain (JIRA)
CombineFileRecordReader should pass a InputSplit in the constructor instead of 
CombineFileSplit
---

 Key: MAPREDUCE-955
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-955
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Namit Jain


The specific reader can always cast the class as needed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-954) The new interface's Context objects should be interfaces

2009-09-04 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751552#action_12751552
 ] 

Arun C Murthy commented on MAPREDUCE-954:
-

+1

We should bite the bullet and fix it sooner rather than later. I've run into 
issues with this in MAPREDUCE-901 etc.

> The new interface's Context objects should be interfaces
> 
>
> Key: MAPREDUCE-954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-954
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Owen O'Malley
> Fix For: 0.21.0
>
>
> When I was doing HADOOP-1230, I was persuaded to make the Context objects as 
> classes. I think that was a serious mistake. It caused a lot of information 
> leakage into the public classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-954) The new interface's Context objects should be interfaces

2009-09-04 Thread Owen O'Malley (JIRA)
The new interface's Context objects should be interfaces


 Key: MAPREDUCE-954
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-954
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client
Reporter: Owen O'Malley
 Fix For: 0.21.0


When I was doing HADOOP-1230, I was persuaded to make the Context objects as 
classes. I think that was a serious mistake. It caused a lot of information 
leakage into the public classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-936) Allow a load difference in fairshare scheduler

2009-09-04 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751540#action_12751540
 ] 

Matei Zaharia commented on MAPREDUCE-936:
-

+1 looks good, feel free to commit it. Thanks Zheng!

> Allow a load difference in fairshare scheduler
> --
>
> Key: MAPREDUCE-936
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-936
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1, 0.21.0, 0.22.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Attachments: MAPREDUCE-936.1.patch, MAPREDUCE-936.2.patch
>
>
> The problem we are facing: It takes a long time for all tasks of a job to get 
> scheduled on the cluster, even if the cluster is almost empty.
> There are two reasons that together lead to this situation:
> 1. The load factor makes sure each TT runs the same number of tasks. (This is 
> the part that this patch tries to change).
> 2. The scheduler tries to schedule map tasks locally (first node-local, then 
> rack-local). There is a wait time (mapred.fairscheduler.localitywait.node and 
> mapred.fairscheduler.localitywait.rack, both are around 10 sec in our conf), 
> and accumulated wait time (JobInfo.localityWait). The accumulated wait time 
> is reset to 0 whenever a non-local map task is scheduled. That means it takes 
> N * wait_time to schedule N non-local map tasks.
> Because of 1, a lot of TT will not be able to take more tasks, even if they 
> have free slots. As a result, a lot of the map tasks cannot be scheduled 
> locally.
> Because of 2, it's really hard to schedule a non-local task.
> As a result, sometimes we are seeing that it takes more than 2 minutes to 
> schedule all the mappers of a job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-943) TestNodeRefresh timesout occasionally

2009-09-04 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751510#action_12751510
 ] 

Konstantin Boudnik commented on MAPREDUCE-943:
--

If the cause of timeouts is connected to the MAPREDUCE-873 then this JIRA has 
to be converted to its sub-task. Otherwise, it seems to be confusing at first.

> TestNodeRefresh timesout occasionally
> -
>
> Key: MAPREDUCE-943
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-943
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Amareshwari Sriramadasu
>Assignee: Amar Kamat
> Fix For: 0.21.0
>
> Attachments: MAPRED-943-v1.0.patch
>
>
> TestNodeRefresh timesout occasionally.
> One of the hudson patch build with timeout 
> @http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/26/testReport/org.apache.hadoop.mapred/TestNodeRefresh/testMRExcludeHostsAcrossRestarts/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-943) TestNodeRefresh timesout occasionally

2009-09-04 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751506#action_12751506
 ] 

Konstantin Boudnik commented on MAPREDUCE-943:
--

So, removing the test case seems to be the only way of 'fixing' the problem?

> TestNodeRefresh timesout occasionally
> -
>
> Key: MAPREDUCE-943
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-943
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Amareshwari Sriramadasu
>Assignee: Amar Kamat
> Fix For: 0.21.0
>
> Attachments: MAPRED-943-v1.0.patch
>
>
> TestNodeRefresh timesout occasionally.
> One of the hudson patch build with timeout 
> @http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/26/testReport/org.apache.hadoop.mapred/TestNodeRefresh/testMRExcludeHostsAcrossRestarts/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-157) Job History log file format is not friendly for external tools.

2009-09-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751490#action_12751490
 ] 

Hadoop QA commented on MAPREDUCE-157:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12418622/mapred-157-4Sep.patch
  against trunk revision 811134.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 27 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/41/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/41/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/41/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/41/console

This message is automatically generated.

> Job History log file format is not friendly for external tools.
> ---
>
> Key: MAPREDUCE-157
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Affects Versions: 0.20.1
>Reporter: Owen O'Malley
>Assignee: Jothi Padmanabhan
> Fix For: 0.21.0
>
> Attachments: mapred-157-4Sep.patch, mapred-157-prelim.patch, 
> MAPREDUCE-157-avro.patch
>
>
> Currently, parsing the job history logs with external tools is very difficult 
> because of the format. The most critical problem is that newlines aren't 
> escaped in the strings. That makes using tools like grep, sed, and awk very 
> tricky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-09-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751460#action_12751460
 ] 

Hadoop QA commented on MAPREDUCE-856:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12418620/MAPREDUCE-856-20090904.1.txt
  against trunk revision 811134.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 23 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/8/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/8/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/8/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/8/console

This message is automatically generated.

> Localized files from DistributedCache should have right access-control
> --
>
> Key: MAPREDUCE-856
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Reporter: Arun C Murthy
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
> MAPREDUCE-856-20090825.3.txt, MAPREDUCE-856-20090827.txt, 
> MAPREDUCE-856-20090903.txt, MAPREDUCE-856-20090904.1.txt, 
> MAPREDUCE-856-20090904.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-864) Enhance JobClient API implementations to look at history files to get information about jobs that are not in memory

2009-09-04 Thread Sharad Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751445#action_12751445
 ] 

Sharad Agarwal commented on MAPREDUCE-864:
--

Had an offline discussion with Devaraj and Hemanth. Apart from the issue 
mentioned above, one more issue was identified with this JIRA. The issue is 
that of consistency. This can't be done just for Job data. To be consistent all 
job client apis should return data from HDFS if not found in job tracker's 
memory. Consider the api like getJob(Jobid). To be consistent, it should also 
look HDFS for completed jobs if data is not in job tracker. Looking into the 
HDFS completed jobs folder and building up the job structures *efficiently* is 
a non-trivial thing to do at this point.

So we agree that a better approach at this point would be:

Retain the contract that job clients will *only* see information which are in 
Jobtracker's memory. Clients will get the very basic information of the 
completed jobs from job tracker's retired cache (MAPREDUCE-817). 
Clients which need to drill down completed jobs' *TASK* level information will 
need to use History parser. The assumption here is that such clients will be 
very few and mostly these clients want to do analysis of the completed jobs. So 
it is better for them to use History parser directly and keep the job client 
interface clean.
The only minor concern here is that many clients may just need to look at the 
counters which are currently not cached in the retired job info. They will have 
to go to the History parser path to retrieve them. There should be a easy way 
to get those. The proposal is to add counters to the retired job cache. The 
idea is to just cache the job level information and not any task level in the 
retired jobs cache. Some quick estimate for the memory consumption. Assuming 
100 counters per job and 200 bytes per counter. For 1000 retired jobs, it comes 
to 100*200*1000 = 20 MB, which is quite manageable.

> Enhance JobClient API implementations to look at history files to get 
> information about jobs that are not in memory
> ---
>
> Key: MAPREDUCE-864
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-864
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: jobtracker
>Reporter: Devaraj Das
>Assignee: Sharad Agarwal
> Fix For: 0.21.0
>
>
> MAPREDUCE-817 added an API to get the JobHistory URL from the JobTracker. 
> This is useful in two ways:
> 1) Users can use this API to get the URL, copy the history files to their 
> local disk, and, do processing on them
> 2) APIs like JobSubmissionProtocol.getJobCounters, can read a part of the 
> history file, and then return the information to the caller (if the job is 
> not there in JT memory). This would  mimic most of the 
> CompletedJobsStatusStore functionality.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.

2009-09-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751443#action_12751443
 ] 

Hadoop QA commented on MAPREDUCE-370:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12418584/patch-370-5.txt
  against trunk revision 811134.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/40/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/40/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/40/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/40/console

This message is automatically generated.

> Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
> ---
>
> Key: MAPREDUCE-370
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-370-1.txt, patch-370-2.txt, patch-370-3.txt, 
> patch-370-4.txt, patch-370-5.txt, patch-370.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-181) Secure job submission

2009-09-04 Thread Amar Kamat (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751438#action_12751438
 ] 

Amar Kamat commented on MAPREDUCE-181:
--

Here is the final proposal :
# Here is how the handshake happens for job submission
 ## jobclient asks the jobtracker for a new jobid (jobtracker maintains a 
mapping from job-id to user-name [ugi]. This user is the owner of the job and 
will be allowed to submit the job)
 ## using the Input-split, the jobclient constructs a split _meta-info_ for the 
jobtracker to be able to create the task->node locality cache. 
  {code}
   job-split-meta-info :
   - split-location (location of the actual split/raw-bytes)
   - split class (used to reinstantiate the split object)
   - split-info (array of individual split meta-info)

   split-meta-info :
   - locations (hostnames where this split is local)
   - start offset (start in raw-bytes)
   - length (total bytes in the corresponding raw-bytes)
   - data-size : total data that will be processed in this split
  {code}
 ## with this new id, the jobclient upload job.xml, job.split, job.jar and 
achives/libs to a staging area (/user/_user-name_/.staging/_jobid_/). job.xml 
is staged to support (jobtracker.getJobFile()) api. 
 ## after the upload is done, the jobclient submits a job by passing job-id, 
job-conf and job-split-meta-info via rpc.
 ## jobtracker does the following things upon a submitjob request
  ### validate conf (includes queuecheck, acls checks etc along with user-name 
[conf.username and owner match]and ownership checks [called of getnewid() and 
submitjob()])
  ### serialize conf to mapred.system.dir/jobid/job.xml (for restarts)
  ### serialize split-meta-info to mapred.system.dir/jobid/job.split
  ### starts the job i.e create jobinprogress
 ## when a tt comes asking for a task, the jobtracker passes the split-metainfo 
(along with split-location and split-classname). Tasktracker uses this metainfo 
for reading the split raw-bytes. 
 ## tasktracker now localizes the job.jar from 
/user/_user-name_/.staging/_job-id_/job.jar and then unjars it. This is done 
using the job-conf (having user-credentials)
 ## mapred.system.dir can now be 700 and only accessible to mapred daemons 
 ## readFields() in jobconf caps the total characters in jobconf. This prevents 
users from passing huge job-confs. For now the limit is 3*1024*1024 chars
 ## job-split metainfo is also capped in readFields() to accept split meta-info 
< 10mb.
 ## since jobtracker.getNewJobId() maintains a mapping from jobid to username, 
the jobtracker needs to cleanup this mapping upon some timeout. One way to 
timeout is to use a thread which periodically cleans up this mapping.
 ## Upon job completion, jobcleanup code cleans up the staging folder i.e 
/user/_user-name_/.staging/_job-id_/.
 ## if the jobclient crashes or fails to submit job then the temp files 
/user/_user-name_/.staging/_job-id_/ are not deleted as this can be used for 
debugging purposes.

# Upon restart the mapred.system.dir can be completely trusted and hence no 
checking is done here.

> Secure job submission 
> --
>
> Key: MAPREDUCE-181
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-181
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Attachments: hadoop-3578-branch-20-example-2.patch, 
> hadoop-3578-branch-20-example.patch, HADOOP-3578-v2.6.patch, 
> HADOOP-3578-v2.7.patch, MAPRED-181-v3.8.patch
>
>
> Currently the jobclient accesses the {{mapred.system.dir}} to add job 
> details. Hence the {{mapred.system.dir}} has the permissions of 
> {{rwx-wx-wx}}. This could be a security loophole where the job files might 
> get overwritten/tampered after the job submission. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-763) Capacity scheduler should clean up reservations if it runs tasks on nodes other than where it has made reservations

2009-09-04 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-763:


Affects Version/s: 0.21.0
Fix Version/s: 0.21.0

> Capacity scheduler should clean up reservations if it runs tasks on nodes 
> other than where it has made reservations
> ---
>
> Key: MAPREDUCE-763
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-763
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/capacity-sched
>Affects Versions: 0.21.0
>Reporter: Hemanth Yamijala
>Assignee: Sreekanth Ramakrishnan
> Fix For: 0.21.0
>
>
> Currently capacity scheduler makes a reservation on nodes for high memory 
> jobs that cannot currently run at the time. It could happen that in the 
> meantime other tasktrackers become free to run the tasks of this job. Ideally 
> in the next heartbeat from the reserved TTs the reservation should be 
> removed. Otherwise it could unnecessarily block capacity for a while (until 
> the TT has enough slots free to run a task of this job).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-157) Job History log file format is not friendly for external tools.

2009-09-04 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan updated MAPREDUCE-157:


Attachment: mapred-157-4Sep.patch

Patch for changing History to use JSON format.  Some notes about the patch:

All history information are logged using events. 
A version event is prepended to all history files.
History viewer and History Parser have been cleaned up and duplication of code 
in the jsp files and HistoryViewer has been removed.
History files are named JobID_username. Filters on the UI page will now be 
based only on JobID and User name
History Viewer now takes a history file as an argument instead of output 
directory
All events are made up of new API objects, including counters. As a result I 
had to open up a couple of constructors in Counters to public.

Hadoop-Vaidya has been changed to use the new History Viewer, but has not been 
tested with it.
A temporary fix has been put for Rumen to get it compiled, it still works only 
with the old history format and not the new one.

> Job History log file format is not friendly for external tools.
> ---
>
> Key: MAPREDUCE-157
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Affects Versions: 0.20.1
>Reporter: Owen O'Malley
>Assignee: Jothi Padmanabhan
> Fix For: 0.21.0
>
> Attachments: mapred-157-4Sep.patch, mapred-157-prelim.patch, 
> MAPREDUCE-157-avro.patch
>
>
> Currently, parsing the job history logs with external tools is very difficult 
> because of the format. The most critical problem is that newlines aren't 
> escaped in the strings. That makes using tools like grep, sed, and awk very 
> tricky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-157) Job History log file format is not friendly for external tools.

2009-09-04 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan updated MAPREDUCE-157:


Fix Version/s: 0.21.0
Affects Version/s: 0.20.1
   Status: Patch Available  (was: Open)

> Job History log file format is not friendly for external tools.
> ---
>
> Key: MAPREDUCE-157
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Affects Versions: 0.20.1
>Reporter: Owen O'Malley
>Assignee: Jothi Padmanabhan
> Fix For: 0.21.0
>
> Attachments: mapred-157-4Sep.patch, mapred-157-prelim.patch, 
> MAPREDUCE-157-avro.patch
>
>
> Currently, parsing the job history logs with external tools is very difficult 
> because of the format. The most critical problem is that newlines aren't 
> escaped in the strings. That makes using tools like grep, sed, and awk very 
> tricky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-09-04 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751425#action_12751425
 ] 

Vinod K V commented on MAPREDUCE-856:
-

In general, it seems like JobConf is one of the only few deprecated classes 
that is being used by the classes in the new 'mapreduce' package. Apparently, 
as Amareshwari says, JobContext is the object that should be used by the mapred 
framework instead of JobConf, but it cannot be done right away. At any rate, we 
should really think of ways as to how we are going to move forward regarding 
this. Will open an issue for the same.

> Localized files from DistributedCache should have right access-control
> --
>
> Key: MAPREDUCE-856
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Reporter: Arun C Murthy
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
> MAPREDUCE-856-20090825.3.txt, MAPREDUCE-856-20090827.txt, 
> MAPREDUCE-856-20090903.txt, MAPREDUCE-856-20090904.1.txt, 
> MAPREDUCE-856-20090904.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-09-04 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-856:


Status: Patch Available  (was: Open)

> Localized files from DistributedCache should have right access-control
> --
>
> Key: MAPREDUCE-856
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Reporter: Arun C Murthy
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
> MAPREDUCE-856-20090825.3.txt, MAPREDUCE-856-20090827.txt, 
> MAPREDUCE-856-20090903.txt, MAPREDUCE-856-20090904.1.txt, 
> MAPREDUCE-856-20090904.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-09-04 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-856:


Attachment: MAPREDUCE-856-20090904.1.txt

Updated patch fixing the compilation. Had to modify MapOuput class's 
constructor to use the (deprecated) JobConf instead of Configuration as the 
code needs to get the user name of the job. Checked with Jothi, who wrote this 
class, to ensure it's fine doing this.

> Localized files from DistributedCache should have right access-control
> --
>
> Key: MAPREDUCE-856
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Reporter: Arun C Murthy
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
> MAPREDUCE-856-20090825.3.txt, MAPREDUCE-856-20090827.txt, 
> MAPREDUCE-856-20090903.txt, MAPREDUCE-856-20090904.1.txt, 
> MAPREDUCE-856-20090904.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-09-04 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-856:


Status: Open  (was: Patch Available)

> Localized files from DistributedCache should have right access-control
> --
>
> Key: MAPREDUCE-856
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Reporter: Arun C Murthy
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
> MAPREDUCE-856-20090825.3.txt, MAPREDUCE-856-20090827.txt, 
> MAPREDUCE-856-20090903.txt, MAPREDUCE-856-20090904.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-09-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751418#action_12751418
 ] 

Hadoop QA commented on MAPREDUCE-856:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12418615/MAPREDUCE-856-20090904.txt
  against trunk revision 811134.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 23 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The patch appears to cause tar ant target to fail.

-1 findbugs.  The patch appears to cause Findbugs to fail.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/7/testReport/
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/7/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/7/console

This message is automatically generated.

> Localized files from DistributedCache should have right access-control
> --
>
> Key: MAPREDUCE-856
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Reporter: Arun C Murthy
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
> MAPREDUCE-856-20090825.3.txt, MAPREDUCE-856-20090827.txt, 
> MAPREDUCE-856-20090903.txt, MAPREDUCE-856-20090904.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-09-04 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-856:


Status: Patch Available  (was: Open)

> Localized files from DistributedCache should have right access-control
> --
>
> Key: MAPREDUCE-856
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Reporter: Arun C Murthy
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
> MAPREDUCE-856-20090825.3.txt, MAPREDUCE-856-20090827.txt, 
> MAPREDUCE-856-20090903.txt, MAPREDUCE-856-20090904.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-09-04 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-856:


Attachment: MAPREDUCE-856-20090904.txt

Updating patch synch'ed with trunk.

> Localized files from DistributedCache should have right access-control
> --
>
> Key: MAPREDUCE-856
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Reporter: Arun C Murthy
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
> MAPREDUCE-856-20090825.3.txt, MAPREDUCE-856-20090827.txt, 
> MAPREDUCE-856-20090903.txt, MAPREDUCE-856-20090904.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-09-04 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-856:


Status: Open  (was: Patch Available)

> Localized files from DistributedCache should have right access-control
> --
>
> Key: MAPREDUCE-856
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Reporter: Arun C Murthy
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
> MAPREDUCE-856-20090825.3.txt, MAPREDUCE-856-20090827.txt, 
> MAPREDUCE-856-20090903.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-943) TestNodeRefresh timesout occasionally

2009-09-04 Thread Amar Kamat (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751409#action_12751409
 ] 

Amar Kamat commented on MAPREDUCE-943:
--

@Konstantin Yes the testcase got stuck because JobClient.startTracker() failed. 

> TestNodeRefresh timesout occasionally
> -
>
> Key: MAPREDUCE-943
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-943
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Amareshwari Sriramadasu
>Assignee: Amar Kamat
> Fix For: 0.21.0
>
> Attachments: MAPRED-943-v1.0.patch
>
>
> TestNodeRefresh timesout occasionally.
> One of the hudson patch build with timeout 
> @http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/26/testReport/org.apache.hadoop.mapred/TestNodeRefresh/testMRExcludeHostsAcrossRestarts/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-930) rumen should interpret job history log input paths with respect to default FS, not local FS

2009-09-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751404#action_12751404
 ] 

Hadoop QA commented on MAPREDUCE-930:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12418567/M930-0.patch
  against trunk revision 811134.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/6/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/6/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/6/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/6/console

This message is automatically generated.

> rumen should interpret job history log input paths with respect to default 
> FS, not local FS
> ---
>
> Key: MAPREDUCE-930
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-930
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Dick King
>Assignee: Dick King
>Priority: Minor
> Attachments: M930-0.patch, patch-930.patch
>
>
> This allows job history log file/directory names that don't specify a file 
> system to use the configured default FS instead of the local FS [when the 
> configured default is not the local].

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-09-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751403#action_12751403
 ] 

Hadoop QA commented on MAPREDUCE-856:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12418592/MAPREDUCE-856-20090903.txt
  against trunk revision 811134.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 23 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/39/console

This message is automatically generated.

> Localized files from DistributedCache should have right access-control
> --
>
> Key: MAPREDUCE-856
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: tasktracker
>Reporter: Arun C Murthy
>Assignee: Vinod K V
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
> MAPREDUCE-856-20090825.3.txt, MAPREDUCE-856-20090827.txt, 
> MAPREDUCE-856-20090903.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-901) Move Framework Counters into a TaskMetric structure

2009-09-04 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-901:


Attachment: MAPREDUCE-901.patch

This is almost there, sans two important pieces:

# It looks like I need to change ReduceContext to take TaskMetrics in-lieu of 
Counters.Counter for it's inputCounter; a change to a public api... sigh!
# As Devaraj noted, I need to fix JobInProgress to store the incoming 
TaskMetrics values in it's Counters.

> Move Framework Counters into a TaskMetric structure
> ---
>
> Key: MAPREDUCE-901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.21.0
>Reporter: Owen O'Malley
>Assignee: Arun C Murthy
> Fix For: 0.21.0
>
> Attachments: 901_1.patch, 901_1.patch, MAPREDUCE-901.patch, 
> MAPREDUCE-901.patch
>
>
> I think we should move all of the Counters that the framework updates into a 
> single class called TaskMetrics. TaskMetrics would have specific fields for 
> each of the metrics like input records, input bytes, output records, etc.
> It would both reduce the serialized size of the heartbeats (by shrinking the 
> Counters down to just the user's counters) and decrease the latency for 
> updates to the JobTracker (since Counters are sent at most 1/minute instead 
> of 1/heartbeat).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-953) Generate configuration dump for hierarchial queue configuration

2009-09-04 Thread rahul k singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rahul k singh updated MAPREDUCE-953:


Priority: Blocker  (was: Major)

> Generate configuration dump for hierarchial queue configuration
> ---
>
> Key: MAPREDUCE-953
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-953
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: rahul k singh
>Priority: Blocker
>
> Generate configuration dump for hierarchial queue configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-953) Generate configuration dump for hierarchial queue configuration

2009-09-04 Thread rahul k singh (JIRA)
Generate configuration dump for hierarchial queue configuration
---

 Key: MAPREDUCE-953
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-953
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: rahul k singh


Generate configuration dump for hierarchial queue configuration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-898) Change DistributedCache to use new api.

2009-09-04 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751399#action_12751399
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-898:
---

-1 core tests. Is due to MAPREDUCE-943.
-1 javac. Is because of deprecation warnings.

> Change DistributedCache to use new api.
> ---
>
> Key: MAPREDUCE-898
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-898
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-898-1.txt, patch-898-2.txt, patch-898-3.txt, 
> patch-898-4.txt, patch-898.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-890) After HADOOP-4491, the user who started mapred system is not able to run job.

2009-09-04 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-890:


Attachment: MAPREDUCE-890-20090904.txt

Attaching patch which should be applied over the latest patch at MAPREDUCE-856. 
It follows a slight modification of the second proposal above, which is
bq. Set drwxrws--- on $job_cache and $dist_cache in all cases. This means 
user's tasks CAN potentially create unwarranted files/dirs in the $job_cache or 
$dist-cache or delete their own files themselves.
So now, the user-directory taskTracker/$user will have "2570 user-owner 
task-tracker-group" and all the world under it will have "2570/0770 user-owner 
task-tracker-group".

> After HADOOP-4491, the user who started mapred system is not able to run job.
> -
>
> Key: MAPREDUCE-890
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-890
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Reporter: Karam Singh
>Assignee: Vinod K V
> Attachments: MAPREDUCE-890-20090904.txt
>
>
> Even setup and cleanup task of job fails due exception -: It fails to create 
> job and related directories under mapred.local.dir/taskTracker/jobcache
> Directories are created as -:
> [dr-xrws--- mapred   hadoop  ]  job_200908190916_0002
> mapred is not wrtie under this. Even manually I failed to touch file.
> mapred is use of started mr cluster 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-936) Allow a load difference in fairshare scheduler

2009-09-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751394#action_12751394
 ] 

Hadoop QA commented on MAPREDUCE-936:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12418583/MAPREDUCE-936.2.patch
  against trunk revision 811134.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/38/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/38/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/38/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/38/console

This message is automatically generated.

> Allow a load difference in fairshare scheduler
> --
>
> Key: MAPREDUCE-936
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-936
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Affects Versions: 0.20.1, 0.21.0, 0.22.0
>Reporter: Zheng Shao
>Assignee: Zheng Shao
> Attachments: MAPREDUCE-936.1.patch, MAPREDUCE-936.2.patch
>
>
> The problem we are facing: It takes a long time for all tasks of a job to get 
> scheduled on the cluster, even if the cluster is almost empty.
> There are two reasons that together lead to this situation:
> 1. The load factor makes sure each TT runs the same number of tasks. (This is 
> the part that this patch tries to change).
> 2. The scheduler tries to schedule map tasks locally (first node-local, then 
> rack-local). There is a wait time (mapred.fairscheduler.localitywait.node and 
> mapred.fairscheduler.localitywait.rack, both are around 10 sec in our conf), 
> and accumulated wait time (JobInfo.localityWait). The accumulated wait time 
> is reset to 0 whenever a non-local map task is scheduled. That means it takes 
> N * wait_time to schedule N non-local map tasks.
> Because of 1, a lot of TT will not be able to take more tasks, even if they 
> have free slots. As a result, a lot of the map tasks cannot be scheduled 
> locally.
> Because of 2, it's really hard to schedule a non-local task.
> As a result, sometimes we are seeing that it takes more than 2 minutes to 
> schedule all the mappers of a job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.

2009-09-04 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-370:
--

Status: Open  (was: Patch Available)

> Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
> ---
>
> Key: MAPREDUCE-370
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-370-1.txt, patch-370-2.txt, patch-370-3.txt, 
> patch-370-4.txt, patch-370-5.txt, patch-370.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.

2009-09-04 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-370:
--

Status: Patch Available  (was: Open)

re-submitting for hudson

> Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
> ---
>
> Key: MAPREDUCE-370
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-370-1.txt, patch-370-2.txt, patch-370-3.txt, 
> patch-370-4.txt, patch-370-5.txt, patch-370.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-951) MAP_INPUT_BYTES counter is missing

2009-09-04 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu resolved MAPREDUCE-951.
---

Resolution: Duplicate

Fixed by HADOOP_5710. The counter is now FileInputFormat.BYTES_READ

> MAP_INPUT_BYTES counter is missing
> --
>
> Key: MAPREDUCE-951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-951
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 0.21.0
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
>Priority: Blocker
> Fix For: 0.21.0
>
>
> Looks we lost it during one of the merges during project split: 
> http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/TaskCounter.java?r1=776174&r2=785392&diff_format=h

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-952) Previously removed Task.Counter reintroduced by MAPREDUCE-318

2009-09-04 Thread Arun C Murthy (JIRA)
Previously removed Task.Counter reintroduced by MAPREDUCE-318
-

 Key: MAPREDUCE-952
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-952
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 0.21.0
Reporter: Arun C Murthy
Assignee: Arun C Murthy
Priority: Blocker
 Fix For: 0.21.0


HADOOP-5717 introduced org.apache.hadoop.mapreduce.TaskCounters in-lieu of the 
older org.apache.hadoop.mapred.Task.Counter (see http://tinyurl.com/m4uwgj for 
the patch). However, MAPREDUCE-318 seems to have accidentally re-introduced it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-951) MAP_INPUT_BYTES counter is missing

2009-09-04 Thread Arun C Murthy (JIRA)
MAP_INPUT_BYTES counter is missing
--

 Key: MAPREDUCE-951
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-951
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 0.21.0
Reporter: Arun C Murthy
Assignee: Arun C Murthy
Priority: Blocker
 Fix For: 0.21.0


Looks we lost it during one of the merges during project split: 
http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/java/org/apache/hadoop/mapreduce/TaskCounter.java?r1=776174&r2=785392&diff_format=h

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-777) A method for finding and tracking jobs from the new API

2009-09-04 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751350#action_12751350
 ] 

Arun C Murthy commented on MAPREDUCE-777:
-

In general, we should look at this as an opportunity to clean up the 
job-submission interface (currently JobClient) and the goal is not to be 
compatible on a feature-by-feature basis. I'll try and take a closer look at 
the interfaces added to org.apache.hadoop.mapreduce.Job soon, but I thought I 
should spell out the underlying vision.

> A method for finding and tracking jobs from the new API
> ---
>
> Key: MAPREDUCE-777
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-777
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: client
>Reporter: Owen O'Malley
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: m-777.patch, patch-777-1.txt, patch-777-2.txt, 
> patch-777-3.txt, patch-777.txt
>
>
> We need to create a replacement interface for the JobClient API in the new 
> interface. In particular, the user needs to be able to query and track jobs 
> that were launched by other processes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-777) A method for finding and tracking jobs from the new API

2009-09-04 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751345#action_12751345
 ] 

Arun C Murthy commented on MAPREDUCE-777:
-

I think this is on the right track... I'm happy to see this shaping up well!

I'm a little unsure about JobSubmitter being a separate class, seems to me that 
since a Job can 'submit' itself (o.a.h.mapreduce.Job.submit) it shouldn't need 
another class (JobSubmitter) for that functionality. Maybe JobSubmitter (or 
JobSubmissionHelper) should have only static methods? Anyway, it's a minor 
issue. Thoughts?

> A method for finding and tracking jobs from the new API
> ---
>
> Key: MAPREDUCE-777
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-777
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: client
>Reporter: Owen O'Malley
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: m-777.patch, patch-777-1.txt, patch-777-2.txt, 
> patch-777-3.txt, patch-777.txt
>
>
> We need to create a replacement interface for the JobClient API in the new 
> interface. In particular, the user needs to be able to query and track jobs 
> that were launched by other processes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.

2009-09-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751344#action_12751344
 ] 

Hadoop QA commented on MAPREDUCE-370:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12418357/patch-370-4.txt
  against trunk revision 811134.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/5/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/5/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/5/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/5/console

This message is automatically generated.

> Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
> ---
>
> Key: MAPREDUCE-370
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-370-1.txt, patch-370-2.txt, patch-370-3.txt, 
> patch-370-4.txt, patch-370-5.txt, patch-370.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-898) Change DistributedCache to use new api.

2009-09-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751336#action_12751336
 ] 

Hadoop QA commented on MAPREDUCE-898:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12418580/patch-898-4.txt
  against trunk revision 811134.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 2236 javac compiler warnings (more 
than the trunk's current 2226 warnings).

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/37/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/37/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/37/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/37/console

This message is automatically generated.

> Change DistributedCache to use new api.
> ---
>
> Key: MAPREDUCE-898
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-898
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-898-1.txt, patch-898-2.txt, patch-898-3.txt, 
> patch-898-4.txt, patch-898.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.