[jira] Updated: (MAPREDUCE-943) TestNodeRefresh timesout occasionally

2009-09-07 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated MAPREDUCE-943:
--

Issue Type: Sub-task  (was: Bug)
Parent: MAPREDUCE-873

 TestNodeRefresh timesout occasionally
 -

 Key: MAPREDUCE-943
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-943
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: jobtracker
Reporter: Amareshwari Sriramadasu
Assignee: Amar Kamat
 Fix For: 0.21.0

 Attachments: MAPRED-943-v1.0.patch


 TestNodeRefresh timesout occasionally.
 One of the hudson patch build with timeout 
 @http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/26/testReport/org.apache.hadoop.mapred/TestNodeRefresh/testMRExcludeHostsAcrossRestarts/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-943) TestNodeRefresh timesout occasionally

2009-09-07 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das resolved MAPREDUCE-943.
---

Resolution: Fixed

I just committed this. Thanks, Amar!

 TestNodeRefresh timesout occasionally
 -

 Key: MAPREDUCE-943
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-943
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: jobtracker
Reporter: Amareshwari Sriramadasu
Assignee: Amar Kamat
 Fix For: 0.21.0

 Attachments: MAPRED-943-v1.0.patch


 TestNodeRefresh timesout occasionally.
 One of the hudson patch build with timeout 
 @http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/26/testReport/org.apache.hadoop.mapred/TestNodeRefresh/testMRExcludeHostsAcrossRestarts/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-957) Set mapred.job.name for a pipes job

2009-09-07 Thread Ramya R (JIRA)
Set mapred.job.name for a pipes job
---

 Key: MAPREDUCE-957
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-957
 Project: Hadoop Map/Reduce
  Issue Type: Wish
  Components: pipes
Affects Versions: 0.20.1
Reporter: Ramya R
Priority: Minor


Currently mapred.job.name is not set for a pipes job. It will be useful if this 
value is set when a pipes job is submitted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-943) TestNodeRefresh timesout occasionally

2009-09-07 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752019#action_12752019
 ] 

Devaraj Das commented on MAPREDUCE-943:
---

Should have added that I also agree that the testcase which times out is no 
longer needed.

 TestNodeRefresh timesout occasionally
 -

 Key: MAPREDUCE-943
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-943
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: jobtracker
Reporter: Amareshwari Sriramadasu
Assignee: Amar Kamat
 Fix For: 0.21.0

 Attachments: MAPRED-943-v1.0.patch


 TestNodeRefresh timesout occasionally.
 One of the hudson patch build with timeout 
 @http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/26/testReport/org.apache.hadoop.mapred/TestNodeRefresh/testMRExcludeHostsAcrossRestarts/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-861) Modify queue configuration format and parsing to support a hierarchy of queues.

2009-09-07 Thread rahul k singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rahul k singh updated MAPREDUCE-861:


Attachment: MAPREDUCE-861-4.patch

Incorporated all the comments except 

1.In DeprecatedHierarchyBuilder we are still not checking if ACLs are disabled 
before parsing them. Note though that this is being done for the 
QueueHierarchyBuilder.
Lots of testcases esp. in TestQueueManager are written with an assumption that 
MapString, AccessControlList list is created for the Queue object all the 
time.
esp in case of setting mapred.acls.enabled = true using conf.set . There are 
lots of NullPointerException if we dont generate this empty object. Hence not 
accommodating this comment , as it is a significant change in testcase and 
moreover for deprecated stuff and having this does empty 
MapString,AccessControlList doesn't effect the overall behaviour at all.

 Modify queue configuration format and parsing to support a hierarchy of 
 queues.
 ---

 Key: MAPREDUCE-861
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-861
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Hemanth Yamijala
Assignee: rahul k singh
 Attachments: MAPREDUCE-861-1.patch, MAPREDUCE-861-2.patch, 
 MAPREDUCE-861-3.patch, MAPREDUCE-861-4.patch


 MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce 
 framework. This JIRA is for defining changes to the configuration related to 
 queues. 
 The current format for defining a queue and its properties is as follows: 
 mapred.queue.queue-name.property-name. For e.g. 
 mapred.queue.queue-name.acl-submit-job. The reason for using this verbose 
 format was to be able to reuse the Configuration parser in Hadoop. However, 
 administrators currently using the queue configuration have already indicated 
 a very strong desire for a more manageable format. Since, this becomes more 
 unwieldy with hierarchical queues, the time may be good to introduce a new 
 format for representing queue configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-943) TestNodeRefresh timesout occasionally

2009-09-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752020#action_12752020
 ] 

Hudson commented on MAPREDUCE-943:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #18 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/18/])
. Removes a testcase in TestNodeRefresh that doesn't make sense in the new 
Job recovery model. Contributed by Amar Kamat.


 TestNodeRefresh timesout occasionally
 -

 Key: MAPREDUCE-943
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-943
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: jobtracker
Reporter: Amareshwari Sriramadasu
Assignee: Amar Kamat
 Fix For: 0.21.0

 Attachments: MAPRED-943-v1.0.patch


 TestNodeRefresh timesout occasionally.
 One of the hudson patch build with timeout 
 @http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/26/testReport/org.apache.hadoop.mapred/TestNodeRefresh/testMRExcludeHostsAcrossRestarts/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-861) Modify queue configuration format and parsing to support a hierarchy of queues.

2009-09-07 Thread rahul k singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752021#action_12752021
 ] 

rahul k singh commented on MAPREDUCE-861:
-

error in the above patch , attaching new one.

 Modify queue configuration format and parsing to support a hierarchy of 
 queues.
 ---

 Key: MAPREDUCE-861
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-861
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Hemanth Yamijala
Assignee: rahul k singh
 Attachments: MAPREDUCE-861-1.patch, MAPREDUCE-861-2.patch, 
 MAPREDUCE-861-3.patch, MAPREDUCE-861-4.patch


 MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce 
 framework. This JIRA is for defining changes to the configuration related to 
 queues. 
 The current format for defining a queue and its properties is as follows: 
 mapred.queue.queue-name.property-name. For e.g. 
 mapred.queue.queue-name.acl-submit-job. The reason for using this verbose 
 format was to be able to reuse the Configuration parser in Hadoop. However, 
 administrators currently using the queue configuration have already indicated 
 a very strong desire for a more manageable format. Since, this becomes more 
 unwieldy with hierarchical queues, the time may be good to introduce a new 
 format for representing queue configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-861) Modify queue configuration format and parsing to support a hierarchy of queues.

2009-09-07 Thread rahul k singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rahul k singh updated MAPREDUCE-861:


Attachment: MAPREDUCE-861-5.patch

 Modify queue configuration format and parsing to support a hierarchy of 
 queues.
 ---

 Key: MAPREDUCE-861
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-861
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Hemanth Yamijala
Assignee: rahul k singh
 Attachments: MAPREDUCE-861-1.patch, MAPREDUCE-861-2.patch, 
 MAPREDUCE-861-3.patch, MAPREDUCE-861-4.patch, MAPREDUCE-861-5.patch


 MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce 
 framework. This JIRA is for defining changes to the configuration related to 
 queues. 
 The current format for defining a queue and its properties is as follows: 
 mapred.queue.queue-name.property-name. For e.g. 
 mapred.queue.queue-name.acl-submit-job. The reason for using this verbose 
 format was to be able to reuse the Configuration parser in Hadoop. However, 
 administrators currently using the queue configuration have already indicated 
 a very strong desire for a more manageable format. Since, this becomes more 
 unwieldy with hierarchical queues, the time may be good to introduce a new 
 format for representing queue configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-861) Modify queue configuration format and parsing to support a hierarchy of queues.

2009-09-07 Thread rahul k singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rahul k singh updated MAPREDUCE-861:


Status: Patch Available  (was: Open)

 Modify queue configuration format and parsing to support a hierarchy of 
 queues.
 ---

 Key: MAPREDUCE-861
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-861
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Hemanth Yamijala
Assignee: rahul k singh
 Attachments: MAPREDUCE-861-1.patch, MAPREDUCE-861-2.patch, 
 MAPREDUCE-861-3.patch, MAPREDUCE-861-4.patch, MAPREDUCE-861-5.patch


 MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce 
 framework. This JIRA is for defining changes to the configuration related to 
 queues. 
 The current format for defining a queue and its properties is as follows: 
 mapred.queue.queue-name.property-name. For e.g. 
 mapred.queue.queue-name.acl-submit-job. The reason for using this verbose 
 format was to be able to reuse the Configuration parser in Hadoop. However, 
 administrators currently using the queue configuration have already indicated 
 a very strong desire for a more manageable format. Since, this becomes more 
 unwieldy with hierarchical queues, the time may be good to introduce a new 
 format for representing queue configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-09-07 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752023#action_12752023
 ] 

Hemanth Yamijala commented on MAPREDUCE-856:


I verified the changes. Only comment is that in the changes related to 
synchronization of user localization, we are repeating work related to a user 
everytime job localization happens. A suggestion is to keep the synchronization 
on user name, but have the value to be a state variable that can indicate the 
status of localization and check that before beginning to localize.

 Localized files from DistributedCache should have right access-control
 --

 Key: MAPREDUCE-856
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: tasktracker
Reporter: Arun C Murthy
Assignee: Vinod K V
 Fix For: 0.21.0

 Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
 MAPREDUCE-856-20090825.3.txt, MAPREDUCE-856-20090827.txt, 
 MAPREDUCE-856-20090903.txt, MAPREDUCE-856-20090904.1.txt, 
 MAPREDUCE-856-20090904.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-860) Modify Queue APIs to support a hierarchy of queues

2009-09-07 Thread rahul k singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rahul k singh resolved MAPREDUCE-860.
-

Resolution: Duplicate

 Modify Queue APIs to support a hierarchy of queues
 --

 Key: MAPREDUCE-860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-860
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: jobtracker
Reporter: Hemanth Yamijala
Assignee: rahul k singh

 MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce 
 framework. This JIRA is for defining changes to the APIs related to queues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-860) Modify Queue APIs to support a hierarchy of queues

2009-09-07 Thread rahul k singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752025#action_12752025
 ] 

rahul k singh commented on MAPREDUCE-860:
-

This issue is being resolved as part of MAPREDUCE-861. Hence closing this as 
duplicate.

 Modify Queue APIs to support a hierarchy of queues
 --

 Key: MAPREDUCE-860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-860
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: jobtracker
Reporter: Hemanth Yamijala
Assignee: rahul k singh

 MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce 
 framework. This JIRA is for defining changes to the APIs related to queues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-157) Job History log file format is not friendly for external tools.

2009-09-07 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan updated MAPREDUCE-157:


Status: Patch Available  (was: Open)

 Job History log file format is not friendly for external tools.
 ---

 Key: MAPREDUCE-157
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 0.20.1
Reporter: Owen O'Malley
Assignee: Jothi Padmanabhan
 Fix For: 0.21.0

 Attachments: mapred-157-4Sep.patch, mapred-157-7Sep.patch, 
 mapred-157-prelim.patch, MAPREDUCE-157-avro.patch


 Currently, parsing the job history logs with external tools is very difficult 
 because of the format. The most critical problem is that newlines aren't 
 escaped in the strings. That makes using tools like grep, sed, and awk very 
 tricky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-924) TestPipes must not directly invoke 'main' of pipes as an exit from main could cause the testcase to crash.

2009-09-07 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-924:
---

Description: 
TestPipes invokes the main method of the program running pipes. In 
MAPREDUCE-421, a change was made to the Pipes command runner to invoke 
System.exit after completion. This itself is a valid change because the pipes 
command runner is in itself a user facing program. But when combined with a 
testcase, it causes the testcase to crash rather than providing feedback on 
whether the test ran correctly or not.

The testcase should be modified to use Tool instead of running main directly.

  was:
TestPipes crashes on trunk due to MAPREDUCE-421.
Testcase should be modified to use Tool insteadof running main directly.

Summary: TestPipes must not directly invoke 'main' of pipes as an exit 
from main could cause the testcase to crash.  (was: TestPipes crashes on trunk)

 TestPipes must not directly invoke 'main' of pipes as an exit from main could 
 cause the testcase to crash.
 --

 Key: MAPREDUCE-924
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-924
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: pipes
Affects Versions: 0.20.1
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.20.1

 Attachments: patch-924-0.20.txt, patch-924.txt


 TestPipes invokes the main method of the program running pipes. In 
 MAPREDUCE-421, a change was made to the Pipes command runner to invoke 
 System.exit after completion. This itself is a valid change because the pipes 
 command runner is in itself a user facing program. But when combined with a 
 testcase, it causes the testcase to crash rather than providing feedback on 
 whether the test ran correctly or not.
 The testcase should be modified to use Tool instead of running main directly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-956) Shuffle should be broken down to only two phases (copy/reduce) instead of three (copy/sort/reduce)

2009-09-07 Thread Ravi Gummadi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752048#action_12752048
 ] 

Ravi Gummadi commented on MAPREDUCE-956:


We could say the phases as Shuffle phase and Reduce phase. But we need to 
investigate how we want to update progress in shuffle phase --- because 
updating progress of shuffle phase just based on 'copy of map outputs' would 
not be a correct way as there could be some merges that would take some time 
after all map outputs are copied to this reduce node(even though some merges 
happen while some map outputs are being copied).

 Shuffle should be broken down to only two phases (copy/reduce) instead of 
 three (copy/sort/reduce)
 --

 Key: MAPREDUCE-956
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-956
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: task
Affects Versions: 0.21.0
Reporter: Jothi Padmanabhan

 For the progress calculations and displaying on the UI, shuffle, in its 
 current form,  is decomposed into three phases (copy/sort/reduce). Actually, 
 the sort phase is no longer applicable. I think we should just reduce the 
 number of phases to two and assign 50% weight-age to each of copy and reduce 
 phases. Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-09-07 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-856:


Status: Patch Available  (was: Open)

 Localized files from DistributedCache should have right access-control
 --

 Key: MAPREDUCE-856
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: tasktracker
Reporter: Arun C Murthy
Assignee: Vinod K V
 Fix For: 0.21.0

 Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
 MAPREDUCE-856-20090825.3.txt, MAPREDUCE-856-20090827.txt, 
 MAPREDUCE-856-20090903.txt, MAPREDUCE-856-20090904.1.txt, 
 MAPREDUCE-856-20090904.txt, MAPREDUCE-856-20090907.1.txt, 
 MAPREDUCE-856-20090907.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-856) Localized files from DistributedCache should have right access-control

2009-09-07 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-856:


Attachment: MAPREDUCE-856-20090907.1.txt

Updated patch fixing the test failures reported by Hudson.

 Localized files from DistributedCache should have right access-control
 --

 Key: MAPREDUCE-856
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-856
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: tasktracker
Reporter: Arun C Murthy
Assignee: Vinod K V
 Fix For: 0.21.0

 Attachments: MAPREDUCE-856-20090820.txt, MAPREDUCE-856-20090821.txt, 
 MAPREDUCE-856-20090825.3.txt, MAPREDUCE-856-20090827.txt, 
 MAPREDUCE-856-20090903.txt, MAPREDUCE-856-20090904.1.txt, 
 MAPREDUCE-856-20090904.txt, MAPREDUCE-856-20090907.1.txt, 
 MAPREDUCE-856-20090907.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-841) Protect Job Tracker against memory exhaustion due to very large InputSplit or JobConf objects

2009-09-07 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752074#action_12752074
 ] 

Devaraj Das commented on MAPREDUCE-841:
---

BTW for the splits part, MAPREDUCE-181 (http://tinyurl.com/legzp9) is 
introducing some changes.

 Protect Job Tracker against memory exhaustion due to very large InputSplit or 
 JobConf objects
 -

 Key: MAPREDUCE-841
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-841
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Affects Versions: 0.20.1
Reporter: Hong Tang
 Fix For: 0.21.0


 JobTracker only needs to examine a subset of information contained by 
 InputSplit or JobConf objects. But currently JobTracker loads the complete 
 user-defined InputSplit and JobConf objects in memory. This design would 
 leave JobTracker susceptible to memory exhaustion particularly in cases when 
 some bugs in user code which could result in very large input splits or job 
 conf objects (e.g. PIG-901).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-876) Sqoop import of large tables can time out

2009-09-07 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-876:


   Resolution: Fixed
Fix Version/s: 0.21.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

+1

I've just committed this. Thanks Aaron!

 Sqoop import of large tables can time out
 -

 Key: MAPREDUCE-876
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-876
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Fix For: 0.21.0

 Attachments: MAPREDUCE-876.2.patch, MAPREDUCE-876.patch


 Related to MAPREDUCE-875, Sqoop should use a background thread to ensure that 
 progress is being reported while a database does external work for the 
 MapReduce task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-918) Test hsqldb server should be memory-only.

2009-09-07 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-918:


   Resolution: Fixed
Fix Version/s: 0.21.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

+1

I've just committed this. Thanks Aaron!

 Test hsqldb server should be memory-only.
 -

 Key: MAPREDUCE-918
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-918
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Fix For: 0.21.0

 Attachments: MAPREDUCE-918.patch


 Sqoop launches a standalone hsqldb server for unit tests, but it currently 
 writes its database to disk and uses a connect string of {{//localhost}}. If 
 multiple test instances are running concurrently, one test server may serve 
 to the other instance of the unit tests, causing race conditions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-944) Extend FairShare scheduler to fair-share memory usage in the cluster

2009-09-07 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752105#action_12752105
 ] 

Vinod K V commented on MAPREDUCE-944:
-

I see in the patch attached that only one concrete implementation 
CapBasedLoadManager is done for the LoadManager which in turn doesn't take into 
account any resource usage. I guess you are planning a proper implementation 
for this feature regarding fair-share of memory usage in another JIRA.

Some points still not dealt with in this JIRA. I bring about these points so as 
to know if you are thinking or have already thought anything about this.
 - Job configuration about how users specify the resource usage. Some memory 
related configuration properties are added to the framework while working for 
memory monitoring on TTs as well as memory usage based scheduling in 
CapacityTaskScheduler. You may want to reuse some/all of it.
 - Capturing the scheduling decisions involved when we are not able to find a 
task from a Schedulable because of lack of resources on a given TaskTasker.

Regarding the latter, the current patch just returns null, which is similar to 
the decision CapacityTaskScheduler used to take in previous versions - i.e. 
block the TT till it can be given a task from the job at the head of the 
queue/pool. Sometime back, we investigated how this approach works with 
FairScheduler and realized some important implications. For e.g, because the 
order of jobs might change significantly in consecutive iterations of 
FairScheduler, just returning null may not work at all. Eventually we may end 
up waiting for a long time if significant number of jobs ask for high amount of 
resources.

Thoughts?

 Extend FairShare scheduler to fair-share memory usage in the cluster
 

 Key: MAPREDUCE-944
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-944
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/fair-share
Reporter: dhruba borthakur
 Attachments: LoadManager.txt


 The FairShare Scheduler has an extensible LoadManager API to regulate 
 allocating new tasks on a particular TaskTracker. In similar lines, it would 
 be nice if the FairShare Scheduler can have a pluggable policy to regulate 
 new tasks from a particular job. This will allow one to skip scheduling tasks 
 of a job that  is eating a large percentage of memory in the cluster, i.e. 
 fair-share of memory resources among jobs. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-181) Secure job submission

2009-09-07 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752117#action_12752117
 ] 

Devaraj Das commented on MAPREDUCE-181:
---

For now, let's keep it simple - don't implement the points to do with 
maintaining/cleaning-up jobID-userName mappings. This should be looked at, in 
a bigger picture, once we have the authentication implemented. Also, rather 
than time-based expiry I think it would be better to have limits on number of 
queued jobs per user and the max queued jobs overall.

 Secure job submission 
 --

 Key: MAPREDUCE-181
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-181
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amar Kamat
Assignee: Amar Kamat
 Attachments: hadoop-3578-branch-20-example-2.patch, 
 hadoop-3578-branch-20-example.patch, HADOOP-3578-v2.6.patch, 
 HADOOP-3578-v2.7.patch, MAPRED-181-v3.8.patch


 Currently the jobclient accesses the {{mapred.system.dir}} to add job 
 details. Hence the {{mapred.system.dir}} has the permissions of 
 {{rwx-wx-wx}}. This could be a security loophole where the job files might 
 get overwritten/tampered after the job submission. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-898) Change DistributedCache to use new api.

2009-09-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752122#action_12752122
 ] 

Hudson commented on MAPREDUCE-898:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #19 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/19/])
. Changes DistributedCache to use the new API. Contributed by Amareshwari 
Sriramadasu.


 Change DistributedCache to use new api.
 ---

 Key: MAPREDUCE-898
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-898
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: patch-898-1.txt, patch-898-2.txt, patch-898-3.txt, 
 patch-898-4.txt, patch-898.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-918) Test hsqldb server should be memory-only.

2009-09-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752123#action_12752123
 ] 

Hudson commented on MAPREDUCE-918:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #19 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/19/])
. Test hsqldb server should be memory-only. Contributed by Aaron Kimball.


 Test hsqldb server should be memory-only.
 -

 Key: MAPREDUCE-918
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-918
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Fix For: 0.21.0

 Attachments: MAPREDUCE-918.patch


 Sqoop launches a standalone hsqldb server for unit tests, but it currently 
 writes its database to disk and uses a connect string of {{//localhost}}. If 
 multiple test instances are running concurrently, one test server may serve 
 to the other instance of the unit tests, causing race conditions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-764) TypedBytesInput's readRaw() does not preserve custom type codes

2009-09-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752182#action_12752182
 ] 

Hudson commented on MAPREDUCE-764:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #20 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/20/])
. TypedBytesInput's readRaw() does not preserve custom type codes. 
Contributed by Klaas Bosteels.


 TypedBytesInput's readRaw() does not preserve custom type codes
 ---

 Key: MAPREDUCE-764
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-764
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming
Affects Versions: 0.21.0
Reporter: Klaas Bosteels
Assignee: Klaas Bosteels
Priority: Blocker
 Fix For: 0.21.0

 Attachments: MAPREDUCE-764.patch, MAPREDUCE-764.patch


 The typed bytes format supports byte sequences of the form {{custom type 
 code length bytes}}. When reading such a sequence via 
 {{TypedBytesInput}}'s {{readRaw()}} method, however, the returned sequence 
 currently is {{0 length bytes}} (0 is the type code for a bytes array), 
 which leads to bugs such as the one described 
 [here|http://dumbo.assembla.com/spaces/dumbo/tickets/54].

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-157) Job History log file format is not friendly for external tools.

2009-09-07 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan updated MAPREDUCE-157:


Status: Open  (was: Patch Available)

 Job History log file format is not friendly for external tools.
 ---

 Key: MAPREDUCE-157
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 0.20.1
Reporter: Owen O'Malley
Assignee: Jothi Padmanabhan
 Fix For: 0.21.0

 Attachments: mapred-157-4Sep.patch, mapred-157-7Sep.patch, 
 mapred-157-prelim.patch, MAPREDUCE-157-avro.patch


 Currently, parsing the job history logs with external tools is very difficult 
 because of the format. The most critical problem is that newlines aren't 
 escaped in the strings. That makes using tools like grep, sed, and awk very 
 tricky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-157) Job History log file format is not friendly for external tools.

2009-09-07 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan updated MAPREDUCE-157:


Attachment: mapred-157-7Sep-v1.patch

Now, sqoop's ivy.xml needs to be updated too!

 Job History log file format is not friendly for external tools.
 ---

 Key: MAPREDUCE-157
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 0.20.1
Reporter: Owen O'Malley
Assignee: Jothi Padmanabhan
 Fix For: 0.21.0

 Attachments: mapred-157-4Sep.patch, mapred-157-7Sep-v1.patch, 
 mapred-157-7Sep.patch, mapred-157-prelim.patch, MAPREDUCE-157-avro.patch


 Currently, parsing the job history logs with external tools is very difficult 
 because of the format. The most critical problem is that newlines aren't 
 escaped in the strings. That makes using tools like grep, sed, and awk very 
 tricky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-936) Allow a load difference in fairshare scheduler

2009-09-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752230#action_12752230
 ] 

Hudson commented on MAPREDUCE-936:
--

Integrated in Hadoop-Mapreduce-trunk #75 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/75/])


 Allow a load difference in fairshare scheduler
 --

 Key: MAPREDUCE-936
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-936
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/fair-share
Affects Versions: 0.20.1, 0.21.0, 0.22.0
Reporter: Zheng Shao
Assignee: Zheng Shao
 Fix For: 0.21.0

 Attachments: MAPREDUCE-936.1.patch, MAPREDUCE-936.2.patch


 The problem we are facing: It takes a long time for all tasks of a job to get 
 scheduled on the cluster, even if the cluster is almost empty.
 There are two reasons that together lead to this situation:
 1. The load factor makes sure each TT runs the same number of tasks. (This is 
 the part that this patch tries to change).
 2. The scheduler tries to schedule map tasks locally (first node-local, then 
 rack-local). There is a wait time (mapred.fairscheduler.localitywait.node and 
 mapred.fairscheduler.localitywait.rack, both are around 10 sec in our conf), 
 and accumulated wait time (JobInfo.localityWait). The accumulated wait time 
 is reset to 0 whenever a non-local map task is scheduled. That means it takes 
 N * wait_time to schedule N non-local map tasks.
 Because of 1, a lot of TT will not be able to take more tasks, even if they 
 have free slots. As a result, a lot of the map tasks cannot be scheduled 
 locally.
 Because of 2, it's really hard to schedule a non-local task.
 As a result, sometimes we are seeing that it takes more than 2 minutes to 
 schedule all the mappers of a job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-370) Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.

2009-09-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752231#action_12752231
 ] 

Hudson commented on MAPREDUCE-370:
--

Integrated in Hadoop-Mapreduce-trunk #75 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/75/])


 Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
 ---

 Key: MAPREDUCE-370
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: patch-370-1.txt, patch-370-2.txt, patch-370-3.txt, 
 patch-370-4.txt, patch-370-5.txt, patch-370.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-372) Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.

2009-09-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752232#action_12752232
 ] 

Hudson commented on MAPREDUCE-372:
--

Integrated in Hadoop-Mapreduce-trunk #75 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/75/])


 Change org.apache.hadoop.mapred.lib.ChainMapper/Reducer to use new api.
 ---

 Key: MAPREDUCE-372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-372
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: patch-372-1.txt, patch-372.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-903) Adding AVRO jar to eclipse classpath

2009-09-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752233#action_12752233
 ] 

Hudson commented on MAPREDUCE-903:
--

Integrated in Hadoop-Mapreduce-trunk #75 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/75/])


 Adding AVRO jar to eclipse classpath
 

 Key: MAPREDUCE-903
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-903
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Philip Zeyliger
Assignee: Philip Zeyliger
 Fix For: 0.21.0

 Attachments: MAPREDUCE-903.patch


 Avro is missing from the eclipse classpath, which caused Eclipse to whine.  
 Easy fix.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-318) Refactor reduce shuffle code

2009-09-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752234#action_12752234
 ] 

Hudson commented on MAPREDUCE-318:
--

Integrated in Hadoop-Mapreduce-trunk #75 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/75/])


 Refactor reduce shuffle code
 

 Key: MAPREDUCE-318
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-318
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.21.0

 Attachments: HADOOP-5233_api.patch, HADOOP-5233_part0.patch, 
 mapred-318-14Aug.patch, mapred-318-20Aug.patch, mapred-318-24Aug.patch, 
 mapred-318-3Sep-v1.patch, mapred-318-3Sep.patch, mapred-318-common.patch


 The reduce shuffle code has become very complex and entangled. I think we 
 should move it out of ReduceTask and into a separate package 
 (org.apache.hadoop.mapred.task.reduce). Details to follow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-945) Test programs support only default queue.

2009-09-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752252#action_12752252
 ] 

Hadoop QA commented on MAPREDUCE-945:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12418797/mapreduce-945-2.patch
  against trunk revision 812209.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/12/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/12/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/12/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/12/console

This message is automatically generated.

 Test programs support only default queue.
 -

 Key: MAPREDUCE-945
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-945
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Suman Sehgal
 Attachments: mapreduce-945-1.patch, mapreduce-945-2.patch


 None of the test program seems to be supporting queue's concept. These 
 programs looks for the default queue only even if some other queue is 
 specified to run these programs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-861) Modify queue configuration format and parsing to support a hierarchy of queues.

2009-09-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752275#action_12752275
 ] 

Hadoop QA commented on MAPREDUCE-861:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12418777/MAPREDUCE-861-5.patch
  against trunk revision 812002.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 40 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 4 new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/42/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/42/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/42/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/42/console

This message is automatically generated.

 Modify queue configuration format and parsing to support a hierarchy of 
 queues.
 ---

 Key: MAPREDUCE-861
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-861
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Hemanth Yamijala
Assignee: rahul k singh
 Attachments: MAPREDUCE-861-1.patch, MAPREDUCE-861-2.patch, 
 MAPREDUCE-861-3.patch, MAPREDUCE-861-4.patch, MAPREDUCE-861-5.patch


 MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce 
 framework. This JIRA is for defining changes to the configuration related to 
 queues. 
 The current format for defining a queue and its properties is as follows: 
 mapred.queue.queue-name.property-name. For e.g. 
 mapred.queue.queue-name.acl-submit-job. The reason for using this verbose 
 format was to be able to reuse the Configuration parser in Hadoop. However, 
 administrators currently using the queue configuration have already indicated 
 a very strong desire for a more manageable format. Since, this becomes more 
 unwieldy with hierarchical queues, the time may be good to introduce a new 
 format for representing queue configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-157) Job History log file format is not friendly for external tools.

2009-09-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752278#action_12752278
 ] 

Hadoop QA commented on MAPREDUCE-157:
-

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12418824/mapred-157-7Sep-v1.patch
  against trunk revision 812209.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 30 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/13/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/13/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/13/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/13/console

This message is automatically generated.

 Job History log file format is not friendly for external tools.
 ---

 Key: MAPREDUCE-157
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Affects Versions: 0.20.1
Reporter: Owen O'Malley
Assignee: Jothi Padmanabhan
 Fix For: 0.21.0

 Attachments: mapred-157-4Sep.patch, mapred-157-7Sep-v1.patch, 
 mapred-157-7Sep.patch, mapred-157-prelim.patch, MAPREDUCE-157-avro.patch


 Currently, parsing the job history logs with external tools is very difficult 
 because of the format. The most critical problem is that newlines aren't 
 escaped in the strings. That makes using tools like grep, sed, and awk very 
 tricky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-959) JobConf::setWorkingDirectory requires that the default FileSystem is reachable

2009-09-07 Thread Chris Douglas (JIRA)
JobConf::setWorkingDirectory requires that the default FileSystem is reachable
--

 Key: MAPREDUCE-959
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-959
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, test
Reporter: Chris Douglas
Priority: Minor


If mapred.working.dir is not set, JobConf::setWorkingDirectory will attempt to 
obtain the default working directory for the default FileSystem. In trunk at 
least, if the default fs is HDFS and not reachable, set will fail:
{noformat}
java.net.UnknownHostException: unknown host: notahost
java.lang.RuntimeException: java.net.UnknownHostException: unknown host: 
notahost
  at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:541)
  at org.apache.hadoop.mapred.JobConf.setWorkingDirectory(JobConf.java:522)
  at org.apache.hadoop.conf.TestJobConf.testSetWorkingDir(TestJobConf.java:67)
Caused by: java.net.UnknownHostException: unknown host: notahost
  at org.apache.hadoop.ipc.Client$Connection.init(Client.java:216)
  at org.apache.hadoop.ipc.Client.getConnection(Client.java:876)
  at org.apache.hadoop.ipc.Client.call(Client.java:746)
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:223)
  at $Proxy4.getProtocolVersion(Unknown Source)
  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:366)
  at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:169)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:276)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:235)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:83)
  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1430)
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:69)
  at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1458)
  at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1446)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:190)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:98)
  at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:537)
{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-960) Unnecessary copy in mapreduce.lib.input.KeyValueLineRecordReader

2009-09-07 Thread Chris Douglas (JIRA)
Unnecessary copy in mapreduce.lib.input.KeyValueLineRecordReader


 Key: MAPREDUCE-960
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-960
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Chris Douglas
Assignee: Chris Douglas


KeyValueLineRecordReader effects the copy from the line to the key/value by 
creating separate arrays:
{noformat}
  int keyLen = pos;
  byte[] keyBytes = new byte[keyLen];
  System.arraycopy(line, 0, keyBytes, 0, keyLen);
  int valLen = lineLen - keyLen - 1;
  byte[] valBytes = new byte[valLen];
  System.arraycopy(line, pos + 1, valBytes, 0, valLen);
  key.set(keyBytes);
  value.set(valBytes);
{noformat}
Since set triggers another copy and Text has a set taking {{byte[], off, len}}, 
the intermediate copy can be avoided

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-960) Unnecessary copy in mapreduce.lib.input.KeyValueLineRecordReader

2009-09-07 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-960:


Attachment: M960-0.patch

Removed intermediate buffer and {{KeyValueLineRecordReader::getKeyClass}} 
accidentally copied from mapred in MAPREDUCE-655

 Unnecessary copy in mapreduce.lib.input.KeyValueLineRecordReader
 

 Key: MAPREDUCE-960
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-960
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Chris Douglas
Assignee: Chris Douglas
 Attachments: M960-0.patch


 KeyValueLineRecordReader effects the copy from the line to the key/value by 
 creating separate arrays:
 {noformat}
   int keyLen = pos;
   byte[] keyBytes = new byte[keyLen];
   System.arraycopy(line, 0, keyBytes, 0, keyLen);
   int valLen = lineLen - keyLen - 1;
   byte[] valBytes = new byte[valLen];
   System.arraycopy(line, pos + 1, valBytes, 0, valLen);
   key.set(keyBytes);
   value.set(valBytes);
 {noformat}
 Since set triggers another copy and Text has a set taking {{byte[], off, 
 len}}, the intermediate copy can be avoided

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-960) Unnecessary copy in mapreduce.lib.input.KeyValueLineRecordReader

2009-09-07 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-960:


Status: Patch Available  (was: Open)

 Unnecessary copy in mapreduce.lib.input.KeyValueLineRecordReader
 

 Key: MAPREDUCE-960
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-960
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Chris Douglas
Assignee: Chris Douglas
 Attachments: M960-0.patch


 KeyValueLineRecordReader effects the copy from the line to the key/value by 
 creating separate arrays:
 {noformat}
   int keyLen = pos;
   byte[] keyBytes = new byte[keyLen];
   System.arraycopy(line, 0, keyBytes, 0, keyLen);
   int valLen = lineLen - keyLen - 1;
   byte[] valBytes = new byte[valLen];
   System.arraycopy(line, pos + 1, valBytes, 0, valLen);
   key.set(keyBytes);
   value.set(valBytes);
 {noformat}
 Since set triggers another copy and Text has a set taking {{byte[], off, 
 len}}, the intermediate copy can be avoided

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-830) Providing BZip2 splitting support for Text data

2009-09-07 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-830:


Attachment: M830-3.patch

* Fixed mapreduce.lib.input.LineRecordReader (I missed the filePosition updates 
in the last patch)
* Added a unit test for the mapreduce code
* Patched KeyValueLineRecordReader::isSplittable in mapred and mapreduce

 Providing BZip2 splitting support for Text data
 ---

 Key: MAPREDUCE-830
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-830
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Abdul Qadeer
Assignee: Abdul Qadeer
 Fix For: 0.21.0

 Attachments: M830-2.patch, M830-3.patch, MapReduce-830-version1.patch


 HADOOP-4012 (https://issues.apache.org/jira/browse/HADOOP-4012) is providing 
 support to handle BZip2 compressed data such that the input compressed file 
 is split at arbitrary points.  This JIRA uses that functionality in 
 LineRecordReader.  The benefit of this work is that, if user provides 
 compressed BZip2 Text data, it will be split by Hadoop and hence will be 
 processed by multiple mappers.  So BZip2 compressed data will be able to 
 fully utilize the cluster power.  Currently BZip2 compressed Text file goes 
 to one mapper and is not split.  So the enhancement in this JIRA provides 
 splitting support  and a considerable performance gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-830) Providing BZip2 splitting support for Text data

2009-09-07 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752304#action_12752304
 ] 

Chris Douglas commented on MAPREDUCE-830:
-

(also includes a workaround for MAPREDUCE-959, which was getting irritating, 
and updates the unit tests to JUnit4 semantics)

 Providing BZip2 splitting support for Text data
 ---

 Key: MAPREDUCE-830
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-830
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Abdul Qadeer
Assignee: Abdul Qadeer
 Fix For: 0.21.0

 Attachments: M830-2.patch, M830-3.patch, MapReduce-830-version1.patch


 HADOOP-4012 (https://issues.apache.org/jira/browse/HADOOP-4012) is providing 
 support to handle BZip2 compressed data such that the input compressed file 
 is split at arbitrary points.  This JIRA uses that functionality in 
 LineRecordReader.  The benefit of this work is that, if user provides 
 compressed BZip2 Text data, it will be split by Hadoop and hence will be 
 processed by multiple mappers.  So BZip2 compressed data will be able to 
 fully utilize the cluster power.  Currently BZip2 compressed Text file goes 
 to one mapper and is not split.  So the enhancement in this JIRA provides 
 splitting support  and a considerable performance gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-960) Unnecessary copy in mapreduce.lib.input.KeyValueLineRecordReader

2009-09-07 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-960:


Affects Version/s: 0.21.0

 Unnecessary copy in mapreduce.lib.input.KeyValueLineRecordReader
 

 Key: MAPREDUCE-960
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-960
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Chris Douglas
Assignee: Chris Douglas
 Attachments: M960-0.patch


 KeyValueLineRecordReader effects the copy from the line to the key/value by 
 creating separate arrays:
 {noformat}
   int keyLen = pos;
   byte[] keyBytes = new byte[keyLen];
   System.arraycopy(line, 0, keyBytes, 0, keyLen);
   int valLen = lineLen - keyLen - 1;
   byte[] valBytes = new byte[valLen];
   System.arraycopy(line, pos + 1, valBytes, 0, valLen);
   key.set(keyBytes);
   value.set(valBytes);
 {noformat}
 Since set triggers another copy and Text has a set taking {{byte[], off, 
 len}}, the intermediate copy can be avoided

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-960) Unnecessary copy in mapreduce.lib.input.KeyValueLineRecordReader

2009-09-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752322#action_12752322
 ] 

Hadoop QA commented on MAPREDUCE-960:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12418866/M960-0.patch
  against trunk revision 812287.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/43/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/43/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/43/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/43/console

This message is automatically generated.

 Unnecessary copy in mapreduce.lib.input.KeyValueLineRecordReader
 

 Key: MAPREDUCE-960
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-960
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Chris Douglas
Assignee: Chris Douglas
 Attachments: M960-0.patch


 KeyValueLineRecordReader effects the copy from the line to the key/value by 
 creating separate arrays:
 {noformat}
   int keyLen = pos;
   byte[] keyBytes = new byte[keyLen];
   System.arraycopy(line, 0, keyBytes, 0, keyLen);
   int valLen = lineLen - keyLen - 1;
   byte[] valBytes = new byte[valLen];
   System.arraycopy(line, pos + 1, valBytes, 0, valLen);
   key.set(keyBytes);
   value.set(valBytes);
 {noformat}
 Since set triggers another copy and Text has a set taking {{byte[], off, 
 len}}, the intermediate copy can be avoided

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-28) TestQueueManager takes too long and times out some times

2009-09-07 Thread Sreekanth Ramakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752327#action_12752327
 ] 

Sreekanth Ramakrishnan commented on MAPREDUCE-28:
-

After discussion with Rahul and looking at the test case which were written for 
MAPREDUCE-861, the path forward would be to just test the sematic meaning of 
the configured acls in the {{TestQueueManager}} the state and acl refresh is 
actually taken care in the test case introduced in {{MAPREDUCE-861}}

 TestQueueManager takes too long and times out some times
 

 Key: MAPREDUCE-28
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-28
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu

 TestQueueManager takes long time for the run and timeouts sometimes.
 See the failure at 
 http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3875/testReport/.
 Looking at the console output, before the test finsihes, it was timed-out.
 On my machine, the test takes about 5 minutes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-944) Extend FairShare scheduler to fair-share memory usage in the cluster

2009-09-07 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated MAPREDUCE-944:
---

Attachment: LoadManager2.txt

Incorporated Matie's review comments.

Vinod: The goal that we have in mind is slightly different from what the 
capacity scheduler has done (pl correct me if I am wrong). Unlike the capacity 
scheduler, there is no assumption that the user knows (upfront, before 
submitting job) how much memory/CPU/network that job will need. A realtime 
stream of  resource usage   will be fed into the new LoadManager that can then 
dynamically decide whether to run another task on that machine or not. Your 
feedback and experience vis-a-vis that Capacity scheduler is very valuable 
here... let's continue the conversation via MAPREDUCE-961

 Extend FairShare scheduler to fair-share memory usage in the cluster
 

 Key: MAPREDUCE-944
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-944
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/fair-share
Reporter: dhruba borthakur
 Attachments: LoadManager.txt, LoadManager2.txt


 The FairShare Scheduler has an extensible LoadManager API to regulate 
 allocating new tasks on a particular TaskTracker. In similar lines, it would 
 be nice if the FairShare Scheduler can have a pluggable policy to regulate 
 new tasks from a particular job. This will allow one to skip scheduling tasks 
 of a job that  is eating a large percentage of memory in the cluster, i.e. 
 fair-share of memory resources among jobs. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-961) ResourceAwareLoadManager to dynamically decide new tasks based on current CPU/memory load on TaskTracker(s)

2009-09-07 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated MAPREDUCE-961:
---

Tags: fb

 ResourceAwareLoadManager to dynamically decide new tasks based on current 
 CPU/memory load on TaskTracker(s)
 ---

 Key: MAPREDUCE-961
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-961
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/fair-share
Reporter: dhruba borthakur
Assignee: dhruba borthakur

 Design and develop a ResouceAwareLoadManager for the FairShare scheduler that 
 dynamically decides how many maps/reduces to run on a particular machine 
 based on the CPU/Memory/diskIO/network usage in that machine.  The amount of 
 resources currently used on each task tracker is being fed into the 
 ResourceAwareLoadManager in real-time via an entity that is external to 
 Hadoop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-944) Extend FairShare scheduler to fair-share memory usage in the cluster

2009-09-07 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated MAPREDUCE-944:
---

Fix Version/s: 0.21.0
 Hadoop Flags: [Reviewed]
   Status: Patch Available  (was: Open)

 Extend FairShare scheduler to fair-share memory usage in the cluster
 

 Key: MAPREDUCE-944
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-944
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/fair-share
Reporter: dhruba borthakur
 Fix For: 0.21.0

 Attachments: LoadManager.txt, LoadManager2.txt


 The FairShare Scheduler has an extensible LoadManager API to regulate 
 allocating new tasks on a particular TaskTracker. In similar lines, it would 
 be nice if the FairShare Scheduler can have a pluggable policy to regulate 
 new tasks from a particular job. This will allow one to skip scheduling tasks 
 of a job that  is eating a large percentage of memory in the cluster, i.e. 
 fair-share of memory resources among jobs. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-861) Modify queue configuration format and parsing to support a hierarchy of queues.

2009-09-07 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752351#action_12752351
 ] 

Hemanth Yamijala commented on MAPREDUCE-861:


This is getting really close, sans the issues with test-patch.

- I still think QueueManager needs more java documentation - particularly 
explaining some aspects of hierarchical queues.
- Also, the mapred-queues.xml can be better presented.
- Not 100% sure about this, but will LOG.fatal cause an exception to be thrown 
? If yes, can you check if that's the intended usage where you are using it ?
- ACLs are stored inconsistently between Deprecated and hierarchical 
configuration. Specifically in the hierarchical configuration, they should be 
stored with the same key as in the hierarchical case, as the QueueManager 
treats them equally.

 Modify queue configuration format and parsing to support a hierarchy of 
 queues.
 ---

 Key: MAPREDUCE-861
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-861
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Hemanth Yamijala
Assignee: rahul k singh
 Attachments: MAPREDUCE-861-1.patch, MAPREDUCE-861-2.patch, 
 MAPREDUCE-861-3.patch, MAPREDUCE-861-4.patch, MAPREDUCE-861-5.patch


 MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce 
 framework. This JIRA is for defining changes to the configuration related to 
 queues. 
 The current format for defining a queue and its properties is as follows: 
 mapred.queue.queue-name.property-name. For e.g. 
 mapred.queue.queue-name.acl-submit-job. The reason for using this verbose 
 format was to be able to reuse the Configuration parser in Hadoop. However, 
 administrators currently using the queue configuration have already indicated 
 a very strong desire for a more manageable format. Since, this becomes more 
 unwieldy with hierarchical queues, the time may be good to introduce a new 
 format for representing queue configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.