[jira] Updated: (MAPREDUCE-1543) Log messages of JobACLsManager should use security logging of HADOOP-6586

2010-03-17 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-1543:
--

Attachment: mr-1543-v1.9.2.patch

Attaching a patch for trunk that incorporates Hemanth's review comments. 
test-patch and ant-tests passed. Log4j changes are commented out but gives some 
insight as to how to configure mapreduce audit logs to write to a separate log 
file.

> Log messages of JobACLsManager should use security logging of HADOOP-6586
> -
>
> Key: MAPREDUCE-1543
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1543
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security
>Reporter: Vinod K V
>Assignee: Amar Kamat
> Fix For: 0.22.0
>
> Attachments: mapreduce-1543-y20s-3.patch, mapreduce-1543-y20s.patch, 
> mr-1543-v1.9.2.patch
>
>
> {{JobACLsManager}} added in MAPREDUCE-1307 logs the successes and failures 
> w.r.t job-level authorization in the corresponding Daemons' logs. The log 
> messages should instead use security logging of HADOOP-6586.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-158) mapred.userlog.retain.hours killing long running tasks

2010-03-17 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu resolved MAPREDUCE-158.
---

Resolution: Duplicate

> mapred.userlog.retain.hours killing long running tasks
> --
>
> Key: MAPREDUCE-158
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-158
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
> Environment: 0.19.2-dev, r753365 
>Reporter: Billy Pearson
> Attachments: hadoop-5600.patch
>
>
> One can reproduce the scenario by configuring mapred.userlog.retain.hours to 
> 1hr, and running tasks that take more than an hour.
> More info on closed ticket HADOOP-5591.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-81) missing userlogs

2010-03-17 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-81?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu resolved MAPREDUCE-81.
--

Resolution: Duplicate

> missing userlogs
> 
>
> Key: MAPREDUCE-81
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-81
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Christian Kunz
>
> I noticed that a long-running job taking more than 1 day can have reduce 
> tasks with missing stderr and stdout log directories (the syslog files exist).
> I wonder whether it has to do with the default value of 12 hours for 
> mapred.userlog.retain.hours, resulting in deletion of the stderr/stdout 
> directories for reduce tasks idling till they start to actually reduce. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-81) missing userlogs

2010-03-17 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846755#action_12846755
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-81:
--

This issue doesn't exist any more, because MAPREDUCE-927 solves this by 
modifying "mapred.userlog.retain.hours" to specify the time(in hours) for which 
the user-logs are to be retained after the job completion.

> missing userlogs
> 
>
> Key: MAPREDUCE-81
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-81
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Christian Kunz
>
> I noticed that a long-running job taking more than 1 day can have reduce 
> tasks with missing stderr and stdout log directories (the syslog files exist).
> I wonder whether it has to do with the default value of 12 hours for 
> mapred.userlog.retain.hours, resulting in deletion of the stderr/stdout 
> directories for reduce tasks idling till they start to actually reduce. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-158) mapred.userlog.retain.hours killing long running tasks

2010-03-17 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846753#action_12846753
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-158:
---

This issue doesn't exist any more, because MAPREDUCE-927 solves this by 
modifying "mapred.userlog.retain.hours" to specify the time(in hours) for which 
the user-logs are to be retained after the job completion.

> mapred.userlog.retain.hours killing long running tasks
> --
>
> Key: MAPREDUCE-158
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-158
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
> Environment: 0.19.2-dev, r753365 
>Reporter: Billy Pearson
> Attachments: hadoop-5600.patch
>
>
> One can reproduce the scenario by configuring mapred.userlog.retain.hours to 
> 1hr, and running tasks that take more than an hour.
> More info on closed ticket HADOOP-5591.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1604) Job acls should be documented in forrest.

2010-03-17 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1604:
---

Assignee: Amareshwari Sriramadasu
  Status: Patch Available  (was: Open)

> Job acls should be documented in forrest.
> -
>
> Key: MAPREDUCE-1604
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1604
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation, security
>Affects Versions: 0.22.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.22.0
>
> Attachments: patch-1604-ydist.txt, patch-1604.txt
>
>
> Job acls introduced in MAPREDUCE-1307 should be documented in forrest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1604) Job acls should be documented in forrest.

2010-03-17 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1604:
---

Attachment: patch-1604.txt

Attaching the patch for trunk.

ant docs passed successfully on my machine.

> Job acls should be documented in forrest.
> -
>
> Key: MAPREDUCE-1604
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1604
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation, security
>Affects Versions: 0.22.0
>Reporter: Amareshwari Sriramadasu
> Fix For: 0.22.0
>
> Attachments: patch-1604-ydist.txt, patch-1604.txt
>
>
> Job acls introduced in MAPREDUCE-1307 should be documented in forrest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1604) Job acls should be documented in forrest.

2010-03-17 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1604:
---

Attachment: patch-1604-ydist.txt

Patch for Yahoo! distribution

> Job acls should be documented in forrest.
> -
>
> Key: MAPREDUCE-1604
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1604
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation, security
>Affects Versions: 0.22.0
>Reporter: Amareshwari Sriramadasu
> Fix For: 0.22.0
>
> Attachments: patch-1604-ydist.txt
>
>
> Job acls introduced in MAPREDUCE-1307 should be documented in forrest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1593) [Rumen] Improvements to random seed generation

2010-03-17 Thread Tamas Sarlos (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tamas Sarlos updated MAPREDUCE-1593:


Attachment: (was: MAPREDUCE-1593-20100311.patch)

> [Rumen] Improvements to random seed generation 
> ---
>
> Key: MAPREDUCE-1593
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1593
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Tamas Sarlos
>Assignee: Tamas Sarlos
>Priority: Trivial
> Fix For: 0.21.0, 0.22.0
>
> Attachments: MAPREDUCE-1593-20100311-dup.patch
>
>
> RandomSeedGenerator introduced in MAPREDUCE-1306 could be more efficient by 
> reusing the MD5 object across calls. Wrapping the MD5 in a ThreadLocal makes 
> the call thread safe as well. Neither of these is an issue with the current 
> client, the mumak simulator, but the changes are small and make the code more 
> useful in the future. Thanks to Chris Douglas for the suggestion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1593) [Rumen] Improvements to random seed generation

2010-03-17 Thread Tamas Sarlos (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846686#action_12846686
 ] 

Tamas Sarlos commented on MAPREDUCE-1593:
-

Re: -1 tests included: MAPREDUCE-1306 introduced a test case, 
TestRandomSeedGenerator.java. The current patch does not change the API of 
RandomSeedGenerator.java. 

> [Rumen] Improvements to random seed generation 
> ---
>
> Key: MAPREDUCE-1593
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1593
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Tamas Sarlos
>Assignee: Tamas Sarlos
>Priority: Trivial
> Fix For: 0.21.0, 0.22.0
>
> Attachments: MAPREDUCE-1593-20100311-dup.patch, 
> MAPREDUCE-1593-20100311.patch
>
>
> RandomSeedGenerator introduced in MAPREDUCE-1306 could be more efficient by 
> reusing the MD5 object across calls. Wrapping the MD5 in a ThreadLocal makes 
> the call thread safe as well. Neither of these is an issue with the current 
> client, the mumak simulator, but the changes are small and make the code more 
> useful in the future. Thanks to Chris Douglas for the suggestion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1593) [Rumen] Improvements to random seed generation

2010-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846679#action_12846679
 ] 

Hadoop QA commented on MAPREDUCE-1593:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12438953/MAPREDUCE-1593-20100311-dup.patch
  against trunk revision 923907.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/36/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/36/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/36/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/36/console

This message is automatically generated.

> [Rumen] Improvements to random seed generation 
> ---
>
> Key: MAPREDUCE-1593
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1593
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Tamas Sarlos
>Assignee: Tamas Sarlos
>Priority: Trivial
> Fix For: 0.21.0, 0.22.0
>
> Attachments: MAPREDUCE-1593-20100311-dup.patch, 
> MAPREDUCE-1593-20100311.patch
>
>
> RandomSeedGenerator introduced in MAPREDUCE-1306 could be more efficient by 
> reusing the MD5 object across calls. Wrapping the MD5 in a ThreadLocal makes 
> the call thread safe as well. Neither of these is an issue with the current 
> client, the mumak simulator, but the changes are small and make the code more 
> useful in the future. Thanks to Chris Douglas for the suggestion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1605) Support multiple headless users to be able to submit job via gridmix v3

2010-03-17 Thread Hong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846641#action_12846641
 ] 

Hong Tang commented on MAPREDUCE-1605:
--

MAPREDUCE-1376 requires Gridmix3 being launched by a regular "super user" that 
has properly authenticated with Kerberos server. In this jira, we would like to 
allow Gridmix3 to be launched by a headless user authenticated with Kerberos 
through keytab.

The title is a bit misleading, what we want is to launch gridmix3 through a 
headless user, whether jobs are submitted as real user or headless users is 
probably irrelevant.

> Support multiple headless users to be able to submit job via gridmix v3
> ---
>
> Key: MAPREDUCE-1605
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1605
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/gridmix
>Reporter: rahul k singh
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1221) Kill tasks on a node if the free physical memory on that machine falls below a configured threshold

2010-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846632#action_12846632
 ] 

Hadoop QA commented on MAPREDUCE-1221:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12438470/MAPREDUCE-1221-v4.patch
  against trunk revision 923907.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/528/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/528/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/528/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/528/console

This message is automatically generated.

> Kill tasks on a node if the free physical memory on that machine falls below 
> a configured threshold
> ---
>
> Key: MAPREDUCE-1221
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1221
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tasktracker
>Affects Versions: 0.22.0
>Reporter: dhruba borthakur
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1221-v1.patch, MAPREDUCE-1221-v2.patch, 
> MAPREDUCE-1221-v3.patch, MAPREDUCE-1221-v4.patch
>
>
> The TaskTracker currently supports killing tasks if the virtual memory of a 
> task exceeds a set of configured thresholds. I would like to extend this 
> feature to enable killing tasks if the physical memory used by that task 
> exceeds a certain threshold.
> On a certain operating system (guess?), if user space processes start using 
> lots of memory, the machine hangs and dies quickly. This means that we would 
> like to prevent map-reduce jobs from triggering this condition. From my 
> understanding, the killing-based-on-virtual-memory-limits (HADOOP-5883) were 
> designed to address this problem. This works well when most map-reduce jobs 
> are Java jobs and have well-defined -Xmx parameters that specify the max 
> virtual memory for each task. On the other hand, if each task forks off 
> mappers/reducers written in other languages (python/php, etc), the total 
> virtual memory usage of the process-subtree varies greatly. In these cases, 
> it is better to use kill-tasks-using-physical-memory-limits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1221) Kill tasks on a node if the free physical memory on that machine falls below a configured threshold

2010-03-17 Thread Scott Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846613#action_12846613
 ] 

Scott Chen commented on MAPREDUCE-1221:
---

The definitions of the parameters in the previous comment are not very clear.
Here are the definitions.

*mapreduce.tasktracker.reserved.physicalmemory.mb*
Specify how much memory on the TT that will not be used by tasks.
For example if we configure this to be 2G and we have 16G of memory.
The tasks only allow to use 14G of memory. If the limit is exceeded, TT will 
kill the task consumes the most amount of memory.
We use reserved memory instead of total memory because there may be cases that 
we have nonuniform cluster.

*mapreduce.map.memory.physical.mb*
Memory limit for one mapper. If the mapper uses more memory that this, it will 
fail.

*mapreduce.reduce.memory.physical.mb*
Memory limit for one reducer. If the reducer uses more memory that this, it 
will fail.

> Kill tasks on a node if the free physical memory on that machine falls below 
> a configured threshold
> ---
>
> Key: MAPREDUCE-1221
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1221
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tasktracker
>Affects Versions: 0.22.0
>Reporter: dhruba borthakur
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1221-v1.patch, MAPREDUCE-1221-v2.patch, 
> MAPREDUCE-1221-v3.patch, MAPREDUCE-1221-v4.patch
>
>
> The TaskTracker currently supports killing tasks if the virtual memory of a 
> task exceeds a set of configured thresholds. I would like to extend this 
> feature to enable killing tasks if the physical memory used by that task 
> exceeds a certain threshold.
> On a certain operating system (guess?), if user space processes start using 
> lots of memory, the machine hangs and dies quickly. This means that we would 
> like to prevent map-reduce jobs from triggering this condition. From my 
> understanding, the killing-based-on-virtual-memory-limits (HADOOP-5883) were 
> designed to address this problem. This works well when most map-reduce jobs 
> are Java jobs and have well-defined -Xmx parameters that specify the max 
> virtual memory for each task. On the other hand, if each task forks off 
> mappers/reducers written in other languages (python/php, etc), the total 
> virtual memory usage of the process-subtree varies greatly. In these cases, 
> it is better to use kill-tasks-using-physical-memory-limits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1593) [Rumen] Improvements to random seed generation

2010-03-17 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-1593:
-

Status: Patch Available  (was: Open)

> [Rumen] Improvements to random seed generation 
> ---
>
> Key: MAPREDUCE-1593
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1593
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Tamas Sarlos
>Assignee: Tamas Sarlos
>Priority: Trivial
> Fix For: 0.21.0, 0.22.0
>
> Attachments: MAPREDUCE-1593-20100311-dup.patch, 
> MAPREDUCE-1593-20100311.patch
>
>
> RandomSeedGenerator introduced in MAPREDUCE-1306 could be more efficient by 
> reusing the MD5 object across calls. Wrapping the MD5 in a ThreadLocal makes 
> the call thread safe as well. Neither of these is an issue with the current 
> client, the mumak simulator, but the changes are small and make the code more 
> useful in the future. Thanks to Chris Douglas for the suggestion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1593) [Rumen] Improvements to random seed generation

2010-03-17 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-1593:
-

Status: Open  (was: Patch Available)

> [Rumen] Improvements to random seed generation 
> ---
>
> Key: MAPREDUCE-1593
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1593
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Tamas Sarlos
>Assignee: Tamas Sarlos
>Priority: Trivial
> Fix For: 0.21.0, 0.22.0
>
> Attachments: MAPREDUCE-1593-20100311-dup.patch, 
> MAPREDUCE-1593-20100311.patch
>
>
> RandomSeedGenerator introduced in MAPREDUCE-1306 could be more efficient by 
> reusing the MD5 object across calls. Wrapping the MD5 in a ThreadLocal makes 
> the call thread safe as well. Neither of these is an issue with the current 
> client, the mumak simulator, but the changes are small and make the code more 
> useful in the future. Thanks to Chris Douglas for the suggestion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1221) Kill tasks on a node if the free physical memory on that machine falls below a configured threshold

2010-03-17 Thread Scott Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846541#action_12846541
 ] 

Scott Chen commented on MAPREDUCE-1221:
---

The patch has been changed over time. Here is a quick overall summary of what 
this patch does.

The purpose of this patch is to allow TaskTracker to kill/fail tasks based on 
the RSS memory status.

We can set the following three different parameters
mapreduce.tasktracker.reserved.physicalmemory.mb
mapreduce.map.memory.physical.mb 
mapreduce.reduce.memory.physical.mb

They will determine the total allowed RSS memory for tasks and the limit for 
individual tasks.
If the total limit is violated, TaskTracker will kill the task with highest 
amount of memory to relief the memory pressure.
If the per task limit is violated, TaskTracker will *fail* the task that 
violate the limit. 
If the parameters are not set, there will not be any limit.

The implementation is mostly follow the virtual memory limiting logic that we 
already have.
And ProcfsBasedProcessTree also allow us to obtain physical memory of tasks.

The tests added are the following
TestTaskTrackerMemoryManager.testTasksCumulativelyExceedingTTPhysicalLimits()
TestTaskTrackerMemoryManager.testTasksBeyondPhysicalLimits()
They verifies the behavior of the cases when total memory limit and per task 
limit are triggered.
There is also a slight modification in 
TestTaskTrackerMemoryManager.testTasksWithinLimits() to make sure the tasks 
within physical memory limit will run correctly.


> Kill tasks on a node if the free physical memory on that machine falls below 
> a configured threshold
> ---
>
> Key: MAPREDUCE-1221
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1221
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tasktracker
>Affects Versions: 0.22.0
>Reporter: dhruba borthakur
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1221-v1.patch, MAPREDUCE-1221-v2.patch, 
> MAPREDUCE-1221-v3.patch, MAPREDUCE-1221-v4.patch
>
>
> The TaskTracker currently supports killing tasks if the virtual memory of a 
> task exceeds a set of configured thresholds. I would like to extend this 
> feature to enable killing tasks if the physical memory used by that task 
> exceeds a certain threshold.
> On a certain operating system (guess?), if user space processes start using 
> lots of memory, the machine hangs and dies quickly. This means that we would 
> like to prevent map-reduce jobs from triggering this condition. From my 
> understanding, the killing-based-on-virtual-memory-limits (HADOOP-5883) were 
> designed to address this problem. This works well when most map-reduce jobs 
> are Java jobs and have well-defined -Xmx parameters that specify the max 
> virtual memory for each task. On the other hand, if each task forks off 
> mappers/reducers written in other languages (python/php, etc), the total 
> virtual memory usage of the process-subtree varies greatly. In these cases, 
> it is better to use kill-tasks-using-physical-memory-limits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1480) CombineFileRecordReader does not properly initialize child RecordReader

2010-03-17 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846535#action_12846535
 ] 

dhruba borthakur commented on MAPREDUCE-1480:
-

Code looks good to me. 

> CombineFileRecordReader does not properly initialize child RecordReader
> ---
>
> Key: MAPREDUCE-1480
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1480
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-1480.2.patch, MAPREDUCE-1480.3.patch, 
> MAPREDUCE-1480.4.patch, MAPREDUCE-1480.patch
>
>
> CombineFileRecordReader instantiates child RecordReader instances but never 
> calls their initialize() method to give them the proper TaskAttemptContext.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1569) Mock Contexts & Configurations

2010-03-17 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846524#action_12846524
 ] 

Aaron Kimball commented on MAPREDUCE-1569:
--

New patch looks good; +1.

> Mock Contexts & Configurations
> --
>
> Key: MAPREDUCE-1569
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1569
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/mrunit
>Reporter: Chris White
>Assignee: Chris White
>Priority: Minor
> Attachments: MAPREDUCE-1569.patch, MAPREDUCE-1569.patch, 
> MAPREDUCE-1569.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Currently the library creates a new Configuration object in the 
> MockMapContext and MocKReduceContext constructors, rather than allowing the 
> developer to configure and pass their own

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1543) Log messages of JobACLsManager should use security logging of HADOOP-6586

2010-03-17 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-1543:


Attachment: mapreduce-1543-y20s-3.patch

Uploading a patch again on Amar's behalf. This one incorporates my review 
comments raised earlier. It is still for earlier versions of Hadoop, and not 
for commit here.

> Log messages of JobACLsManager should use security logging of HADOOP-6586
> -
>
> Key: MAPREDUCE-1543
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1543
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: security
>Reporter: Vinod K V
>Assignee: Amar Kamat
> Fix For: 0.22.0
>
> Attachments: mapreduce-1543-y20s-3.patch, mapreduce-1543-y20s.patch
>
>
> {{JobACLsManager}} added in MAPREDUCE-1307 logs the successes and failures 
> w.r.t job-level authorization in the corresponding Daemons' logs. The log 
> messages should instead use security logging of HADOOP-6586.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1221) Kill tasks on a node if the free physical memory on that machine falls below a configured threshold

2010-03-17 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1221:
--

Status: Open  (was: Patch Available)

> Kill tasks on a node if the free physical memory on that machine falls below 
> a configured threshold
> ---
>
> Key: MAPREDUCE-1221
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1221
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tasktracker
>Affects Versions: 0.22.0
>Reporter: dhruba borthakur
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1221-v1.patch, MAPREDUCE-1221-v2.patch, 
> MAPREDUCE-1221-v3.patch, MAPREDUCE-1221-v4.patch
>
>
> The TaskTracker currently supports killing tasks if the virtual memory of a 
> task exceeds a set of configured thresholds. I would like to extend this 
> feature to enable killing tasks if the physical memory used by that task 
> exceeds a certain threshold.
> On a certain operating system (guess?), if user space processes start using 
> lots of memory, the machine hangs and dies quickly. This means that we would 
> like to prevent map-reduce jobs from triggering this condition. From my 
> understanding, the killing-based-on-virtual-memory-limits (HADOOP-5883) were 
> designed to address this problem. This works well when most map-reduce jobs 
> are Java jobs and have well-defined -Xmx parameters that specify the max 
> virtual memory for each task. On the other hand, if each task forks off 
> mappers/reducers written in other languages (python/php, etc), the total 
> virtual memory usage of the process-subtree varies greatly. In these cases, 
> it is better to use kill-tasks-using-physical-memory-limits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1221) Kill tasks on a node if the free physical memory on that machine falls below a configured threshold

2010-03-17 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1221:
--

Status: Patch Available  (was: Open)

> Kill tasks on a node if the free physical memory on that machine falls below 
> a configured threshold
> ---
>
> Key: MAPREDUCE-1221
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1221
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tasktracker
>Affects Versions: 0.22.0
>Reporter: dhruba borthakur
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1221-v1.patch, MAPREDUCE-1221-v2.patch, 
> MAPREDUCE-1221-v3.patch, MAPREDUCE-1221-v4.patch
>
>
> The TaskTracker currently supports killing tasks if the virtual memory of a 
> task exceeds a set of configured thresholds. I would like to extend this 
> feature to enable killing tasks if the physical memory used by that task 
> exceeds a certain threshold.
> On a certain operating system (guess?), if user space processes start using 
> lots of memory, the machine hangs and dies quickly. This means that we would 
> like to prevent map-reduce jobs from triggering this condition. From my 
> understanding, the killing-based-on-virtual-memory-limits (HADOOP-5883) were 
> designed to address this problem. This works well when most map-reduce jobs 
> are Java jobs and have well-defined -Xmx parameters that specify the max 
> virtual memory for each task. On the other hand, if each task forks off 
> mappers/reducers written in other languages (python/php, etc), the total 
> virtual memory usage of the process-subtree varies greatly. In these cases, 
> it is better to use kill-tasks-using-physical-memory-limits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1221) Kill tasks on a node if the free physical memory on that machine falls below a configured threshold

2010-03-17 Thread Scott Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846506#action_12846506
 ] 

Scott Chen commented on MAPREDUCE-1221:
---

I have run TestJobClient on two different dev boxes. Both worked. I will submit 
this to Hudson again.

> Kill tasks on a node if the free physical memory on that machine falls below 
> a configured threshold
> ---
>
> Key: MAPREDUCE-1221
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1221
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tasktracker
>Affects Versions: 0.22.0
>Reporter: dhruba borthakur
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1221-v1.patch, MAPREDUCE-1221-v2.patch, 
> MAPREDUCE-1221-v3.patch, MAPREDUCE-1221-v4.patch
>
>
> The TaskTracker currently supports killing tasks if the virtual memory of a 
> task exceeds a set of configured thresholds. I would like to extend this 
> feature to enable killing tasks if the physical memory used by that task 
> exceeds a certain threshold.
> On a certain operating system (guess?), if user space processes start using 
> lots of memory, the machine hangs and dies quickly. This means that we would 
> like to prevent map-reduce jobs from triggering this condition. From my 
> understanding, the killing-based-on-virtual-memory-limits (HADOOP-5883) were 
> designed to address this problem. This works well when most map-reduce jobs 
> are Java jobs and have well-defined -Xmx parameters that specify the max 
> virtual memory for each task. On the other hand, if each task forks off 
> mappers/reducers written in other languages (python/php, etc), the total 
> virtual memory usage of the process-subtree varies greatly. In these cases, 
> it is better to use kill-tasks-using-physical-memory-limits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1221) Kill tasks on a node if the free physical memory on that machine falls below a configured threshold

2010-03-17 Thread Scott Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846498#action_12846498
 ] 

Scott Chen commented on MAPREDUCE-1221:
---

I am checking the failed tasks.

> Kill tasks on a node if the free physical memory on that machine falls below 
> a configured threshold
> ---
>
> Key: MAPREDUCE-1221
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1221
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tasktracker
>Affects Versions: 0.22.0
>Reporter: dhruba borthakur
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1221-v1.patch, MAPREDUCE-1221-v2.patch, 
> MAPREDUCE-1221-v3.patch, MAPREDUCE-1221-v4.patch
>
>
> The TaskTracker currently supports killing tasks if the virtual memory of a 
> task exceeds a set of configured thresholds. I would like to extend this 
> feature to enable killing tasks if the physical memory used by that task 
> exceeds a certain threshold.
> On a certain operating system (guess?), if user space processes start using 
> lots of memory, the machine hangs and dies quickly. This means that we would 
> like to prevent map-reduce jobs from triggering this condition. From my 
> understanding, the killing-based-on-virtual-memory-limits (HADOOP-5883) were 
> designed to address this problem. This works well when most map-reduce jobs 
> are Java jobs and have well-defined -Xmx parameters that specify the max 
> virtual memory for each task. On the other hand, if each task forks off 
> mappers/reducers written in other languages (python/php, etc), the total 
> virtual memory usage of the process-subtree varies greatly. In these cases, 
> it is better to use kill-tasks-using-physical-memory-limits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1480) CombineFileRecordReader does not properly initialize child RecordReader

2010-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846495#action_12846495
 ] 

Hadoop QA commented on MAPREDUCE-1480:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12438662/MAPREDUCE-1480.4.patch
  against trunk revision 923907.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/35/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/35/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/35/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/35/console

This message is automatically generated.

> CombineFileRecordReader does not properly initialize child RecordReader
> ---
>
> Key: MAPREDUCE-1480
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1480
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-1480.2.patch, MAPREDUCE-1480.3.patch, 
> MAPREDUCE-1480.4.patch, MAPREDUCE-1480.patch
>
>
> CombineFileRecordReader instantiates child RecordReader instances but never 
> calls their initialize() method to give them the proper TaskAttemptContext.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1603) Add a plugin class for the TaskTracker to determine available slots

2010-03-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846491#action_12846491
 ] 

Steve Loughran commented on MAPREDUCE-1603:
---

-that would imply passing up machine metadata: cpu family/version, OS, etc. No 
reason why that couldn't be done, though you'd have to decide whether that is 
something you'd republish every heartbeat or just when the TT first registers. 
Of course, without the JT making decisions on where to route stuff based on 
those features, it's wasted effort. Which would imply you also need some plugin 
support for making the decisions as to where to run Mappers and Reducers; right 
now it's fairly straightforward: do it close to the data. 

> Add a plugin class for the TaskTracker to determine available slots
> ---
>
> Key: MAPREDUCE-1603
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1603
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: tasktracker
>Affects Versions: 0.22.0
>Reporter: Steve Loughran
>Priority: Minor
>
> Currently the #of available map and reduce slots is determined by the 
> configuration. MAPREDUCE-922 has proposed working things out automatically, 
> but that is going to depend a lot on the specific tasks -hard to get right 
> for everyone.
> There is a Hadoop cluster near me that would like to use CPU time from other 
> machines in the room, machines which cannot offer storage, but which will 
> have spare CPU time when they aren't running code scheduled with a grid 
> scheduler. The nodes could run a TT which would report a dynamic number of 
> slots, the number depending upon the current grid workload. 
> I propose we add a plugin point here, so that different people can develop 
> plugin classes that determine the amount of available slots based on 
> workload, RAM, CPU, power budget, thermal parameters, etc. Lots of space for 
> customisation and improvement. And by having it as a plugin: people get to 
> integrate with whatever datacentre schedulers they have without Hadoop itself 
> needing to be altered: the base implementation would be as today: subtract 
> the number of active map and reduce slots from the configured values, push 
> that out. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1606) TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task

2010-03-17 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-1606:
-

Attachment: MR1606.20S.1.patch

Updated patch on behalf of Ravi.

The problem with the previous patch was that the test could still timeout due 
to setup/cleanup tasks getting done before the kill, so a single map has been 
added to the job.

> TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task
> 
>
> Key: MAPREDUCE-1606
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1606
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Ravi Gummadi
>Assignee: Ravi Gummadi
> Fix For: 0.22.0
>
> Attachments: MR1606.20S.1.patch, MR1606.20S.patch, MR1606.patch
>
>
> TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task. 
> Because MiniMRCluster with 0 TaskTrackers is started in the test. In trunk, 
> we can set the config property mapreduce.job.committer.setup.cleanup.needed 
> to false sothat we don't get into this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-03-17 Thread Rodrigo Schmidt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846445#action_12846445
 ] 

Rodrigo Schmidt commented on MAPREDUCE-1585:


I'm checking the failed unit tests.

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1221) Kill tasks on a node if the free physical memory on that machine falls below a configured threshold

2010-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846431#action_12846431
 ] 

Hadoop QA commented on MAPREDUCE-1221:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12438470/MAPREDUCE-1221-v4.patch
  against trunk revision 923907.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/527/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/527/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/527/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/527/console

This message is automatically generated.

> Kill tasks on a node if the free physical memory on that machine falls below 
> a configured threshold
> ---
>
> Key: MAPREDUCE-1221
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1221
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tasktracker
>Affects Versions: 0.22.0
>Reporter: dhruba borthakur
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1221-v1.patch, MAPREDUCE-1221-v2.patch, 
> MAPREDUCE-1221-v3.patch, MAPREDUCE-1221-v4.patch
>
>
> The TaskTracker currently supports killing tasks if the virtual memory of a 
> task exceeds a set of configured thresholds. I would like to extend this 
> feature to enable killing tasks if the physical memory used by that task 
> exceeds a certain threshold.
> On a certain operating system (guess?), if user space processes start using 
> lots of memory, the machine hangs and dies quickly. This means that we would 
> like to prevent map-reduce jobs from triggering this condition. From my 
> understanding, the killing-based-on-virtual-memory-limits (HADOOP-5883) were 
> designed to address this problem. This works well when most map-reduce jobs 
> are Java jobs and have well-defined -Xmx parameters that specify the max 
> virtual memory for each task. On the other hand, if each task forks off 
> mappers/reducers written in other languages (python/php, etc), the total 
> virtual memory usage of the process-subtree varies greatly. In these cases, 
> it is better to use kill-tasks-using-physical-memory-limits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-889) binary communication formats added to Streaming by HADOOP-1722 should be documented

2010-03-17 Thread Klaas Bosteels (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaas Bosteels updated MAPREDUCE-889:
-

Attachment: MAPREDUCE-889.patch

Thanks Amareshwari, I've added the link. Any other comments?

> binary communication formats added to Streaming by HADOOP-1722 should be 
> documented
> ---
>
> Key: MAPREDUCE-889
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-889
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Reporter: Amareshwari Sriramadasu
>Assignee: Klaas Bosteels
>Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-889.patch, MAPREDUCE-889.patch
>
>
> binary communication formats added to Streaming by HADOOP-1722 should be 
> documented in forrest

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846415#action_12846415
 ] 

Hadoop QA commented on MAPREDUCE-1585:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12438889/MAPREDUCE-1585.1.patch
  against trunk revision 923907.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/34/console

This message is automatically generated.

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1606) TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task

2010-03-17 Thread Ravi Gummadi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846386#action_12846386
 ] 

Ravi Gummadi commented on MAPREDUCE-1606:
-

As Amareshwari pointed offline, trunk patch would probably need more changes. 
Let me investigate more.

> TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task
> 
>
> Key: MAPREDUCE-1606
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1606
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Ravi Gummadi
>Assignee: Ravi Gummadi
> Fix For: 0.22.0
>
> Attachments: MR1606.20S.patch, MR1606.patch
>
>
> TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task. 
> Because MiniMRCluster with 0 TaskTrackers is started in the test. In trunk, 
> we can set the config property mapreduce.job.committer.setup.cleanup.needed 
> to false sothat we don't get into this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1606) TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task

2010-03-17 Thread Ravi Gummadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-1606:


Attachment: MR1606.patch

Attaching patch for trunk.

> TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task
> 
>
> Key: MAPREDUCE-1606
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1606
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Ravi Gummadi
>Assignee: Ravi Gummadi
> Fix For: 0.22.0
>
> Attachments: MR1606.20S.patch, MR1606.patch
>
>
> TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task. 
> Because MiniMRCluster with 0 TaskTrackers is started in the test. In trunk, 
> we can set the config property mapreduce.job.committer.setup.cleanup.needed 
> to false sothat we don't get into this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1480) CombineFileRecordReader does not properly initialize child RecordReader

2010-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846377#action_12846377
 ] 

Hadoop QA commented on MAPREDUCE-1480:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12438662/MAPREDUCE-1480.4.patch
  against trunk revision 923907.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/526/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/526/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/526/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/526/console

This message is automatically generated.

> CombineFileRecordReader does not properly initialize child RecordReader
> ---
>
> Key: MAPREDUCE-1480
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1480
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-1480.2.patch, MAPREDUCE-1480.3.patch, 
> MAPREDUCE-1480.4.patch, MAPREDUCE-1480.patch
>
>
> CombineFileRecordReader instantiates child RecordReader instances but never 
> calls their initialize() method to give them the proper TaskAttemptContext.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1606) TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task

2010-03-17 Thread Ravi Gummadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-1606:


Attachment: MR1606.20S.patch

Attaching patch for earlier version of hadoop. Not for commit here.

> TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task
> 
>
> Key: MAPREDUCE-1606
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1606
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Ravi Gummadi
>Assignee: Ravi Gummadi
> Fix For: 0.22.0
>
> Attachments: MR1606.20S.patch
>
>
> TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task. 
> Because MiniMRCluster with 0 TaskTrackers is started in the test. In trunk, 
> we can set the config property mapreduce.job.committer.setup.cleanup.needed 
> to false sothat we don't get into this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1569) Mock Contexts & Configurations

2010-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846370#action_12846370
 ] 

Hadoop QA commented on MAPREDUCE-1569:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12438682/MAPREDUCE-1569.patch
  against trunk revision 923907.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 12 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/33/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/33/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/33/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/33/console

This message is automatically generated.

> Mock Contexts & Configurations
> --
>
> Key: MAPREDUCE-1569
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1569
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/mrunit
>Reporter: Chris White
>Assignee: Chris White
>Priority: Minor
> Attachments: MAPREDUCE-1569.patch, MAPREDUCE-1569.patch, 
> MAPREDUCE-1569.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Currently the library creates a new Configuration object in the 
> MockMapContext and MocKReduceContext constructors, rather than allowing the 
> developer to configure and pass their own

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1606) TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task

2010-03-17 Thread Ravi Gummadi (JIRA)
TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task


 Key: MAPREDUCE-1606
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1606
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.22.0
Reporter: Ravi Gummadi
Assignee: Ravi Gummadi
 Fix For: 0.22.0


TestJobACLs may timeout as there are no slots for launching JOB_CLEANUP task. 
Because MiniMRCluster with 0 TaskTrackers is started in the test. In trunk, we 
can set the config property mapreduce.job.committer.setup.cleanup.needed to 
false sothat we don't get into this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-03-17 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1585:
---

Status: Patch Available  (was: Open)

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-03-17 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1585:
---

Status: Open  (was: Patch Available)

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-889) binary communication formats added to Streaming by HADOOP-1722 should be documented

2010-03-17 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846336#action_12846336
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-889:
---

For the following change can you add the link to api documentation.
{code}
+typedbytes: The "typed bytes" format as described in the API 
documentation for the package org.apache.hadoop.typedbytes.
{code}
You can add the external ref in site.xml, and link it from the streaming 
documentation through href. 
For example, see :
{code}
KeyFieldBasedPartitioner,
 
{code}

> binary communication formats added to Streaming by HADOOP-1722 should be 
> documented
> ---
>
> Key: MAPREDUCE-889
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-889
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Reporter: Amareshwari Sriramadasu
>Assignee: Klaas Bosteels
>Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-889.patch
>
>
> binary communication formats added to Streaming by HADOOP-1722 should be 
> documented in forrest

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1605) Support multiple headless users to be able to submit job via gridmix v3

2010-03-17 Thread rahul k singh (JIRA)
Support multiple headless users to be able to submit job via gridmix v3
---

 Key: MAPREDUCE-1605
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1605
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/gridmix
Reporter: rahul k singh




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1320) StringBuffer -> StringBuilder occurence

2010-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846326#action_12846326
 ] 

Hadoop QA commented on MAPREDUCE-1320:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12438126/MAPREDUCE-1320.patch
  against trunk revision 923907.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/347/console

This message is automatically generated.

> StringBuffer -> StringBuilder occurence 
> 
>
> Key: MAPREDUCE-1320
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1320
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Kay Kay
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1320.patch, MAPREDUCE-1320.patch
>
>
> A good number of toString() implementations use StringBuffer when the 
> reference clearly does not go out of scope of the method and no concurrency 
> is needed. Patch contains replacing those occurences from StringBuffer to 
> StringBuilder. 
> Created against map/reduce project trunk . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.