[jira] Commented: (MAPREDUCE-1692) Remove TestStreamedMerge from the streaming tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857757#action_12857757 ] Hadoop QA commented on MAPREDUCE-1692: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12441925/patch-1692.txt against trunk revision 933441. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/115/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/115/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/115/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/115/console This message is automatically generated. Remove TestStreamedMerge from the streaming tests - Key: MAPREDUCE-1692 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1692 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: Sreekanth Ramakrishnan Priority: Minor Fix For: 0.22.0 Attachments: MAPREDUCE-1692-1.patch, MAPREDUCE-1692-1.patch, patch-1692.txt Currently the {{TestStreamedMerge}} is never run as a part of the streaming test suite, the code paths which were exercised by the test was removed in HADOOP-1315, so it is better to remove the testcase from the code base. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1617) TestBadRecords failed once in our test runs
[ https://issues.apache.org/jira/browse/MAPREDUCE-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-1617: -- Attachment: mr-1617-v1.3.patch Attaching a patch for Yahoo!'s distribution of Hadoop not to be committed here. TestBadRecords fail because the task jvm take more than 30 secs to get the task from the TaskTracker. As this is a [known IPv6 issue|http://tinyurl.com/2a2o9m], this patch simply enables IPv4 stack for JUnit tests. TestBadRecords passes after the patch. test-patch and ant-tests passed with the patch. TestBadRecords failed once in our test runs --- Key: MAPREDUCE-1617 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1617 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: mr-1617-v1.3.patch, TestBadRecords.txt org.apache.hadoop.mapred.TestBadRecords.testBadMapRed failed with the following exception: java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1142) at org.apache.hadoop.mapred.TestBadRecords.runMapReduce(TestBadRecords.java:94) at org.apache.hadoop.mapred.TestBadRecords.testBadMapRed(TestBadRecords.java:211) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1695) capacity scheduler is not included in findbugs/javadoc targets
[ https://issues.apache.org/jira/browse/MAPREDUCE-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857768#action_12857768 ] Hemanth Yamijala commented on MAPREDUCE-1695: - Hong, I agree that changing the default behavior to include other contrib projects into findbugs etc automatically should be a separate JIRA. capacity scheduler is not included in findbugs/javadoc targets -- Key: MAPREDUCE-1695 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1695 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched Reporter: Hong Tang Assignee: Hong Tang Attachments: MAPREDUCE-1695-2.patch, MAPREDUCE-1695-3.patch, MAPREDUCE-1695.patch, mr1695-hadoop-findbugs-report-1.html, mr1695-hadoop-findbugs-report-2.html Capacity Scheduler is not included in findbugs/javadoc targets. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1693) Process tree clean up of either a failed task or killed task tests.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay Kumar Thota updated MAPREDUCE-1693: - Attachment: taskchildskilling_1693.patch I have addressed all your comments.Please check it and let me know your comments. Process tree clean up of either a failed task or killed task tests. --- Key: MAPREDUCE-1693 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1693 Project: Hadoop Map/Reduce Issue Type: Task Components: test Reporter: Vinay Kumar Thota Assignee: Vinay Kumar Thota Attachments: taskchildskilling_1693.diff, taskchildskilling_1693.diff, taskchildskilling_1693.patch, taskchildskilling_1693.patch The following scenarios covered in the test. 1. Run a job which spawns subshells in the tasks. Kill one of the task. All the child process of the killed task must be killed. 2. Run a job which spawns subshells in tasks. Fail one of the task. All the child process of the killed task must be killed along with the task after its failure. 3. Check process tree cleanup on paritcular task-tracker when we use -kill-task and -fail-task with both map and reduce. 4. Submit a job which would spawn child processes and each of the child processes exceeds the memory limits. Let the job complete . Check if all the child processes are killed, the overall job should fail. l)Submit a job which would spawn child processes and each of the child processes exceeds the memory limits. Kill/fail the job while in progress. Check if all the child processes are killed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (MAPREDUCE-1710) Process tree clean up of exceeding memory limit tasks.
Process tree clean up of exceeding memory limit tasks. -- Key: MAPREDUCE-1710 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1710 Project: Hadoop Map/Reduce Issue Type: Task Components: test Reporter: Vinay Kumar Thota Assignee: Vinay Kumar Thota Attachments: memorylimittask_1710.patch 1. Submit a job which would spawn child processes and each of the child processes exceeds the memory limits. Let the job complete . Check if all the child processes are killed, the overall job should fail. 2. Submit a job which would spawn child processes and each of the child processes exceeds the memory limits. Kill/fail the job while in progress. Check if all the child processes are killed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1710) Process tree clean up of exceeding memory limit tasks.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay Kumar Thota updated MAPREDUCE-1710: - Attachment: memorylimittask_1710.patch Please review it and let me know your comments. Process tree clean up of exceeding memory limit tasks. -- Key: MAPREDUCE-1710 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1710 Project: Hadoop Map/Reduce Issue Type: Task Components: test Reporter: Vinay Kumar Thota Assignee: Vinay Kumar Thota Attachments: memorylimittask_1710.patch 1. Submit a job which would spawn child processes and each of the child processes exceeds the memory limits. Let the job complete . Check if all the child processes are killed, the overall job should fail. 2. Submit a job which would spawn child processes and each of the child processes exceeds the memory limits. Kill/fail the job while in progress. Check if all the child processes are killed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1641) Job submission should fail if same uri is added for mapred.cache.files and mapred.cache.archives
[ https://issues.apache.org/jira/browse/MAPREDUCE-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857854#action_12857854 ] Dick King commented on MAPREDUCE-1641: -- Perhaps we should allow this, and both localize the file _and_ unarchive it? What do you think? Job submission should fail if same uri is added for mapred.cache.files and mapred.cache.archives Key: MAPREDUCE-1641 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1641 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Reporter: Amareshwari Sriramadasu Assignee: Dick King Fix For: 0.22.0 The behavior of mapred.cache.files and mapred.cache.archives is different during localization in the following way: If a jar file is added to mapred.cache.files, it will be localized under TaskTracker under a unique path. If a jar file is added to mapred.cache.archives, it will be localized under a unique path in a directory named the jar file name, and will be unarchived under the same directory. If same jar file is passed for both the configurations, the behavior undefined. Thus the job submission should fail. Currently, since distributed cache processes files before archives, the jar file will be just localized and not unarchived. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1709) mapred.cache.archives is not creating links for long path names
[ https://issues.apache.org/jira/browse/MAPREDUCE-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857859#action_12857859 ] Dick King commented on MAPREDUCE-1709: -- I understand this to be related to MAPREDUCE-1641 . I am marking it as a duplication. mapred.cache.archives is not creating links for long path names --- Key: MAPREDUCE-1709 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1709 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Dick King Assignee: Dick King I got the following complaint: {quote} We specified this {{JobConf}} parameter: {{mapred.cache.archives=/tmp/mchiang/workflows/custommain/lib/tutorial-udf.jar\#udfjar}} However, we do not see a link created here: {{$\{PWD\}/udfjar/tutorial-udf.jar}} {quote} I will look into this and publish detailed problem duplication instructions soon. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1683) Remove JNI calls from ClusterStatus cstr
[ https://issues.apache.org/jira/browse/MAPREDUCE-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-1683: - Attachment: MAPREDUCE-1683_part2_yhadoop_20_10.patch We missed a part of the patch to fix the jsps. Remove JNI calls from ClusterStatus cstr Key: MAPREDUCE-1683 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1683 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.2 Reporter: Chris Douglas Attachments: MAPREDUCE-1683_part2_yhadoop_20_10.patch, MAPREDUCE-1683_yhadoop_20_9.patch, MAPREDUCE-1683_yhadoop_20_S.patch The {{ClusterStatus}} constructor makes two JNI calls to the {{Runtime}} to fetch memory information. {{ClusterStatus}} instances are often created inside the {{JobTracker}} to obtain other, unrelated metrics (sometimes from schedulers' inner loops). Given that this information is related to the {{JobTracker}} process and not the cluster, the metrics are also available via {{JvmMetrics}}, and the jsps can gather this information for themselves: these fields can be removed from {{ClusterStatus}} -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1538) TrackerDistributedCacheManager can fail because the number of subdirectories reaches system limit
[ https://issues.apache.org/jira/browse/MAPREDUCE-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1538: -- Status: Open (was: Patch Available) TrackerDistributedCacheManager can fail because the number of subdirectories reaches system limit - Key: MAPREDUCE-1538 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1538 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.21.0 Attachments: MAPREDUCE-1538-v2.txt, MAPREDUCE-1538.patch TrackerDistributedCacheManager deletes the cached files when the size goes up to a configured number. But there is no such limit for the number of subdirectories. Therefore the number of subdirectories may grow large and exceed system limit. This will make TT cannot create directory when getLocalCache and fails the tasks. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1538) TrackerDistributedCacheManager can fail because the number of subdirectories reaches system limit
[ https://issues.apache.org/jira/browse/MAPREDUCE-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1538: -- Status: Patch Available (was: Open) TrackerDistributedCacheManager can fail because the number of subdirectories reaches system limit - Key: MAPREDUCE-1538 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1538 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.21.0 Attachments: MAPREDUCE-1538-v2.txt, MAPREDUCE-1538.patch TrackerDistributedCacheManager deletes the cached files when the size goes up to a configured number. But there is no such limit for the number of subdirectories. Therefore the number of subdirectories may grow large and exceed system limit. This will make TT cannot create directory when getLocalCache and fails the tasks. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1538) TrackerDistributedCacheManager can fail because the number of subdirectories reaches system limit
[ https://issues.apache.org/jira/browse/MAPREDUCE-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857961#action_12857961 ] Hadoop QA commented on MAPREDUCE-1538: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12441877/MAPREDUCE-1538-v2.txt against trunk revision 933441. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/116/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/116/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/116/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/116/console This message is automatically generated. TrackerDistributedCacheManager can fail because the number of subdirectories reaches system limit - Key: MAPREDUCE-1538 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1538 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.21.0 Attachments: MAPREDUCE-1538-v2.txt, MAPREDUCE-1538.patch TrackerDistributedCacheManager deletes the cached files when the size goes up to a configured number. But there is no such limit for the number of subdirectories. Therefore the number of subdirectories may grow large and exceed system limit. This will make TT cannot create directory when getLocalCache and fails the tasks. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1317) Reducing memory consumption of rumen objects
[ https://issues.apache.org/jira/browse/MAPREDUCE-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Tang updated MAPREDUCE-1317: - Attachment: mapreduce-1317-yhadoo-20.1xx.patch patch for yahoo hadoop 20.1xx branch. Not to be committed. Reducing memory consumption of rumen objects Key: MAPREDUCE-1317 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1317 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.21.0, 0.22.0 Reporter: Hong Tang Assignee: Hong Tang Fix For: 0.21.0 Attachments: mapreduce-1317-20091218.patch, mapreduce-1317-20091222-2.patch, mapreduce-1317-20091222.patch, mapreduce-1317-20091223.patch, mapreduce-1317-yhadoo-20.1xx.patch We have encountered OutOfMemoryErrors in mumak and gridmix when dealing with very large jobs. The purpose of this jira is to optimze memory consumption of rumen produced job objects. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1538) TrackerDistributedCacheManager can fail because the number of subdirectories reaches system limit
[ https://issues.apache.org/jira/browse/MAPREDUCE-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated MAPREDUCE-1538: Status: Resolved (was: Patch Available) Fix Version/s: 0.22.0 (was: 0.21.0) Resolution: Fixed I just committed this. Thanks Scott! TrackerDistributedCacheManager can fail because the number of subdirectories reaches system limit - Key: MAPREDUCE-1538 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1538 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1538-v2.txt, MAPREDUCE-1538.patch TrackerDistributedCacheManager deletes the cached files when the size goes up to a configured number. But there is no such limit for the number of subdirectories. Therefore the number of subdirectories may grow large and exceed system limit. This will make TT cannot create directory when getLocalCache and fails the tasks. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1221) Kill tasks on a node if the free physical memory on that machine falls below a configured threshold
[ https://issues.apache.org/jira/browse/MAPREDUCE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858045#action_12858045 ] dhruba borthakur commented on MAPREDUCE-1221: - hi arun/amareshwari, would you like to please review this one? Kill tasks on a node if the free physical memory on that machine falls below a configured threshold --- Key: MAPREDUCE-1221 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1221 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.22.0 Reporter: dhruba borthakur Assignee: Scott Chen Fix For: 0.21.0 Attachments: MAPREDUCE-1221-v1.patch, MAPREDUCE-1221-v2.patch, MAPREDUCE-1221-v3.patch, MAPREDUCE-1221-v4.patch, MAPREDUCE-1221-v5.txt The TaskTracker currently supports killing tasks if the virtual memory of a task exceeds a set of configured thresholds. I would like to extend this feature to enable killing tasks if the physical memory used by that task exceeds a certain threshold. On a certain operating system (guess?), if user space processes start using lots of memory, the machine hangs and dies quickly. This means that we would like to prevent map-reduce jobs from triggering this condition. From my understanding, the killing-based-on-virtual-memory-limits (HADOOP-5883) were designed to address this problem. This works well when most map-reduce jobs are Java jobs and have well-defined -Xmx parameters that specify the max virtual memory for each task. On the other hand, if each task forks off mappers/reducers written in other languages (python/php, etc), the total virtual memory usage of the process-subtree varies greatly. In these cases, it is better to use kill-tasks-using-physical-memory-limits. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (MAPREDUCE-1711) Gridmix should provide an option to submit jobs to the same queues as specified in the trace.
Gridmix should provide an option to submit jobs to the same queues as specified in the trace. - Key: MAPREDUCE-1711 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1711 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/gridmix Reporter: Hong Tang Gridmix should provide an option to submit jobs to the same queues as specified in the trace. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1711) Gridmix should provide an option to submit jobs to the same queues as specified in the trace.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Tang updated MAPREDUCE-1711: - Attachment: mr-1711-yhadoop-20.1xx-20100416.patch preliminary patch for yhadoop 20.1xx branch. Gridmix should provide an option to submit jobs to the same queues as specified in the trace. - Key: MAPREDUCE-1711 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1711 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/gridmix Reporter: Hong Tang Attachments: mr-1711-yhadoop-20.1xx-20100416.patch Gridmix should provide an option to submit jobs to the same queues as specified in the trace. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1641) Job submission should fail if same uri is added for mapred.cache.files and mapred.cache.archives
[ https://issues.apache.org/jira/browse/MAPREDUCE-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858057#action_12858057 ] Dick King commented on MAPREDUCE-1641: -- We will _not_ allow this file duplication as proposed in 16/Apr/10 11:41AM. However, we will not throw an {{IllegalArgumentException}} . We will throw an {{InvalidJobconfException}} instead. A consequence of this is that the check cannot be performed as you add individual files or blocks of files to the cache; the interface is wrong. We perform the check for conflicts between {{mapred.cache.files}} and {{mapred.cache.archives}} when the user finally submits the offending {{JobConf}} . In particular, I plan to make a new class {{DistributedCache.DuplicatedURI extends InvalidJobConfException}} and throw _that_ . Job submission should fail if same uri is added for mapred.cache.files and mapred.cache.archives Key: MAPREDUCE-1641 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1641 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Reporter: Amareshwari Sriramadasu Assignee: Dick King Fix For: 0.22.0 The behavior of mapred.cache.files and mapred.cache.archives is different during localization in the following way: If a jar file is added to mapred.cache.files, it will be localized under TaskTracker under a unique path. If a jar file is added to mapred.cache.archives, it will be localized under a unique path in a directory named the jar file name, and will be unarchived under the same directory. If same jar file is passed for both the configurations, the behavior undefined. Thus the job submission should fail. Currently, since distributed cache processes files before archives, the jar file will be just localized and not unarchived. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (MAPREDUCE-1712) HAR sequence files throw errors in MR jobs
HAR sequence files throw errors in MR jobs -- Key: MAPREDUCE-1712 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1712 Project: Hadoop Map/Reduce Issue Type: Bug Components: harchive Affects Versions: 0.20.1 Reporter: Paul Yang When a HAR is specified as the input for a map reduce job and the file format is sequence file, an error similar to the following is thrown (this one is from Hive). {code} java.lang.IllegalArgumentException: Offset 0 is outside of file (0..-1) at org.apache.hadoop.mapred.FileInputFormat.getBlockIndex(FileInputFormat.java:299) at org.apache.hadoop.mapred.FileInputFormat.getSplitHosts(FileInputFormat.java:455) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:260) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:261) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:827) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:798) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:747) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:663) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:631) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:504) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:382) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) {code} This is caused by the dummy block location returned by HarFileSystem.getFileBlockLocations(). -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1687) Stress submission policy does not always stress the cluster.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Tang updated MAPREDUCE-1687: - Attachment: mr-1687-yhadoop-20.1xx-20100416.patch Preliminary patch that caps the maximum number of incomplete map tasks for each job in cluster overload calculation. Stress submission policy does not always stress the cluster. Key: MAPREDUCE-1687 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1687 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Hong Tang Attachments: mr-1687-yhadoop-20.1xx-20100416.patch Currently, the rough idea of stress submission policy is to continue submitting jobs until the pending map tasks reach 2x of the cluster capacity. This proves to be inadequate and we saw a large job could monopolize the whole cluster. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1532) Delegation token is obtained as the superuser
[ https://issues.apache.org/jira/browse/MAPREDUCE-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated MAPREDUCE-1532: --- Attachment: 1532.2.patch A slightly updated patch. An equivalent patch for y20s has been manually tested. Delegation token is obtained as the superuser - Key: MAPREDUCE-1532 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1532 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission, security Affects Versions: 0.22.0 Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.22.0 Attachments: 1532-bp20.1.patch, 1532-bp20.2.patch, 1532-bp20.4.1.patch, 1532-bp20.4.2.patch, 1532-bp20.4.patch, 1532.1.patch, 1532.2.patch When the UserGroupInformation.doAs is invoked for proxy users, the delegation token is incorrectly obtained as the real user. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1532) Delegation token is obtained as the superuser
[ https://issues.apache.org/jira/browse/MAPREDUCE-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated MAPREDUCE-1532: --- Status: Patch Available (was: Open) Delegation token is obtained as the superuser - Key: MAPREDUCE-1532 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1532 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission, security Affects Versions: 0.22.0 Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.22.0 Attachments: 1532-bp20.1.patch, 1532-bp20.2.patch, 1532-bp20.4.1.patch, 1532-bp20.4.2.patch, 1532-bp20.4.patch, 1532.1.patch, 1532.2.patch When the UserGroupInformation.doAs is invoked for proxy users, the delegation token is incorrectly obtained as the real user. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1532) Delegation token is obtained as the superuser
[ https://issues.apache.org/jira/browse/MAPREDUCE-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858101#action_12858101 ] Hadoop QA commented on MAPREDUCE-1532: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12442031/1532.2.patch against trunk revision 935090. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/117/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/117/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/117/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/117/console This message is automatically generated. Delegation token is obtained as the superuser - Key: MAPREDUCE-1532 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1532 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission, security Affects Versions: 0.22.0 Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.22.0 Attachments: 1532-bp20.1.patch, 1532-bp20.2.patch, 1532-bp20.4.1.patch, 1532-bp20.4.2.patch, 1532-bp20.4.patch, 1532.1.patch, 1532.2.patch When the UserGroupInformation.doAs is invoked for proxy users, the delegation token is incorrectly obtained as the real user. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira