[jira] Updated: (MAPREDUCE-1435) symlinks in cwd of the task are not handled properly after MAPREDUCE-896
[ https://issues.apache.org/jira/browse/MAPREDUCE-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hemanth Yamijala updated MAPREDUCE-1435: Status: Open (was: Patch Available) Canceling patch to incorporate review comments. symlinks in cwd of the task are not handled properly after MAPREDUCE-896 Key: MAPREDUCE-1435 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1435 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.22.0 Reporter: Amareshwari Sriramadasu Assignee: Ravi Gummadi Fix For: 0.22.0 Attachments: 1435.patch, 1435.v1.patch, 1435.v2.patch, 1435.v3.patch, MR-1435-y20s.patch With JVM reuse, TaskRunner.setupWorkDir() lists the contents of workDir and does a fs.delete on each path listed. If the listed file is a symlink to directory, it will delete the contents of those linked directories. This would delete files from distributed cache and jars directory,if mapred.create.symlink is true. Changing ownership/permissions of symlinks through ENABLE_TASK_FOR_CLEANUP would change ownership/permissions of underlying files. This is observed by Karam while running streaming jobs with DistributedCache and jvm reuse. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1435) symlinks in cwd of the task are not handled properly after MAPREDUCE-896
[ https://issues.apache.org/jira/browse/MAPREDUCE-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hemanth Yamijala updated MAPREDUCE-1435: Status: Patch Available (was: Open) Running through Hudson. symlinks in cwd of the task are not handled properly after MAPREDUCE-896 Key: MAPREDUCE-1435 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1435 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.22.0 Reporter: Amareshwari Sriramadasu Assignee: Ravi Gummadi Fix For: 0.22.0 Attachments: 1435.patch, 1435.v1.patch, 1435.v2.patch, 1435.v3.patch, 1435.v4.patch, MR-1435-y20s.patch With JVM reuse, TaskRunner.setupWorkDir() lists the contents of workDir and does a fs.delete on each path listed. If the listed file is a symlink to directory, it will delete the contents of those linked directories. This would delete files from distributed cache and jars directory,if mapred.create.symlink is true. Changing ownership/permissions of symlinks through ENABLE_TASK_FOR_CLEANUP would change ownership/permissions of underlying files. This is observed by Karam while running streaming jobs with DistributedCache and jvm reuse. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1435) symlinks in cwd of the task are not handled properly after MAPREDUCE-896
[ https://issues.apache.org/jira/browse/MAPREDUCE-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841167#action_12841167 ] Hemanth Yamijala commented on MAPREDUCE-1435: - Output of test-patch: {noformat} [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 18 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. {noformat} symlinks in cwd of the task are not handled properly after MAPREDUCE-896 Key: MAPREDUCE-1435 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1435 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.22.0 Reporter: Amareshwari Sriramadasu Assignee: Ravi Gummadi Fix For: 0.22.0 Attachments: 1435.patch, 1435.v1.patch, 1435.v2.patch, 1435.v3.patch, 1435.v4.patch, MR-1435-y20s.patch With JVM reuse, TaskRunner.setupWorkDir() lists the contents of workDir and does a fs.delete on each path listed. If the listed file is a symlink to directory, it will delete the contents of those linked directories. This would delete files from distributed cache and jars directory,if mapred.create.symlink is true. Changing ownership/permissions of symlinks through ENABLE_TASK_FOR_CLEANUP would change ownership/permissions of underlying files. This is observed by Karam while running streaming jobs with DistributedCache and jvm reuse. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1421) LinuxTaskController tests failing on trunk after the commit of MAPREDUCE-1385
[ https://issues.apache.org/jira/browse/MAPREDUCE-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1421: --- Attachment: patch-1421-ydist.txt Patch for Yahoo! distribution LinuxTaskController tests failing on trunk after the commit of MAPREDUCE-1385 - Key: MAPREDUCE-1421 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1421 Project: Hadoop Map/Reduce Issue Type: Bug Components: task-controller, tasktracker, test Affects Versions: 0.22.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1421-1.txt, patch-1421-2.txt, patch-1421-ydist.txt, patch-1421.txt, TestJobExecutionAsDifferentUser.patch The following tests fail, in particular: - TestDebugScriptWithLinuxTaskController - TestJobExecutionAsDifferentUser - TestPipesAsDifferentUser - TestKillSubProcessesWithLinuxTaskController -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1542) Deprecate mapred.permissions.supergroup in favor of hadoop.cluster.administrators
[ https://issues.apache.org/jira/browse/MAPREDUCE-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841177#action_12841177 ] Ravi Gummadi commented on MAPREDUCE-1542: - OK. Planning to make the config property mapred.permissions.supergroup work. This is done by doing some code changes spicifically for this config property's compatinility. This should be fine as we need this config property only when daemons are starting. Deprecate mapred.permissions.supergroup in favor of hadoop.cluster.administrators - Key: MAPREDUCE-1542 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1542 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Reporter: Vinod K V Assignee: Ravi Gummadi Fix For: 0.22.0 HADOOP-6568 added the configuration {{hadoop.cluster.administrators}} through which admins can configure who the superusers/supergroups for the cluster are. MAPREDUCE itself already has {{mapred.permissions.supergroup}} (which is just a single group). As agreed upon at HADOOP-6568, this should be deprecated in favor of {{hadoop.cluster.administrators}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-927) Cleanup of task-logs should happen in TaskTracker instead of the Child
[ https://issues.apache.org/jira/browse/MAPREDUCE-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841210#action_12841210 ] Hadoop QA commented on MAPREDUCE-927: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12437694/patch-927-2.txt against trunk revision 918864. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 17 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/17/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/17/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/17/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/17/console This message is automatically generated. Cleanup of task-logs should happen in TaskTracker instead of the Child -- Key: MAPREDUCE-927 URL: https://issues.apache.org/jira/browse/MAPREDUCE-927 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: security, tasktracker Affects Versions: 0.21.0 Reporter: Vinod K V Assignee: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.22.0 Attachments: patch-927-1.txt, patch-927-2.txt, patch-927.txt Task logs' cleanup is being done in Child now. This is undesirable atleast for two reasons: 1) failures while cleaning up will affect the user's tasks, and 2) the task's wall time will get affected due to operations that TT actually should own. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1523) Sometimes rumen trace generator fails to extract the job finish time.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841214#action_12841214 ] Hadoop QA commented on MAPREDUCE-1523: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12437218/mapreduce-1523--2010-02-25.patch against trunk revision 918864. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/498/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/498/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/498/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/498/console This message is automatically generated. Sometimes rumen trace generator fails to extract the job finish time. - Key: MAPREDUCE-1523 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1523 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Hong Tang Assignee: Dick King Attachments: mapreduce-1523--2010-02-24.patch, mapreduce-1523--2010-02-25.patch We saw sometimes (not very often) that rumen may fail to extract the job finish time from Hadoop 0.20 history log. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1553) mapred.userlog.retain.hours is improperly renamed in MAPREDUCE-849
[ https://issues.apache.org/jira/browse/MAPREDUCE-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841222#action_12841222 ] Hadoop QA commented on MAPREDUCE-1553: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12437697/patch-1553.txt against trunk revision 918864. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h9.grid.sp2.yahoo.net/3/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h9.grid.sp2.yahoo.net/3/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h9.grid.sp2.yahoo.net/3/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h9.grid.sp2.yahoo.net/3/console This message is automatically generated. mapred.userlog.retain.hours is improperly renamed in MAPREDUCE-849 -- Key: MAPREDUCE-1553 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1553 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 Attachments: patch-1553.txt mapred.userlog.retain.hours is renamed as mapred.task.userlog.retain.hours in JobContext. But, in mapred-default, it is mapreduce.task.userlog.retain.hours. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-890) After HADOOP-4491, the user who started mapred system is not able to run job.
[ https://issues.apache.org/jira/browse/MAPREDUCE-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841226#action_12841226 ] Hadoop QA commented on MAPREDUCE-890: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12437589/MR890.v1.1.patch against trunk revision 918864. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 24 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/342/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/342/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/342/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/342/console This message is automatically generated. After HADOOP-4491, the user who started mapred system is not able to run job. - Key: MAPREDUCE-890 URL: https://issues.apache.org/jira/browse/MAPREDUCE-890 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Reporter: Karam Singh Assignee: Ravi Gummadi Priority: Blocker Fix For: 0.21.0 Attachments: MAPREDUCE-890-20090904.txt, MAPREDUCE-890-20090909.txt, MR890.patch, MR890.v1.1.patch, MR890.v1.patch Even setup and cleanup task of job fails due exception -: It fails to create job and related directories under mapred.local.dir/taskTracker/jobcache Directories are created as -: [dr-xrws--- mapred hadoop ] job_200908190916_0002 mapred is not wrtie under this. Even manually I failed to touch file. mapred is use of started mr cluster -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1435) symlinks in cwd of the task are not handled properly after MAPREDUCE-896
[ https://issues.apache.org/jira/browse/MAPREDUCE-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841341#action_12841341 ] Ravi Gummadi commented on MAPREDUCE-1435: - Patch looks good. +1 symlinks in cwd of the task are not handled properly after MAPREDUCE-896 Key: MAPREDUCE-1435 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1435 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.22.0 Reporter: Amareshwari Sriramadasu Assignee: Ravi Gummadi Fix For: 0.22.0 Attachments: 1435.patch, 1435.v1.patch, 1435.v2.patch, 1435.v3.patch, 1435.v4.patch, MR-1435-y20s.patch With JVM reuse, TaskRunner.setupWorkDir() lists the contents of workDir and does a fs.delete on each path listed. If the listed file is a symlink to directory, it will delete the contents of those linked directories. This would delete files from distributed cache and jars directory,if mapred.create.symlink is true. Changing ownership/permissions of symlinks through ENABLE_TASK_FOR_CLEANUP would change ownership/permissions of underlying files. This is observed by Karam while running streaming jobs with DistributedCache and jvm reuse. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1408) Allow customization of job submission policies
[ https://issues.apache.org/jira/browse/MAPREDUCE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841344#action_12841344 ] Hadoop QA commented on MAPREDUCE-1408: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12437859/1408-4.patch against trunk revision 918864. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/18/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/18/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/18/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/18/console This message is automatically generated. Allow customization of job submission policies -- Key: MAPREDUCE-1408 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1408 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: rahul k singh Attachments: 1408-1.patch, 1408-2.patch, 1408-2.patch, 1408-20-2.patch, 1408-20-3.patch, 1408-20.patch, 1408-3.patch, 1408-4.patch Currently, gridmix3 replay job submission faithfully. For evaluation purposes, it would be great if we can support other job submission policies such as sequential job submission, or stress job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1512) RAID could use HarFileSystem directly instead of FileSystem.get
[ https://issues.apache.org/jira/browse/MAPREDUCE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841358#action_12841358 ] Hadoop QA commented on MAPREDUCE-1512: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12437733/MAPREDUCE-1512.1.patch against trunk revision 918864. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/499/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/499/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/499/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/499/console This message is automatically generated. RAID could use HarFileSystem directly instead of FileSystem.get --- Key: MAPREDUCE-1512 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1512 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Reporter: Rodrigo Schmidt Assignee: Rodrigo Schmidt Priority: Minor Attachments: MAPREDUCE-1512.1.patch, MAPREDUCE-1512.patch Makes the code run slightly faster and avoids possible problems in matching the right filesystem like the stale cache reported in HADOOP-6097. This is a minor improvement for trunk, but it is really helpful for people running RAID on earlier releases susceptible to HADOOP-6097, since RAID would crash on them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1548) Hadoop archives should be able to preserve times and other properties from original files
[ https://issues.apache.org/jira/browse/MAPREDUCE-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841372#action_12841372 ] Mahadev konar commented on MAPREDUCE-1548: -- HADOOP-6591 has been created for this. We can fix that with avro or we can fix that with url encoding in the filenames (both leading to upping the version). Hadoop archives should be able to preserve times and other properties from original files - Key: MAPREDUCE-1548 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1548 Project: Hadoop Map/Reduce Issue Type: Improvement Components: harchive Reporter: Rodrigo Schmidt Assignee: Rodrigo Schmidt Files inside hadoop archives don't keep their original: - modification time - access time - permission - owner - group all such properties are currently taken from the file storing the archive index, and not the stored files. This doesn't look very correct. There should be possible to preserve the original properties of the stored files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1512) RAID could use HarFileSystem directly instead of FileSystem.get
[ https://issues.apache.org/jira/browse/MAPREDUCE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841377#action_12841377 ] Rodrigo Schmidt commented on MAPREDUCE-1512: This patch is mostly refactoring the code. It simplifies some things, and optimizes others. I don't think we need new tests since there is no bug or new feature associated with it. RAID could use HarFileSystem directly instead of FileSystem.get --- Key: MAPREDUCE-1512 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1512 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Reporter: Rodrigo Schmidt Assignee: Rodrigo Schmidt Priority: Minor Attachments: MAPREDUCE-1512.1.patch, MAPREDUCE-1512.patch Makes the code run slightly faster and avoids possible problems in matching the right filesystem like the stale cache reported in HADOOP-6097. This is a minor improvement for trunk, but it is really helpful for people running RAID on earlier releases susceptible to HADOOP-6097, since RAID would crash on them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1501) FileInputFormat to support multi-level/recursive directory listing
[ https://issues.apache.org/jira/browse/MAPREDUCE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841428#action_12841428 ] Hudson commented on MAPREDUCE-1501: --- Integrated in Hadoop-Mapreduce-trunk #248 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/248/]) . FileInputFormat supports multi-level, recursive directory listing. (Zheng Shao via dhruba) FileInputFormat to support multi-level/recursive directory listing -- Key: MAPREDUCE-1501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1501 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Zheng Shao Assignee: Zheng Shao Attachments: MAPREDUCE-1501.1.branch-0.20.patch, MAPREDUCE-1501.1.trunk.patch As we have seen multiple times in the mailing list, users want to have the capability of getting all files out of a multi-level directory structure. 4/1/2008: http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3ce75c02ef0804011433x144813e6x2450da7883de3...@mail.gmail.com%3e 2/3/2009: http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200902.mbox/%3c7f80089c-3e7f-4330-90ba-6f1c5b0b0...@nist.gov%3e 6/2/2009: http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.mbox/%3c4a258a16.8050...@darose.net%3e One solution that our users had is to write a new FileInputFormat, but that means all existing FileInputFormat subclasses need to be changed in order to support this feature. We can easily provide a JobConf option (which defaults to false) to {{FileInputFormat.listStatus(...)}} to recursively go into directory structure. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1454) The servlets should quote server generated strings sent in the response
[ https://issues.apache.org/jira/browse/MAPREDUCE-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841426#action_12841426 ] Hudson commented on MAPREDUCE-1454: --- Integrated in Hadoop-Mapreduce-trunk #248 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/248/]) . Quote user supplied strings in Tracker servlets. The servlets should quote server generated strings sent in the response --- Key: MAPREDUCE-1454 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1454 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 0.22.0 Reporter: Devaraj Das Assignee: Chris Douglas Fix For: 0.22.0 Attachments: M1454-0y20.patch, M1454-1.patch, M1454-1y20.patch, M1454-2.patch, mr-1454-trunk-v1.patch This is related to HADOOP-6151 but for output. We need to go through all the servlets/jsps and pass all the response strings that could be based on the incoming request or user's data through a filter (implemented in HADOOP-6151). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1455) Authorization for servlets
[ https://issues.apache.org/jira/browse/MAPREDUCE-1455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841427#action_12841427 ] Hudson commented on MAPREDUCE-1455: --- Integrated in Hadoop-Mapreduce-trunk #248 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/248/]) Authorization for servlets -- Key: MAPREDUCE-1455 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1455 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: jobtracker, security, tasktracker Reporter: Devaraj Das Assignee: Ravi Gummadi Fix For: 0.22.0 Attachments: 1455.20S.2.fix.patch, 1455.20S.2.patch, 1455.patch, 1455.v1.patch, 1455.v2.patch, 1455.v3.patch, 1455.v4.1.patch, 1455.v4.2.patch, 1455.v4.patch This jira is about building the authorization for servlets (on top of MAPREDUCE-1307). That is, the JobTracker/TaskTracker runs authorization checks on web requests based on the configured job permissions. For e.g., if the job permission is 600, then no one except the authenticated user can look at the job details via the browser. The authenticated user in the servlet can be obtained using the HttpServletRequest method. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1510) RAID should regenerate parity files if they get deleted
[ https://issues.apache.org/jira/browse/MAPREDUCE-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841429#action_12841429 ] Hudson commented on MAPREDUCE-1510: --- Integrated in Hadoop-Mapreduce-trunk #248 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/248/]) RAID should regenerate parity files if they get deleted --- Key: MAPREDUCE-1510 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1510 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Reporter: Rodrigo Schmidt Assignee: Rodrigo Schmidt Attachments: MAPREDUCE-1510.1.patch, MAPREDUCE-1510.2.patch, MAPREDUCE-1510.patch Currently, if a source file has a replication factor lower or equal to that expected by RAID, the file is skipped and no parity file is generated. I don't think this is a good behavior since parity files can get wrongly deleted, leaving the source file with a low replication factor. In that case, raid should be able to recreate the parity file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1385) Make changes to MapReduce for the new UserGroupInformation APIs (HADOOP-6299)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841431#action_12841431 ] Hudson commented on MAPREDUCE-1385: --- Integrated in Hadoop-Mapreduce-trunk #248 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/248/]) Make changes to MapReduce for the new UserGroupInformation APIs (HADOOP-6299) - Key: MAPREDUCE-1385 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1385 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.22.0 Attachments: mr-6299.3.patch, mr-6299.7.patch, mr-6299.8.patch, mr-6299.patch This is about moving the MapReduce code to use the new UserGroupInformation API as described in HADOOP-6299. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1309) I want to change the rumen job trace generator to use a more modular internal structure, to allow for more input log formats
[ https://issues.apache.org/jira/browse/MAPREDUCE-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841435#action_12841435 ] Hudson commented on MAPREDUCE-1309: --- Integrated in Hadoop-Mapreduce-trunk #248 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/248/]) I want to change the rumen job trace generator to use a more modular internal structure, to allow for more input log formats - Key: MAPREDUCE-1309 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1309 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Dick King Assignee: Dick King Fix For: 0.22.0 Attachments: demuxer-plus-concatenated-files--2009-12-21.patch, demuxer-plus-concatenated-files--2010-01-06.patch, demuxer-plus-concatenated-files--2010-01-08-b.patch, demuxer-plus-concatenated-files--2010-01-08-c.patch, demuxer-plus-concatenated-files--2010-01-08-d.patch, demuxer-plus-concatenated-files--2010-01-08.patch, demuxer-plus-concatenated-files--2010-01-11.patch, mapreduce-1309--2009-01-14-a.patch, mapreduce-1309--2009-01-14.patch, mapreduce-1309--2010-01-20.patch, mapreduce-1309--2010-02-03.patch, mapreduce-1309--2010-02-04.patch, mapreduce-1309--2010-02-10.patch, mapreduce-1309--2010-02-12.patch, mapreduce-1309--2010-02-16-a.patch, mapreduce-1309--2010-02-16.patch, mapreduce-1309--2010-02-17.patch There are two orthogonal questions to answer when processing a job tracker log: how will the logs and the xml configuration files be packaged, and in which release of hadoop map/reduce were the logs generated? The existing rumen only has a couple of answers to this question. The new engine will handle three answers to the version question: 0.18, 0.20 and current, and two answers to the packaging question: separate files with names derived from the job ID, and concatenated files with a header between sections [used for easier file interchange]. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1523) Sometimes rumen trace generator fails to extract the job finish time.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841438#action_12841438 ] Dick King commented on MAPREDUCE-1523: -- I seem to have gotten zero test failures [ http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/498/testReport/ ] but got busted anyway. Huh? Sometimes rumen trace generator fails to extract the job finish time. - Key: MAPREDUCE-1523 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1523 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Hong Tang Assignee: Dick King Attachments: mapreduce-1523--2010-02-24.patch, mapreduce-1523--2010-02-25.patch We saw sometimes (not very often) that rumen may fail to extract the job finish time from Hadoop 0.20 history log. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1270) Hadoop C++ Extention
[ https://issues.apache.org/jira/browse/MAPREDUCE-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841468#action_12841468 ] Owen O'Malley commented on MAPREDUCE-1270: -- By the way, here is an archive of the message that I sent back in Nov 07 comparing the performance of Java, pipes, and streaming. http://www.mail-archive.com/hadoop-u...@lucene.apache.org/msg02961.html Especially by reimplementing the sort and shuffle, you should be able to get much faster than Java. *smile* Hadoop C++ Extention Key: MAPREDUCE-1270 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1270 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.20.1 Environment: hadoop linux Reporter: Wang Shouyan Hadoop C++ extension is an internal project in baidu, We start it for these reasons: 1 To provide C++ API. We mostly use Streaming before, and we also try to use PIPES, but we do not find PIPES is more efficient than Streaming. So we think a new C++ extention is needed for us. 2 Even using PIPES or Streaming, it is hard to control memory of hadoop map/reduce Child JVM. 3 It costs so much to read/write/sort TB/PB data by Java. When using PIPES or Streaming, pipe or socket is not efficient to carry so huge data. What we want to do: 1 We do not use map/reduce Child JVM to do any data processing, which just prepares environment, starts C++ mapper, tells mapper which split it should deal with, and reads report from mapper until that finished. The mapper will read record, ivoke user defined map, to do partition, write spill, combine and merge into file.out. We think these operations can be done by C++ code. 2 Reducer is similar to mapper, it was started after sort finished, it read from sorted files, ivoke user difined reduce, and write to user defined record writer. 3 We also intend to rewrite shuffle and sort with C++, for efficience and memory control. at first, 1 and 2, then 3. What's the difference with PIPES: 1 Yes, We will reuse most PIPES code. 2 And, We should do it more completely, nothing changed in scheduling and management, but everything in execution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1538) TrackerDistributedCacheManager can fail because the number of subdirectories reaches system limit
[ https://issues.apache.org/jira/browse/MAPREDUCE-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841506#action_12841506 ] dhruba borthakur commented on MAPREDUCE-1538: - Code look good. I will commit this patch TrackerDistributedCacheManager can fail because the number of subdirectories reaches system limit - Key: MAPREDUCE-1538 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1538 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1538.patch TrackerDistributedCacheManager deletes the cached files when the size goes up to a configured number. But there is no such limit for the number of subdirectories. Therefore the number of subdirectories may grow large and exceed system limit. This will make TT cannot create directory when getLocalCache and fails the tasks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1512) RAID could use HarFileSystem directly instead of FileSystem.get
[ https://issues.apache.org/jira/browse/MAPREDUCE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated MAPREDUCE-1512: Resolution: Fixed Fix Version/s: 0.22.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I just committed this. Thanks Rodrigo! RAID could use HarFileSystem directly instead of FileSystem.get --- Key: MAPREDUCE-1512 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1512 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Reporter: Rodrigo Schmidt Assignee: Rodrigo Schmidt Priority: Minor Fix For: 0.22.0 Attachments: MAPREDUCE-1512.1.patch, MAPREDUCE-1512.patch Makes the code run slightly faster and avoids possible problems in matching the right filesystem like the stale cache reported in HADOOP-6097. This is a minor improvement for trunk, but it is really helpful for people running RAID on earlier releases susceptible to HADOOP-6097, since RAID would crash on them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1512) RAID could use HarFileSystem directly instead of FileSystem.get
[ https://issues.apache.org/jira/browse/MAPREDUCE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841515#action_12841515 ] Rodrigo Schmidt commented on MAPREDUCE-1512: Thanks, Dhruba! Now I'll submit the patch for MAPREDUCE-1518. Cheers, Rodrigo RAID could use HarFileSystem directly instead of FileSystem.get --- Key: MAPREDUCE-1512 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1512 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Reporter: Rodrigo Schmidt Assignee: Rodrigo Schmidt Priority: Minor Fix For: 0.22.0 Attachments: MAPREDUCE-1512.1.patch, MAPREDUCE-1512.patch Makes the code run slightly faster and avoids possible problems in matching the right filesystem like the stale cache reported in HADOOP-6097. This is a minor improvement for trunk, but it is really helpful for people running RAID on earlier releases susceptible to HADOOP-6097, since RAID would crash on them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1512) RAID could use HarFileSystem directly instead of FileSystem.get
[ https://issues.apache.org/jira/browse/MAPREDUCE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841534#action_12841534 ] Hudson commented on MAPREDUCE-1512: --- Integrated in Hadoop-Mapreduce-trunk-Commit #260 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/260/]) . RAID uses HarFileSystem directly instead of FileSystem.get (Rodrigo Schmidt via dhruba) RAID could use HarFileSystem directly instead of FileSystem.get --- Key: MAPREDUCE-1512 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1512 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Reporter: Rodrigo Schmidt Assignee: Rodrigo Schmidt Priority: Minor Fix For: 0.22.0 Attachments: MAPREDUCE-1512.1.patch, MAPREDUCE-1512.patch Makes the code run slightly faster and avoids possible problems in matching the right filesystem like the stale cache reported in HADOOP-6097. This is a minor improvement for trunk, but it is really helpful for people running RAID on earlier releases susceptible to HADOOP-6097, since RAID would crash on them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1120) JobClient poll intervals should be job configurations, not cluster configurations
[ https://issues.apache.org/jira/browse/MAPREDUCE-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841614#action_12841614 ] Todd Lipcon commented on MAPREDUCE-1120: bq. If polling intervals are job-level configuration parameters, Job. getCompletionPollInterval(conf) and Job.getProgressPollInterval(conf) should be not static methods and should not take configuration as the parameter. The methods should read the values from Job's conf directly. OK. Do we need to maintain compatibility on these static functions, since they're a public API? (eg mark the static ones deprecated, then make non-static ones that forward to the static ones for now) JobClient poll intervals should be job configurations, not cluster configurations - Key: MAPREDUCE-1120 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1120 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: mapreduce-1120.txt Job.waitForCompletion gets the poll interval from the Cluster object's configuration rather than its own Job configuration. This is counter-intuitive - Chris and I both made this same mistake working on MAPREDUCE-64, and Aaron agrees as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1518) On contrib/raid, the RaidNode currently runs the deletion check for parity files on directories too. It would be better if it didn't.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rodrigo Schmidt updated MAPREDUCE-1518: --- Attachment: MAPREDUCE-1518.0.patch On contrib/raid, the RaidNode currently runs the deletion check for parity files on directories too. It would be better if it didn't. - Key: MAPREDUCE-1518 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1518 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Environment: On contrib/raid, the RaidNode currently runs the deletion check for parity files on directories too. It runs okay because the directory is not empty and trying to delete it non-recursively fails, but such failure messages only polute the log file. My proposal is the following: If recursePurge is checking a directory, it should call itself recursively. If it's checking a file, it should do the deletion check. Reporter: Rodrigo Schmidt Assignee: Rodrigo Schmidt Attachments: MAPREDUCE-1518.0.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1518) On contrib/raid, the RaidNode currently runs the deletion check for parity files on directories too. It would be better if it didn't.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rodrigo Schmidt updated MAPREDUCE-1518: --- Status: Patch Available (was: Open) On contrib/raid, the RaidNode currently runs the deletion check for parity files on directories too. It would be better if it didn't. - Key: MAPREDUCE-1518 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1518 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Environment: On contrib/raid, the RaidNode currently runs the deletion check for parity files on directories too. It runs okay because the directory is not empty and trying to delete it non-recursively fails, but such failure messages only polute the log file. My proposal is the following: If recursePurge is checking a directory, it should call itself recursively. If it's checking a file, it should do the deletion check. Reporter: Rodrigo Schmidt Assignee: Rodrigo Schmidt Attachments: MAPREDUCE-1518.0.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1065) Modify the mapred tutorial documentation to use new mapreduce api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-1065: - Attachment: MAPREDUCE-1065.2.patch Attaching patch that addresses the issues from the above review. Modify the mapred tutorial documentation to use new mapreduce api. -- Key: MAPREDUCE-1065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1065 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Aaron Kimball Priority: Blocker Fix For: 0.21.0 Attachments: MAPREDUCE-1065.2.patch, MAPREDUCE-1065.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1065) Modify the mapred tutorial documentation to use new mapreduce api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-1065: - Status: Patch Available (was: Open) Modify the mapred tutorial documentation to use new mapreduce api. -- Key: MAPREDUCE-1065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1065 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Aaron Kimball Priority: Blocker Fix For: 0.21.0 Attachments: MAPREDUCE-1065.2.patch, MAPREDUCE-1065.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1065) Modify the mapred tutorial documentation to use new mapreduce api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-1065: - Status: Open (was: Patch Available) Modify the mapred tutorial documentation to use new mapreduce api. -- Key: MAPREDUCE-1065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1065 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Aaron Kimball Priority: Blocker Fix For: 0.21.0 Attachments: MAPREDUCE-1065.2.patch, MAPREDUCE-1065.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1480) CombineFileRecordReader does not properly initialize child RecordReader
[ https://issues.apache.org/jira/browse/MAPREDUCE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841641#action_12841641 ] Aaron Kimball commented on MAPREDUCE-1480: -- Amareshwari, Thanks for looking over this patch. The previous progress calculator was strictly based on the number of sub-splits processed. The underlying RecordReader's getProgress() function was never called, which means that the granularity of progress was only based around the number of subsplits and did not take intra-split progress into account. A review from Dhruba is definitely welcome. I'll add another testcase as you suggest and post this in the next couple of days. CombineFileRecordReader does not properly initialize child RecordReader --- Key: MAPREDUCE-1480 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1480 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-1480.2.patch, MAPREDUCE-1480.patch CombineFileRecordReader instantiates child RecordReader instances but never calls their initialize() method to give them the proper TaskAttemptContext. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1221) Kill tasks on a node if the free physical memory on that machine falls below a configured threshold
[ https://issues.apache.org/jira/browse/MAPREDUCE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841653#action_12841653 ] Scott Chen commented on MAPREDUCE-1221: --- @Arun: Sorry for the very late reply. Dhruba and I have been trying to call you but it seems you are busy as well. I think I got your point. The problem is that the bad job will never fail and its task gets killed and rescheduled again and again which keeps hurting the cluster. So we should add per task RSS limit in this patch so that we can fail the bad job. This is just like what we currently do in the trunk for virtual memory. But we here offer the RSS memory limiting as an option (a trade-off between memory utilization and stability). I will make the change and resubmit the patch soon. Thanks again for the help. Kill tasks on a node if the free physical memory on that machine falls below a configured threshold --- Key: MAPREDUCE-1221 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1221 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.22.0 Reporter: dhruba borthakur Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1221-v1.patch, MAPREDUCE-1221-v2.patch, MAPREDUCE-1221-v3.patch The TaskTracker currently supports killing tasks if the virtual memory of a task exceeds a set of configured thresholds. I would like to extend this feature to enable killing tasks if the physical memory used by that task exceeds a certain threshold. On a certain operating system (guess?), if user space processes start using lots of memory, the machine hangs and dies quickly. This means that we would like to prevent map-reduce jobs from triggering this condition. From my understanding, the killing-based-on-virtual-memory-limits (HADOOP-5883) were designed to address this problem. This works well when most map-reduce jobs are Java jobs and have well-defined -Xmx parameters that specify the max virtual memory for each task. On the other hand, if each task forks off mappers/reducers written in other languages (python/php, etc), the total virtual memory usage of the process-subtree varies greatly. In these cases, it is better to use kill-tasks-using-physical-memory-limits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1435) symlinks in cwd of the task are not handled properly after MAPREDUCE-896
[ https://issues.apache.org/jira/browse/MAPREDUCE-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hemanth Yamijala updated MAPREDUCE-1435: Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I committed this patch to trunk. Thanks, Ravi ! symlinks in cwd of the task are not handled properly after MAPREDUCE-896 Key: MAPREDUCE-1435 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1435 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.22.0 Reporter: Amareshwari Sriramadasu Assignee: Ravi Gummadi Fix For: 0.22.0 Attachments: 1435.patch, 1435.v1.patch, 1435.v2.patch, 1435.v3.patch, 1435.v4.patch, MR-1435-y20s.patch With JVM reuse, TaskRunner.setupWorkDir() lists the contents of workDir and does a fs.delete on each path listed. If the listed file is a symlink to directory, it will delete the contents of those linked directories. This would delete files from distributed cache and jars directory,if mapred.create.symlink is true. Changing ownership/permissions of symlinks through ENABLE_TASK_FOR_CLEANUP would change ownership/permissions of underlying files. This is observed by Karam while running streaming jobs with DistributedCache and jvm reuse. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1501) FileInputFormat to support multi-level/recursive directory listing
[ https://issues.apache.org/jira/browse/MAPREDUCE-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841658#action_12841658 ] Chris Douglas commented on MAPREDUCE-1501: -- {noformat} +import com.sun.org.apache.commons.logging.Log; +import com.sun.org.apache.commons.logging.LogFactory; {noformat} Should these imports be {{org.apache.hadoop.commons.logging}}, not {{com.sun...}} ? Is there a reason this feature was only added to a deprecated class, instead of the {{FileInputFormat}} in the {{mapreduce}} package? FileInputFormat to support multi-level/recursive directory listing -- Key: MAPREDUCE-1501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1501 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Zheng Shao Assignee: Zheng Shao Attachments: MAPREDUCE-1501.1.branch-0.20.patch, MAPREDUCE-1501.1.trunk.patch As we have seen multiple times in the mailing list, users want to have the capability of getting all files out of a multi-level directory structure. 4/1/2008: http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3ce75c02ef0804011433x144813e6x2450da7883de3...@mail.gmail.com%3e 2/3/2009: http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200902.mbox/%3c7f80089c-3e7f-4330-90ba-6f1c5b0b0...@nist.gov%3e 6/2/2009: http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200906.mbox/%3c4a258a16.8050...@darose.net%3e One solution that our users had is to write a new FileInputFormat, but that means all existing FileInputFormat subclasses need to be changed in order to support this feature. We can easily provide a JobConf option (which defaults to false) to {{FileInputFormat.listStatus(...)}} to recursively go into directory structure. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1538) TrackerDistributedCacheManager can fail because the number of subdirectories reaches system limit
[ https://issues.apache.org/jira/browse/MAPREDUCE-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841660#action_12841660 ] Scott Chen commented on MAPREDUCE-1538: --- Thanks for the help, Dhruba :) TrackerDistributedCacheManager can fail because the number of subdirectories reaches system limit - Key: MAPREDUCE-1538 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1538 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.22.0 Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1538.patch TrackerDistributedCacheManager deletes the cached files when the size goes up to a configured number. But there is no such limit for the number of subdirectories. Therefore the number of subdirectories may grow large and exceed system limit. This will make TT cannot create directory when getLocalCache and fails the tasks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1408) Allow customization of job submission policies
[ https://issues.apache.org/jira/browse/MAPREDUCE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rahul k singh updated MAPREDUCE-1408: - Status: Open (was: Patch Available) Allow customization of job submission policies -- Key: MAPREDUCE-1408 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1408 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: rahul k singh Attachments: 1408-1.patch, 1408-2.patch, 1408-2.patch, 1408-20-2.patch, 1408-20-3.patch, 1408-20.patch, 1408-3.patch, 1408-4.patch Currently, gridmix3 replay job submission faithfully. For evaluation purposes, it would be great if we can support other job submission policies such as sequential job submission, or stress job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1408) Allow customization of job submission policies
[ https://issues.apache.org/jira/browse/MAPREDUCE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rahul k singh updated MAPREDUCE-1408: - Attachment: 1408-5.patch Allow customization of job submission policies -- Key: MAPREDUCE-1408 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1408 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: rahul k singh Attachments: 1408-1.patch, 1408-2.patch, 1408-2.patch, 1408-20-2.patch, 1408-20-3.patch, 1408-20-4.patch, 1408-20.patch, 1408-3.patch, 1408-4.patch, 1408-5.patch Currently, gridmix3 replay job submission faithfully. For evaluation purposes, it would be great if we can support other job submission policies such as sequential job submission, or stress job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1408) Allow customization of job submission policies
[ https://issues.apache.org/jira/browse/MAPREDUCE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rahul k singh updated MAPREDUCE-1408: - Attachment: 1408-20-4.patch A very minute change in DebugJobProducer Allow customization of job submission policies -- Key: MAPREDUCE-1408 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1408 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: rahul k singh Attachments: 1408-1.patch, 1408-2.patch, 1408-2.patch, 1408-20-2.patch, 1408-20-3.patch, 1408-20-4.patch, 1408-20.patch, 1408-3.patch, 1408-4.patch, 1408-5.patch Currently, gridmix3 replay job submission faithfully. For evaluation purposes, it would be great if we can support other job submission policies such as sequential job submission, or stress job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1408) Allow customization of job submission policies
[ https://issues.apache.org/jira/browse/MAPREDUCE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rahul k singh updated MAPREDUCE-1408: - Status: Patch Available (was: Open) Allow customization of job submission policies -- Key: MAPREDUCE-1408 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1408 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: rahul k singh Attachments: 1408-1.patch, 1408-2.patch, 1408-2.patch, 1408-20-2.patch, 1408-20-3.patch, 1408-20-4.patch, 1408-20.patch, 1408-3.patch, 1408-4.patch, 1408-5.patch Currently, gridmix3 replay job submission faithfully. For evaluation purposes, it would be great if we can support other job submission policies such as sequential job submission, or stress job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1408) Allow customization of job submission policies
[ https://issues.apache.org/jira/browse/MAPREDUCE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1408: - Resolution: Fixed Fix Version/s: 0.22.0 Assignee: rahul k singh Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) This doesn't need to go through Hudson again; the only changes were to constants and the relevant test case passes. +1 I committed this. Thanks, Rahul! Allow customization of job submission policies -- Key: MAPREDUCE-1408 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1408 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: rahul k singh Assignee: rahul k singh Fix For: 0.22.0 Attachments: 1408-1.patch, 1408-2.patch, 1408-2.patch, 1408-20-2.patch, 1408-20-3.patch, 1408-20-4.patch, 1408-20.patch, 1408-3.patch, 1408-4.patch, 1408-5.patch Currently, gridmix3 replay job submission faithfully. For evaluation purposes, it would be great if we can support other job submission policies such as sequential job submission, or stress job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1518) On contrib/raid, the RaidNode currently runs the deletion check for parity files on directories too. It would be better if it didn't.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841677#action_12841677 ] Hadoop QA commented on MAPREDUCE-1518: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12437948/MAPREDUCE-1518.0.patch against trunk revision 919173. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/20/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/20/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/20/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/20/console This message is automatically generated. On contrib/raid, the RaidNode currently runs the deletion check for parity files on directories too. It would be better if it didn't. - Key: MAPREDUCE-1518 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1518 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Environment: On contrib/raid, the RaidNode currently runs the deletion check for parity files on directories too. It runs okay because the directory is not empty and trying to delete it non-recursively fails, but such failure messages only polute the log file. My proposal is the following: If recursePurge is checking a directory, it should call itself recursively. If it's checking a file, it should do the deletion check. Reporter: Rodrigo Schmidt Assignee: Rodrigo Schmidt Attachments: MAPREDUCE-1518.0.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1176) Contribution: FixedLengthInputFormat and FixedLengthRecordReader
[ https://issues.apache.org/jira/browse/MAPREDUCE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BitsOfInfo updated MAPREDUCE-1176: -- Status: Patch Available (was: Open) regtriggering hudson on latest patch, could not see output from last run Contribution: FixedLengthInputFormat and FixedLengthRecordReader Key: MAPREDUCE-1176 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1176 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 0.20.2, 0.20.1 Environment: Any Reporter: BitsOfInfo Priority: Minor Attachments: MAPREDUCE-1176-v1.patch, MAPREDUCE-1176-v2.patch, MAPREDUCE-1176-v3.patch, MAPREDUCE-1176-v4.patch Hello, I would like to contribute the following two classes for incorporation into the mapreduce.lib.input package. These two classes can be used when you need to read data from files containing fixed length (fixed width) records. Such files have no CR/LF (or any combination thereof), no delimiters etc, but each record is a fixed length, and extra data is padded with spaces. The data is one gigantic line within a file. Provided are two classes first is the FixedLengthInputFormat and its corresponding FixedLengthRecordReader. When creating a job that specifies this input format, the job must have the mapreduce.input.fixedlengthinputformat.record.length property set as follows myJobConf.setInt(mapreduce.input.fixedlengthinputformat.record.length,[myFixedRecordLength]); OR myJobConf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, [myFixedRecordLength]); This input format overrides computeSplitSize() in order to ensure that InputSplits do not contain any partial records since with fixed records there is no way to determine where a record begins if that were to occur. Each InputSplit passed to the FixedLengthRecordReader will start at the beginning of a record, and the last byte in the InputSplit will be the last byte of a record. The override of computeSplitSize() delegates to FileInputFormat's compute method, and then adjusts the returned split size by doing the following: (Math.floor(fileInputFormatsComputedSplitSize / fixedRecordLength) * fixedRecordLength) This suite of fixed length input format classes, does not support compressed files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1560) Better diagnostic message for tasks killed for going over vmem limit
Better diagnostic message for tasks killed for going over vmem limit Key: MAPREDUCE-1560 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1560 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.22.0 Currently the user has no indication of his tasks getting killed due to vmem limit, the only way to know is by looking at TT logs. We should get the TT to insert a diagnostic string for the task to indicate this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1176) Contribution: FixedLengthInputFormat and FixedLengthRecordReader
[ https://issues.apache.org/jira/browse/MAPREDUCE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BitsOfInfo updated MAPREDUCE-1176: -- Status: Open (was: Patch Available) regtriggering hudson on latest patch, could not see output from last run Contribution: FixedLengthInputFormat and FixedLengthRecordReader Key: MAPREDUCE-1176 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1176 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 0.20.2, 0.20.1 Environment: Any Reporter: BitsOfInfo Priority: Minor Attachments: MAPREDUCE-1176-v1.patch, MAPREDUCE-1176-v2.patch, MAPREDUCE-1176-v3.patch, MAPREDUCE-1176-v4.patch Hello, I would like to contribute the following two classes for incorporation into the mapreduce.lib.input package. These two classes can be used when you need to read data from files containing fixed length (fixed width) records. Such files have no CR/LF (or any combination thereof), no delimiters etc, but each record is a fixed length, and extra data is padded with spaces. The data is one gigantic line within a file. Provided are two classes first is the FixedLengthInputFormat and its corresponding FixedLengthRecordReader. When creating a job that specifies this input format, the job must have the mapreduce.input.fixedlengthinputformat.record.length property set as follows myJobConf.setInt(mapreduce.input.fixedlengthinputformat.record.length,[myFixedRecordLength]); OR myJobConf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, [myFixedRecordLength]); This input format overrides computeSplitSize() in order to ensure that InputSplits do not contain any partial records since with fixed records there is no way to determine where a record begins if that were to occur. Each InputSplit passed to the FixedLengthRecordReader will start at the beginning of a record, and the last byte in the InputSplit will be the last byte of a record. The override of computeSplitSize() delegates to FileInputFormat's compute method, and then adjusts the returned split size by doing the following: (Math.floor(fileInputFormatsComputedSplitSize / fixedRecordLength) * fixedRecordLength) This suite of fixed length input format classes, does not support compressed files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1408) Allow customization of job submission policies
[ https://issues.apache.org/jira/browse/MAPREDUCE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841688#action_12841688 ] Hudson commented on MAPREDUCE-1408: --- Integrated in Hadoop-Mapreduce-trunk-Commit #262 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/262/]) . Add customizable job submission policies to Gridmix. Contributed by Rahul Singh Allow customization of job submission policies -- Key: MAPREDUCE-1408 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1408 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: rahul k singh Assignee: rahul k singh Fix For: 0.22.0 Attachments: 1408-1.patch, 1408-2.patch, 1408-2.patch, 1408-20-2.patch, 1408-20-3.patch, 1408-20-4.patch, 1408-20.patch, 1408-3.patch, 1408-4.patch, 1408-5.patch Currently, gridmix3 replay job submission faithfully. For evaluation purposes, it would be great if we can support other job submission policies such as sequential job submission, or stress job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1493) Authorization for job-history pages
[ https://issues.apache.org/jira/browse/MAPREDUCE-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod K V updated MAPREDUCE-1493: - Status: Patch Available (was: Open) Authorization for job-history pages --- Key: MAPREDUCE-1493 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1493 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: jobtracker, security Reporter: Vinod K V Assignee: Vinod K V Fix For: 0.22.0 Attachments: MAPREDUCE-1493-20100222.1.txt, MAPREDUCE-1493-20100225.2.txt, MAPREDUCE-1493-20100226.1.txt, MAPREDUCE-1493-20100227.2-ydist.txt, MAPREDUCE-1493-20100227.3-ydist.txt, MAPREDUCE-1493-20100301.1.txt, MAPREDUCE-1493-20100304.txt MAPREDUCE-1455 introduces authorization for most of the Map/Reduce jsp pages and servlets, but left history pages. This JIRA will make sure that authorization checks are made while accessing job-history pages also. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1065) Modify the mapred tutorial documentation to use new mapreduce api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841716#action_12841716 ] Hadoop QA commented on MAPREDUCE-1065: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12437951/MAPREDUCE-1065.2.patch against trunk revision 919268. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/501/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/501/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/501/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/501/console This message is automatically generated. Modify the mapred tutorial documentation to use new mapreduce api. -- Key: MAPREDUCE-1065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1065 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Aaron Kimball Priority: Blocker Fix For: 0.21.0 Attachments: MAPREDUCE-1065.2.patch, MAPREDUCE-1065.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1561) mapreduce patch tests hung with java.lang.OutOfMemoryError: Java heap space
mapreduce patch tests hung with java.lang.OutOfMemoryError: Java heap space - Key: MAPREDUCE-1561 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1561 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Giridharan Kesavan http://hudson.zones.apache.org/hudson/view/Mapreduce/job/Mapreduce-Patch-h9.grid.sp2.yahoo.net/4/console Error form the console: [exec] [junit] 10/03/05 04:08:29 INFO datanode.DataNode: PacketResponder 2 for block blk_-3280111748864197295_19758 terminating [exec] [junit] 10/03/05 04:08:29 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:46067 is added to blk_-3280111748864197295_19758{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[127.0.0.1:46067|RBW], ReplicaUnderConstruction[127.0.0.1:37626|RBW], ReplicaUnderConstruction[127.0.0.1:48886|RBW]]} size 0 [exec] [junit] 10/03/05 04:08:29 INFO hdfs.StateChange: DIR* NameSystem.completeFile: file /tmp/hadoop-hudson/mapred/system/job_20100304162726530_3751/job-info is closed by DFSClient_79157028 [exec] [junit] 10/03/05 04:08:29 INFO mapred.JobTracker: Job job_20100304162726530_3751 added successfully for user 'hudson' to queue 'default' [exec] [junit] 10/03/05 04:08:29 INFO mapred.JobTracker: Initializing job_20100304162726530_3751 [exec] [junit] 10/03/05 04:08:29 INFO mapred.JobInProgress: Initializing job_20100304162726530_3751 [exec] [junit] 10/03/05 04:08:29 INFO mapreduce.Job: Running job: job_20100304162726530_3751 [exec] [junit] 10/03/05 04:08:29 INFO jobhistory.JobHistory: SetupWriter, creating file file:/grid/0/hudson/hudson-slave/workspace/Mapreduce-Patch-h9.grid.sp2.yahoo.net/trunk/build/contrib/raid/test/logs/history/job_20100304162726530_3751_hudson [exec] [junit] 10/03/05 04:08:29 ERROR mapred.JobTracker: Job initialization failed: [exec] [junit] org.apache.avro.AvroRuntimeException: java.lang.NoSuchFieldException: _SCHEMA [exec] [junit] at org.apache.avro.specific.SpecificData.createSchema(SpecificData.java:50) [exec] [junit] at org.apache.avro.reflect.ReflectData.getSchema(ReflectData.java:210) [exec] [junit] at org.apache.avro.specific.SpecificDatumWriter.init(SpecificDatumWriter.java:28) [exec] [junit] at org.apache.hadoop.mapreduce.jobhistory.EventWriter.init(EventWriter.java:47) [exec] [junit] at org.apache.hadoop.mapreduce.jobhistory.JobHistory.setupEventWriter(JobHistory.java:252) [exec] [junit] at org.apache.hadoop.mapred.JobInProgress.logSubmissionToJobHistory(JobInProgress.java:710) [exec] [junit] at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:619) [exec] [junit] at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3256) [exec] [junit] at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79) [exec] [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [exec] [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [exec] [junit] at java.lang.Thread.run(Thread.java:619) [exec] [junit] Caused by: java.lang.NoSuchFieldException: _SCHEMA [exec] [junit] at java.lang.Class.getDeclaredField(Class.java:1882) [exec] [junit] at org.apache.avro.specific.SpecificData.createSchema(SpecificData.java:48) [exec] [junit] ... 11 more [exec] [junit] [exec] [junit] Exception in thread pool-1-thread-3 java.lang.OutOfMemoryError: Java heap space [exec] [junit] at java.util.Arrays.copyOf(Arrays.java:2786) [exec] [junit] at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94) [exec] [junit] at java.io.PrintStream.write(PrintStream.java:430) [exec] [junit] at org.apache.tools.ant.util.TeeOutputStream.write(TeeOutputStream.java:81) [exec] [junit] at java.io.PrintStream.write(PrintStream.java:430) [exec] [junit] at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202) [exec] [junit] at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:272) [exec] [junit] at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:276) [exec] [junit] at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:122) [exec] [junit] at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:212) [exec] [junit] at
[jira] Commented: (MAPREDUCE-1556) upgrade to Avro 1.3.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841723#action_12841723 ] Giridharan Kesavan commented on MAPREDUCE-1556: --- patch test on hudson is stuck for 17 hrs, I 've to kill this patch test job. https://issues.apache.org/jira/browse/MAPREDUCE-1561 upgrade to Avro 1.3.0 - Key: MAPREDUCE-1556 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1556 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Doug Cutting Assignee: Doug Cutting Fix For: 0.22.0 Attachments: MAPREDUCE-1556.patch Avro 1.3.0 has now been released. HADOOP-6486 and HDFS-892 require it, and the version of Avro used by MapReduce should be synchronized with these projects. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1518) On contrib/raid, the RaidNode currently runs the deletion check for parity files on directories too. It would be better if it didn't.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841729#action_12841729 ] dhruba borthakur commented on MAPREDUCE-1518: - Code looks good. +1 On contrib/raid, the RaidNode currently runs the deletion check for parity files on directories too. It would be better if it didn't. - Key: MAPREDUCE-1518 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1518 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid Environment: On contrib/raid, the RaidNode currently runs the deletion check for parity files on directories too. It runs okay because the directory is not empty and trying to delete it non-recursively fails, but such failure messages only polute the log file. My proposal is the following: If recursePurge is checking a directory, it should call itself recursively. If it's checking a file, it should do the deletion check. Reporter: Rodrigo Schmidt Assignee: Rodrigo Schmidt Attachments: MAPREDUCE-1518.0.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1176) Contribution: FixedLengthInputFormat and FixedLengthRecordReader
[ https://issues.apache.org/jira/browse/MAPREDUCE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841730#action_12841730 ] Hadoop QA commented on MAPREDUCE-1176: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12434480/MAPREDUCE-1176-v4.patch against trunk revision 919277. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/22/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/22/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/22/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/22/console This message is automatically generated. Contribution: FixedLengthInputFormat and FixedLengthRecordReader Key: MAPREDUCE-1176 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1176 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 0.20.1, 0.20.2 Environment: Any Reporter: BitsOfInfo Priority: Minor Attachments: MAPREDUCE-1176-v1.patch, MAPREDUCE-1176-v2.patch, MAPREDUCE-1176-v3.patch, MAPREDUCE-1176-v4.patch Hello, I would like to contribute the following two classes for incorporation into the mapreduce.lib.input package. These two classes can be used when you need to read data from files containing fixed length (fixed width) records. Such files have no CR/LF (or any combination thereof), no delimiters etc, but each record is a fixed length, and extra data is padded with spaces. The data is one gigantic line within a file. Provided are two classes first is the FixedLengthInputFormat and its corresponding FixedLengthRecordReader. When creating a job that specifies this input format, the job must have the mapreduce.input.fixedlengthinputformat.record.length property set as follows myJobConf.setInt(mapreduce.input.fixedlengthinputformat.record.length,[myFixedRecordLength]); OR myJobConf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, [myFixedRecordLength]); This input format overrides computeSplitSize() in order to ensure that InputSplits do not contain any partial records since with fixed records there is no way to determine where a record begins if that were to occur. Each InputSplit passed to the FixedLengthRecordReader will start at the beginning of a record, and the last byte in the InputSplit will be the last byte of a record. The override of computeSplitSize() delegates to FileInputFormat's compute method, and then adjusts the returned split size by doing the following: (Math.floor(fileInputFormatsComputedSplitSize / fixedRecordLength) * fixedRecordLength) This suite of fixed length input format classes, does not support compressed files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1562) TestBadRecords fails sometimes
TestBadRecords fails sometimes -- Key: MAPREDUCE-1562 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1562 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Vinod K V Fix For: 0.22.0 TestBadRecords.testMapRed fails sometimes. One instance of this was seen by Hudson while testing MAPREDUCE-890: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/342/testReport/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.