[jira] Created: (MAPREDUCE-1992) NPE in JobTracker's constructor
NPE in JobTracker's constructor --- Key: MAPREDUCE-1992 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1992 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Ravi Gummadi On my local machine, JobTracker is coming up with current trunk. Logs show the following NPE: 2010-08-03 14:01:41,449 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NullPointerException at org.apache.hadoop.security.UserGroupInformation.isLoginKeytabBased(UserGroupInformation.java:703) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1383) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:275) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:267) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:262) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4236) 2010-08-03 14:01:41,449 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1992) NPE in JobTracker's constructor
[ https://issues.apache.org/jira/browse/MAPREDUCE-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894843#action_12894843 ] Ravi Gummadi commented on MAPREDUCE-1992: - Is this related to MAPREDUCE-1945 which added this call to UserGroupInformation.isLoginKeytabBased() recently ? NPE in JobTracker's constructor --- Key: MAPREDUCE-1992 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1992 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Ravi Gummadi On my local machine, JobTracker is coming up with current trunk. Logs show the following NPE: 2010-08-03 14:01:41,449 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NullPointerException at org.apache.hadoop.security.UserGroupInformation.isLoginKeytabBased(UserGroupInformation.java:703) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1383) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:275) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:267) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:262) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4236) 2010-08-03 14:01:41,449 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1992) NPE in JobTracker's constructor
[ https://issues.apache.org/jira/browse/MAPREDUCE-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-1992: Description: On my local machine, JobTracker is *not* coming up with current trunk. Logs show the following NPE: 2010-08-03 14:01:41,449 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NullPointerException at org.apache.hadoop.security.UserGroupInformation.isLoginKeytabBased(UserGroupInformation.java:703) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1383) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:275) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:267) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:262) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4236) 2010-08-03 14:01:41,449 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: was: On my local machine, JobTracker is coming up with current trunk. Logs show the following NPE: 2010-08-03 14:01:41,449 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NullPointerException at org.apache.hadoop.security.UserGroupInformation.isLoginKeytabBased(UserGroupInformation.java:703) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1383) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:275) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:267) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:262) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4236) 2010-08-03 14:01:41,449 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: NPE in JobTracker's constructor --- Key: MAPREDUCE-1992 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1992 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Ravi Gummadi On my local machine, JobTracker is *not* coming up with current trunk. Logs show the following NPE: 2010-08-03 14:01:41,449 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NullPointerException at org.apache.hadoop.security.UserGroupInformation.isLoginKeytabBased(UserGroupInformation.java:703) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1383) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:275) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:267) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:262) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4236) 2010-08-03 14:01:41,449 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894966#action_12894966 ] Mahadev konar commented on MAPREDUCE-1834: -- +1 ... ill go ahead and commit the patch. TestSimulatorDeterministicReplay timesout on trunk -- Key: MAPREDUCE-1834 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1834 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/mumak Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Hong Tang Attachments: MAPREDUCE-1834.patch, mr-1834-20100727.patch, mr-1834-20100729.patch, mr-1834-20100802.patch, TestSimulatorDeterministicReplay.log TestSimulatorDeterministicReplay timesout on trunk. See hudson patch build http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/216/testReport/org.apache.hadoop.mapred/TestSimulatorDeterministicReplay/testMain/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1991) taskcontroller allows stealing permissions on any local file
[ https://issues.apache.org/jira/browse/MAPREDUCE-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894967#action_12894967 ] Todd Lipcon commented on MAPREDUCE-1991: We also have to fix the checks on permissions - it currently uses argv[0] which is spoofable. Calling stat on /proc/self/exe is going to be more secure (and we've already used Linux-specific code elsewhere in the task controller) taskcontroller allows stealing permissions on any local file Key: MAPREDUCE-1991 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1991 Project: Hadoop Map/Reduce Issue Type: Bug Components: task-controller Affects Versions: 0.21.0, 0.22.0 Reporter: Todd Lipcon Priority: Blocker The linux task-controller setuid binary allows a malicious user to chmod any file on the system to 644 (and as a side effect appends some junk to the end) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated MAPREDUCE-1834: - Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Fix Version/s: 0.22.0 Resolution: Fixed I just committed this. Thanks hong. TestSimulatorDeterministicReplay timesout on trunk -- Key: MAPREDUCE-1834 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1834 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/mumak Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Hong Tang Fix For: 0.22.0 Attachments: MAPREDUCE-1834.patch, mr-1834-20100727.patch, mr-1834-20100729.patch, mr-1834-20100802.patch, TestSimulatorDeterministicReplay.log TestSimulatorDeterministicReplay timesout on trunk. See hudson patch build http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/216/testReport/org.apache.hadoop.mapred/TestSimulatorDeterministicReplay/testMain/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1991) taskcontroller allows stealing permissions on any local file
[ https://issues.apache.org/jira/browse/MAPREDUCE-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894977#action_12894977 ] Allen Wittenauer commented on MAPREDUCE-1991: - Should this get renamed linux-task-controller or be a wrapper to sbin where linux-task-controller lives? This way we can make it pluggable for other OSes... taskcontroller allows stealing permissions on any local file Key: MAPREDUCE-1991 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1991 Project: Hadoop Map/Reduce Issue Type: Bug Components: task-controller Affects Versions: 0.21.0, 0.22.0 Reporter: Todd Lipcon Priority: Blocker The linux task-controller setuid binary allows a malicious user to chmod any file on the system to 644 (and as a side effect appends some junk to the end) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1991) taskcontroller allows stealing permissions on any local file
[ https://issues.apache.org/jira/browse/MAPREDUCE-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894985#action_12894985 ] Allen Wittenauer commented on MAPREDUCE-1991: - Or do we want to #ifdef this? [getexecname() under Solaris] taskcontroller allows stealing permissions on any local file Key: MAPREDUCE-1991 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1991 Project: Hadoop Map/Reduce Issue Type: Bug Components: task-controller Affects Versions: 0.21.0, 0.22.0 Reporter: Todd Lipcon Priority: Blocker The linux task-controller setuid binary allows a malicious user to chmod any file on the system to 644 (and as a side effect appends some junk to the end) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-220) Collecting cpu and memory usage for MapReduce tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894990#action_12894990 ] Scott Chen commented on MAPREDUCE-220: -- Hey Philip, We haven't try test this under the case of JVM re-use. But I think you are right about this. We need to do some more work for this case. We can still get the correct PID in JVM reuse case. Because we use {code} String pid = System.getenv().get(JVM_PID); {code} which is invoked from Task.updateCounters(). So we should be able to get the correct PID for the task no matter JVM is reused or not. The problem is the cumulated CPU time. Because the process may be used by another task for a while. One way to solve this is to send only the current value instead of cumulated value. Does this sound correct to you? Scott Collecting cpu and memory usage for MapReduce tasks --- Key: MAPREDUCE-220 URL: https://issues.apache.org/jira/browse/MAPREDUCE-220 Project: Hadoop Map/Reduce Issue Type: New Feature Components: task, tasktracker Reporter: Hong Tang Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-220-20100616.txt, MAPREDUCE-220-v1.txt, MAPREDUCE-220.txt It would be nice for TaskTracker to collect cpu and memory usage for individual Map or Reduce tasks over time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1991) taskcontroller allows stealing permissions on any local file
[ https://issues.apache.org/jira/browse/MAPREDUCE-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895005#action_12895005 ] Todd Lipcon commented on MAPREDUCE-1991: The java side implementation is already called LinuxTaskController and is pluggable. I'm not against ifdeffing the different implementations, but I think there are some other Linux peculiarities in the code already (eg canonicalize_file_name is GNU specific). taskcontroller allows stealing permissions on any local file Key: MAPREDUCE-1991 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1991 Project: Hadoop Map/Reduce Issue Type: Bug Components: task-controller Affects Versions: 0.21.0, 0.22.0 Reporter: Todd Lipcon Priority: Blocker The linux task-controller setuid binary allows a malicious user to chmod any file on the system to 644 (and as a side effect appends some junk to the end) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1521) Protection against incorrectly configured reduces
[ https://issues.apache.org/jira/browse/MAPREDUCE-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated MAPREDUCE-1521: - Attachment: MAPREDUCE-1521-trunk.patch this patch is for trunk. Protection against incorrectly configured reduces - Key: MAPREDUCE-1521 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1521 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Arun C Murthy Assignee: Mahadev konar Priority: Critical Fix For: 0.22.0 Attachments: MAPREDUCE-1521-0.20-yahoo.patch, MAPREDUCE-1521-0.20-yahoo.patch, MAPREDUCE-1521-0.20-yahoo.patch, MAPREDUCE-1521-0.20-yahoo.patch, MAPREDUCE-1521-0.20-yahoo.patch, MAPREDUCE-1521-trunk.patch We've seen a fair number of instances where naive users process huge data-sets (10TB) with badly mis-configured #reduces e.g. 1 reduce. This is a significant problem on large clusters since it takes each attempt of the reduce a long time to shuffle and then run into problems such as local disk-space etc. Then it takes 4 such attempts. Proposal: Come up with heuristics/configs to fail such jobs early. Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1853) MultipleOutputs does not cache TaskAttemptContext
[ https://issues.apache.org/jira/browse/MAPREDUCE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated MAPREDUCE-1853: --- Fix Version/s: 0.22.0 Affects Version/s: 0.22.0 MultipleOutputs does not cache TaskAttemptContext - Key: MAPREDUCE-1853 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1853 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0, 0.22.0 Environment: OSX 10.6 java6 Reporter: Torsten Curdt Priority: Critical Fix For: 0.21.0, 0.22.0 Attachments: cache-task-attempts.diff In MultipleOutputs there is [code] private TaskAttemptContext getContext(String nameOutput) throws IOException { // The following trick leverages the instantiation of a record writer via // the job thus supporting arbitrary output formats. Job job = new Job(context.getConfiguration()); job.setOutputFormatClass(getNamedOutputFormatClass(context, nameOutput)); job.setOutputKeyClass(getNamedOutputKeyClass(context, nameOutput)); job.setOutputValueClass(getNamedOutputValueClass(context, nameOutput)); TaskAttemptContext taskContext = new TaskAttemptContextImpl(job.getConfiguration(), context.getTaskAttemptID()); return taskContext; } [code] so for every reduce call it creates a new Job instance ...which creates a new LocalJobRunner. That does not sound like a good idea. You end up with a flood of jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized This should probably also be added to 0.22. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1992) NPE in JobTracker's constructor
[ https://issues.apache.org/jira/browse/MAPREDUCE-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kan Zhang updated MAPREDUCE-1992: - Attachment: m1992-02.patch Attach a patch. Can't run tests on it since trunk is broken. NPE in JobTracker's constructor --- Key: MAPREDUCE-1992 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1992 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Ravi Gummadi Attachments: m1992-02.patch On my local machine, JobTracker is *not* coming up with current trunk. Logs show the following NPE: 2010-08-03 14:01:41,449 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NullPointerException at org.apache.hadoop.security.UserGroupInformation.isLoginKeytabBased(UserGroupInformation.java:703) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1383) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:275) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:267) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:262) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4236) 2010-08-03 14:01:41,449 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1992) NPE in JobTracker's constructor
[ https://issues.apache.org/jira/browse/MAPREDUCE-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895072#action_12895072 ] Kan Zhang commented on MAPREDUCE-1992: -- Is this related to MAPREDUCE-1945 Yes. I manually verified the patch on a secure cluster, but forgot to check when security is turned off. NPE in JobTracker's constructor --- Key: MAPREDUCE-1992 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1992 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Ravi Gummadi Attachments: m1992-02.patch On my local machine, JobTracker is *not* coming up with current trunk. Logs show the following NPE: 2010-08-03 14:01:41,449 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NullPointerException at org.apache.hadoop.security.UserGroupInformation.isLoginKeytabBased(UserGroupInformation.java:703) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1383) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:275) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:267) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:262) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4236) 2010-08-03 14:01:41,449 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895073#action_12895073 ] Matei Zaharia commented on MAPREDUCE-1881: -- Sounds good, I will add a composite class then. I used for loops because other listener systems in Hadoop, such as the JobTrackerListener, use them as well. Improve TaskTrackerInstrumentation -- Key: MAPREDUCE-1881 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Matei Zaharia Assignee: Matei Zaharia Priority: Minor Attachments: mapreduce-1881.patch The TaskTrackerInstrumentation class provides a useful way to capture key events at the TaskTracker for use in various reporting tools, but it is currently rather limited, because only one TaskTrackerInstrumentation can be added to a given TaskTracker and this objects receives minimal information about tasks (only their IDs). I propose enhancing the functionality through two changes: # Support a comma-separated list of TaskTrackerInstrumentation classes rather than just a single one in the JobConf, and report events to all of them. # Make the reportTaskLaunch and reportTaskEnd methods in TaskTrackerInstrumentation receive a reference to a whole Task object rather than just its TaskAttemptID. It might also be useful to make the latter receive the task's final state, i.e. failed, killed, or successful. I'm just posting this here to get a sense of whether this is a good idea. If people think it's okay, I will make a patch against trunk that implements these changes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1207) Allow admins to set java options for map/reduce tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895079#action_12895079 ] Dick King commented on MAPREDUCE-1207: -- {{TaskRunner.java}} : API nomenclature: {{MAPRED_MAP_ADMIN_JAVA_OPTS}} etc. is a property that contains options, not a set of options like {{DEFAULT_MAPRED_ADMIN_JAVA_OPTS}} , and should probably be named {{MAP_ADMIN_JAVA_OPTS_PROPNAME}} . There should probably be such a property for mapred as well as the two separate properties for map and reduce. funny return value: {{getVMEnvironment}} [and after the patch, {{setEnvFromInputString}} ] only ever returns its first parameter, {{errorInfo}} , if it returns at all. The return value is certainly not pulling its weight, and the method should be {{void}} . functionality lack: Environment value substitution is extremely specialized. It can basically only handle expansions to path-like environment variables. The original code had this property, but with the new code it looks like we should to be able to handle more cases. In particular, {{updateUserLoginEnv(...) }} sets several environment variables such as {{USER}} . The administrator is likely to want to be able to expand {{$USER}} in a new environment variable value [ie., perhaps a path name]. {{TestTaskEnvironment.java}} Environment variable substitution is not tested, even the limited version that we support. Allow admins to set java options for map/reduce tasks - Key: MAPREDUCE-1207 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1207 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client Reporter: Arun C Murthy Assignee: Krishna Ramachandran Attachments: mapred-1207.patch It will be useful for allow cluster-admins to set some java options for child map/reduce tasks. E.g. We've had to ask users to set -Djava.net.preferIPv4Stack=true in their jobs, it would be nice to do it for all users in such scenarios even when people override mapred.child.{map|reduce}.java.opts but forget to add this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-1958) using delegation token over hftp for long running clients (part of hdfs 1296)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das resolved MAPREDUCE-1958. Fix Version/s: 0.22.0 Resolution: Fixed I just committed this. Thanks Boris and Owen! using delegation token over hftp for long running clients (part of hdfs 1296) - Key: MAPREDUCE-1958 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1958 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Boris Shkolnik Assignee: Boris Shkolnik Fix For: 0.22.0 Attachments: MAPREDUCE-1958-1.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1992) NPE in JobTracker's constructor
[ https://issues.apache.org/jira/browse/MAPREDUCE-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kan Zhang updated MAPREDUCE-1992: - Status: Patch Available (was: Open) NPE in JobTracker's constructor --- Key: MAPREDUCE-1992 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1992 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Ravi Gummadi Attachments: m1992-02.patch On my local machine, JobTracker is *not* coming up with current trunk. Logs show the following NPE: 2010-08-03 14:01:41,449 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NullPointerException at org.apache.hadoop.security.UserGroupInformation.isLoginKeytabBased(UserGroupInformation.java:703) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1383) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:275) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:267) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:262) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4236) 2010-08-03 14:01:41,449 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1921) IOExceptions should contain the filename of the broken input files
[ https://issues.apache.org/jira/browse/MAPREDUCE-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895096#action_12895096 ] Dick King commented on MAPREDUCE-1921: -- {{MapTask.java}} When creating a new exception to re-throw, perhaps we should duplicate the original exception's type? I realize that there's a lot of code that doesn't do that, but swallowing this information feels wrong. Especially in the case of IO exceptions, I can envision a lot of code that wants to treat certain sub-exceptions a bit differently; some IO exceptions are the caller's fault but some represent oddities that happened in the file system. Perhaps something like {noformat} throw (IOException)(ioe.getClass().getConstructor(String.class, ioe.getClass()) .newInstance(IO error ... + ... , ioe)); {noformat} ? This cliche should occur in two places; perhaps we should pull it out and put it in one of the utilities classes? Perhaps IO utilities, since the main use case where I can see people caring is the large variety of {{IOException}} s. IOExceptions should contain the filename of the broken input files -- Key: MAPREDUCE-1921 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1921 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Krishna Ramachandran Assignee: Krishna Ramachandran Attachments: mapred-1921-1.patch, mapred-1921-3.patch, mapreduce-1921.patch If bzip or other decompression fails, the IOException does not contain the name of the broken file that caused the exception. It would be nice if such actions could be avoided in the future by having the name of the files that are broken spelled out in the exception. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1992) NPE in JobTracker's constructor
[ https://issues.apache.org/jira/browse/MAPREDUCE-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated MAPREDUCE-1992: --- Hadoop Flags: [Reviewed] +1. This should work for both secure and unsecure. NPE in JobTracker's constructor --- Key: MAPREDUCE-1992 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1992 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Ravi Gummadi Attachments: m1992-02.patch On my local machine, JobTracker is *not* coming up with current trunk. Logs show the following NPE: 2010-08-03 14:01:41,449 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NullPointerException at org.apache.hadoop.security.UserGroupInformation.isLoginKeytabBased(UserGroupInformation.java:703) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1383) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:275) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:267) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:262) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4236) 2010-08-03 14:01:41,449 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1921) IOExceptions should contain the filename of the broken input files
[ https://issues.apache.org/jira/browse/MAPREDUCE-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895116#action_12895116 ] Krishna Ramachandran commented on MAPREDUCE-1921: - Thx Dick I will take a look and revise suitably or explain Regards Krishna On 8/3/10 5:16 PM, Dick King (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/MAPREDUCE-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895096#action_12895096 ] Dick King commented on MAPREDUCE-1921: -- {{MapTask.java}} When creating a new exception to re-throw, perhaps we should duplicate the original exception's type? I realize that there's a lot of code that doesn't do that, but swallowing this information feels wrong. Especially in the case of IO exceptions, I can envision a lot of code that wants to treat certain sub-exceptions a bit differently; some IO exceptions are the caller's fault but some represent oddities that happened in the file system. Perhaps something like {noformat} throw (IOException)(ioe.getClass().getConstructor(String.class, ioe.getClass()) .newInstance(IO error ... + ... , ioe)); {noformat} ? This cliche should occur in two places; perhaps we should pull it out and put it in one of the utilities classes? Perhaps IO utilities, since the main use case where I can see people caring is the large variety of {{IOException}} s. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. IOExceptions should contain the filename of the broken input files -- Key: MAPREDUCE-1921 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1921 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Krishna Ramachandran Assignee: Krishna Ramachandran Attachments: mapred-1921-1.patch, mapred-1921-3.patch, mapreduce-1921.patch If bzip or other decompression fails, the IOException does not contain the name of the broken file that caused the exception. It would be nice if such actions could be avoided in the future by having the name of the files that are broken spelled out in the exception. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1992) NPE in JobTracker's constructor
[ https://issues.apache.org/jira/browse/MAPREDUCE-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895121#action_12895121 ] Hadoop QA commented on MAPREDUCE-1992: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12451174/m1992-02.patch against trunk revision 982087. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/349/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/349/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/349/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/349/console This message is automatically generated. NPE in JobTracker's constructor --- Key: MAPREDUCE-1992 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1992 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Ravi Gummadi Attachments: m1992-02.patch On my local machine, JobTracker is *not* coming up with current trunk. Logs show the following NPE: 2010-08-03 14:01:41,449 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.NullPointerException at org.apache.hadoop.security.UserGroupInformation.isLoginKeytabBased(UserGroupInformation.java:703) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1383) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:275) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:267) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:262) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4236) 2010-08-03 14:01:41,449 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-220) Collecting cpu and memory usage for MapReduce tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895129#action_12895129 ] Philip Zeyliger commented on MAPREDUCE-220: --- Hi Scott, You could also reset the counters to 0 when the new task is started (sort of like a tare button on a scale). If resourceCalculator.getProcCumulativeCpuTime() was rather resourceCalculator.getCumulativeCpuTimeDelta() [cumulative CPU time since last call], you could use counter.incr() for the CPU usage. It's also worth mentioning that the memory usage here is the last-known memory usage value. It's not byte-seconds (which wouldn't be that useful), nor is it maximum memory. That seems useful, but it's a bit unintuitive. {noformat} +long cpuTime = resourceCalculator.getProcCumulativeCpuTime(); +long pMem = resourceCalculator.getProcPhysicalMemorySize(); +long vMem = resourceCalculator.getProcVirtualMemorySize(); +counters.findCounter(TaskCounter.CPU_MILLISECONDS).setValue(cpuTime); +counters.findCounter(TaskCounter.PHYSICAL_MEMORY_BYTES).setValue(pMem); +counters.findCounter(TaskCounter.VIRTUAL_MEMORY_BYTES).setValue(vMem); {noformat} Collecting cpu and memory usage for MapReduce tasks --- Key: MAPREDUCE-220 URL: https://issues.apache.org/jira/browse/MAPREDUCE-220 Project: Hadoop Map/Reduce Issue Type: New Feature Components: task, tasktracker Reporter: Hong Tang Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-220-20100616.txt, MAPREDUCE-220-v1.txt, MAPREDUCE-220.txt It would be nice for TaskTracker to collect cpu and memory usage for individual Map or Reduce tasks over time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1670) RAID should avoid policies that scan their own destination path
[ https://issues.apache.org/jira/browse/MAPREDUCE-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1670: --- Status: Open (was: Patch Available) RAID should avoid policies that scan their own destination path --- Key: MAPREDUCE-1670 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1670 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Affects Versions: 0.22.0 Reporter: Rodrigo Schmidt Assignee: Ramkumar Vadali Fix For: 0.22.0 Attachments: MAPREDUCE-1670.patch Raid currently allows policies that include the destination directory into the source directory and vice-versa. Both situations can create cycles and should be avoided. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1670) RAID should avoid policies that scan their own destination path
[ https://issues.apache.org/jira/browse/MAPREDUCE-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1670: --- Status: Patch Available (was: Open) RAID should avoid policies that scan their own destination path --- Key: MAPREDUCE-1670 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1670 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Affects Versions: 0.22.0 Reporter: Rodrigo Schmidt Assignee: Ramkumar Vadali Fix For: 0.22.0 Attachments: MAPREDUCE-1670.patch Raid currently allows policies that include the destination directory into the source directory and vice-versa. Both situations can create cycles and should be avoided. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1668) RaidNode should only Har a directory if all its parity files have been created
[ https://issues.apache.org/jira/browse/MAPREDUCE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1668: --- Status: Open (was: Patch Available) RaidNode should only Har a directory if all its parity files have been created -- Key: MAPREDUCE-1668 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1668 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Affects Versions: 0.22.0 Reporter: Rodrigo Schmidt Assignee: Ramkumar Vadali Fix For: 0.22.0 Attachments: MAPREDUCE-1668.patch In the current code, it can happen that a directory will be Archived (Har'ed) before all its parity files have been generated since parity file generation is not atomic. We should verify if all the parity files are present before Archiving a directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1668) RaidNode should only Har a directory if all its parity files have been created
[ https://issues.apache.org/jira/browse/MAPREDUCE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali updated MAPREDUCE-1668: --- Status: Patch Available (was: Open) RaidNode should only Har a directory if all its parity files have been created -- Key: MAPREDUCE-1668 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1668 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Affects Versions: 0.22.0 Reporter: Rodrigo Schmidt Assignee: Ramkumar Vadali Fix For: 0.22.0 Attachments: MAPREDUCE-1668.patch In the current code, it can happen that a directory will be Archived (Har'ed) before all its parity files have been generated since parity file generation is not atomic. We should verify if all the parity files are present before Archiving a directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1790) Automatic resolution of Lzo codecs is needed.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Rajagopalan updated MAPREDUCE-1790: -- Attachment: build_change.patch Failed to get the build.xml changes with previous checkin. Automatic resolution of Lzo codecs is needed. - Key: MAPREDUCE-1790 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1790 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build Environment: Herriot system test case automation Reporter: Balaji Rajagopalan Assignee: Giridharan Kesavan Attachments: build_change.patch, ivy_lzcodec.patch, ivy_lzcodec_1.patch, lzcodec_fix.txt The test cases are failing due to non-availablity of the jar hadoop-gpl-compression-0.1.0-1005060043.jar, need changes to aop xml to fix this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1991) taskcontroller allows stealing permissions on any local file
[ https://issues.apache.org/jira/browse/MAPREDUCE-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895136#action_12895136 ] Vinod K V commented on MAPREDUCE-1991: -- bq. The -l option is to enable logging in the taskcontroller. AFAIK, we have never really used this. Should we knock it out ? A big +1. Note that we do depend on the fact that the output is piped into TaskTracker JVM and eventually into the TaskTracker's logs. So we should retain the logging to stdout/stderr but just knock out the command line option. bq. We also have to fix the checks on permissions - it currently uses argv[0] which is spoofable. Calling stat on /proc/self/exe is going to be more secure (and we've already used Linux-specific code elsewhere in the task controller) Can you please file a new ticket? taskcontroller allows stealing permissions on any local file Key: MAPREDUCE-1991 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1991 Project: Hadoop Map/Reduce Issue Type: Bug Components: task-controller Affects Versions: 0.21.0, 0.22.0 Reporter: Todd Lipcon Priority: Blocker The linux task-controller setuid binary allows a malicious user to chmod any file on the system to 644 (and as a side effect appends some junk to the end) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.