[jira] [Updated] (MAPREDUCE-3725) Hadoop 22 hadoop job -list returns user name as NULL
[ https://issues.apache.org/jira/browse/MAPREDUCE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-3725: --- Fix Version/s: (was: 0.22.0) Target Version/s: 0.22.1 Status: Open (was: Patch Available) Mayank, could you please update the jira on the running of the test target. +1 The patch looks good Hadoop 22 hadoop job -list returns user name as NULL Key: MAPREDUCE-3725 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3725 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.22.1 Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: patch-MAPREDUCE-3725.patch Hadoop 22 hadoop job -list returns user name as NULL -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3593) MAPREDUCE Impersonation is not working in 22
[ https://issues.apache.org/jira/browse/MAPREDUCE-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-3593: --- Resolution: Fixed Target Version/s: 0.22.1 (was: 0.22.1, 0.22.0) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I just committed this. Thank you Mayank. MAPREDUCE Impersonation is not working in 22 Key: MAPREDUCE-3593 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3593 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Affects Versions: 0.22.0 Reporter: Mayank Bansal Assignee: Mayank Bansal Fix For: 0.22.1 Attachments: MAPREDUCE-3593.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3594) Contrib/Streaming - Test org.apache.hadoop.streaming.TestUlimit fails on VM
[ https://issues.apache.org/jira/browse/MAPREDUCE-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-3594: --- Description: The TestUlimit test is as follows : The testcse sets the upper limit for virtual memory to 768 MB in the jobconf Start a maponly job. Let the task get the applicable ulimit from the shell and write it as the output. The testcase will wait for the completion of the job and compare the joboutput with the ulimit originally set in the jobconf But this testcase fails because all the task attempts fail with the following exception java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:225) Caused by: java.io.IOException: Task process exit with nonzero status of 134. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:212) So there is no job output . The Test passes on my developer machine, but fails on the build machine which is a VM. The build machine OS is Red Hat Enterprise Linux Server release 6.1 (Santiago) was: The TestUlimit test is as follows : The testcse sets the upper limit for virtual memory to 768 MB in the jobconf Start a maponly job. Let the task get the applicable ulimit from the shell and write it as the output. The testcase will wait for the completion of the job and compare the joboutput with the ulimit originally set in the jobconf But this testcase fails because all the task attempts fail with the following exception java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:225) Caused by: java.io.IOException: Task process exit with nonzero status of 134. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:212) So there is no job output . The Test passes on my developer machine, but fails on the build machine which is a VM. The build machine OS is Red Hat Enterprise Linux Server release 6.1 (Santiago) Target Version/s: 0.22.1 Contrib/Streaming - Test org.apache.hadoop.streaming.TestUlimit fails on VM --- Key: MAPREDUCE-3594 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3594 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Affects Versions: 0.22.0 Environment: Red Hat Enterprise Linux Server release 6.1 (Santiago) Reporter: Benoy Antony Priority: Minor The TestUlimit test is as follows : The testcse sets the upper limit for virtual memory to 768 MB in the jobconf Start a maponly job. Let the task get the applicable ulimit from the shell and write it as the output. The testcase will wait for the completion of the job and compare the joboutput with the ulimit originally set in the jobconf But this testcase fails because all the task attempts fail with the following exception java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:225) Caused by: java.io.IOException: Task process exit with nonzero status of 134. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:212) So there is no job output . The Test passes on my developer machine, but fails on the build machine which is a VM. The build machine OS is Red Hat Enterprise Linux Server release 6.1 (Santiago) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2669) Some new examples and test cases for them.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-2669: --- Fix Version/s: 0.24.0 Some new examples and test cases for them. -- Key: MAPREDUCE-2669 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2669 Project: Hadoop Map/Reduce Issue Type: Test Components: examples Affects Versions: 0.22.0 Reporter: Plamen Jeliazkov Priority: Minor Fix For: 0.24.0 Attachments: MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, MAPREDUCE-2669.patch, mapreduce-new-examples-0.22.patch Original Estimate: 48h Remaining Estimate: 48h Looking to add some more examples such as Mean, Median, and Standard Deviation to the examples. I have some generic JUnit testcases as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2103) task-controller shouldn't require o-r permissions
[ https://issues.apache.org/jira/browse/MAPREDUCE-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-2103: --- Fix Version/s: 0.22.0 task-controller shouldn't require o-r permissions - Key: MAPREDUCE-2103 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2103 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task-controller Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Trivial Fix For: 0.22.0 Attachments: mapreduce-2103-20x.patch, mapreduce-2103.txt, mapreduce-2103.txt The task-controller currently checks that other users don't have read permissions. This is unnecessary - we just need to make it's not executable. The debian policy manual explains it well: {quote} Setuid and setgid executables should be mode 4755 or 2755 respectively, and owned by the appropriate user or group. They should not be made unreadable (modes like 4711 or 2711 or even 4111); doing so achieves no extra security, because anyone can find the binary in the freely available Debian package; it is merely inconvenient. For the same reason you should not restrict read or execute permissions on non-set-id executables. Some setuid programs need to be restricted to particular sets of users, using file permissions. In this case they should be owned by the uid to which they are set-id, and by the group which should be allowed to execute them. They should have mode 4754; again there is no point in making them unreadable to those users who must not be allowed to execute them. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1778) CompletedJobStatusStore initialization should fail if {mapred.job.tracker.persist.jobstatus.dir} is unwritable
[ https://issues.apache.org/jira/browse/MAPREDUCE-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-1778: --- Fix Version/s: 0.22.0 CompletedJobStatusStore initialization should fail if {mapred.job.tracker.persist.jobstatus.dir} is unwritable -- Key: MAPREDUCE-1778 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1778 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Amar Kamat Assignee: Krishna Ramachandran Fix For: 0.22.0 Attachments: mapred-1778-1.patch, mapred-1778-2.patch, mapred-1778-3.patch, mapred-1778-4.patch, mapred-1778.20S-1.patch, mapred-1778.20S.patch, mapred-1778.patch If {mapred.job.tracker.persist.jobstatus.dir} points to an unwritable location or mkdir of {mapred.job.tracker.persist.jobstatus.dir} fails, then CompletedJobStatusStore silently ignores the failure and disables CompletedJobStatusStore. Ideally the JobTracker should bail out early indicating a misconfiguration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2093) Herriot JT and TT clients should vend statistics
[ https://issues.apache.org/jira/browse/MAPREDUCE-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-2093: --- Fix Version/s: 0.22.0 Herriot JT and TT clients should vend statistics Key: MAPREDUCE-2093 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2093 Project: Hadoop Map/Reduce Issue Type: Improvement Components: test Affects Versions: 0.22.0 Reporter: Konstantin Boudnik Assignee: Konstantin Boudnik Fix For: 0.22.0 Attachments: MAPREDUCE-2093.patch, MAPREDUCE-2093.patch, MAPREDUCE-2093.patch Mapreduce counterpart of HDFS-1408 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1100) User's task-logs filling up local disks on the TaskTrackers
[ https://issues.apache.org/jira/browse/MAPREDUCE-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-1100: --- Priority: Major (was: Blocker) I agree. Unblocking this. Target for 0.22.1 User's task-logs filling up local disks on the TaskTrackers --- Key: MAPREDUCE-1100 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1100 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.20.1, 0.20.2, 0.21.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 0.21.1, 0.22.0 Attachments: MAPREDUCE-1100-20091102.txt, MAPREDUCE-1100-20091106.txt, MAPREDUCE-1100-20091216.2.txt, patch-1100-fix-ydist.2.txt, reducetask-log-level.patch Some user's jobs are filling up TT disks by outrageous logging. mapreduce.task.userlog.limit.kb is not enabled on the cluster. Disks are getting filled up before task-log cleanup via mapred.task.userlog.retain.hours can kick in. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1716) Truncate logs of finished tasks to prevent node thrash due to excessive logging
[ https://issues.apache.org/jira/browse/MAPREDUCE-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-1716: --- Priority: Major (was: Blocker) Unblocking this. Target for 0.22.1 Truncate logs of finished tasks to prevent node thrash due to excessive logging --- Key: MAPREDUCE-1716 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1716 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task, tasktracker Affects Versions: 0.20.3, 0.21.0, 0.21.1, 0.22.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: 0.22.0 Attachments: MAPREDUCE-1100-20091216.2.txt, mapreduce-1716-testcase-race.txt, patch-1100-fix-ydist.2.txt, patch-log-truncation-bugs-20100514.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3026) When user adds hierarchical queues to the cluster, mapred queue -list returns NULL Pointer Exception
[ https://issues.apache.org/jira/browse/MAPREDUCE-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-3026: --- Description: When User adds the hierarchical queues, and try to see them from the command line using mapred queue -list It returns Null Pointer Exception. was: When User adds the hierarchical queues, and try to see them from the command line using mapred queue -list It returns Null Pointer Exception. Assignee: Mayank Bansal When user adds hierarchical queues to the cluster, mapred queue -list returns NULL Pointer Exception Key: MAPREDUCE-3026 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3026 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.22.0 Reporter: Mayank Bansal Assignee: Mayank Bansal Labels: patch Fix For: 0.22.0 Attachments: patch-22, patch-22.patch When User adds the hierarchical queues, and try to see them from the command line using mapred queue -list It returns Null Pointer Exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2939) Ant setup on hadoop7 jenkins host
[ https://issues.apache.org/jira/browse/MAPREDUCE-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-2939: --- Assignee: Joep Rottinghuis Issue Type: Task (was: Bug) Ant setup on hadoop7 jenkins host - Key: MAPREDUCE-2939 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2939 Project: Hadoop Map/Reduce Issue Type: Task Components: build Affects Versions: 0.22.0 Environment: Jenkins: https://builds.apache.org/view/G-L/view/Hadoop/job/Hadoop-Mapreduce-22-branch/65/console Reporter: Joep Rottinghuis Assignee: Joep Rottinghuis Fix For: 0.22.0 From the build error it looks like a) ant is not set up on the machine b) $ANT_HOME point to the wrong spot -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1853) MultipleOutputs does not cache TaskAttemptContext
[ https://issues.apache.org/jira/browse/MAPREDUCE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-1853: --- Component/s: task Description: In MultipleOutputs there is {code} private TaskAttemptContext getContext(String nameOutput) throws IOException { // The following trick leverages the instantiation of a record writer via // the job thus supporting arbitrary output formats. Job job = new Job(context.getConfiguration()); job.setOutputFormatClass(getNamedOutputFormatClass(context, nameOutput)); job.setOutputKeyClass(getNamedOutputKeyClass(context, nameOutput)); job.setOutputValueClass(getNamedOutputValueClass(context, nameOutput)); TaskAttemptContext taskContext = new TaskAttemptContextImpl(job.getConfiguration(), context.getTaskAttemptID()); return taskContext; } {code} so for every reduce call it creates a new Job instance ...which creates a new LocalJobRunner. That does not sound like a good idea. You end up with a flood of jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized This should probably also be added to 0.22. was: In MultipleOutputs there is [code] private TaskAttemptContext getContext(String nameOutput) throws IOException { // The following trick leverages the instantiation of a record writer via // the job thus supporting arbitrary output formats. Job job = new Job(context.getConfiguration()); job.setOutputFormatClass(getNamedOutputFormatClass(context, nameOutput)); job.setOutputKeyClass(getNamedOutputKeyClass(context, nameOutput)); job.setOutputValueClass(getNamedOutputValueClass(context, nameOutput)); TaskAttemptContext taskContext = new TaskAttemptContextImpl(job.getConfiguration(), context.getTaskAttemptID()); return taskContext; } [code] so for every reduce call it creates a new Job instance ...which creates a new LocalJobRunner. That does not sound like a good idea. You end up with a flood of jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized This should probably also be added to 0.22. Assignee: Torsten Curdt MultipleOutputs does not cache TaskAttemptContext - Key: MAPREDUCE-1853 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1853 Project: Hadoop Map/Reduce Issue Type: Bug Components: task Affects Versions: 0.21.0, 0.22.0 Environment: OSX 10.6 java6 Reporter: Torsten Curdt Assignee: Torsten Curdt Priority: Critical Fix For: 0.21.0, 0.22.0 Attachments: cache-task-attempts.diff In MultipleOutputs there is {code} private TaskAttemptContext getContext(String nameOutput) throws IOException { // The following trick leverages the instantiation of a record writer via // the job thus supporting arbitrary output formats. Job job = new Job(context.getConfiguration()); job.setOutputFormatClass(getNamedOutputFormatClass(context, nameOutput)); job.setOutputKeyClass(getNamedOutputKeyClass(context, nameOutput)); job.setOutputValueClass(getNamedOutputValueClass(context, nameOutput)); TaskAttemptContext taskContext = new TaskAttemptContextImpl(job.getConfiguration(), context.getTaskAttemptID()); return taskContext; } {code} so for every reduce call it creates a new Job instance ...which creates a new LocalJobRunner. That does not sound like a good idea. You end up with a flood of jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized This should probably also be added to 0.22. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1517) streaming should support running on background
[ https://issues.apache.org/jira/browse/MAPREDUCE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-1517: --- Affects Version/s: 0.22.0 Assignee: Bochun Bai streaming should support running on background -- Key: MAPREDUCE-1517 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1517 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/streaming Affects Versions: 0.22.0 Reporter: Bochun Bai Assignee: Bochun Bai Fix For: 0.22.0 Attachments: contrib-streaming-background-with-test-with-license.patch, contrib-streaming-background-with-test-with-license2.patch, contrib-streaming-background-with-test.patch, contrib-streaming-background.patch, contrib-streaming-background.patch, hadoop-mapred-patch-logs.tar.gz StreamJob submit the job and use a while loop monitor the progress. I prefer it running on background. Just add at the end of command is a alternative solution, but it keeps a java process on client machine. When submit hundreds jobs at the same time, the client machine is overloaded. Adding a -background option to StreamJob, tell it only submit and don't monitor the progress. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1248) Redundant memory copying in StreamKeyValUtil
[ https://issues.apache.org/jira/browse/MAPREDUCE-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-1248: --- Affects Version/s: 0.22.0 Assignee: Ruibang He Redundant memory copying in StreamKeyValUtil Key: MAPREDUCE-1248 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1248 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/streaming Affects Versions: 0.22.0 Reporter: Ruibang He Assignee: Ruibang He Priority: Minor Fix For: 0.22.0 Attachments: MAPREDUCE-1248-v1.0.patch I found that when MROutputThread collecting the output of Reducer, it calls StreamKeyValUtil.splitKeyVal() and two local byte-arrays are allocated there for each line of output. Later these two byte-arrays are passed to variable key and val. There are twice memory copying here, one is the System.arraycopy() method, the other is inside key.set() / val.set(). This causes double times of memory copying for the whole output (may lead to higher CPU consumption), and frequent temporay object allocation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1118) Capacity Scheduler scheduling information is hard to read / should be tabular format
[ https://issues.apache.org/jira/browse/MAPREDUCE-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-1118: --- Component/s: contrib/capacity-sched Assignee: Milind Bhandarkar Capacity Scheduler scheduling information is hard to read / should be tabular format Key: MAPREDUCE-1118 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1118 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched Affects Versions: 0.20.2 Reporter: Allen Wittenauer Assignee: Milind Bhandarkar Fix For: 0.20.203.0, 0.22.0 Attachments: MR-1118-22.patch, mapred-1118-1.patch, mapred-1118-2.patch, mapred-1118-3.patch, mapred-1118.20S.patch, mapred-1118.patch The scheduling information provided by the capacity scheduler is extremely hard to read on the job tracker web page. Instead of just flat text, it should be presenting the information in a tabular format, similar to what the fair share scheduler provides. This makes it much easier to compare what different queues are doing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2059) RecoveryManager attempts to add jobtracker.info
[ https://issues.apache.org/jira/browse/MAPREDUCE-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-2059: --- Attachment: MAPREDUCE-2059.patch I was impatient. It runs for about 5 minutes. But the new test was failing, because the previous test case testJobTrackerInfoCreation() was not closing MiniDFSCluter. I added the shutdown statement, and cleaned up some deprecations in the new test. Also change job completion threshold from 50% to 20%, which reduced running time from 290 sec to 150. RecoveryManager attempts to add jobtracker.info --- Key: MAPREDUCE-2059 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2059 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.203.0, 0.22.0 Environment: https://svn.apache.org/repos/asf/hadoop/mapreduce/trunk@994941 Reporter: Dan Adkins Labels: hadoop Fix For: 0.22.0 Attachments: MAPREDUCE-2059.patch, MAPREDUCE-2059.patch The jobtracker is treating the file 'jobtracker.info' in the system data directory as a job to be recovered, resulting in the following: 10/09/09 18:06:02 WARN mapred.JobTracker: Failed to add the job jobtracker.info java.lang.IllegalArgumentException: JobId string : jobtracker.info is not properly formed at org.apache.hadoop.mapreduce.JobID.forName(JobID.java:158) at org.apache.hadoop.mapred.JobID.forName(JobID.java:84) at org.apache.hadoop.mapred.JobTracker$RecoveryManager.addJobForRecovery(JobTracker.java:1057) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1565) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:275) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:267) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:262) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4256) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2059) RecoveryManager attempts to add jobtracker.info
[ https://issues.apache.org/jira/browse/MAPREDUCE-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-2059: --- Affects Version/s: (was: 0.23.0) 0.22.0 0.20.203.0 Fix Version/s: 0.22.0 RecoveryManager attempts to add jobtracker.info --- Key: MAPREDUCE-2059 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2059 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.203.0, 0.22.0 Environment: https://svn.apache.org/repos/asf/hadoop/mapreduce/trunk@994941 Reporter: Dan Adkins Labels: hadoop Fix For: 0.22.0 Attachments: MAPREDUCE-2059.patch The jobtracker is treating the file 'jobtracker.info' in the system data directory as a job to be recovered, resulting in the following: 10/09/09 18:06:02 WARN mapred.JobTracker: Failed to add the job jobtracker.info java.lang.IllegalArgumentException: JobId string : jobtracker.info is not properly formed at org.apache.hadoop.mapreduce.JobID.forName(JobID.java:158) at org.apache.hadoop.mapred.JobID.forName(JobID.java:84) at org.apache.hadoop.mapred.JobTracker$RecoveryManager.addJobForRecovery(JobTracker.java:1057) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1565) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:275) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:267) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:262) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4256) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1118) Capacity Scheduler scheduling information is hard to read / should be tabular format
[ https://issues.apache.org/jira/browse/MAPREDUCE-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-1118: --- Status: Open (was: Patch Available) Did you forget to include sorttable.js? No need to make patch available for .22. Jenkins kick off works only for trunk. Capacity Scheduler scheduling information is hard to read / should be tabular format Key: MAPREDUCE-1118 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1118 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.2 Reporter: Allen Wittenauer Fix For: 0.22.0, 0.20.203.0 Attachments: MR-1118-22.patch, mapred-1118-1.patch, mapred-1118-2.patch, mapred-1118-3.patch, mapred-1118.20S.patch, mapred-1118.patch The scheduling information provided by the capacity scheduler is extremely hard to read on the job tracker web page. Instead of just flat text, it should be presenting the information in a tabular format, similar to what the fair share scheduler provides. This makes it much easier to compare what different queues are doing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-1118) Capacity Scheduler scheduling information is hard to read / should be tabular format
[ https://issues.apache.org/jira/browse/MAPREDUCE-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-1118: --- Fix Version/s: 0.22.0 Capacity Scheduler scheduling information is hard to read / should be tabular format Key: MAPREDUCE-1118 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1118 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.2 Reporter: Allen Wittenauer Fix For: 0.20.203.0, 0.22.0 Attachments: mapred-1118-1.patch, mapred-1118-2.patch, mapred-1118-3.patch, mapred-1118.20S.patch, mapred-1118.patch The scheduling information provided by the capacity scheduler is extremely hard to read on the job tracker web page. Instead of just flat text, it should be presenting the information in a tabular format, similar to what the fair share scheduler provides. This makes it much easier to compare what different queues are doing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2373) When tasks exit with a nonzero exit status, task runner should log the stderr as well as stdout
[ https://issues.apache.org/jira/browse/MAPREDUCE-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-2373: --- Fix Version/s: (was: 0.22.0) When tasks exit with a nonzero exit status, task runner should log the stderr as well as stdout --- Key: MAPREDUCE-2373 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2373 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: mapreduce-2373-on-20sec.txt, mr-2373-amendment.txt Currently, if the taskjvm.sh script fails to exec java for some reason, it prints its error message to stderr. This doesn't make it to the logs anywhere. Logging the stderr is very useful to understand why taskjvm.sh failed to start the Child jvm. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2574) Force entropy to come from non-true random for tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-2574: --- Fix Version/s: (was: 0.22.0) Force entropy to come from non-true random for tests Key: MAPREDUCE-2574 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2574 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: mapreduce-2574.txt Same as HADOOP-7335 but for MR -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3138) Allow for applications to deal with MAPREDUCE-954
[ https://issues.apache.org/jira/browse/MAPREDUCE-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-3138: --- Fix Version/s: 0.22.0 Committed to 0.22 branch. Allow for applications to deal with MAPREDUCE-954 - Key: MAPREDUCE-3138 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3138 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, mrv2 Affects Versions: 0.23.0 Reporter: Arun C Murthy Assignee: Owen O'Malley Priority: Blocker Fix For: 0.22.0, 0.23.0 Attachments: MAPREDUCE-3138.0.22.patch, MAPREDUCE-3138.patch, MAPREDUCE-3138.patch MAPREDUCE-954 changed the context-objs api to interfaces. This breaks Pig. We need a bridge for them to move to 0.23. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2531) org.apache.hadoop.mapred.jobcontrol.getAssignedJobID throw class cast exception
[ https://issues.apache.org/jira/browse/MAPREDUCE-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-2531: --- Priority: Blocker (was: Major) Fix Version/s: 0.22.0 org.apache.hadoop.mapred.jobcontrol.getAssignedJobID throw class cast exception Key: MAPREDUCE-2531 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2531 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.22.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Fix For: 0.22.0, 0.23.0 Attachments: MR-2531-V1-trunk.patch, MR-2531-yarn-v1.patch When using a combination of the mapred and mapreduce APIs (PIG) it is possible to have the following exception Caused by: java.lang.ClassCastException: org.apache.hadoop.mapreduce.JobID cannot be cast to org.apache.hadoop.mapred.JobID at org.apache.hadoop.mapred.jobcontrol.Job.getAssignedJobID(Job.java:71) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:239) at org.apache.pig.PigServer.launchPlan(PigServer.java:1325) ... 29 more This is because the JobID is just downcast. It should be calling JobID.downgrade -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3139) SlivePartitioner generates negative partitions
[ https://issues.apache.org/jira/browse/MAPREDUCE-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-3139: --- Summary: SlivePartitioner generates negative partitions (was: SlivePartitioner generates negtive partitions) SlivePartitioner generates negative partitions -- Key: MAPREDUCE-3139 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3139 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.22.0 Reporter: Konstantin Shvachko Fix For: 0.22.0, 0.23.0, 0.24.0 {{SlivePartitioner.getPartition()}} returns negative partition numbers on some occasions, which is illegal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2779) JobSplitWriter.java can't handle large job.split file
[ https://issues.apache.org/jira/browse/MAPREDUCE-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-2779: --- Resolution: Fixed Fix Version/s: 0.24.0 0.23.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I just committed this to 0.22, 0.23, and trunk. Thank you Ming. JobSplitWriter.java can't handle large job.split file - Key: MAPREDUCE-2779 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2779 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Affects Versions: 0.20.205.0, 0.22.0, 0.23.0 Reporter: Ming Ma Assignee: Ming Ma Fix For: 0.22.0, 0.23.0, 0.24.0 Attachments: MAPREDUCE-2779-0.22.patch, MAPREDUCE-2779-trunk.patch, MAPREDUCE-2779-trunk.patch We use cascading MultiInputFormat. MultiInputFormat sometimes generates big job.split used internally by hadoop, sometimes it can go beyond 2GB. In JobSplitWriter.java, the function that generates such file uses 32bit signed integer to compute offset into job.split. writeNewSplits ... int prevCount = out.size(); ... int currCount = out.size(); writeOldSplits ... long offset = out.size(); ... int currLen = out.size(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2779) JobSplitWriter.java can't handle large job.split file
[ https://issues.apache.org/jira/browse/MAPREDUCE-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-2779: --- Attachment: MAPREDUCE-2779-trunk.patch JobSplitWriter.java can't handle large job.split file - Key: MAPREDUCE-2779 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2779 Project: Hadoop Map/Reduce Issue Type: Bug Components: job submission Affects Versions: 0.20.205.0, 0.22.0, 0.23.0 Reporter: Ming Ma Assignee: Ming Ma Fix For: 0.22.0 Attachments: MAPREDUCE-2779-0.22.patch, MAPREDUCE-2779-trunk.patch, MAPREDUCE-2779-trunk.patch We use cascading MultiInputFormat. MultiInputFormat sometimes generates big job.split used internally by hadoop, sometimes it can go beyond 2GB. In JobSplitWriter.java, the function that generates such file uses 32bit signed integer to compute offset into job.split. writeNewSplits ... int prevCount = out.size(); ... int currCount = out.size(); writeOldSplits ... long offset = out.size(); ... int currLen = out.size(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3039) Make mapreduce use same version of avro as HBase
[ https://issues.apache.org/jira/browse/MAPREDUCE-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-3039: --- Attachment: MAPREDUCE-3039-branch-0.22-shv.patch A slight modification to avoid avro dependency on packaged jars in Hadoop. Make mapreduce use same version of avro as HBase Key: MAPREDUCE-3039 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3039 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched, contrib/fair-share, contrib/gridmix, contrib/mrunit, contrib/mumak, contrib/raid, contrib/streaming, jobhistoryserver Affects Versions: 0.22.0 Reporter: Joep Rottinghuis Assignee: Joep Rottinghuis Fix For: 0.22.0 Attachments: MAPREDUCE-3039-branch-0.22-shv.patch, MAPREDUCE-3039-branch-0.22.patch HBase depends on avro 1.5.3 whereas hadoop-common depends on 1.3.2. When building HBase on top of hadoop, this should be consistent. Moreover, this should be consistent between common, hdfs, and mapreduce. Contribs seem to have declared a dependency on avro but are not in fact depending on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3039) Make mapreduce use same version of avro as HBase
[ https://issues.apache.org/jira/browse/MAPREDUCE-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-3039: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I just committed this. Thank you Joep. Make mapreduce use same version of avro as HBase Key: MAPREDUCE-3039 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3039 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched, contrib/fair-share, contrib/gridmix, contrib/mrunit, contrib/mumak, contrib/raid, contrib/streaming, jobhistoryserver Affects Versions: 0.22.0 Reporter: Joep Rottinghuis Assignee: Joep Rottinghuis Fix For: 0.22.0 Attachments: MAPREDUCE-3039-branch-0.22-shv.patch, MAPREDUCE-3039-branch-0.22.patch HBase depends on avro 1.5.3 whereas hadoop-common depends on 1.3.2. When building HBase on top of hadoop, this should be consistent. Moreover, this should be consistent between common, hdfs, and mapreduce. Contribs seem to have declared a dependency on avro but are not in fact depending on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3088) Clover 2.4.3 breaks build for 0.22 branch
[ https://issues.apache.org/jira/browse/MAPREDUCE-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated MAPREDUCE-3088: --- Component/s: build Clover 2.4.3 breaks build for 0.22 branch - Key: MAPREDUCE-3088 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3088 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.22.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 0.22.0 Attachments: nightly.patch Due to known bug in Clover 2.4.3 build for 0.22 branch is broken. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira