[jira] [Updated] (MAPREDUCE-3674) If invoked with no queueName request param, jobqueue_details.jsp injects a null queue name into schedulers.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-3674: --- Resolution: Fixed Fix Version/s: 1.1.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the review Tom and Todd, committed for 1.1.0. If invoked with no queueName request param, jobqueue_details.jsp injects a null queue name into schedulers. --- Key: MAPREDUCE-3674 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3674 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.0.0 Reporter: Harsh J Assignee: Harsh J Priority: Critical Fix For: 1.1.0 Attachments: MAPREDUCE-3674.patch, MAPREDUCE-3674.patch, MAPREDUCE-3674.patch, MAPREDUCE-3674.patch, hadoop-findbugs-report.html When you access /jobqueue_details.jsp manually, instead of via a link, it has queueName set to null internally and this goes for a lookup into the scheduling info maps as well. As a result, if using FairScheduler, a Pool with String name = null gets created and this brings the scheduler down. I have not tested what happens to the CapacityScheduler, but ideally if no queueName is set in that jsp, it should fall back to 'default'. Otherwise, this brings down the JobTracker completely. FairScheduler must also add a check to not create a pool with 'null' name. The following is the strace that ensues: {code} ERROR org.mortbay.log: /jobqueue_details.jsp java.lang.NullPointerException at org.apache.hadoop.mapred.jobqueue_005fdetails_jsp._jspService(jobqueue_005fdetails_jsp.java:71) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:829) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9001, call heartbeat from XYZ:MNOP: error: java.io.IOException: java.lang.NullPointerException java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.mapred.SchedulingAlgorithms$FairShareComparator.compare(SchedulingAlgorithms.java:95) at org.apache.hadoop.mapred.SchedulingAlgorithms$FairShareComparator.compare(SchedulingAlgorithms.java:68) at java.util.Arrays.mergeSort(Unknown Source) at java.util.Arrays.sort(Unknown Source) at java.util.Collections.sort(Unknown Source) at org.apache.hadoop.mapred.FairScheduler.assignTasks(FairScheduler.java:435) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3226) at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Unknown Source) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428
[jira] [Updated] (MAPREDUCE-3674) If invoked with no queueName request param, jobqueue_details.jsp injects a null queue name into schedulers.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-3674: --- Target Version/s: (was: 1.1.0) If invoked with no queueName request param, jobqueue_details.jsp injects a null queue name into schedulers. --- Key: MAPREDUCE-3674 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3674 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.0.0 Reporter: Harsh J Assignee: Harsh J Priority: Critical Fix For: 1.1.0 Attachments: MAPREDUCE-3674.patch, MAPREDUCE-3674.patch, MAPREDUCE-3674.patch, MAPREDUCE-3674.patch, hadoop-findbugs-report.html When you access /jobqueue_details.jsp manually, instead of via a link, it has queueName set to null internally and this goes for a lookup into the scheduling info maps as well. As a result, if using FairScheduler, a Pool with String name = null gets created and this brings the scheduler down. I have not tested what happens to the CapacityScheduler, but ideally if no queueName is set in that jsp, it should fall back to 'default'. Otherwise, this brings down the JobTracker completely. FairScheduler must also add a check to not create a pool with 'null' name. The following is the strace that ensues: {code} ERROR org.mortbay.log: /jobqueue_details.jsp java.lang.NullPointerException at org.apache.hadoop.mapred.jobqueue_005fdetails_jsp._jspService(jobqueue_005fdetails_jsp.java:71) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:829) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9001, call heartbeat from XYZ:MNOP: error: java.io.IOException: java.lang.NullPointerException java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.mapred.SchedulingAlgorithms$FairShareComparator.compare(SchedulingAlgorithms.java:95) at org.apache.hadoop.mapred.SchedulingAlgorithms$FairShareComparator.compare(SchedulingAlgorithms.java:68) at java.util.Arrays.mergeSort(Unknown Source) at java.util.Arrays.sort(Unknown Source) at java.util.Collections.sort(Unknown Source) at org.apache.hadoop.mapred.FairScheduler.assignTasks(FairScheduler.java:435) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3226) at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Unknown Source) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org
[jira] [Updated] (MAPREDUCE-4026) Lower minimum-allocation-mb to sensible defaults
[ https://issues.apache.org/jira/browse/MAPREDUCE-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-4026: --- Attachment: MAPREDUCE-4026.patch Patch that lowers the min-alloc MB to 128 in both existing FIFO and CS schedulers. Lower minimum-allocation-mb to sensible defaults Key: MAPREDUCE-4026 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4026 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, scheduler Affects Versions: 2.0.0 Reporter: Harsh J Assignee: Harsh J Attachments: MAPREDUCE-4026.patch The CapacityScheduler's minimum-allocation-mb is set to 1024. The FIFO's minimum-allocation-mb meanwhile, is 128. I propose changing the formers' minimum to that amount as well. 1024 is way too much as a default, wastes slots on NMs - and I also do not see why CS has to deviate that settings from the FIFO default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4026) Lower minimum-allocation-mb to sensible defaults
[ https://issues.apache.org/jira/browse/MAPREDUCE-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-4026: --- Target Version/s: 2.0.0 (was: trunk, 2.0.0) Status: Patch Available (was: Open) Lower minimum-allocation-mb to sensible defaults Key: MAPREDUCE-4026 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4026 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, scheduler Affects Versions: 0.23.3 Reporter: Harsh J Assignee: Harsh J Attachments: MAPREDUCE-4026.patch The CapacityScheduler's minimum-allocation-mb is set to 1024. The FIFO's minimum-allocation-mb meanwhile, is 128. I propose changing the formers' minimum to that amount as well. 1024 is way too much as a default, wastes slots on NMs - and I also do not see why CS has to deviate that settings from the FIFO default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4026) Lower minimum-allocation-mb to sensible defaults
[ https://issues.apache.org/jira/browse/MAPREDUCE-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-4026: --- Affects Version/s: (was: 0.23.3) 2.0.0 Lower minimum-allocation-mb to sensible defaults Key: MAPREDUCE-4026 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4026 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, scheduler Affects Versions: 2.0.0 Reporter: Harsh J Assignee: Harsh J Attachments: MAPREDUCE-4026.patch The CapacityScheduler's minimum-allocation-mb is set to 1024. The FIFO's minimum-allocation-mb meanwhile, is 128. I propose changing the formers' minimum to that amount as well. 1024 is way too much as a default, wastes slots on NMs - and I also do not see why CS has to deviate that settings from the FIFO default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4027) Document the minimum-allocation-mb configurations
[ https://issues.apache.org/jira/browse/MAPREDUCE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-4027: --- Attachment: MAPREDUCE-4027.patch Docs inline with MAPREDUCE-4026 changes. Document the minimum-allocation-mb configurations - Key: MAPREDUCE-4027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4027 Project: Hadoop Map/Reduce Issue Type: Improvement Components: resourcemanager Affects Versions: 0.23.3 Reporter: Harsh J Assignee: Harsh J Priority: Minor Attachments: MAPREDUCE-4027.patch None of the current yarn.scheduler.fifo.minimum-allocation-mb and yarn.scheduler.capacity.minimum-allocation-mb are documented anywhere. Without knowledge of these params, one can't change the default allocations. And the default allocations are pretty high btw (MAPREDUCE-4026). We should document these in the Cluster Setup page at least. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4027) Document the minimum-allocation-mb configurations
[ https://issues.apache.org/jira/browse/MAPREDUCE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-4027: --- Target Version/s: 2.0.0, trunk (was: trunk, 2.0.0) Status: Patch Available (was: Open) Document the minimum-allocation-mb configurations - Key: MAPREDUCE-4027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4027 Project: Hadoop Map/Reduce Issue Type: Improvement Components: resourcemanager Affects Versions: 0.23.3 Reporter: Harsh J Assignee: Harsh J Priority: Minor Attachments: MAPREDUCE-4027.patch None of the current yarn.scheduler.fifo.minimum-allocation-mb and yarn.scheduler.capacity.minimum-allocation-mb are documented anywhere. Without knowledge of these params, one can't change the default allocations. And the default allocations are pretty high btw (MAPREDUCE-4026). We should document these in the Cluster Setup page at least. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4026) Lower minimum-allocation-mb to sensible defaults
[ https://issues.apache.org/jira/browse/MAPREDUCE-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-4026: --- Attachment: MAPREDUCE-4026.patch Didn't realize there was a hardcoded test in TestFifoScheduler. Fixed test to not hardcode now, and rather reuse a constant. Lower minimum-allocation-mb to sensible defaults Key: MAPREDUCE-4026 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4026 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, scheduler Affects Versions: 2.0.0 Reporter: Harsh J Assignee: Harsh J Attachments: MAPREDUCE-4026.patch, MAPREDUCE-4026.patch The CapacityScheduler's minimum-allocation-mb is set to 1024. The FIFO's minimum-allocation-mb meanwhile, is 128. I propose changing the formers' minimum to that amount as well. 1024 is way too much as a default, wastes slots on NMs - and I also do not see why CS has to deviate that settings from the FIFO default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4026) Lower minimum-allocation-mb to sensible defaults
[ https://issues.apache.org/jira/browse/MAPREDUCE-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-4026: --- Attachment: MAPREDUCE-4026.patch Ah earlier patch lacked all sub commits. Re-upping a new one. Lower minimum-allocation-mb to sensible defaults Key: MAPREDUCE-4026 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4026 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, scheduler Affects Versions: 2.0.0 Reporter: Harsh J Assignee: Harsh J Attachments: MAPREDUCE-4026.patch, MAPREDUCE-4026.patch, MAPREDUCE-4026.patch The CapacityScheduler's minimum-allocation-mb is set to 1024. The FIFO's minimum-allocation-mb meanwhile, is 128. I propose changing the formers' minimum to that amount as well. 1024 is way too much as a default, wastes slots on NMs - and I also do not see why CS has to deviate that settings from the FIFO default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4026) Lower minimum-allocation-mb to sensible defaults
[ https://issues.apache.org/jira/browse/MAPREDUCE-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-4026: --- Attachment: MAPREDUCE-4026.patch Only TestRMWebServices failure seems related to this. Fixed that test to not use hardcodes either. Fixed test passes now. The other failures stem out of port binding issues (mis-closed min-YARN clusters?) and a couple of others with kerberos-related issues. Lower minimum-allocation-mb to sensible defaults Key: MAPREDUCE-4026 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4026 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, scheduler Affects Versions: 2.0.0 Reporter: Harsh J Assignee: Harsh J Attachments: MAPREDUCE-4026.patch, MAPREDUCE-4026.patch, MAPREDUCE-4026.patch, MAPREDUCE-4026.patch The CapacityScheduler's minimum-allocation-mb is set to 1024. The FIFO's minimum-allocation-mb meanwhile, is 128. I propose changing the formers' minimum to that amount as well. 1024 is way too much as a default, wastes slots on NMs - and I also do not see why CS has to deviate that settings from the FIFO default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4154) streaming MR job succeeds even if the streaming command fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated MAPREDUCE-4154: - Target Version/s: 1.1.0, 1.0.3 (was: 1.1.0) I think we should put this into 1.0.3. Thoughts? streaming MR job succeeds even if the streaming command fails - Key: MAPREDUCE-4154 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4154 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.0.2 Reporter: Thejas M Nair Assignee: Devaraj Das Fix For: 1.1.0 Attachments: streaming.patch Hadoop 1.0.1 behaves as expected - The task fails for streaming MR job if the streaming command fails. But it succeeds in hadoop 1.0.2 . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4169) Container Logs appear in unsorted order
[ https://issues.apache.org/jira/browse/MAPREDUCE-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated MAPREDUCE-4169: --- Attachment: MAPREDUCE-4169.patch Container Logs appear in unsorted order --- Key: MAPREDUCE-4169 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4169 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3, 2.0.0 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Minor Attachments: MAPREDUCE-4169.patch container logs (stdout, stderr, syslog) in the nodemanager ui and jobhistory ui appear in unsorted order where the order displayed is based on what file was created first. This jira will have the results be displayed in a consistent order. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4169) Container Logs appear in unsorted order
[ https://issues.apache.org/jira/browse/MAPREDUCE-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated MAPREDUCE-4169: --- Target Version/s: 0.23.3, 2.0.0 (was: 2.0.0, 0.23.3) Status: Patch Available (was: Open) Container Logs appear in unsorted order --- Key: MAPREDUCE-4169 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4169 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3, 2.0.0 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Minor Attachments: MAPREDUCE-4169.patch container logs (stdout, stderr, syslog) in the nodemanager ui and jobhistory ui appear in unsorted order where the order displayed is based on what file was created first. This jira will have the results be displayed in a consistent order. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4074) Client continuously retries to RM When RM goes down before launching Application Master
[ https://issues.apache.org/jira/browse/MAPREDUCE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4074: - Resolution: Fixed Fix Version/s: 0.23.3 Target Version/s: 0.23.3 Status: Resolved (was: Patch Available) +1. Thanks xieguiming! I've committed this to trunk, branch-2, and branch-0.23. Client continuously retries to RM When RM goes down before launching Application Master --- Key: MAPREDUCE-4074 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4074 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.1 Reporter: Devaraj K Fix For: 0.23.3 Attachments: MAPREDUCE-4074-1.patch, MAPREDUCE-4074-2.patch, MAPREDUCE-4074-3.patch, MAPREDUCE-4074.patch Client continuously tries to RM and logs the below messages when the RM goes down before launching App Master. I feel exception should be thrown or break the loop after finite no of retries. {code:xml} 28/03/12 07:15:03 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 0 time(s). 28/03/12 07:15:04 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 1 time(s). 28/03/12 07:15:05 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 2 time(s). 28/03/12 07:15:06 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 3 time(s). 28/03/12 07:15:07 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 4 time(s). 28/03/12 07:15:08 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 5 time(s). 28/03/12 07:15:09 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 6 time(s). 28/03/12 07:15:10 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 7 time(s). 28/03/12 07:15:11 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 8 time(s). 28/03/12 07:15:12 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 9 time(s). 28/03/12 07:15:13 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 0 time(s). 28/03/12 07:15:14 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 1 time(s). 28/03/12 07:15:15 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 2 time(s). 28/03/12 07:15:16 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 3 time(s). 28/03/12 07:15:17 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 4 time(s). {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4133) MR over viewfs is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John George updated MAPREDUCE-4133: --- Status: Open (was: Patch Available) MR over viewfs is broken Key: MAPREDUCE-4133 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4133 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.3, 2.0.0 Reporter: John George Assignee: John George Attachments: MR-4133.patch, MR-4133.patch, MR-4133.patch After the changes in HADOOP-8014 went in, MR programs using viewfs broke. This is because, viewfs now expects getDefaultBlockSize, getDefaultReplication, and getServerDefaults to pass in a {{path}} as an argument. In the existing MR source, these are called with no arguments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4133) MR over viewfs is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John George updated MAPREDUCE-4133: --- Status: Patch Available (was: Open) trying again... MR over viewfs is broken Key: MAPREDUCE-4133 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4133 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.3, 2.0.0 Reporter: John George Assignee: John George Attachments: MR-4133.patch, MR-4133.patch, MR-4133.patch After the changes in HADOOP-8014 went in, MR programs using viewfs broke. This is because, viewfs now expects getDefaultBlockSize, getDefaultReplication, and getServerDefaults to pass in a {{path}} as an argument. In the existing MR source, these are called with no arguments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4133) MR over viewfs is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John George updated MAPREDUCE-4133: --- Attachment: MR-4133.patch MR over viewfs is broken Key: MAPREDUCE-4133 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4133 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.3, 2.0.0 Reporter: John George Assignee: John George Attachments: MR-4133.patch, MR-4133.patch, MR-4133.patch After the changes in HADOOP-8014 went in, MR programs using viewfs broke. This is because, viewfs now expects getDefaultBlockSize, getDefaultReplication, and getServerDefaults to pass in a {{path}} as an argument. In the existing MR source, these are called with no arguments. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3867) MiniMRYarn/MiniYarn use fixed ports
[ https://issues.apache.org/jira/browse/MAPREDUCE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-3867: -- Attachment: MAPREDUCE-3867.patch Nailed the TestDistributedShell failure. It needed some special classpath setting and yarn-site.xml for the minicluster. Also this testcase was using Java assert instead Junit Assert, and when running from IDE the test was wrongly passing as the IDE does not have assertions on. Now all tests using minicluster pass. Though the patch won't apply because it is multiproject (mapreduce/tools) For the reviewer, the steps to test it are: * from trunk/ : mvn install -DskipTests -Dmaven.javadoc.skip=true * from trunk/hadoop-mapreduce-project/ : mvn test -offline * from trunk/hadoop-tools/ : mvn test -offline MiniMRYarn/MiniYarn use fixed ports --- Key: MAPREDUCE-3867 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3867 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: MAPREDUCE-3867.patch, MAPREDUCE-3867.patch, MAPREDUCE-3867.patch This presents issues if there are other processes using those ports. Also, if multitasking among dev environments using Mini* things start to fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4159) Job is running in Uber mode after setting mapreduce.job.ubertask.maxreduces to zero
[ https://issues.apache.org/jira/browse/MAPREDUCE-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4159: --- Resolution: Fixed Fix Version/s: 3.0.0 2.0.0 0.23.3 Status: Resolved (was: Patch Available) Thanks Devaraj, I just pulled this into trunk, branch-2, and branch-0.23 Job is running in Uber mode after setting mapreduce.job.ubertask.maxreduces to zero - Key: MAPREDUCE-4159 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4159 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0, 3.0.0 Reporter: Nishan Shetty Assignee: Devaraj K Fix For: 0.23.3, 2.0.0, 3.0.0 Attachments: MAPREDUCE-4159-1.patch, MAPREDUCE-4159.patch 1.Configure mapreduce.job.ubertask.enable to true 2.Configure mapreduce.job.ubertask.maxreduces to 0(zero) 3.Run job such that it has one reducer(more than mapreduce.job.ubertask.maxreduces value) Observe that job is running in Uber mode instead of normal mode(non uber mode) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4163) consistently set the bind address
[ https://issues.apache.org/jira/browse/MAPREDUCE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-4163: --- Attachment: HADOOP-3659.patch consistently set the bind address - Key: MAPREDUCE-4163 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4163 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0, 0.24.0, 2.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Use {{NetUtils.getConnectAddress}} for determining the bind address used for setting a token's service. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4163) consistently set the bind address
[ https://issues.apache.org/jira/browse/MAPREDUCE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-4163: --- Attachment: (was: HADOOP-3659.patch) consistently set the bind address - Key: MAPREDUCE-4163 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4163 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0, 0.24.0, 2.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Use {{NetUtils.getConnectAddress}} for determining the bind address used for setting a token's service. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4163) consistently set the bind address
[ https://issues.apache.org/jira/browse/MAPREDUCE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-4163: --- Attachment: MAPREDUCE-3659.patch consistently set the bind address - Key: MAPREDUCE-4163 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4163 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0, 0.24.0, 2.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3659.patch Use {{NetUtils.getConnectAddress}} for determining the bind address used for setting a token's service. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4163) consistently set the bind address
[ https://issues.apache.org/jira/browse/MAPREDUCE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-4163: --- Status: Patch Available (was: Open) consistently set the bind address - Key: MAPREDUCE-4163 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4163 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0, 0.24.0, 2.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3659.patch Use {{NetUtils.getConnectAddress}} for determining the bind address used for setting a token's service. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4165) Committing is misspelled as commiting in task logs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4165: --- Resolution: Fixed Fix Version/s: 3.0.0 2.0.0 0.23.3 Target Version/s: 0.23.3, 2.0.0 (was: 2.0.0, 0.23.3) Status: Resolved (was: Patch Available) Thanks Jon, +1 I put this in trunk, branch-2, and branch-0.23 Committing is misspelled as commiting in task logs -- Key: MAPREDUCE-4165 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4165 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3, 2.0.0 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Trivial Fix For: 0.23.3, 2.0.0, 3.0.0 Attachments: MAPREDUCE-4165.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4079) Allow MR AppMaster to limit ephemeral port range.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4079: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 3.0.0, 2.0.0, 0.23.3) Status: Patch Available (was: Open) Allow MR AppMaster to limit ephemeral port range. - Key: MAPREDUCE-4079 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4079 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Affects Versions: 0.23.2, 2.0.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Attachments: MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt Having the MapReduce Application Masters bind to any ephemeral port makes it very difficult to setup ACLs. mapreduce.job.am-access-disabled from MAPREDUCE-3251 is not a practical permanent solution for all jobs. Especially for tools like pig where they are not aware of mapreduce.job.am-access-disabled and may deal with it properly. We should add in a config option that would allow someone to restrict the range of ports that the MR-AM can bind to. It will slow down startup in some cases because we will have to probe for open ports instead of just asking the OS to find one for us. But we can make that conditional on this config so users who do not set this config do not see any performance degradation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4079) Allow MR AppMaster to limit ephemeral port range.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4079: --- Attachment: MR-4079-trunk.txt Kicking Jenkins again. Allow MR AppMaster to limit ephemeral port range. - Key: MAPREDUCE-4079 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4079 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Affects Versions: 0.23.2, 2.0.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Attachments: MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt Having the MapReduce Application Masters bind to any ephemeral port makes it very difficult to setup ACLs. mapreduce.job.am-access-disabled from MAPREDUCE-3251 is not a practical permanent solution for all jobs. Especially for tools like pig where they are not aware of mapreduce.job.am-access-disabled and may deal with it properly. We should add in a config option that would allow someone to restrict the range of ports that the MR-AM can bind to. It will slow down startup in some cases because we will have to probe for open ports instead of just asking the OS to find one for us. But we can make that conditional on this config so users who do not set this config do not see any performance degradation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4079) Allow MR AppMaster to limit ephemeral port range.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4079: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 3.0.0, 2.0.0, 0.23.3) Status: Open (was: Patch Available) Allow MR AppMaster to limit ephemeral port range. - Key: MAPREDUCE-4079 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4079 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Affects Versions: 0.23.2, 2.0.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Attachments: MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt Having the MapReduce Application Masters bind to any ephemeral port makes it very difficult to setup ACLs. mapreduce.job.am-access-disabled from MAPREDUCE-3251 is not a practical permanent solution for all jobs. Especially for tools like pig where they are not aware of mapreduce.job.am-access-disabled and may deal with it properly. We should add in a config option that would allow someone to restrict the range of ports that the MR-AM can bind to. It will slow down startup in some cases because we will have to probe for open ports instead of just asking the OS to find one for us. But we can make that conditional on this config so users who do not set this config do not see any performance degradation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4093) Improve RM WebApp start up when proxy address is not set
[ https://issues.apache.org/jira/browse/MAPREDUCE-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4093: --- Resolution: Fixed Fix Version/s: 3.0.0 2.0.0 Status: Resolved (was: Patch Available) Thanks Devaraj, I put this into branch-2, and trunk Improve RM WebApp start up when proxy address is not set Key: MAPREDUCE-4093 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4093 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 2.0.0, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Fix For: 2.0.0, 3.0.0 Attachments: MAPREDUCE-4093.patch {code:title=ResourceManager.java|borderStyle=solid} protected void startWepApp() { BuilderApplicationMasterService builder = WebApps.$for(cluster, ApplicationMasterService.class, masterService, ws).at( this.conf.get(YarnConfiguration.RM_WEBAPP_ADDRESS, YarnConfiguration.DEFAULT_RM_WEBAPP_ADDRESS)); if(YarnConfiguration.getRMWebAppHostAndPort(conf). equals(YarnConfiguration.getProxyHostAndPort(conf))) { AppReportFetcher fetcher = new AppReportFetcher(conf, getClientRMService()); builder.withServlet(ProxyUriUtils.PROXY_SERVLET_NAME, ProxyUriUtils.PROXY_PATH_SPEC, WebAppProxyServlet.class); builder.withAttribute(WebAppProxy.FETCHER_ATTRIBUTE, fetcher); String proxy = YarnConfiguration.getProxyHostAndPort(conf); String[] proxyParts = proxy.split(:); builder.withAttribute(WebAppProxy.PROXY_HOST_ATTRIBUTE, proxyParts[0]); } webApp = builder.start(new RMWebApp(this)); } {code} In the above code, YarnConfiguration.getProxyHostAndPort(conf) is invoking twice. getProxyHostAndPort() internally invokes getRMWebAppHostAndPort() which resolves RM web app address when proxy address is not set. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4048) NullPointerException exception while accessing the Application Master UI
[ https://issues.apache.org/jira/browse/MAPREDUCE-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4048: --- Status: Open (was: Patch Available) NullPointerException exception while accessing the Application Master UI Key: MAPREDUCE-4048 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4048 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Attachments: MAPREDUCE-4048.patch {code:xml} 2012-03-21 10:21:31,838 ERROR [2145015588@qtp-957250718-801] org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /mapreduce/attempts/job_1332261815858_2_8/m/KILLED java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:150) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) ... at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: java.lang.NullPointerException at com.google.common.base.Joiner.toString(Joiner.java:317) at com.google.common.base.Joiner.appendTo(Joiner.java:97) at com.google.common.base.Joiner.appendTo(Joiner.java:127) at com.google.common.base.Joiner.join(Joiner.java:158) at com.google.common.base.Joiner.join(Joiner.java:166) at org.apache.hadoop.yarn.util.StringHelper.join(StringHelper.java:102) at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.badRequest(AppController.java:319) at org.apache.hadoop.mapreduce.v2.app.webapp.AppController.attempts(AppController.java:286) ... 36 more {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4129) Lots of unneeded counters log messages
[ https://issues.apache.org/jira/browse/MAPREDUCE-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4129: --- Resolution: Fixed Fix Version/s: (was: 0.23.2) 3.0.0 2.0.0 0.23.3 Status: Resolved (was: Patch Available) Thanks Ahmed, +1 the patch looks good. I don't think we really need any new tests because we are just changing log messages. I put this into trunk, branch-2, and branch-0.23. Lots of unneeded counters log messages -- Key: MAPREDUCE-4129 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4129 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Fix For: 0.23.3, 2.0.0, 3.0.0 Attachments: MAPREDUCE-4129.patch, MAPREDUCE-4129_rev2.patch Huge number of the same WARN messages are written. We only need to write each distinct message once. The messages are of the form: {code} 2012-04-05 03:55:04,166 WARN mapreduce.Counters: Group {oldGroup} is deprecated. Use {newGroup} instead {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3958) RM: Remove RMNodeState and replace it with NodeState
[ https://issues.apache.org/jira/browse/MAPREDUCE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3958: --- Status: Open (was: Patch Available) Could you please upmerge, the patch is out of date. RM: Remove RMNodeState and replace it with NodeState Key: MAPREDUCE-3958 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3958 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 0.23.2 Attachments: MAPREDUCE-3958-1.patch, MAPREDUCE-3958.patch RMNodeState is being sent over the wire after MAPREDUCE-3353. This has been done by cloning the enum into NodeState in yarn protocol records. That makes RMNodeState redundant and it should be replaced with NodeState. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3867) MiniMRYarn/MiniYarn should uses fixed ports
[ https://issues.apache.org/jira/browse/MAPREDUCE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-3867: -- Summary: MiniMRYarn/MiniYarn should uses fixed ports (was: MiniMRYarn/MiniYarn use fixed ports) MiniMRYarn/MiniYarn should uses fixed ports --- Key: MAPREDUCE-3867 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3867 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: MAPREDUCE-3867.patch, MAPREDUCE-3867.patch, MAPREDUCE-3867.patch This presents issues if there are other processes using those ports. Also, if multitasking among dev environments using Mini* things start to fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3867) MiniMRYarn/MiniYarn uses fixed ports
[ https://issues.apache.org/jira/browse/MAPREDUCE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-3867: -- Summary: MiniMRYarn/MiniYarn uses fixed ports (was: MiniMRYarn/MiniYarn should uses fixed ports) MiniMRYarn/MiniYarn uses fixed ports Key: MAPREDUCE-3867 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3867 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: MAPREDUCE-3867.patch, MAPREDUCE-3867.patch, MAPREDUCE-3867.patch This presents issues if there are other processes using those ports. Also, if multitasking among dev environments using Mini* things start to fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3867) MiniMRYarn/MiniYarn uses fixed ports
[ https://issues.apache.org/jira/browse/MAPREDUCE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-3867: -- Resolution: Fixed Fix Version/s: 2.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) committed to trunk and branch-2 MiniMRYarn/MiniYarn uses fixed ports Key: MAPREDUCE-3867 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3867 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.0.0 Attachments: MAPREDUCE-3867.patch, MAPREDUCE-3867.patch, MAPREDUCE-3867.patch This presents issues if there are other processes using those ports. Also, if multitasking among dev environments using Mini* things start to fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3947) yarn.app.mapreduce.am.resource.mb not documented
[ https://issues.apache.org/jira/browse/MAPREDUCE-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3947: --- Resolution: Fixed Fix Version/s: 3.0.0 2.0.0 0.23.3 Assignee: Devaraj K Status: Resolved (was: Patch Available) Thanks Devaraj, I put this into trunk, branch-2, and branch-0.23 yarn.app.mapreduce.am.resource.mb not documented Key: MAPREDUCE-3947 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3947 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.24.0, 0.23.3 Reporter: Todd Lipcon Assignee: Devaraj K Priority: Minor Labels: mrv2 Fix For: 0.23.3, 2.0.0, 3.0.0 Attachments: MAPREDUCE-3947.patch This configuration is useful but doesn't appear to be documented anywhere. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4163) consistently set the bind address
[ https://issues.apache.org/jira/browse/MAPREDUCE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-4163: --- Attachment: MAPREDUCE-3659-1.patch Remove now defunct code. consistently set the bind address - Key: MAPREDUCE-4163 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4163 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0, 0.24.0, 2.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3659-1.patch, MAPREDUCE-3659.patch Use {{NetUtils.getConnectAddress}} for determining the bind address used for setting a token's service. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3958) RM: Remove RMNodeState and replace it with NodeState
[ https://issues.apache.org/jira/browse/MAPREDUCE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-3958: -- Attachment: MAPREDUCE-3958-2.patch Done RM: Remove RMNodeState and replace it with NodeState Key: MAPREDUCE-3958 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3958 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 0.23.2 Attachments: MAPREDUCE-3958-1.patch, MAPREDUCE-3958-2.patch, MAPREDUCE-3958.patch RMNodeState is being sent over the wire after MAPREDUCE-3353. This has been done by cloning the enum into NodeState in yarn protocol records. That makes RMNodeState redundant and it should be replaced with NodeState. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3451) Port Fair Scheduler to MR2
[ https://issues.apache.org/jira/browse/MAPREDUCE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated MAPREDUCE-3451: - Hadoop Flags: Reviewed Status: Patch Available (was: Open) Port Fair Scheduler to MR2 -- Key: MAPREDUCE-3451 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3451 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2, scheduler Reporter: Patrick Wendell Assignee: Patrick Wendell Attachments: MAPREDUCE-3451.v1.patch.txt, MAPREDUCE-3451.v2.patch.txt, MAPREDUCE-3451.v3.patch.txt The Fair Scheduler is in widespread use today in MR1 clusters, but not yet ported to MR2. This is to track the porting of the Fair Scheduler to MR2 and will be updated to include design considerations and progress. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-3193: - Affects Version/s: 0.23.2 FileInputFormat doesn't read files recursively in the input path dir Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 0.23.2, 1.0.2, 2.0.0, 3.0.0 Reporter: Ramgopal N Assignee: Devaraj K Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2720) MR-279: Write a simple Java application
[ https://issues.apache.org/jira/browse/MAPREDUCE-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-2720: - Attachment: MAPREDUCE-2720.patch MR-279: Write a simple Java application --- Key: MAPREDUCE-2720 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2720 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2 Reporter: Sharad Agarwal Assignee: Devaraj K Attachments: MAPREDUCE-2720.patch Currently for isolation purposes, many simple java applications run in cluster with 1 map only job. (eg. Oozie). This is not really required with nextgen hadoop (mrv2) and *non-MR* apps are first class and easy to write. A simple hadoop java app can be written which runs in the cluster in the user space. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2720) MR-279: Write a simple Java application
[ https://issues.apache.org/jira/browse/MAPREDUCE-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-2720: - Affects Version/s: 3.0.0 2.0.0 Status: Patch Available (was: Open) I have attached first level patch for review. Please give your comments/suggestions on the patch. MR-279: Write a simple Java application --- Key: MAPREDUCE-2720 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2720 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2 Affects Versions: 2.0.0, 3.0.0 Reporter: Sharad Agarwal Assignee: Devaraj K Attachments: MAPREDUCE-2720.patch Currently for isolation purposes, many simple java applications run in cluster with 1 map only job. (eg. Oozie). This is not really required with nextgen hadoop (mrv2) and *non-MR* apps are first class and easy to write. A simple hadoop java app can be written which runs in the cluster in the user space. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4159) Job is running in Uber mode after setting mapreduce.job.ubertask.maxreduces to zero
[ https://issues.apache.org/jira/browse/MAPREDUCE-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-4159: - Assignee: Devaraj K Status: Patch Available (was: Open) I have attached patch to address this issue. Job is running in Uber mode after setting mapreduce.job.ubertask.maxreduces to zero - Key: MAPREDUCE-4159 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4159 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0 Reporter: Nishan Shetty Assignee: Devaraj K Attachments: MAPREDUCE-4159.patch 1.Configure mapreduce.job.ubertask.enable to true 2.Configure mapreduce.job.ubertask.maxreduces to 0(zero) 3.Run job such that it has one reducer(more than mapreduce.job.ubertask.maxreduces value) Observe that job is running in Uber mode instead of normal mode(non uber mode) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4159) Job is running in Uber mode after setting mapreduce.job.ubertask.maxreduces to zero
[ https://issues.apache.org/jira/browse/MAPREDUCE-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-4159: - Attachment: MAPREDUCE-4159.patch Job is running in Uber mode after setting mapreduce.job.ubertask.maxreduces to zero - Key: MAPREDUCE-4159 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4159 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0 Reporter: Nishan Shetty Attachments: MAPREDUCE-4159.patch 1.Configure mapreduce.job.ubertask.enable to true 2.Configure mapreduce.job.ubertask.maxreduces to 0(zero) 3.Run job such that it has one reducer(more than mapreduce.job.ubertask.maxreduces value) Observe that job is running in Uber mode instead of normal mode(non uber mode) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4161) create sockets consistently
[ https://issues.apache.org/jira/browse/MAPREDUCE-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-4161: --- Attachment: (was: MAPREDUCE-4161-1.patch) create sockets consistently --- Key: MAPREDUCE-4161 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4161 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: client, mrv2 Affects Versions: 0.24.0, 0.23.3, 2.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-4161.patch Use getSocketAddr from HADOOP-8286 to ensure sockets are created consistently and compatible for host-based service generation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4079) Allow MR AppMaster to limit ephemeral port range.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4079: --- Attachment: MR-4079-trunk.txt Kicking Jenkins Again. Allow MR AppMaster to limit ephemeral port range. - Key: MAPREDUCE-4079 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4079 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Affects Versions: 0.23.2, 2.0.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Attachments: MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt Having the MapReduce Application Masters bind to any ephemeral port makes it very difficult to setup ACLs. mapreduce.job.am-access-disabled from MAPREDUCE-3251 is not a practical permanent solution for all jobs. Especially for tools like pig where they are not aware of mapreduce.job.am-access-disabled and may deal with it properly. We should add in a config option that would allow someone to restrict the range of ports that the MR-AM can bind to. It will slow down startup in some cases because we will have to probe for open ports instead of just asking the OS to find one for us. But we can make that conditional on this config so users who do not set this config do not see any performance degradation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4079) Allow MR AppMaster to limit ephemeral port range.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4079: --- Target Version/s: 0.23.2, 2.0.0 (was: 2.0.0, 0.23.2) Status: Open (was: Patch Available) Allow MR AppMaster to limit ephemeral port range. - Key: MAPREDUCE-4079 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4079 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Affects Versions: 0.23.2, 2.0.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Attachments: MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt Having the MapReduce Application Masters bind to any ephemeral port makes it very difficult to setup ACLs. mapreduce.job.am-access-disabled from MAPREDUCE-3251 is not a practical permanent solution for all jobs. Especially for tools like pig where they are not aware of mapreduce.job.am-access-disabled and may deal with it properly. We should add in a config option that would allow someone to restrict the range of ports that the MR-AM can bind to. It will slow down startup in some cases because we will have to probe for open ports instead of just asking the OS to find one for us. But we can make that conditional on this config so users who do not set this config do not see any performance degradation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4079) Allow MR AppMaster to limit ephemeral port range.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4079: --- Target Version/s: 0.23.2, 2.0.0 (was: 2.0.0, 0.23.2) Status: Patch Available (was: Open) Allow MR AppMaster to limit ephemeral port range. - Key: MAPREDUCE-4079 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4079 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Affects Versions: 0.23.2, 2.0.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Attachments: MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt Having the MapReduce Application Masters bind to any ephemeral port makes it very difficult to setup ACLs. mapreduce.job.am-access-disabled from MAPREDUCE-3251 is not a practical permanent solution for all jobs. Especially for tools like pig where they are not aware of mapreduce.job.am-access-disabled and may deal with it properly. We should add in a config option that would allow someone to restrict the range of ports that the MR-AM can bind to. It will slow down startup in some cases because we will have to probe for open ports instead of just asking the OS to find one for us. But we can make that conditional on this config so users who do not set this config do not see any performance degradation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4161) create sockets consistently
[ https://issues.apache.org/jira/browse/MAPREDUCE-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-4161: --- Attachment: MAPREDUCE-4161-2.patch Paranoia is good. I reverted the two instances of the removal. create sockets consistently --- Key: MAPREDUCE-4161 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4161 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: client, mrv2 Affects Versions: 0.24.0, 0.23.3, 2.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-4161-1.patch, MAPREDUCE-4161-2.patch, MAPREDUCE-4161.patch Use getSocketAddr from HADOOP-8286 to ensure sockets are created consistently and compatible for host-based service generation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4161) create sockets consistently
[ https://issues.apache.org/jira/browse/MAPREDUCE-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4161: --- Status: Open (was: Patch Available) create sockets consistently --- Key: MAPREDUCE-4161 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4161 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: client, mrv2 Affects Versions: 0.24.0, 0.23.3, 2.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-4161-1.patch, MAPREDUCE-4161-2.patch, MAPREDUCE-4161.patch Use getSocketAddr from HADOOP-8286 to ensure sockets are created consistently and compatible for host-based service generation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4161) create sockets consistently
[ https://issues.apache.org/jira/browse/MAPREDUCE-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4161: --- Status: Patch Available (was: Open) create sockets consistently --- Key: MAPREDUCE-4161 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4161 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: client, mrv2 Affects Versions: 0.24.0, 0.23.3, 2.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-4161-1.patch, MAPREDUCE-4161-2.patch, MAPREDUCE-4161.patch Use getSocketAddr from HADOOP-8286 to ensure sockets are created consistently and compatible for host-based service generation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4074) Client continuously retries to RM When RM goes down before launching Application Master
[ https://issues.apache.org/jira/browse/MAPREDUCE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4074: - Status: Open (was: Patch Available) Client continuously retries to RM When RM goes down before launching Application Master --- Key: MAPREDUCE-4074 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4074 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.1 Reporter: Devaraj K Attachments: MAPREDUCE-4074-1.patch, MAPREDUCE-4074-2.patch, MAPREDUCE-4074.patch Client continuously tries to RM and logs the below messages when the RM goes down before launching App Master. I feel exception should be thrown or break the loop after finite no of retries. {code:xml} 28/03/12 07:15:03 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 0 time(s). 28/03/12 07:15:04 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 1 time(s). 28/03/12 07:15:05 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 2 time(s). 28/03/12 07:15:06 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 3 time(s). 28/03/12 07:15:07 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 4 time(s). 28/03/12 07:15:08 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 5 time(s). 28/03/12 07:15:09 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 6 time(s). 28/03/12 07:15:10 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 7 time(s). 28/03/12 07:15:11 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 8 time(s). 28/03/12 07:15:12 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 9 time(s). 28/03/12 07:15:13 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 0 time(s). 28/03/12 07:15:14 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 1 time(s). 28/03/12 07:15:15 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 2 time(s). 28/03/12 07:15:16 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 3 time(s). 28/03/12 07:15:17 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 4 time(s). {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4161) create sockets consistently
[ https://issues.apache.org/jira/browse/MAPREDUCE-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4161: --- Resolution: Fixed Fix Version/s: 3.0.0 2.0.0 0.23.3 Status: Resolved (was: Patch Available) Thanks Daryn, I put this into trunk, branch-2, and branch-0.23 create sockets consistently --- Key: MAPREDUCE-4161 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4161 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: client, mrv2 Affects Versions: 0.24.0, 0.23.3, 2.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.3, 2.0.0, 3.0.0 Attachments: MAPREDUCE-4161-1.patch, MAPREDUCE-4161-2.patch, MAPREDUCE-4161.patch Use getSocketAddr from HADOOP-8286 to ensure sockets are created consistently and compatible for host-based service generation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4159) Job is running in Uber mode after setting mapreduce.job.ubertask.maxreduces to zero
[ https://issues.apache.org/jira/browse/MAPREDUCE-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4159: --- Status: Open (was: Patch Available) Job is running in Uber mode after setting mapreduce.job.ubertask.maxreduces to zero - Key: MAPREDUCE-4159 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4159 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0 Reporter: Nishan Shetty Assignee: Devaraj K Attachments: MAPREDUCE-4159.patch 1.Configure mapreduce.job.ubertask.enable to true 2.Configure mapreduce.job.ubertask.maxreduces to 0(zero) 3.Run job such that it has one reducer(more than mapreduce.job.ubertask.maxreduces value) Observe that job is running in Uber mode instead of normal mode(non uber mode) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4056) Remove MR1 src/test/system
[ https://issues.apache.org/jira/browse/MAPREDUCE-4056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4056: --- Target Version/s: 2.0.0 (was: 0.23.3) Remove MR1 src/test/system -- Key: MAPREDUCE-4056 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4056 Project: Hadoop Map/Reduce Issue Type: Improvement Components: test Reporter: Eli Collins Assignee: Eli Collins Priority: Minor hadoop-mapreduce-project/src/test/system is MR1 specific (eg built against the JT/TT), is already maintained in branch-1 can be removed from trunk/23. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3994) Port TestEmptyJob to maven, possibly using the MiniMRYarnCluster
[ https://issues.apache.org/jira/browse/MAPREDUCE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3994: --- Target Version/s: 2.0.0, trunk (was: 0.23.3) Port TestEmptyJob to maven, possibly using the MiniMRYarnCluster Key: MAPREDUCE-3994 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3994 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv2, test Affects Versions: 0.23.2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans TestEmptyJob started failing recently. MAPREDUCE-3982 fixes that failure, but it would have been good to have the test as part of maven so that the failure would have been caught long before it was checked in. The test currently uses the MRMiniCluster, but it does some things with synchronization that are not currently available in the MiniMRYarnCluster. Before just moving the test over, we need to better understand what the intention of the test is, and if it should just be deleted, if there are other tests that cover it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3957) yarn daemonlog usage wrong
[ https://issues.apache.org/jira/browse/MAPREDUCE-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3957: --- Target Version/s: 2.0.0, trunk (was: 0.23.3) yarn daemonlog usage wrong --- Key: MAPREDUCE-3957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3957 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Priority: Minor $ yarn daemonlog USAGES: java org.apache.hadoop.log.LogLevel -getlevel host:port name java org.apache.hadoop.log.LogLevel -setlevel host:port name level The usage shouldn't print java org.apache.hadoop.log.LogLevel -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3685) There are some bugs in implementation of MergeManager
[ https://issues.apache.org/jira/browse/MAPREDUCE-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3685: --- Target Version/s: 2.0.0, trunk (was: 0.23.3) There are some bugs in implementation of MergeManager - Key: MAPREDUCE-3685 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3685 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: anty.rao Assignee: anty Priority: Minor Attachments: MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3971) Job History web services need to have limits on the number of items they can return.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3971: --- Target Version/s: 2.0.0, trunk (was: 0.23.3) Job History web services need to have limits on the number of items they can return. Key: MAPREDUCE-3971 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3971 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.2 Reporter: Robert Joseph Evans The Job History web services canput a very large load on the job history server. We should put in a limit on the number of entries that can be returned by the web service, and also add in the ability to modify the starting location in the list, so that all entries can still be downlaoded. Just not all at once. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3967) DefaultContainerExecutor cannot launch container under windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3967: --- Target Version/s: 2.0.0, trunk (was: 0.23.3) DefaultContainerExecutor cannot launch container under windows -- Key: MAPREDUCE-3967 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3967 Project: Hadoop Map/Reduce Issue Type: Bug Components: nodemanager Affects Versions: 0.23.3 Environment: Apache Maven 3.0.4 (r1232337; 2012-01-17 16:44:56+0800) Java version: 1.7.0_02, vendor: Oracle Corporation Java home: C:\Program Files (x86)\Java\jdk1.7.0_02\jre Default locale: zh_CN, platform encoding: GBK OS name: windows 7, version: 6.1, arch: x86, family: windows Reporter: Changming Sun Original Estimate: 72h Remaining Estimate: 72h DefaultContainerExecutor cannot launch container under windows, because bash cannot find the WRAPPER_LAUNCH_SCRIPT. Path wrapperScriptDst = new Path(containerWorkDir, WRAPPER_LAUNCH_SCRIPT); String[] command = {bash, -c, wrapperScriptDst.toUri().getPath().toString()}; LOG.info(launchContainer: + Arrays.toString(command)); Suppose that the value of 'wrapperScriptDst' is C:\hadoop\default_container_executor.sh Then wrapperScriptDst.toUri().getPath().toString() will be /C:/hadoop/default_container_executor.sh, which is a wrong path -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4027) Document the minimum-allocation-mb configurations
[ https://issues.apache.org/jira/browse/MAPREDUCE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4027: --- Target Version/s: 2.0.0, trunk (was: 0.23.3, 0.24.0) Document the minimum-allocation-mb configurations - Key: MAPREDUCE-4027 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4027 Project: Hadoop Map/Reduce Issue Type: Improvement Components: resourcemanager Affects Versions: 0.23.3 Reporter: Harsh J Assignee: Harsh J Priority: Minor None of the current yarn.scheduler.fifo.minimum-allocation-mb and yarn.scheduler.capacity.minimum-allocation-mb are documented anywhere. Without knowledge of these params, one can't change the default allocations. And the default allocations are pretty high btw (MAPREDUCE-4026). We should document these in the Cluster Setup page at least. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3939) Enhance YARN service model
[ https://issues.apache.org/jira/browse/MAPREDUCE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3939: --- Target Version/s: 2.0.0, trunk (was: 0.23.3, 0.24.0) Enhance YARN service model -- Key: MAPREDUCE-3939 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3939 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.24.0, 0.23.2 Reporter: Steve Loughran Assignee: Steve Loughran Having played the YARN service model, there are some issues that I've identified based on past work and initial use. This JIRA issue is an overall one to cover the issues, with solutions pushed out to separate JIRAs. h2. state model prevents stopped state being entered if you could not successfully start the service. In the current lifecycle you cannot stop a service unless it was successfully started, but * {{init()}} may acquire resources that need to be explicitly released * if the {{start()}} operation fails partway through, the {{stop()}} operation may be needed to release resources. *Fix:* make {{stop()}} a valid state transition from all states and require the implementations to be able to stop safely without requiring all fields to be non null. Before anyone points out that the {{stop()}} operations assume that all fields are valid; and if called before a {{start()}} they will NPE; MAPREDUCE-3431 shows that this problem arises today, MAPREDUCE-3502 is a fix for this. It is independent of the rest of the issues in this doc but it will aid making {{stop()}} execute from all states other than stopped. MAPREDUCE-3502 is too big a patch and needs to be broken down for easier review and take up; this can be done with issues linked to this one. h2. AbstractService doesn't prevent duplicate state change requests. The {{ensureState()}} checks to verify whether or not a state transition is allowed from the current state are performed in the base {{AbstractService}} class -yet subclasses tend to call this *after* their own {{init()}}, {{start()}} {{stop()}} operations. This means that these operations can be performed out of order, and even if the outcome of the call is an exception, all actions performed by the subclasses will have taken place. MAPREDUCE-3877 demonstrates this. This is a tricky one to address. In HADOOP-3128 I used a base class instead of an interface and made the {{init()}}, {{start()}} {{stop()}} methods {{final}}. These methods would do the checks, and then invoke protected inner methods, {{innerStart()}}, {{innerStop()}}, etc. It should be possible to retrofit the same behaviour to everything that extends {{AbstractService}} -something that must be done before the class is considered stable (because once the lifecycle methods are declared final, all subclasses that are out of the source tree will need fixing by the respective developers. h2. AbstractService state change doesn't defend against race conditions. There's no concurrency locks on the state transitions. Whatever fix for wrong state calls is added should correct this to prevent re-entrancy, such as {{stop()}} being called from two threads. h2. Static methods to choreograph of lifecycle operations Helper methods to move things through lifecycles. init-start is common, stop-if-service!=null another. Some static methods can execute these, and even call {{stop()}} if {{init()}} raises an exception. These could go into a class {{ServiceOps}} in the same package. These can be used by those services that wrap other services, and help manage more robust shutdowns. h2. state transition failures are something that registered service listeners may wish to be informed of. When a state transition fails a {{RuntimeException}} can be thrown -and the service listeners are not informed as the notification point isn't reached. They may wish to know this, especially for management and diagnostics. *Fix:* extend {{ServiceStateChangeListener}} with a callback such as {{stateChangeFailed(Service service,Service.State targeted-state, RuntimeException e)}} that is invoked from the (final) state change methods in the {{AbstractService}} class (once they delegate to their inner {{innerStart()}}, {{innerStop()}} methods; make a no-op on the existing implementations of the interface. h2. Service listener failures not handled Is this an error an error or not? Log and ignore may not be what is desired. *Proposed:* during {{stop()}} any exception by a listener is caught and discarded, to increase the likelihood of a better shutdown, but do not add try-catch clauses to the other state changes. h2. Support static listeners for all AbstractServices Add support to {{AbstractService}} that allow callers
[jira] [Updated] (MAPREDUCE-3990) MRBench allows Long-sized input-lines value but parses CLI argument as an Integer
[ https://issues.apache.org/jira/browse/MAPREDUCE-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3990: --- Target Version/s: 2.0.0, trunk (was: 0.23.3, 0.24.0) MRBench allows Long-sized input-lines value but parses CLI argument as an Integer - Key: MAPREDUCE-3990 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3990 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.23.0 Reporter: Harsh J Assignee: Harsh J Priority: Trivial Attachments: MAPREDUCE-3990.patch MRBench has the following method: {code} public void generateTextFile(FileSystem fs, Path inputFile, long numLines, Order sortOrder) { ... } {code} The method is already set to accept a long datatype for numLines, for generating very large amount of data. However, in {{MRBench#run(...)}}, the inputLines CLI parameter is parsed via an Integer.parseInt, causing numbers passed Integer.MAX_VALUE to throw NumberFormatExceptions as a result. The parsing should be Long.parseLong and the inputLines datatype should be switched to the same type as passed to the method (long). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3937) docs need updating on how to start proxyserver
[ https://issues.apache.org/jira/browse/MAPREDUCE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3937: --- Target Version/s: 2.0.0, trunk (was: 0.23.3) docs need updating on how to start proxyserver -- Key: MAPREDUCE-3937 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3937 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Priority: Minor The docs (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/ClusterSetup.html) on how to start the proxyserver are wrong. It says to use yarn start, which should be use yarn_daemon.sh start proxyserver. Also just running yarn doesn't show the proxyserver as an option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4026) Lower minimum-allocation-mb to sensible defaults
[ https://issues.apache.org/jira/browse/MAPREDUCE-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4026: --- Target Version/s: 2.0.0, trunk (was: 0.23.3) Lower minimum-allocation-mb to sensible defaults Key: MAPREDUCE-4026 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4026 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, scheduler Affects Versions: 0.23.3 Reporter: Harsh J Assignee: Harsh J The CapacityScheduler's minimum-allocation-mb is set to 1024. The FIFO's minimum-allocation-mb meanwhile, is 128. I propose changing the formers' minimum to that amount as well. 1024 is way too much as a default, wastes slots on NMs - and I also do not see why CS has to deviate that settings from the FIFO default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4029) NodeManager status web page should express 'last update' times as seconds ago
[ https://issues.apache.org/jira/browse/MAPREDUCE-4029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4029: --- Target Version/s: 2.0.0, trunk (was: 0.23.3, 0.24.0) NodeManager status web page should express 'last update' times as seconds ago - Key: MAPREDUCE-4029 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4029 Project: Hadoop Map/Reduce Issue Type: Improvement Components: webapps Affects Versions: 0.23.1 Reporter: Harsh J The 'Last health update' field on the MR2 apps' nodes page (at http://host:8088/cluster/nodes) is a timestamp right now, which isn't really informative for what the field means. It ought to be in seconds-ago from now(), like was the case in JobTracker for heartbeats. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3993) reduce fetch catch clause should catch RTEs as well
[ https://issues.apache.org/jira/browse/MAPREDUCE-3993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3993: --- Target Version/s: 1.0.3, 2.0.0 (was: 1.0.3, 0.23.3) reduce fetch catch clause should catch RTEs as well --- Key: MAPREDUCE-3993 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3993 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 0.23.1, 1.0.2 Reporter: Todd Lipcon When using a compression codec for intermediate compression, some cases of corrupt data can cause the codec to throw exceptions other than IOException (eg java.lang.InternalError). This will currently cause the whole reduce task to fail, instead of simply treating it like another case of a failed fetch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3751) Simplify job submission in gridmix
[ https://issues.apache.org/jira/browse/MAPREDUCE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3751: --- Target Version/s: 1.1.0, 2.0.0 (was: 1.1.0, 0.23.1) Simplify job submission in gridmix -- Key: MAPREDUCE-3751 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3751 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/gridmix, mrv2 Affects Versions: 0.23.0, 1.0.0 Reporter: Arun C Murthy Currently gridmix tries to gauge cluster load etc. and throttles job submission. This makes it unpredictable and also is hard to support across MR1 and MR2. I propose we simplify it to be: # Replay mode - Just submit jobs in the interval as in the original trace. # Stress mode - Compress the interval with a given factor for all jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4079) Allow MR AppMaster to limit ephemeral port range.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4079: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 2.0.0, 0.23.2) Allow MR AppMaster to limit ephemeral port range. - Key: MAPREDUCE-4079 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4079 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Affects Versions: 0.23.2, 2.0.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Attachments: MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-full-branch-0.23.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt, MR-4079-trunk.txt Having the MapReduce Application Masters bind to any ephemeral port makes it very difficult to setup ACLs. mapreduce.job.am-access-disabled from MAPREDUCE-3251 is not a practical permanent solution for all jobs. Especially for tools like pig where they are not aware of mapreduce.job.am-access-disabled and may deal with it properly. We should add in a config option that would allow someone to restrict the range of ports that the MR-AM can bind to. It will slow down startup in some cases because we will have to probe for open ports instead of just asking the OS to find one for us. But we can make that conditional on this config so users who do not set this config do not see any performance degradation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3550) RM web proxy should handle redirect of web services urls
[ https://issues.apache.org/jira/browse/MAPREDUCE-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3550: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.2) RM web proxy should handle redirect of web services urls Key: MAPREDUCE-3550 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3550 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Assignee: Thomas Graves Priority: Critical the RM web proxy should handle the web services urls added in MAPREDUCE-2863. The proxy does handle passing the web service urls to the AM, it just doesn't handle redirecting it after the AM goes away. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3905) Allow per job log aggregation configuration
[ https://issues.apache.org/jira/browse/MAPREDUCE-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3905: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.2) Allow per job log aggregation configuration --- Key: MAPREDUCE-3905 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3905 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.0 Reporter: Siddharth Seth Assignee: Siddharth Seth Currently, if log aggregation is enabled for a cluster - logs for all jobs will be aggregated - leading to a whole bunch of files on hdfs which users may not want. Users should be able to control this along with the aggregation policy - failed only, all, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4030) If the nodemanager on which the maptask is executed is going down before the mapoutput is consumed by the reducer,then the job is failing with shuffle error
[ https://issues.apache.org/jira/browse/MAPREDUCE-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4030: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.2) If the nodemanager on which the maptask is executed is going down before the mapoutput is consumed by the reducer,then the job is failing with shuffle error Key: MAPREDUCE-4030 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4030 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Nishan Shetty Assignee: Devaraj K My cluster has 2 NM's. The value of mapreduce.job.reduce.slowstart.completedmaps is set to 1. When the job execution is in progress and Mappers has finished about 99% completion,one of the NM has gone down. The job has failed with the following trace Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:148) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:143) Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:253) at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.copyFailed(ShuffleScheduler.java:187) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:240) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:152) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3350) Per-app RM page should have the list of application-attempts like on the app JHS page
[ https://issues.apache.org/jira/browse/MAPREDUCE-3350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3350: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.24.0, 0.23.1) Per-app RM page should have the list of application-attempts like on the app JHS page - Key: MAPREDUCE-3350 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3350 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, webapps Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Jonathan Eagles Priority: Critical Fix For: 0.24.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3850) Avoid redundant calls for tokens in TokenCache
[ https://issues.apache.org/jira/browse/MAPREDUCE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3850: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.24.0, 0.23.1) Avoid redundant calls for tokens in TokenCache -- Key: MAPREDUCE-3850 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3850 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3850-1.patch, MAPREDUCE-3850.patch, MAPREDUCE-3850.patch The {{TokenCache}} will repeatedly call the same filesystem for tokens. This is inefficient and can easily be changed to only call each filesystem once. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3838) MapReduce job submission time has increased in 0.23 when compared to 0.20.206
[ https://issues.apache.org/jira/browse/MAPREDUCE-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3838: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.2) MapReduce job submission time has increased in 0.23 when compared to 0.20.206 - Key: MAPREDUCE-3838 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3838 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: client Affects Versions: 0.23.0 Reporter: Amar Kamat Labels: gridmix, job-submit-time, yarn Fix For: 0.23.2 While running Gridmix on 0.23, we found that the job submission time has increased when compared to 0.20.206. Here are some stats: ||Submit-Time||Total number of jobs in YARN|| Total number of jobs in FRED|| | 25secs|3 |1 | | 20secs| 6 | 2 | | 15secs| 14 | 4 | | 10secs| 24 | 4 | | 5secs | 67 | 28| Note that Gridmix was run using the same trace. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3893) allow capacity scheduler configs maximum-applications and maximum-am-resource-percent configurable on a per queue basis
[ https://issues.apache.org/jira/browse/MAPREDUCE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3893: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.2) allow capacity scheduler configs maximum-applications and maximum-am-resource-percent configurable on a per queue basis --- Key: MAPREDUCE-3893 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3893 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Assignee: Eric Payne Priority: Critical Attachments: MAPREDUCE-3893-1.txt The capacity scheduler configs for maximum-applications and maximum-am-resource-percent are currently configured globally and then made proportional to each queue based on its capacity. There are times when this may not work well. some exampless - if you have a queue that is running on uberAM jobs, the jobs a queue is running always has a small number of containers, and then you have the opposite where in a queue with very small capacity, you may want to limit the am resources even more so you don't end up deadlocked with all your capacity being used for app masters. I think we should make those configurable on a per queue basis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3825) MR should not be getting duplicate tokens for a MR Job.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3825: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.24.0, 0.23.1) MR should not be getting duplicate tokens for a MR Job. --- Key: MAPREDUCE-3825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3825.patch, TokenCache.pdf, solution4.patch This is the counterpart to HADOOP-7967. MR gets tokens for all input, output and the default filesystem when a MR job is submitted. The APIs in FileSystem make it challenging to avoid duplicate tokens when there are file systems that have embedded filesystems. Here is the original description that Daryn wrote: The token cache currently tries to assume a filesystem's token service key. The assumption generally worked while there was a one to one mapping of filesystem to token. With the advent of multi-token filesystems like viewfs, the token cache will try to use a service key (ie. for viewfs) that will never exist (because it really gets the mounted fs tokens). The descriop -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3835) RM capacity scheduler web UI doesn't show active users
[ https://issues.apache.org/jira/browse/MAPREDUCE-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3835: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.1) RM capacity scheduler web UI doesn't show active users -- Key: MAPREDUCE-3835 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3835 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Priority: Minor On the jobtracker, the web ui showed the active users for each queue and how much resources each of those users were using. That currently isn't being displayed on the RM capacity scheduler web ui. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3889) job client tries to use /tasklog interface, but that doesn't exist anymore
[ https://issues.apache.org/jira/browse/MAPREDUCE-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3889: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.2) job client tries to use /tasklog interface, but that doesn't exist anymore -- Key: MAPREDUCE-3889 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3889 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Thomas Graves Priority: Critical if you specify -Dmapreduce.client.output.filter=SUCCEEDED option when running a job it tries to fetch task logs to print out on the client side from a url like: http://nodemanager:8080/tasklog?plaintext=trueattemptid=attempt_1329857083014_0003_r_00_0filter=stdout It always errors on this request with: Required param job, map and reduce We saw this error when using distcp and the distcp failed. I'm not sure if it is mandatory for distcp or just informational purposes. I'm guessing the latter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3267) MR2 reduce tasks showing 100% complete
[ https://issues.apache.org/jira/browse/MAPREDUCE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3267: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.1) MR2 reduce tasks showing 100% complete --- Key: MAPREDUCE-3267 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3267 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, task Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Ravi Prakash My job is currently showing 100% reduce completion. Some reduce tasks are much higher than 100% complete. they appear to be in the last merge pass stage -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3755) Add the equivalent of JobStatus to end of JobHistory file
[ https://issues.apache.org/jira/browse/MAPREDUCE-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3755: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.2) Add the equivalent of JobStatus to end of JobHistory file -- Key: MAPREDUCE-3755 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3755 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Arun C Murthy Assignee: Bikas Saha Priority: Critical Fix For: 0.23.2 In MR1 we have the notion of CompletedJobStatus store to aid fast responses to job.getStatus. We need the equivalent for MR2, an option is to add the jobStatus to the end of the JobHistory file to which the JHS can easily jump ahead to and serve the query, it should also cache this for a fair number of recently completed jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3707) mapreduce/yarn source jars not included in dist tarball
[ https://issues.apache.org/jira/browse/MAPREDUCE-3707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3707: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.1) mapreduce/yarn source jars not included in dist tarball --- Key: MAPREDUCE-3707 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3707 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves the mapreduce and yarn sources jars don't get included into the distribution tarball. It seems they get built by default just aren't assembled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3917) Use java.net.preferIPv4Stack to force IPv4 in yarn
[ https://issues.apache.org/jira/browse/MAPREDUCE-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3917: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.2) Use java.net.preferIPv4Stack to force IPv4 in yarn -- Key: MAPREDUCE-3917 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3917 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves HADOOP-6056 made the changes for hadoop cli to use java.net.preferIPv4Stack to force IPv4. We should do the same things for the yarn commands. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3659) Host-based token support
[ https://issues.apache.org/jira/browse/MAPREDUCE-3659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3659: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.24.0, 0.23.1) Host-based token support Key: MAPREDUCE-3659 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3659 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Need to port the 205 host-based token support into MR and yarn. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3120) JobHistory is not providing correct count failed,killed task
[ https://issues.apache.org/jira/browse/MAPREDUCE-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3120: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.1) JobHistory is not providing correct count failed,killed task Key: MAPREDUCE-3120 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3120 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Subroto Sanyal Assignee: Subroto Sanyal Priority: Critical Fix For: 0.24.0 Attachments: JobFail.PNG Please refer the attachment JobFail.PNG. Here the Job (WordCount) Failed as all Map Attempts were killed(intensionally) but, still the Table in UI shows 0 Killed Attempts and no reason for Failure is also available. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3623) Authorization of NM = RM with simple authentication mistakenly attempts kerberos when yarn.nodemanager.principal is defined
[ https://issues.apache.org/jira/browse/MAPREDUCE-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3623: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.24.0, 0.23.1) Authorization of NM = RM with simple authentication mistakenly attempts kerberos when yarn.nodemanager.principal is defined - Key: MAPREDUCE-3623 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3623 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.1, 0.24.0 Reporter: Jonathan Eagles MAPREDUCE-3617 addresses the default values of yarn.nodemanager.principal and yarn.resourcemanager.principal I have enabled authorization with simple authentication. NM = RM still attempts kerberos authentication. If simple authentication is enabled yarn.nodemanager.principal and yarn.resourcemanager.principal values should be ignored and simple authentication should be used. {code:xml|title=core-site.xml snippet} property namehadoop.security.authentication/name valuesimple/value description/description /property property namehadoop.security.authorization/name valuetrue/value description/description /property {code} {code:xml|title=yarn-site.xml snippet} property descriptionThe Kerberos principal for the resource manager./description nameyarn.resourcemanager.principal/name valuerm/sightbusy-lx@LOCALHOST/value /property property descriptionThe kerberos principal for the node manager./description nameyarn.nodemanager.principal/name valuenm/sightbusy-lx@LOCALHOST/value /property {code} {noformat:title=nodemanager.out snippet} 2012-01-03 16:40:00,793 INFO nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:registerWithRM(176)) - Connected to ResourceManager at machine.example.com:8025 2012-01-03 16:40:00,845 ERROR service.CompositeService (CompositeService.java:start(72)) - Error starting services org.apache.hadoop.yarn.server.nodemanager.NodeManager org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:149) at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:167) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:242) Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:182) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:145) ... 3 more Caused by: com.google.protobuf.ServiceException: org.apache.hadoop.security.authorize.AuthorizationException: User user (auth:SIMPLE) is not authorized for protocol interface org.apache.hadoop.yarn.proto.ResourceTracker$ResourceTrackerService$BlockingInterface, expected client Kerberos principal is nm/sightbusy-lx@LOCALHOST at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139) at $Proxy24.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59) ... 5 more Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User user (auth:SIMPLE) is not authorized for protocol interface org.apache.hadoop.yarn.proto.ResourceTracker$ResourceTrackerService$BlockingInterface, expected client Kerberos principal is nm/sightbusy-lx@LOCALHOST at org.apache.hadoop.ipc.Client.call(Client.java:1085) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136) ... 7 more 2012-01-03 16:40:00,846 WARN event.AsyncDispatcher (AsyncDispatcher.java:run(78)) - AsyncDispatcher thread interrupted java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1996) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:76) at java.lang.Thread.run
[jira] [Updated] (MAPREDUCE-3908) jobhistory server trying to load job conf file from wrong location
[ https://issues.apache.org/jira/browse/MAPREDUCE-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3908: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.2) jobhistory server trying to load job conf file from wrong location -- Key: MAPREDUCE-3908 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3908 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.23.0 Reporter: Thomas Graves I have seen a few instance where I try to click on the job configuration link from the job history server web ui and it gives a 500 message. Looking at the job history server log file it shows an exception like: 2012-02-23 22:16:32,519 ERROR org.apache.hadoop.yarn.webapp.View: Error while reading hdfs://host.com:9000/home/hadoop/mapred/history/done_intermediate/user/job_1330033607650_0001_conf.xml java.io.FileNotFoundException: File does not exist: /home/hadoop/mapred/history/done_intermediate/user/job_1330033607650_0001_conf.xml at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:746) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:709) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:681) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:302) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:2 If I go look in hdfs, it doesn't exist in the done_intermediate directory anymore, it exists in the done directory structure. hdfs://host.com:9000/home/hadoop/mapred/history/done/2012/02/23/00/job_1330033607650_0001_conf.xml I'm not exactly sure how to reproduce this, but I definitely see it every once in a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3498) yarn rmadmin help message contains reference to hadoop cli and JT
[ https://issues.apache.org/jira/browse/MAPREDUCE-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3498: --- Target Version/s: 0.23.3, 2.0.0, 3.0.0 (was: 0.23.1) yarn rmadmin help message contains reference to hadoop cli and JT - Key: MAPREDUCE-3498 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3498 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves has option to specify a job tracker and the last line for general command line syntax had bin/hadoop command [genericOptions] [commandOptions] ran yarn rmadmin to get usage: RMAdmin Usage: java RMAdmin [-refreshQueues] [-refreshNodes] [-refreshUserToGroupsMappings] [-refreshSuperUserGroupsConfiguration] [-refreshAdminAcls] [-refreshServiceAcl] [-help [cmd]] Generic options supported are -conf configuration file specify an application configuration file -D property=valueuse value for given property -fs local|namenode:port specify a namenode -jt local|jobtracker:portspecify a job tracker -files comma separated list of filesspecify comma separated files to be copied to the map reduce cluster -libjars comma separated list of jarsspecify comma separated jar files to include in the classpath. -archives comma separated list of archivesspecify comma separated archives to be unarchived on the compute machines. The general command line syntax is bin/hadoop command [genericOptions] [commandOptions] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3302) Remove the last dependency call from org.apache.hadoop.record package in MR.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3302: --- Target Version/s: 2.0.0, 3.0.0 (was: 0.23.1) Remove the last dependency call from org.apache.hadoop.record package in MR. Key: MAPREDUCE-3302 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3302 Project: Hadoop Map/Reduce Issue Type: Task Components: client Affects Versions: 0.24.0 Reporter: Harsh J Assignee: Harsh J Priority: Minor Attachments: MAPREDUCE-3302.patch SecureShuffleUtils provides the following helper: {code} /** * verify that hash equals to HMacHash(msg) * @param newHash * @return true if is the same */ private static boolean verifyHash(byte[] hash, byte[] msg, SecretKey key) { byte[] msg_hash = generateByteHash(msg, key); return Utils.compareBytes(msg_hash, 0, msg_hash.length, hash, 0, hash.length) == 0; } {code} The {{Utils}} class used there is {{org.apache.hadoop.record.Utils}}. With the {{record}} common package going away via HADOOP-7781, the internal (and also deprecated on the whole) {{compareBytes}} utility must be moved elsewhere. The {{Utils#compareBytes}} contains: {code} /** Lexicographic order of binary data. */ public static int compareBytes(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) { return WritableComparator.compareBytes(b1, s1, l1, b2, s2, l2); } {code} Which looks like it can be replaced inline, as it appears to be a dummy wrapper call. I'll put up a patch with this inline replacement shortly for review. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3223) Remove MR1 configs from mapred-default.xml
[ https://issues.apache.org/jira/browse/MAPREDUCE-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3223: --- Target Version/s: 2.0.0, 3.0.0 (was: 0.23.1) Remove MR1 configs from mapred-default.xml -- Key: MAPREDUCE-3223 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3223 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation, mrv2 Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: mr-3223.txt All of the MRv1 configs are still in mapred-default.xml. This is confusing when trying to make config changes. Since a lot of the input/output format tests still depend on MR1, I'd like to move these to src/test/mapred-site.xml for now, and once that dependency is broken, we can remove them entirely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3461) Move hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site to hadoop-site
[ https://issues.apache.org/jira/browse/MAPREDUCE-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3461: --- Target Version/s: 2.0.0, 3.0.0 (was: 0.23.1) Move hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site to hadoop-site - Key: MAPREDUCE-3461 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3461 Project: Hadoop Map/Reduce Issue Type: Improvement Components: documentation Affects Versions: 0.23.0 Reporter: Arun C Murthy Currently hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site has both HDFS and MR docs, we should move it to top-level hadoop-site. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4039) Sort Avoidance
[ https://issues.apache.org/jira/browse/MAPREDUCE-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4039: --- Target Version/s: 2.0.0, 3.0.0 (was: 0.23.2) Sort Avoidance -- Key: MAPREDUCE-4039 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4039 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2 Affects Versions: 0.23.2 Reporter: anty.rao Assignee: anty Priority: Minor Fix For: 0.23.2 Attachments: MAPREDUCE-4039-branch-0.23.2.patch, MAPREDUCE-4039-branch-0.23.2.patch Inspired by [Tenzing|http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/37200.pdf], in 5.1 MapReduce Enhanceemtns: {quote}*Sort Avoidance*. Certain operators such as hash join and hash aggregation require shuffling, but not sorting. The MapReduce API was enhanced to automatically turn off sorting for these operations. When sorting is turned off, the mapper feeds data to the reducer which directly passes the data to the Reduce() function bypassing the intermediate sorting step. This makes many SQL operators significantly more ecient.{quote} There are a lot of applications which need aggregation only, not sorting.Using sorting to achieve aggregation is costly and inefficient. Without sorting, up application can make use of hash table or hash map to do aggregation efficiently.But application should bear in mind that reduce memory is limited, itself is committed to manage memory of reduce, guard against out of memory. Map-side combiner is not supported, you can also do hash aggregation in map side as a workaround. the following is the main points of sort avoidance implementation # add a configuration parameter ??mapreduce.sort.avoidance??, boolean type, to turn on/off sort avoidance workflow.Two type of workflow are coexist together. # key/value pairs emitted by map function is sorted by partition only, using a more efficient sorting algorithm: counting sort. # map-side merge, use a kind of byte merge, which just concatenate bytes from generated spills, read in bytes, write out bytes, without overhead of key/value serialization/deserailization, comparison, which current version incurs. # reduce can start up as soon as there is any map output available, in contrast to sort workflow which must wait until all map outputs are fetched and merged. # map output in memory can be directly consumed by reduce.When reduce can't catch up with the speed of incoming map outputs, in-memory merge thread will kick in, merging in-memory map outputs onto disk. # sequentially read in on-disk files to feed reduce, in contrast to currently implementation which read multiple files concurrently, result in many disk seek. Map output in memory take precedence over on disk files in feeding reduce function. I have already implement this feature based on hadoop CDH3U3 and done some performance evaluation, you can reference to [https://github.com/hanborq/hadoop] for details. Now,I'm willing to port it into yarn. Welcome for commenting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3768) MR-2450 introduced a significant performance regression (Hive)
[ https://issues.apache.org/jira/browse/MAPREDUCE-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3768: --- Target Version/s: 2.0.0, 3.0.0 (was: 0.23.1) MR-2450 introduced a significant performance regression (Hive) -- Key: MAPREDUCE-3768 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3768 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Eli Collins Attachments: stopcommunicatorpatch.txt MAPREDUCE-2450 introduced, or at least triggers, a significant performance regression in Hive. With MR-2450 the execution time of TestCliDriver.skewjoin goes from 2 minutes to 15 minutes. Reverting this change from the build fixes the issue. Here's the relevant query: {noformat} FROM src src1 JOIN src src2 ON (src1.key = src2.key) INSERT OVERWRITE TABLE dest_j1 SELECT src1.key, src2.value; {noformat} You can reproduce this by running the following from Hive 8.0 against Hadoop built from branch-23. {noformat} ant very-clean package test -Dtestcase=TestCliDriver -Dqfile=skewjoin.q {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3871) Allow symlinking in LocalJobRunner DistributedCache
[ https://issues.apache.org/jira/browse/MAPREDUCE-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3871: --- Target Version/s: 2.0.0, 3.0.0 (was: 0.23.2) Allow symlinking in LocalJobRunner DistributedCache --- Key: MAPREDUCE-3871 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3871 Project: Hadoop Map/Reduce Issue Type: Improvement Components: distributed-cache Reporter: Tom White Assignee: Tom White Attachments: MAPREDUCE-3871.patch Currently the LocalJobRunner doesn't create symlinks for files in the DistributedCache. It is safe to create symlinks if files of the same name don't exist. LocalJobRunner should also delete the symlinks when the job has completed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3289) Make use of fadvise in the NM's shuffle handler
[ https://issues.apache.org/jira/browse/MAPREDUCE-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3289: --- Target Version/s: 2.0.0, 3.0.0 (was: 0.23.1) Make use of fadvise in the NM's shuffle handler --- Key: MAPREDUCE-3289 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3289 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2, nodemanager, performance Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: mr-3289.txt Using the new NativeIO fadvise functions, we can make the NodeManager prefetch map output before it's send over the socket, and drop it out of the fs cache once it's been sent (since it's very rare for an output to have to be re-sent). This improves IO efficiency and reduces cache pollution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3739) Document all missing default configuration values in the relevant *default.xml files.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3739: --- Target Version/s: 2.0.0, 3.0.0 (was: 0.24.0, 0.23.1) Document all missing default configuration values in the relevant *default.xml files. - Key: MAPREDUCE-3739 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3739 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0, 0.24.0 Reporter: Hitesh Shah There seem to be quite a few configuration settings that are used in the code but missing from the *default.xml files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3502) Review all Service.stop() operations and make sure that they work before a service is started
[ https://issues.apache.org/jira/browse/MAPREDUCE-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3502: --- Target Version/s: 2.0.0, 3.0.0 (was: 0.24.0, 0.23.1) Review all Service.stop() operations and make sure that they work before a service is started - Key: MAPREDUCE-3502 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3502 Project: Hadoop Map/Reduce Issue Type: Task Components: mrv2 Affects Versions: 0.23.0, 0.24.0 Reporter: Steve Loughran Assignee: Steve Loughran Attachments: MAPREDUCE-3502.patch, MAPREDUCE-3502.patch Original Estimate: 24h Time Spent: 2.5h Remaining Estimate: 21.5h MAPREDUCE-3431 has shown that some of the key services's shutdown operations are not robust against being invoked before the service is started. They need to be by # not calling other things if the other things are null # not being re-entrant (i.e. make synchronized if possible), Maybe # have a StopService operation that only stops a service if it is live # factor out the is-running test from the base service class and make it a pre-check for all the child services, so they bail out sooner rather than later. This would be the best as it would be the one guaranteed to work consistently across all instances, so only one or two would need testing my first iteration will skip the sync though it's something to consider. Testing: try to create each instance; call stop() straight after construction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2584) Check for serializers early, and give out more information regarding missing serializers
[ https://issues.apache.org/jira/browse/MAPREDUCE-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-2584: --- Target Version/s: 2.0.0, 3.0.0 (was: 0.24.0, 0.23.1) Check for serializers early, and give out more information regarding missing serializers Key: MAPREDUCE-2584 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2584 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task Affects Versions: 0.20.2 Reporter: Harsh J Assignee: Harsh J Labels: serializers, tasks Attachments: 0.20-security-MAPREDUCE-2584.r5.diff, MAPREDUCE-2584.r2.diff, MAPREDUCE-2584.r3.diff, MAPREDUCE-2584.r4.diff, MAPREDUCE-2584.r5.diff As discussed on HADOOP-7328, MapReduce can handle serializers in a much better way in case of bad configuration, improper imports (Some odd Text class instead of the Writable Text set as key), etc.. This issue covers the MapReduce parts of the improvements (made to IFile, MapOutputBuffer, etc. and possible early-check of serializer availability pre-submit) that provide more information than just an NPE as is the current case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3320) Error conditions in web apps should stop pages from rendering.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3320: --- Target Version/s: 2.0.0, 3.0.0 (was: 0.23.1, 0.24.0) Error conditions in web apps should stop pages from rendering. -- Key: MAPREDUCE-3320 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3320 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0, 0.24.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Fix For: 0.24.0 There are several places in the web apps where an error condition should short circuit the page from rendering, but it does not. Ideally the web app framework should be extended to support exceptions similar to Jersey that can have an HTTP return code associated with them. Then all of the places that produce custom error pages can just throw these exceptions instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3658) Improvements to CapacityScheduler documentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3658: --- Target Version/s: 2.0.0, 3.0.0 (was: 0.23.1) Improvements to CapacityScheduler documentation --- Key: MAPREDUCE-3658 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3658 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.0 Reporter: Yoram Arnon Assignee: Yoram Arnon Priority: Minor Labels: documentation Fix For: 0.24.0 Attachments: MAPREDUCE-3658, MAPREDUCE-3658 Original Estimate: 3h Remaining Estimate: 3h There are some typos and some cases of incorrect English. Also, the descriptions of yarn.scheduler.capacity.queue-path.capacity, yarn.scheduler.capacity.queue-path.maximum-capacity, yarn.scheduler.capacity.queue-path.user-limit-factor, yarn.scheduler.capacity.maximum-applications are not very clear to the uninitiated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3079) usercache/user/appcache/appid directory not removed when using DefaultContainerExecutor
[ https://issues.apache.org/jira/browse/MAPREDUCE-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3079: --- Target Version/s: 2.0.0, 3.0.0 (was: 0.23.1) usercache/user/appcache/appid directory not removed when using DefaultContainerExecutor --- Key: MAPREDUCE-3079 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3079 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Fix For: 0.24.0 Running with the DefaultContainerExecutor it appears that the usercache/user/appcache/appid directory itself is not removed when the app finishes. All the directories under it are properly removed though. The nodemanager log file indicates that it tries to delete it: 11/09/23 15:17:56 INFO nodemanager.DefaultContainerExecutor: Deleting absolute path : /home/hadoop/mapred/tmp/mapred-local/usercache/tgraves/appcache/application_1316722920862_0003 This doesn't appear to happen with the LinuxContainerExecutor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira