[jira] [Updated] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-2722: Attachment: 2722.v1.patch Attaching new patch with unit test. Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used -- Key: MAPREDUCE-2722 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 2722.v1.patch, MR2722.patch When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3829) [Gridmix] Gridmix should give better error message when input-data directory already exists and -generate option is given
[ https://issues.apache.org/jira/browse/MAPREDUCE-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-3829: Status: Open (was: Patch Available) [Gridmix] Gridmix should give better error message when input-data directory already exists and -generate option is given - Key: MAPREDUCE-3829 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3829 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 3829.v0.patch, 3829.v1.patch Instead of throwing exception messages on to the console, Gridmix should give better error message when input-data directory already exists and -generate option is given. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13219935#comment-13219935 ] Hadoop QA commented on MAPREDUCE-2722: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516656/2722.v1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1975//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1975//console This message is automatically generated. Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used -- Key: MAPREDUCE-2722 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 2722.v1.patch, MR2722.patch When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3885) Apply the fix similar to HADOOP-8084
[ https://issues.apache.org/jira/browse/MAPREDUCE-3885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1321#comment-1321 ] Hudson commented on MAPREDUCE-3885: --- Integrated in Hadoop-Hdfs-trunk #971 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/971/]) MAPREDUCE-3885. Avoid an unnecessary copy for all requests/responses in MRs ProtoOverHadoopRpcEngine. (Contributed by Devaraj Das) (Revision 1295362) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1295362 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/ProtoOverHadoopRpcEngine.java Apply the fix similar to HADOOP-8084 Key: MAPREDUCE-3885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3885 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.0, 0.24.0 Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.24.0, 0.23.3 Attachments: mr-no-copy.patch Apply the fix similar to HADOOP-8084 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3687) If AM dies before it returns new tracking URL, proxy redirects to http://N/A/ and doesn't return error code
[ https://issues.apache.org/jira/browse/MAPREDUCE-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220040#comment-13220040 ] Hudson commented on MAPREDUCE-3687: --- Integrated in Hadoop-Mapreduce-trunk #1006 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1006/]) MAPREDUCE-3687. If AM dies before it returns new tracking URL, proxy redirects to http://N/A/ and doesn't return error code (Ravi Prakash via bobby) (Revision 1295147) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1295147 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java If AM dies before it returns new tracking URL, proxy redirects to http://N/A/ and doesn't return error code --- Key: MAPREDUCE-3687 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3687 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: David Capwell Assignee: Ravi Prakash Fix For: 0.23.2 Attachments: MAPREDUCE-3687.patch I tried to turn on Uber AM and put 9223372036854775807l (last char is an L) for maxbytes. This caused a NumberFormatException in the AM and killed it. When I try to go to the RM proxy, it redirects me to http://N/A/ curl -i http://resource.manager.example.com:$port/proxy/application_1326504761991_0001/ HTTP/1.1 302 Found Content-Type: text/plain; charset=utf-8 Location: http://N/A/ Content-Length: 0 Server: Jetty(6.1.26) Since the AM has no tracker URL, I would expect the return code to be 400~ or 500~ and return an error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3920) Revise yarn default port number selection
[ https://issues.apache.org/jira/browse/MAPREDUCE-3920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220039#comment-13220039 ] Hudson commented on MAPREDUCE-3920: --- Integrated in Hadoop-Mapreduce-trunk #1006 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1006/]) MAPREDUCE-3920. Revise yarn default port number selection (Dave Thompson via tgraves) (Revision 1295162) Result = SUCCESS tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1295162 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/local/LocalContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRAppBenchmark.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockJobs.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/TestYarnClientProtocolProvider.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/TestNodeId.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDefaultContainerExecutor.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/api/protocolrecords/impl/pb/TestPBLocalizerRPC.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestContainerLocalizer.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMWebServices.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMWebServicesContainers.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/TestAMRestart.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/HistoryServerRest.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/MapredAppMasterRest.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerRest.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerRest.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/WebServicesIntro.apt.vm Revise yarn default port number selection - Key: MAPREDUCE-3920 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3920 Project: Hadoop Map/Reduce Issue Type: Bug Components: nodemanager, resourcemanager Affects Versions: 0.23.1 Reporter: Dave Thompson Assignee: Dave Thompson Fix For: 0.23.2 Attachments: MAPREDUCE-3920-branch-0.23.1.patch, MAPREDUCE-3920-branch-0.23.1.patch, MAPREDUCE-3920-branch-0.23.1.patch The default port numbers chosen for nodemanager and resourcemanager are random and widely spread out creating unnecessary overhead in deployments where site operators care, and deploy many clusters. Current and proposed new default ports are as follows: Current
[jira] [Commented] (MAPREDUCE-3728) ShuffleHandler can't access results when configured in a secure mode
[ https://issues.apache.org/jira/browse/MAPREDUCE-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220041#comment-13220041 ] Hudson commented on MAPREDUCE-3728: --- Integrated in Hadoop-Mapreduce-trunk #1006 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1006/]) MAPREDUCE-3728. ShuffleHandler can't access results when configured in a secure mode (ahmed via tucu) (Revision 1295245) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1295245 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ContainerLocalizer.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestContainerLocalizer.java ShuffleHandler can't access results when configured in a secure mode Key: MAPREDUCE-3728 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3728 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.0 Reporter: Roman Shaposhnik Assignee: Ahmed Radwan Priority: Critical Fix For: 0.23.3 Attachments: MAPREDUCE-3728.patch While running the simplest of jobs (Pi) on MR2 in a fully secure configuration I have noticed that the job was failing on the reduce side with the following messages littering the nodemanager logs: {noformat} 2012-01-19 08:35:32,544 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find usercache/rvs/appcache/application_1326928483038_0001/output/attempt_1326928483038_0001_m_03_0/file.out.index in any of the configured local directories {noformat} While digging further I found out that the permissions on the files/dirs were prohibiting nodemanager (running under the user yarn) to access these files: {noformat} $ ls -l /data/3/yarn/usercache/testuser/appcache/application_1327102703969_0001/output/attempt_1327102703969_0001_m_01_0 -rw-r- 1 testuser testuser 28 Jan 20 15:41 file.out -rw-r- 1 testuser testuser 32 Jan 20 15:41 file.out.index {noformat} Digging even further revealed that the group-sticky bit that was faithfully put on all the subdirectories between testuser and application_1327102703969_0001 was gone from output and attempt_1327102703969_0001_m_01_0. Looking into how these subdirectories are created (org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.initDirs()) {noformat} // $x/usercache/$user/appcache/$appId/filecache Path appFileCacheDir = new Path(appBase, FILECACHE); appsFileCacheDirs[i] = appFileCacheDir.toString(); lfs.mkdir(appFileCacheDir, null, false); // $x/usercache/$user/appcache/$appId/output lfs.mkdir(new Path(appBase, OUTPUTDIR), null, false); {noformat} Reveals that lfs.mkdir ends up manipulating permissions and thus clears sticky bit from output and filecache. At this point I'm at a loss about how this is supposed to work. My understanding was that the whole sequence of events here was predicated on a sticky bit set so that daemons running under the user yarn (default group yarn) can have access to the resulting files and subdirectories down at output and below. Please let me know if I'm missing something or whether this is just a bug that needs to be fixed. On a related note, when the shuffle side of the Pi job failed the job itself didn't. It went into the endless loop and only exited when it exhausted all the local storage for the log files (at which point the nodemanager died and thus the job ended). Perhaps this is even more serious side effect of this issue that needs to be investigated separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3903) no admin override to view jobs on mr app master and job history server
[ https://issues.apache.org/jira/browse/MAPREDUCE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220049#comment-13220049 ] Hudson commented on MAPREDUCE-3903: --- Integrated in Hadoop-Mapreduce-trunk #1006 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1006/]) MAPREDUCE-3903. Add support for mapreduce admin users. (Contributed by Thomas Graves) (Revision 1295262) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1295262 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobACLsManager.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestJobAclsManager.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryClientService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ClusterSetup.apt.vm no admin override to view jobs on mr app master and job history server -- Key: MAPREDUCE-3903 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3903 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Priority: Critical Fix For: 0.23.2 Attachments: MAPREDUCE-3903.patch, MAPREDUCE-3903.patch, MAPREDUCE-3903.patch in 1.0 there was a config mapreduce.cluster.administrators that allowed administrators to view anyones job. That no longer works on yarn. yarn has the new config yarn.admin.acl but it appears the mr app master and job history server don't use that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3706) HTTP Circular redirect error on the job attempts page
[ https://issues.apache.org/jira/browse/MAPREDUCE-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220046#comment-13220046 ] Hudson commented on MAPREDUCE-3706: --- Integrated in Hadoop-Mapreduce-trunk #1006 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1006/]) MAPREDUCE-3706. Fix circular redirect error in job-attempts page. Contributed by Robert Evans. (Revision 1295314) Result = SUCCESS acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1295314 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java HTTP Circular redirect error on the job attempts page - Key: MAPREDUCE-3706 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3706 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Assignee: Robert Joseph Evans Priority: Critical Fix For: 0.23.2 Attachments: MR-3706.txt submitted job and tried to go to following url: http://rmhost.domain.com:8088/proxy/application_1326992308313_0004/mapreduce/attempts/job_1326992308313_4_4/m/NEW This resulted in the following HTTP ERROR: HTTP ERROR 500 Problem accessing /proxy/application_1326992308313_0004/mapreduce/attempts/job_1326992308313_4_4/m/NEW. Reason: Circular redirect to 'http://amhost.domain.com:44869/mapreduce/attempts/job_1326992308313_4_4/m/NEW' Caused by: org.apache.commons.httpclient.CircularRedirectException: Circular redirect to 'http://amhost.domain.com:44869/mapreduce/attempts/job_1326992308313_4_4/m/NEW' at org.apache.commons.httpclient.HttpMethodDirector.processRedirectResponse(HttpMethodDirector.java:638) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:179) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.proxyLink(WebAppProxyServlet.java:148) at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:269) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:66) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) Note that if you first go to the proxy at: http://rmhost.domain.com:8088/proxy/application_1326992308313_0004/ and then click the links to get here you don't get the error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3933) Failures because MALLOC_ARENA_MAX is not set
[ https://issues.apache.org/jira/browse/MAPREDUCE-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220050#comment-13220050 ] Hudson commented on MAPREDUCE-3933: --- Integrated in Hadoop-Mapreduce-trunk #1006 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1006/]) MAPREDUCE-3933. Failures because MALLOC_ARENA_MAX is not set (ahmed via tucu) (Revision 1295178) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1295178 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml * /hadoop/common/trunk/hadoop-project/pom.xml Failures because MALLOC_ARENA_MAX is not set Key: MAPREDUCE-3933 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3933 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 0.23.0 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Fix For: 0.23.3 Attachments: MAPREDUCE-3933.patch, MAPREDUCE-3933_rev2.patch We have noticed a bunch of MapReduce test failures on CentOS 6 due to running beyond virtual memory limits. These tests fail with messages of the form: {code} [Node Status Updater] nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:getNodeStatus(254)) - Sending out status for container: container_id {, app_attempt_id {, application_id {, id: 1, cluster_timestamp: 1330401645767, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics: Container [pid=16750,containerID=container_1330401645767_0001_01_01] is running beyond virtual memory limits. Current usage: 220.5mb of 2.0gb physical memory used; 7.1gb of 4.2gb virtual memory used. Killing container {code} The failing tests are: {code} TestJobCounters TestJobSysDirWithDFS TestLazyOutput TestMiniMRChildTask TestMiniMRClientCluster TestReduceFetchFromPartialMem TestChild TestMapReduceLazyOutput TestJobOutputCommitter TestMRAppWithCombiner TestMRJobs TestMRJobsWithHistoryService TestMROldApiJobs TestSpeculativeExecution TestUberAM {code} I'll upload a patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3550) RM web proxy should handle redirect of web services urls
[ https://issues.apache.org/jira/browse/MAPREDUCE-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-3550: - Description: the RM web proxy should handle the web services urls added in MAPREDUCE-2863. The proxy does handle passing the web service urls to the AM, it just doesn't handle redirecting it after the AM goes away. (was: the RM web proxy should handle the web services urls added in MAPREDUCE-2863. ) Summary: RM web proxy should handle redirect of web services urls (was: RM web proxy should handle web services urls) RM web proxy should handle redirect of web services urls Key: MAPREDUCE-3550 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3550 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Assignee: Thomas Graves Priority: Critical the RM web proxy should handle the web services urls added in MAPREDUCE-2863. The proxy does handle passing the web service urls to the AM, it just doesn't handle redirecting it after the AM goes away. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3550) RM web proxy should handle redirect of web services urls
[ https://issues.apache.org/jira/browse/MAPREDUCE-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-3550: - Target Version/s: 0.23.2 (was: 0.23.0) RM web proxy should handle redirect of web services urls Key: MAPREDUCE-3550 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3550 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Assignee: Thomas Graves Priority: Critical the RM web proxy should handle the web services urls added in MAPREDUCE-2863. The proxy does handle passing the web service urls to the AM, it just doesn't handle redirecting it after the AM goes away. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3954) Clean up passing HEAPSIZE to yarn and mapred commands.
Clean up passing HEAPSIZE to yarn and mapred commands. -- Key: MAPREDUCE-3954 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3954 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Currently the heap size for all of these is set in yarn-env.sh. JAVA_HEAP_MAX is set to -Xmx1000m unless YARN_HEAPSIZE is set. If it is set it will override JAVA_HEAP_MAX. However, we do not always want to have the RM, NM, and HistoryServer with the exact same heap size. It would be logical to have inside of yarn and mapred to set JAVA_HEAP_MAX if YARN_RESOURCEMANAGER_HEAPSIZE, YARN_NODEMANAGER_HEAPSIZE or HADOOP_JOB_HISTORYSERVER_HEAPSIZE are set respectively. This is a bug because it is easy to configure the history server to store more entires then the heap can hold. It is also a performance issue if we do not allow the history server to cache many entries on a large cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3761) AM info in job -list does not reflect the actual AM hostname
[ https://issues.apache.org/jira/browse/MAPREDUCE-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220197#comment-13220197 ] Robert Joseph Evans commented on MAPREDUCE-3761: The issue is that the AM is arbitrary code. It does not have to be something that we wrote. The only thing that we can do to restrict it is what the OS allows us to do. If there is a good way for us to make the OS not allow any incoming connections to the HTTP port of the AM then I am all for that. Sadly the HTTP port is an ephemeral port and so is the RPC port that needs to be open to allow connections from other locations. I would prefer a different column that can show the host that the AM is running on. It should come from the RM indicating where it was scheduler, because it is the AM that reports back both the web app URL and the RPC url. AM info in job -list does not reflect the actual AM hostname Key: MAPREDUCE-3761 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3761 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.1 Reporter: Ramya Sunil Assignee: Vinod Kumar Vavilapalli Fix For: 0.23.1 Attachments: MAPREDUCE-3761-20120202.txt, MAPREDUCE-3761-20120214.1.txt The AM info field on bin/mapred job -list currently has a value resourcemanager hostname:8088/proxy/appID. This info is irrelevant unless it shows the real information of where the AM was launched. This needs to be fixed to show the AM host details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3761) AM info in job -list does not reflect the actual AM hostname
[ https://issues.apache.org/jira/browse/MAPREDUCE-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220200#comment-13220200 ] Robert Joseph Evans commented on MAPREDUCE-3761: Also the whole point of the proxy is to mitigate the opportunities for an AM to steal cookies from an end user. It does not do a very good job of it, and I really would prefer to see a different architecture for how it does this, but this is what we have. If you want to put in the change as is I am not going to block it, but I suspect that the security people here will force me to change it before we can deploy into production, so I would prefer to just have a separate field that is not a URL. AM info in job -list does not reflect the actual AM hostname Key: MAPREDUCE-3761 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3761 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.1 Reporter: Ramya Sunil Assignee: Vinod Kumar Vavilapalli Fix For: 0.23.1 Attachments: MAPREDUCE-3761-20120202.txt, MAPREDUCE-3761-20120214.1.txt The AM info field on bin/mapred job -list currently has a value resourcemanager hostname:8088/proxy/appID. This info is irrelevant unless it shows the real information of where the AM was launched. This needs to be fixed to show the AM host details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3954) Clean up passing HEAPSIZE to yarn and mapred commands.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3954: --- Attachment: MR-3954.txt There are no tests in this patch as it is just script changes, but I manually tested on a single node cluster. Clean up passing HEAPSIZE to yarn and mapred commands. -- Key: MAPREDUCE-3954 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3954 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Attachments: MR-3954.txt Currently the heap size for all of these is set in yarn-env.sh. JAVA_HEAP_MAX is set to -Xmx1000m unless YARN_HEAPSIZE is set. If it is set it will override JAVA_HEAP_MAX. However, we do not always want to have the RM, NM, and HistoryServer with the exact same heap size. It would be logical to have inside of yarn and mapred to set JAVA_HEAP_MAX if YARN_RESOURCEMANAGER_HEAPSIZE, YARN_NODEMANAGER_HEAPSIZE or HADOOP_JOB_HISTORYSERVER_HEAPSIZE are set respectively. This is a bug because it is easy to configure the history server to store more entires then the heap can hold. It is also a performance issue if we do not allow the history server to cache many entries on a large cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3897) capacity scheduler - maxActiveApplicationsPerUser calculation can be wrong
[ https://issues.apache.org/jira/browse/MAPREDUCE-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220240#comment-13220240 ] Thomas Graves commented on MAPREDUCE-3897: -- I just thought of a case where this won't work well for utilization. That is if you have a queue with small capacity - say 1%, but its max capacity is say 100%, even if we had the configuration per queue for am% and you set it really high, it might only be allowed a couple of AM's when in reality if the cluster has no one else running it should be allowed more so it could use the 100% max capacity. We might be better off leaving the maxActiveApplications computation using maxCapacity but changing the maxActiveApplicationsPerUser to use capacity and then allow the user limit factor to apply. Need to think about it some more. capacity scheduler - maxActiveApplicationsPerUser calculation can be wrong -- Key: MAPREDUCE-3897 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3897 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Assignee: Eric Payne Priority: Critical Attachments: MAPREDUCE-3897-1.txt, MAPREDUCE-3897-1.txt The capacity scheduler calculates the maxActiveApplications and the maxActiveApplicationsPerUser based on the config yarn.scheduler.capacity.maximum-applications or default 1. MaxActiveApplications = max ( ceil ( clusterMemory/minAllocation * maxAMResource% * absoluteMaxCapacity), 1) MaxActiveAppsPerUser = max( ceil (maxActiveApplicationsComputedAbove * (userLimit%/100) * userLimitFactor), 1) maxActiveApplications is already multiplied by the queue absolute MAXIMUM capacity, so if max capacity capacity and if you have user limit factor 1 (which is the default) and only 1 user is running, that user will not be allowed to use over the queue capacity, so having it relative to MAX capacity doesn't make sense. That user could easily end up in a deadlock and all its space used by application masters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-3353: -- Attachment: MAPREDUCE-3353-branch-0.23.patch Attaching patch. This passes all tests locally and has a functional test for main changes. 1) providing all unusable nodes on app attempt registration 2) providing delta updates of nodes thereafter to a running app I will scan through the changes for final clean ups and look at adding some more tests if necessary. Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes - Key: MAPREDUCE-3353 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mrv2, resourcemanager Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Bikas Saha Priority: Critical Fix For: 0.23.2 Attachments: MAPREDUCE-3353-branch-0.23.patch When a node gets lost or turns faulty, AM needs to know about that event so that it can take some action like for e.g. re-executing map tasks whose intermediate output live on that faulty node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3955) Replace ProtoOverHadoopRpcEngine with ProtobufRpcEngine.
Replace ProtoOverHadoopRpcEngine with ProtobufRpcEngine. Key: MAPREDUCE-3955 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3955 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey We shouldn't have two rpc engines based on protocol buffers. ProtoOverHadoopRpcEngine in hadoop-yarn-common should be replaced by ProtobufRpcEngine in hadoop-common. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3955) Replace ProtoOverHadoopRpcEngine with ProtobufRpcEngine.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated MAPREDUCE-3955: Target Version/s: 0.24.0, 0.23.3 Replace ProtoOverHadoopRpcEngine with ProtobufRpcEngine. Key: MAPREDUCE-3955 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3955 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey We shouldn't have two rpc engines based on protocol buffers. ProtoOverHadoopRpcEngine in hadoop-yarn-common should be replaced by ProtobufRpcEngine in hadoop-common. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3956) Remove the use of the deprecated Syncable.sync() method
Remove the use of the deprecated Syncable.sync() method --- Key: MAPREDUCE-3956 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3956 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE This is a part of HADOOP-8124. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3954) Clean up passing HEAPSIZE to yarn and mapred commands.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220333#comment-13220333 ] Robert Joseph Evans commented on MAPREDUCE-3954: The above patch has no documentation either. I will update the docs and resubmit. Clean up passing HEAPSIZE to yarn and mapred commands. -- Key: MAPREDUCE-3954 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3954 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Attachments: MR-3954.txt Currently the heap size for all of these is set in yarn-env.sh. JAVA_HEAP_MAX is set to -Xmx1000m unless YARN_HEAPSIZE is set. If it is set it will override JAVA_HEAP_MAX. However, we do not always want to have the RM, NM, and HistoryServer with the exact same heap size. It would be logical to have inside of yarn and mapred to set JAVA_HEAP_MAX if YARN_RESOURCEMANAGER_HEAPSIZE, YARN_NODEMANAGER_HEAPSIZE or HADOOP_JOB_HISTORYSERVER_HEAPSIZE are set respectively. This is a bug because it is easy to configure the history server to store more entires then the heap can hold. It is also a performance issue if we do not allow the history server to cache many entries on a large cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3944) JobHistory web services are slower then the UI and can easly overload the JH
[ https://issues.apache.org/jira/browse/MAPREDUCE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220342#comment-13220342 ] Siddharth Seth commented on MAPREDUCE-3944: --- bq. There are two things that I can think of that can help make this faster. The first one is to move as much filtering for the web service up to use PartialJob where possible. This will allow us to not load as many full jobs. The next one is that we need to fix the locking on getJob, so that we can be loading more then one job at a time. we need to some how lock on the JobID and not on the HistoryServer. +1 for these two changes. JobHistory used to have a method which did filtering on some parameters. That was removed a while ago - since it was never used. You may want to pull out the history for JobHistory.java. Considering the webservice ends up building CompletedJob objects for each job that it has to return - it can grow much bigger than the configured CompletedJobCache size and can cause the JobHistory server to go OOM. MAPREDUCE-3755 will help towards avoiding this, and also making this webservice call faster. For now, I'd propose having a hard limit on the number of entries this webservice can return. Also, I'd prefer having a webservice to return jobIds (based on the specified filters), instead of returning completed job info. The jobs/{jobId} webservice can then be used to pull job info for each job. The order of results returned can be problematic as well. The webservice doesn't have an order parameter - and ends up depending on whatever order the history server returns, which can change in the future. JobHistory web services are slower then the UI and can easly overload the JH Key: MAPREDUCE-3944 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3944 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1, 0.23.2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker When our first customer started using the Job History web services today the History Server ground to a halt. We found 250 Jetty threads stuck on the following stack trace. {noformat} java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:898) - waiting to lock 0x2aaab364ba60 (a org.apache.hadoop.mapreduce.v2.hs.JobHistory) at org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices.getJobs(HsWebServices.java:188) {noformat} HsWebServices.java:188 corresponds to the /mapreduce/jobs service. Looking at the code there are a number of optimizations that need to be done to improve its performance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3956) Remove the use of the deprecated Syncable.sync() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated MAPREDUCE-3956: -- Attachment: m3956_20120301.patch m3956_20120301.patch: It turns out that only TeraOutputFormat uses it. Remove the use of the deprecated Syncable.sync() method --- Key: MAPREDUCE-3956 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3956 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: m3956_20120301.patch This is a part of HADOOP-8124. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3944) JobHistory web services are slower then the UI and can easly overload the JH
[ https://issues.apache.org/jira/browse/MAPREDUCE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220361#comment-13220361 ] Thomas Graves commented on MAPREDUCE-3944: -- I think it would be better for the webservices to return the partial job instead of just the job id. From my understanding, if you are returning the ids you have all the partial job information also. That will be equivalent to the job history web page and could atleast give a user some useful information. If they need the complete then can do the next query of the specific job. JobHistory web services are slower then the UI and can easly overload the JH Key: MAPREDUCE-3944 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3944 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1, 0.23.2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker When our first customer started using the Job History web services today the History Server ground to a halt. We found 250 Jetty threads stuck on the following stack trace. {noformat} java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:898) - waiting to lock 0x2aaab364ba60 (a org.apache.hadoop.mapreduce.v2.hs.JobHistory) at org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices.getJobs(HsWebServices.java:188) {noformat} HsWebServices.java:188 corresponds to the /mapreduce/jobs service. Looking at the code there are a number of optimizations that need to be done to improve its performance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3944) JobHistory web services are slower then the UI and can easly overload the JH
[ https://issues.apache.org/jira/browse/MAPREDUCE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220378#comment-13220378 ] Robert Joseph Evans commented on MAPREDUCE-3944: I agree with Tom on this. I would also like to be able to limit the total number of entries that can be returned, but we have the issue with ordering here like Sid said. I am going to try to set up a stress test for this so I can try out several fixes to see what happens. There are really two different goals to this JIRA. The most important one is that even if someone is hitting the web service a lot, the history server should still be usable. The second part is to optimize the web service calls so that they reduce the load on the system. I think that the locking is the primary thing for fixing the first part, and I will put in the filtering changes too. Once I have those working I will do some profiling and see what else looks like hot spots. JobHistory web services are slower then the UI and can easly overload the JH Key: MAPREDUCE-3944 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3944 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1, 0.23.2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker When our first customer started using the Job History web services today the History Server ground to a halt. We found 250 Jetty threads stuck on the following stack trace. {noformat} java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:898) - waiting to lock 0x2aaab364ba60 (a org.apache.hadoop.mapreduce.v2.hs.JobHistory) at org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices.getJobs(HsWebServices.java:188) {noformat} HsWebServices.java:188 corresponds to the /mapreduce/jobs service. Looking at the code there are a number of optimizations that need to be done to improve its performance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3954) Clean up passing HEAPSIZE to yarn and mapred commands.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-3954: --- Attachment: MR-3954.txt Docs added. Clean up passing HEAPSIZE to yarn and mapred commands. -- Key: MAPREDUCE-3954 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3954 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker Attachments: MR-3954.txt, MR-3954.txt Currently the heap size for all of these is set in yarn-env.sh. JAVA_HEAP_MAX is set to -Xmx1000m unless YARN_HEAPSIZE is set. If it is set it will override JAVA_HEAP_MAX. However, we do not always want to have the RM, NM, and HistoryServer with the exact same heap size. It would be logical to have inside of yarn and mapred to set JAVA_HEAP_MAX if YARN_RESOURCEMANAGER_HEAPSIZE, YARN_NODEMANAGER_HEAPSIZE or HADOOP_JOB_HISTORYSERVER_HEAPSIZE are set respectively. This is a bug because it is easy to configure the history server to store more entires then the heap can hold. It is also a performance issue if we do not allow the history server to cache many entries on a large cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3956) Remove the use of the deprecated Syncable.sync() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220434#comment-13220434 ] Suresh Srinivas commented on MAPREDUCE-3956: +1 for the patch. Remove the use of the deprecated Syncable.sync() method --- Key: MAPREDUCE-3956 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3956 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: m3956_20120301.patch This is a part of HADOOP-8124. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3944) JobHistory web services are slower then the UI and can easly overload the JH
[ https://issues.apache.org/jira/browse/MAPREDUCE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220441#comment-13220441 ] Siddharth Seth commented on MAPREDUCE-3944: --- bq. I think it would be better for the webservices to return the partial job instead of just the job id. From my understanding, if you are returning the ids you have all the partial job information also. That will be equivalent to the job history web page and could atleast give a user some useful information. If they need the complete then can do the next query of the specific job. That works well. Instead of adding another webservice to return jobIds, the current one returns a smaller set of fields for now, and can go back to returning what it does rightnow once CompleteJobStatusStore is implemented or there's a more efficient way of getting additional job info. Bobby, the stacktrace you had posted earlier - that's from multiple parallel calls right ? Are you planning some kind of rate limiting as well, or restricting the number of worker threads based on the source of the request - UI / webservice / RPC. JobHistory web services are slower then the UI and can easly overload the JH Key: MAPREDUCE-3944 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3944 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1, 0.23.2 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Blocker When our first customer started using the Job History web services today the History Server ground to a halt. We found 250 Jetty threads stuck on the following stack trace. {noformat} java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:898) - waiting to lock 0x2aaab364ba60 (a org.apache.hadoop.mapreduce.v2.hs.JobHistory) at org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices.getJobs(HsWebServices.java:188) {noformat} HsWebServices.java:188 corresponds to the /mapreduce/jobs service. Looking at the code there are a number of optimizations that need to be done to improve its performance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-3353: -- Attachment: MAPREDUCE-3353-branch-0.23.patch Done with planned changes. Patch for review. Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes - Key: MAPREDUCE-3353 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mrv2, resourcemanager Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Bikas Saha Priority: Critical Fix For: 0.23.2 Attachments: MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch When a node gets lost or turns faulty, AM needs to know about that event so that it can take some action like for e.g. re-executing map tasks whose intermediate output live on that faulty node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-3353: -- Status: Patch Available (was: Open) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes - Key: MAPREDUCE-3353 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mrv2, resourcemanager Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Bikas Saha Priority: Critical Fix For: 0.23.2 Attachments: MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch When a node gets lost or turns faulty, AM needs to know about that event so that it can take some action like for e.g. re-executing map tasks whose intermediate output live on that faulty node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3956) Remove the use of the deprecated Syncable.sync() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated MAPREDUCE-3956: -- Component/s: examples Priority: Minor (was: Major) Hadoop Flags: Reviewed The patch has to be committed with HADOOP-8124. Otherwise, the code cannot be compiled. For the same reason, Jerkins won't be able to test the patch. Remove the use of the deprecated Syncable.sync() method --- Key: MAPREDUCE-3956 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3956 Project: Hadoop Map/Reduce Issue Type: Improvement Components: examples Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Attachments: m3956_20120301.patch This is a part of HADOOP-8124. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3497) missing documentation for yarn cli and subcommands - similar to commands_manual.html
[ https://issues.apache.org/jira/browse/MAPREDUCE-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-3497: - Target Version/s: 0.23.2 (was: 0.23.1) Status: Patch Available (was: Open) missing documentation for yarn cli and subcommands - similar to commands_manual.html Key: MAPREDUCE-3497 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3497 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Assignee: Thomas Graves Priority: Critical Attachments: MAPREDUCE-3497.patch the yarn cli and sub-commands aren't documented anywhere. Should have documentation similar to the commands_manual.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3956) Remove the use of the deprecated Syncable.sync() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved MAPREDUCE-3956. --- Resolution: Fixed Fix Version/s: 0.24.0 I have committed this. Remove the use of the deprecated Syncable.sync() method --- Key: MAPREDUCE-3956 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3956 Project: Hadoop Map/Reduce Issue Type: Improvement Components: examples Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: 0.24.0 Attachments: m3956_20120301.patch This is a part of HADOOP-8124. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3956) Remove the use of the deprecated Syncable.sync() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220508#comment-13220508 ] Hudson commented on MAPREDUCE-3956: --- Integrated in Hadoop-Hdfs-trunk-Commit #1892 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1892/]) HADOOP-8124, HDFS-3034, MAPREDUCE-3956. Remove the deprecated Syncable.sync(). (Revision 1295999) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1295999 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataOutputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Syncable.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/terasort/TeraOutputFormat.java Remove the use of the deprecated Syncable.sync() method --- Key: MAPREDUCE-3956 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3956 Project: Hadoop Map/Reduce Issue Type: Improvement Components: examples Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: 0.24.0 Attachments: m3956_20120301.patch This is a part of HADOOP-8124. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3792) job -list displays only the jobs submitted by a particular user
[ https://issues.apache.org/jira/browse/MAPREDUCE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-3792: -- Attachment: MAPREDUCE-3792.patch Updated to address the review comments. job -list displays only the jobs submitted by a particular user --- Key: MAPREDUCE-3792 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3792 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Ramya Sunil Assignee: Jason Lowe Priority: Critical Attachments: MAPREDUCE-3792.patch, MAPREDUCE-3792.patch, MAPREDUCE-3792.patch mapred job -list lists only the jobs submitted by the user who ran the command. This behavior is different from 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3792) job -list displays only the jobs submitted by a particular user
[ https://issues.apache.org/jira/browse/MAPREDUCE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-3792: -- Status: Patch Available (was: Open) job -list displays only the jobs submitted by a particular user --- Key: MAPREDUCE-3792 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3792 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Ramya Sunil Assignee: Jason Lowe Priority: Critical Attachments: MAPREDUCE-3792.patch, MAPREDUCE-3792.patch, MAPREDUCE-3792.patch mapred job -list lists only the jobs submitted by the user who ran the command. This behavior is different from 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3956) Remove the use of the deprecated Syncable.sync() method
[ https://issues.apache.org/jira/browse/MAPREDUCE-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220544#comment-13220544 ] Hudson commented on MAPREDUCE-3956: --- Integrated in Hadoop-Mapreduce-trunk-Commit #1825 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1825/]) HADOOP-8124, HDFS-3034, MAPREDUCE-3956. Remove the deprecated Syncable.sync(). (Revision 1295999) Result = ABORTED szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1295999 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataOutputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Syncable.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/terasort/TeraOutputFormat.java Remove the use of the deprecated Syncable.sync() method --- Key: MAPREDUCE-3956 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3956 Project: Hadoop Map/Reduce Issue Type: Improvement Components: examples Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: 0.24.0 Attachments: m3956_20120301.patch This is a part of HADOOP-8124. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-3353: -- Status: Open (was: Patch Available) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes - Key: MAPREDUCE-3353 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mrv2, resourcemanager Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Bikas Saha Priority: Critical Fix For: 0.23.2 Attachments: MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch When a node gets lost or turns faulty, AM needs to know about that event so that it can take some action like for e.g. re-executing map tasks whose intermediate output live on that faulty node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-3353: -- Status: Patch Available (was: Open) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes - Key: MAPREDUCE-3353 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mrv2, resourcemanager Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Bikas Saha Priority: Critical Fix For: 0.23.2 Attachments: MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch When a node gets lost or turns faulty, AM needs to know about that event so that it can take some action like for e.g. re-executing map tasks whose intermediate output live on that faulty node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3353) Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-3353: -- Attachment: MAPREDUCE-3353-branch-0.23.patch 1) Fixed the javadoc warning. 2) The javac warnings are because of events handlers being called in NodesListManager.java and are similar to pre-existing warnings. === [WARNING] /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java:[142,19] [unchecked] unchecked call to handle(T) as a member of the raw type org.apache.hadoop.yarn.event.EventHandler [WARNING] /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java:[155,21] [unchecked] unchecked call to handle(T) as a member of the raw type org.apache.hadoop.yarn.event.EventHandler [WARNING] /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerImpl.java:[244,35] [unchecked] unchecked call to handle(T) as a member of the raw type org.apache.hadoop.yarn.event.EventHandler === Need a RM-AM channel to inform AMs about faulty/unhealthy/lost nodes - Key: MAPREDUCE-3353 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3353 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mrv2, resourcemanager Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Bikas Saha Priority: Critical Fix For: 0.23.2 Attachments: MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch, MAPREDUCE-3353-branch-0.23.patch When a node gets lost or turns faulty, AM needs to know about that event so that it can take some action like for e.g. re-executing map tasks whose intermediate output live on that faulty node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3792) job -list displays only the jobs submitted by a particular user
[ https://issues.apache.org/jira/browse/MAPREDUCE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220559#comment-13220559 ] Hadoop QA commented on MAPREDUCE-3792: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516761/MAPREDUCE-3792.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapred.TestClientServiceDelegate +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1979//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1979//console This message is automatically generated. job -list displays only the jobs submitted by a particular user --- Key: MAPREDUCE-3792 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3792 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Ramya Sunil Assignee: Jason Lowe Priority: Critical Attachments: MAPREDUCE-3792.patch, MAPREDUCE-3792.patch, MAPREDUCE-3792.patch mapred job -list lists only the jobs submitted by the user who ran the command. This behavior is different from 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3792) job -list displays only the jobs submitted by a particular user
[ https://issues.apache.org/jira/browse/MAPREDUCE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-3792: -- Status: Open (was: Patch Available) Canceling patch to investigate test failure. job -list displays only the jobs submitted by a particular user --- Key: MAPREDUCE-3792 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3792 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Ramya Sunil Assignee: Jason Lowe Priority: Critical Attachments: MAPREDUCE-3792.patch, MAPREDUCE-3792.patch, MAPREDUCE-3792.patch mapred job -list lists only the jobs submitted by the user who ran the command. This behavior is different from 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3958) Remove RMNodeState and replace it with NodeState
Remove RMNodeState and replace it with NodeState Key: MAPREDUCE-3958 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3958 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Bikas Saha Fix For: 0.23.2 RMNodeState is being sent over the wire after MAPREDUCE-3353. This has been done by cloning the enum into NodeState in yarn protocol records. That makes RMNodeState redundant and it should be replaced with NodeState. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3958) RM: Remove RMNodeState and replace it with NodeState
[ https://issues.apache.org/jira/browse/MAPREDUCE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-3958: -- Summary: RM: Remove RMNodeState and replace it with NodeState (was: Remove RMNodeState and replace it with NodeState) RM: Remove RMNodeState and replace it with NodeState Key: MAPREDUCE-3958 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3958 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Bikas Saha Fix For: 0.23.2 RMNodeState is being sent over the wire after MAPREDUCE-3353. This has been done by cloning the enum into NodeState in yarn protocol records. That makes RMNodeState redundant and it should be replaced with NodeState. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3792) job -list displays only the jobs submitted by a particular user
[ https://issues.apache.org/jira/browse/MAPREDUCE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-3792: -- Attachment: MAPREDUCE-3792.patch Test failure was from an unrelated change trying to fix an issue I ran into while testing. Will track that under a separate JIRA. job -list displays only the jobs submitted by a particular user --- Key: MAPREDUCE-3792 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3792 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Ramya Sunil Assignee: Jason Lowe Priority: Critical Attachments: MAPREDUCE-3792.patch, MAPREDUCE-3792.patch, MAPREDUCE-3792.patch, MAPREDUCE-3792.patch mapred job -list lists only the jobs submitted by the user who ran the command. This behavior is different from 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3792) job -list displays only the jobs submitted by a particular user
[ https://issues.apache.org/jira/browse/MAPREDUCE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-3792: -- Status: Patch Available (was: Open) job -list displays only the jobs submitted by a particular user --- Key: MAPREDUCE-3792 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3792 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: Ramya Sunil Assignee: Jason Lowe Priority: Critical Attachments: MAPREDUCE-3792.patch, MAPREDUCE-3792.patch, MAPREDUCE-3792.patch, MAPREDUCE-3792.patch mapred job -list lists only the jobs submitted by the user who ran the command. This behavior is different from 1.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-3958) RM: Remove RMNodeState and replace it with NodeState
[ https://issues.apache.org/jira/browse/MAPREDUCE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reassigned MAPREDUCE-3958: -- Assignee: Bikas Saha +1. It's all yours ;) RM: Remove RMNodeState and replace it with NodeState Key: MAPREDUCE-3958 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3958 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 0.23.2 RMNodeState is being sent over the wire after MAPREDUCE-3353. This has been done by cloning the enum into NodeState in yarn protocol records. That makes RMNodeState redundant and it should be replaced with NodeState. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3896) pig job through oozie hangs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-3896: --- Status: Open (was: Patch Available) pig job through oozie hangs Key: MAPREDUCE-3896 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3896 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.1, 0.24.0, 0.23.2 Reporter: John George Assignee: Vinod Kumar Vavilapalli Priority: Blocker Fix For: 0.23.2 Attachments: MAPREDUCE-3896-20120228.txt running pig job on oozie hangs due to race condition -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3896) pig job through oozie hangs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-3896: --- Attachment: MAPREDUCE-3896-20120301.txt Patch with a test. I couldn't exactly reproduce the error in the test case, but without the code fix, the test does fail, albeit with a different exception which still indicates the same underlying problem of missing user-name. The test passes after the code-fix. pig job through oozie hangs Key: MAPREDUCE-3896 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3896 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.1 Reporter: John George Assignee: Vinod Kumar Vavilapalli Priority: Blocker Fix For: 0.23.2 Attachments: MAPREDUCE-3896-20120228.txt, MAPREDUCE-3896-20120301.txt running pig job on oozie hangs due to race condition -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3896) pig job through oozie hangs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-3896: --- Affects Version/s: (was: 0.23.2) (was: 0.24.0) Status: Patch Available (was: Open) pig job through oozie hangs Key: MAPREDUCE-3896 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3896 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.1 Reporter: John George Assignee: Vinod Kumar Vavilapalli Priority: Blocker Fix For: 0.23.2 Attachments: MAPREDUCE-3896-20120228.txt, MAPREDUCE-3896-20120301.txt running pig job on oozie hangs due to race condition -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3896) pig job through oozie hangs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220670#comment-13220670 ] Hadoop QA commented on MAPREDUCE-3896: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12516783/MAPREDUCE-3896-20120301.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1982//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1982//console This message is automatically generated. pig job through oozie hangs Key: MAPREDUCE-3896 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3896 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.1 Reporter: John George Assignee: Vinod Kumar Vavilapalli Priority: Blocker Fix For: 0.23.2 Attachments: MAPREDUCE-3896-20120228.txt, MAPREDUCE-3896-20120301.txt running pig job on oozie hangs due to race condition -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3829) [Gridmix] Gridmix should give better error message when input-data directory already exists and -generate option is given
[ https://issues.apache.org/jira/browse/MAPREDUCE-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220692#comment-13220692 ] Amar Kamat commented on MAPREDUCE-3829: --- Thanks Ravi for the patch. Few comments: # It would be nice to move the FileSystem, size and path checks to the writeInputData() API. This way you can test this API and the current fix via JUnit tests. # Add JUnit tests to test ## writeInputData() w.r.t zero-data size, missing input dir ## 777 permissions on io-path. [Gridmix] Gridmix should give better error message when input-data directory already exists and -generate option is given - Key: MAPREDUCE-3829 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3829 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 3829.v0.patch, 3829.v1.patch Instead of throwing exception messages on to the console, Gridmix should give better error message when input-data directory already exists and -generate option is given. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220701#comment-13220701 ] Amar Kamat commented on MAPREDUCE-2722: --- Ravi, compression-emulation is a feature having 3 parts # Input compression emulation # Intermediate compression emulation # Output compression emulation Intermediate and output compression emulation happens only when the compression-emulation feature is turned on and the job's config has those parameters set. For input compression, Gridmix relies on 'mapred.input.dir'. If there are compressed input files only then input compression emulation will be attempted. Scale the input-data-size field only if input-compression-emulation is desired. Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used -- Key: MAPREDUCE-2722 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Reporter: Ravi Gummadi Assignee: Ravi Gummadi Attachments: 2722.v1.patch, MR2722.patch When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira