[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local
[ https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yi updated MAPREDUCE-4957: -- Attachment: MAPREDUCE-4957.patch > Throw FileNotFoundException when running in single node and > "mapreduce.framework.name" is local > --- > > Key: MAPREDUCE-4957 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: yi >Assignee: yi >Priority: Minor > Attachments: MAPREDUCE-4957.patch, MAPREDUCE-4957.patch > > > Run in single node and "mapreduce.framework.name" is local, and get following > error: > java.io.FileNotFoundException: File does not exist: > /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772) > > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208) > > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71) > > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254) > > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292) > > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365) > > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) > > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) > > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > Job Submission failed with exception 'java.io.FileNotFoundException(File does > not exist: > /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561449#comment-13561449 ] Sandy Ryza commented on MAPREDUCE-4951: --- It doesn't seem to me that either approach would conflict with this patch at the moment. While this code might get rewritten in the future, under the current preemption mechanism, when MR is explicitly told that a container was preempted, it should not count it as failed. Does anybody disagree? > Container preemption interpreted as task failure > > > Key: MAPREDUCE-4951 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mr-am, mrv2 >Affects Versions: 2.0.2-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951-2.patch, > MAPREDUCE-4951.patch > > > When YARN reports a completed container to the MR AM, it always interprets it > as a failure. This can lead to a job failing because too many of its tasks > failed, when in fact they only failed because the scheduler preempted them. > MR needs to recognize the special exit code value of -100 and interpret it as > a container being killed instead of a container failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4958) close method of RawKeyValueIterator is not called after after finish using.
Jerry Chen created MAPREDUCE-4958: - Summary: close method of RawKeyValueIterator is not called after after finish using. Key: MAPREDUCE-4958 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4958 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: trunk Reporter: Jerry Chen I observed that the close method of the RawKeyValueIterator returned from MergeManager is not called. Which will cause resource leaks for RawKeyValueIterator implementation which depends on the RawKeyValueIterator.close for doing cleanup when finished. Some other places in MapTask also not follow the convension to call RawKeyValueIterator.close after use it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local
[ https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561407#comment-13561407 ] Hadoop QA commented on MAPREDUCE-4957: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12566240/MAPREDUCE-4957.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 2024 javac compiler warnings (more than the trunk's current 2021 warnings). {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3265//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3265//artifact/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3265//console This message is automatically generated. > Throw FileNotFoundException when running in single node and > "mapreduce.framework.name" is local > --- > > Key: MAPREDUCE-4957 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: yi >Assignee: yi >Priority: Minor > Attachments: MAPREDUCE-4957.patch > > > Run in single node and "mapreduce.framework.name" is local, and get following > error: > java.io.FileNotFoundException: File does not exist: > /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772) > > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208) > > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71) > > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254) > > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292) > > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365) > > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) > > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) > > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > Job Submission failed with exception 'java.io.FileNotFoundException(File does > not exist: > /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)' -- Thi
[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local
[ https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yi updated MAPREDUCE-4957: -- Description: Run in single node and "mapreduce.framework.name" is local, and get following error: java.io.FileNotFoundException: File does not exist: /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)' was: java.io.FileNotFoundException: File does not exist: /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Job Submission failed with exception 'java.io.FileN
[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local
[ https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yi updated MAPREDUCE-4957: -- Status: Patch Available (was: Open) Patch has been attached, please help to review. > Throw FileNotFoundException when running in single node and > "mapreduce.framework.name" is local > --- > > Key: MAPREDUCE-4957 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: yi >Assignee: yi >Priority: Minor > Attachments: MAPREDUCE-4957.patch > > > Run in single node and "mapreduce.framework.name" is local, and get following > error: > java.io.FileNotFoundException: File does not exist: > /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772) > > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208) > > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71) > > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254) > > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292) > > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365) > > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) > > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) > > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > Job Submission failed with exception 'java.io.FileNotFoundException(File does > not exist: > /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local
[ https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yi updated MAPREDUCE-4957: -- Attachment: MAPREDUCE-4957.patch > Throw FileNotFoundException when running in single node and > "mapreduce.framework.name" is local > --- > > Key: MAPREDUCE-4957 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: yi >Assignee: yi >Priority: Minor > Attachments: MAPREDUCE-4957.patch > > > java.io.FileNotFoundException: File does not exist: > /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772) > > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208) > > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71) > > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254) > > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292) > > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365) > > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) > > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) > > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > Job Submission failed with exception 'java.io.FileNotFoundException(File does > not exist: > /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local
[ https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561367#comment-13561367 ] yi commented on MAPREDUCE-4957: --- The file path added by "DistributedCache.addFileToClassPath" should be qualified by correct filesystem. > Throw FileNotFoundException when running in single node and > "mapreduce.framework.name" is local > --- > > Key: MAPREDUCE-4957 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: yi >Assignee: yi >Priority: Minor > > java.io.FileNotFoundException: File does not exist: > /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772) > > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208) > > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71) > > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254) > > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292) > > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365) > > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) > > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) > > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > Job Submission failed with exception 'java.io.FileNotFoundException(File does > not exist: > /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local
yi created MAPREDUCE-4957: - Summary: Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local Key: MAPREDUCE-4957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: yi Assignee: yi Priority: Minor java.io.FileNotFoundException: File does not exist: /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4838) Add extra info to JH files
[ https://issues.apache.org/jira/browse/MAPREDUCE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated MAPREDUCE-4838: --- Attachment: MAPREDUCE-4838_3.patch Hi Sid, your new comments are addressed. Please have a look at the latest attached patch file. Thanks, Zhijie > Add extra info to JH files > -- > > Key: MAPREDUCE-4838 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4838 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Arun C Murthy >Assignee: Zhijie Shen > Attachments: MAPREDUCE-4838_1.patch, MAPREDUCE-4838_2.patch, > MAPREDUCE-4838_3.patch, MAPREDUCE-4838.patch > > > It will be useful to add more task-info to JH for analytics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4956) The Additional JH Info Should Be Exposed
Zhijie Shen created MAPREDUCE-4956: -- Summary: The Additional JH Info Should Be Exposed Key: MAPREDUCE-4956 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4956 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Zhijie Shen Assignee: Zhijie Shen In MAPREDUCE-4838, the addition info has been added to JH. This info is useful to be exposed, at least via UI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4838) Add extra info to JH files
[ https://issues.apache.org/jira/browse/MAPREDUCE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561105#comment-13561105 ] Siddharth Seth commented on MAPREDUCE-4838: --- Took a quick look. This patch looks better but still needs some fixes. - Unit tests should use the new properties defined in MRJobConfig.* - In case of Reduce tasks, the container host doesn't need to be resolved (or in the case where dataLocalHosts is empty) - The null check is still required in the history events - since these values don't need to be set. - TaskAttemptImpl has a repeated dataLocalHosts assignment - RMContainerAllocator has an unused import - Formatting fix in TestJobImpl (patch line 377) - Since resolveHosts has been changed to work with a hashSet - TaskAttemptImpl itself could store this. Instead of iterating over an array to match the container host. Most of the test changes look good as well. Need to take a better look at some of them though. Thanks. > Add extra info to JH files > -- > > Key: MAPREDUCE-4838 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4838 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Arun C Murthy >Assignee: Zhijie Shen > Attachments: MAPREDUCE-4838_1.patch, MAPREDUCE-4838_2.patch, > MAPREDUCE-4838.patch > > > It will be useful to add more task-info to JH for analytics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560916#comment-13560916 ] Alejandro Abdelnur commented on MAPREDUCE-4049: --- Avner, sounds good. We can do another review iteration on your updated patch, it will be easier looking at concrete code. Thx > plugin for generic shuffle service > -- > > Key: MAPREDUCE-4049 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: performance, task, tasktracker >Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 >Reporter: Avner BenHanoch >Assignee: Avner BenHanoch > Labels: merge, plugin, rdma, shuffle > Fix For: 3.0.0 > > Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, > MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch > > > Support generic shuffle service as set of two plugins: ShuffleProvider & > ShuffleConsumer. > This will satisfy the following needs: > # Better shuffle and merge performance. For example: we are working on > shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, > or Infiniband) instead of using the current HTTP shuffle. Based on the fast > RDMA shuffle, the plugin can also utilize a suitable merge approach during > the intermediate merges. Hence, getting much better performance. > # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden > dependency of NodeManager with a specific version of mapreduce shuffle > (currently targeted to 0.24.0). > References: > # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu > from Auburn University with others, > [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] > # I am attaching 2 documents with suggested Top Level Design for both plugins > (currently, based on 1.0 branch) > # I am providing link for downloading UDA - Mellanox's open source plugin > that implements generic shuffle service using RDMA and levitated merge. > Note: At this phase, the code is in C++ through JNI and you should consider > it as beta only. Still, it can serve anyone that wants to implement or > contribute to levitated merge. (Please be advised that levitated merge is > mostly suit in very fast networks) - > [http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=144&menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560903#comment-13560903 ] Bikas Saha commented on MAPREDUCE-4951: --- We might be digressing from this jira here. But I really dont think the 2-step approach is worth its complexity. The main scenario where it makes sense is when the task has an ability to checkpoint its work before getting preempted. I havent seen this capability outside of basic research prototypes. Its much simpler to have the preemption be an RM only action. We do need to fix the action and information loop so that AM's can get correct information about the infrastructure's actions. > Container preemption interpreted as task failure > > > Key: MAPREDUCE-4951 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mr-am, mrv2 >Affects Versions: 2.0.2-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951-2.patch, > MAPREDUCE-4951.patch > > > When YARN reports a completed container to the MR AM, it always interprets it > as a failure. This can lead to a job failing because too many of its tasks > failed, when in fact they only failed because the scheduler preempted them. > MR needs to recognize the special exit code value of -100 and interpret it as > a container being killed instead of a container failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560736#comment-13560736 ] Jason Lowe commented on MAPREDUCE-4951: --- bq. having the RM ask the AM to kill the container in case of preemption would likely not work as the AM cannot be trusted. Agreed, I was thinking of exactly the alternative you propose where preemption has potentially two phases, a "please AM, preempt that container you have" with a watchdog timer to have the RM kill it forcefully if the AM does not comply in a reasonable amount of time. This eliminates the race where the container can fail because of the preemption and provides a way for the AM to potentially checkpoint the state of the container for faster recovery. However it does mean the meantime latency for container availability would be higher since the AM will have a grace period before relinquishing the resources. > Container preemption interpreted as task failure > > > Key: MAPREDUCE-4951 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mr-am, mrv2 >Affects Versions: 2.0.2-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951-2.patch, > MAPREDUCE-4951.patch > > > When YARN reports a completed container to the MR AM, it always interprets it > as a failure. This can lead to a job failing because too many of its tasks > failed, when in fact they only failed because the scheduler preempted them. > MR needs to recognize the special exit code value of -100 and interpret it as > a container being killed instead of a container failure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4949) Enable multiple pi jobs to run in parallel
[ https://issues.apache.org/jira/browse/MAPREDUCE-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560708#comment-13560708 ] Hudson commented on MAPREDUCE-4949: --- Integrated in Hadoop-Mapreduce-trunk #1322 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1322/]) MAPREDUCE-4949. Enable multiple pi jobs to run in parallel. (sandyr via tucu) (Revision 1437029) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437029 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/QuasiMonteCarlo.java > Enable multiple pi jobs to run in parallel > -- > > Key: MAPREDUCE-4949 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4949 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: examples >Affects Versions: 2.0.2-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza >Priority: Minor > Fix For: 2.0.3-alpha > > Attachments: MAPREDUCE-4949.patch > > > Currently the hadoop pi example uses a hardcoded temporary directory to store > its inputs and outputs. This makes it so that only one pi job can run at a > time, and that if it is cancelled, the temporary directory must be manually > deleted. > I propose using a temporary directory based on a timestamp and random number > to avoid these conflicts -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560703#comment-13560703 ] Hudson commented on MAPREDUCE-4946: --- Integrated in Hadoop-Mapreduce-trunk #1322 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1322/]) MAPREDUCE-4946. Fix a performance problem for large jobs by reducing the number of map completion event type conversions. Contributed by Jason Lowe. (Revision 1437103) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437103 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockJobs.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CompletedJob.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/MockHistoryJobs.java > Type conversion of map completion events leads to performance problems with > large jobs > -- > > Key: MAPREDUCE-4946 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.2-alpha, 0.23.5 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Critical > Fix For: 2.0.3-alpha, 0.23.7 > > Attachments: MAPREDUCE-4946-branch-0.23.patch, MAPREDUCE-4946.patch > > > We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where > reducers fail to connect back to the AM after being launched due to > connection timeout. Looking at stack traces of the AM during this time we > see a lot of IPC servers stuck waiting for a lock to get the application ID > while type converting the map completion events. What's odd is that normally > getting the application ID should be very cheap, but in this case we're > type-converting thousands of map completion events for *each* reducer > connecting. That means we end up type-converting the map completion events > over 45 million times during the lifetime of the example job (13,000 * 3,500). > We either need to make the type conversion much cheaper (i.e.: lockless or at > least read-write locked) or, even better, store the completion events in a > form that does not require type conversion when serving them up to reducers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations
[ https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560704#comment-13560704 ] Hudson commented on MAPREDUCE-4808: --- Integrated in Hadoop-Mapreduce-trunk #1322 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1322/]) MAPREDUCE-4808. Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations. (masokan via tucu) (Revision 1436936) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1436936 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryMapOutput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryReader.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MapOutput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeThread.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java > Refactor MapOutput and MergeManager to facilitate reuse by Shuffle > implementations > -- > > Key: MAPREDUCE-4808 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Reporter: Arun C Murthy >Assignee: Mariappan Asokan > Fix For: 3.0.0 > > Attachments: COMBO-mapreduce-4809-4812-4808.patch, M4808-0.patch, > M4808-1.patch, mapreduce-4808.patch, mapreduce-4808.patch, > mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, > mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, > MergeManagerPlugin.pdf, MR-4808.patch > > > Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for > alternate implementations to be able to reuse portions of the default > implementation. > This would come with the strong caveat that these classes are LimitedPrivate > and Unstable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4949) Enable multiple pi jobs to run in parallel
[ https://issues.apache.org/jira/browse/MAPREDUCE-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560660#comment-13560660 ] Hudson commented on MAPREDUCE-4949: --- Integrated in Hadoop-Hdfs-trunk #1294 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1294/]) MAPREDUCE-4949. Enable multiple pi jobs to run in parallel. (sandyr via tucu) (Revision 1437029) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437029 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/QuasiMonteCarlo.java > Enable multiple pi jobs to run in parallel > -- > > Key: MAPREDUCE-4949 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4949 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: examples >Affects Versions: 2.0.2-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza >Priority: Minor > Fix For: 2.0.3-alpha > > Attachments: MAPREDUCE-4949.patch > > > Currently the hadoop pi example uses a hardcoded temporary directory to store > its inputs and outputs. This makes it so that only one pi job can run at a > time, and that if it is cancelled, the temporary directory must be manually > deleted. > I propose using a temporary directory based on a timestamp and random number > to avoid these conflicts -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations
[ https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560656#comment-13560656 ] Hudson commented on MAPREDUCE-4808: --- Integrated in Hadoop-Hdfs-trunk #1294 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1294/]) MAPREDUCE-4808. Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations. (masokan via tucu) (Revision 1436936) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1436936 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryMapOutput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryReader.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MapOutput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeThread.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java > Refactor MapOutput and MergeManager to facilitate reuse by Shuffle > implementations > -- > > Key: MAPREDUCE-4808 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Reporter: Arun C Murthy >Assignee: Mariappan Asokan > Fix For: 3.0.0 > > Attachments: COMBO-mapreduce-4809-4812-4808.patch, M4808-0.patch, > M4808-1.patch, mapreduce-4808.patch, mapreduce-4808.patch, > mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, > mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, > MergeManagerPlugin.pdf, MR-4808.patch > > > Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for > alternate implementations to be able to reuse portions of the default > implementation. > This would come with the strong caveat that these classes are LimitedPrivate > and Unstable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560655#comment-13560655 ] Hudson commented on MAPREDUCE-4946: --- Integrated in Hadoop-Hdfs-trunk #1294 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1294/]) MAPREDUCE-4946. Fix a performance problem for large jobs by reducing the number of map completion event type conversions. Contributed by Jason Lowe. (Revision 1437103) Result = FAILURE sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437103 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockJobs.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CompletedJob.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/MockHistoryJobs.java > Type conversion of map completion events leads to performance problems with > large jobs > -- > > Key: MAPREDUCE-4946 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.2-alpha, 0.23.5 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Critical > Fix For: 2.0.3-alpha, 0.23.7 > > Attachments: MAPREDUCE-4946-branch-0.23.patch, MAPREDUCE-4946.patch > > > We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where > reducers fail to connect back to the AM after being launched due to > connection timeout. Looking at stack traces of the AM during this time we > see a lot of IPC servers stuck waiting for a lock to get the application ID > while type converting the map completion events. What's odd is that normally > getting the application ID should be very cheap, but in this case we're > type-converting thousands of map completion events for *each* reducer > connecting. That means we end up type-converting the map completion events > over 45 million times during the lifetime of the example job (13,000 * 3,500). > We either need to make the type conversion much cheaper (i.e.: lockless or at > least read-write locked) or, even better, store the completion events in a > form that does not require type conversion when serving them up to reducers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560645#comment-13560645 ] Hudson commented on MAPREDUCE-4946: --- Integrated in Hadoop-Hdfs-0.23-Build #503 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/503/]) MAPREDUCE-4946. Fix a performance problem for large jobs by reducing the number of map completion event type conversions. Contributed by Jason Lowe. (Revision 1437105) Result = FAILURE sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437105 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockJobs.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CompletedJob.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/MockHistoryJobs.java > Type conversion of map completion events leads to performance problems with > large jobs > -- > > Key: MAPREDUCE-4946 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.2-alpha, 0.23.5 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Critical > Fix For: 2.0.3-alpha, 0.23.7 > > Attachments: MAPREDUCE-4946-branch-0.23.patch, MAPREDUCE-4946.patch > > > We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where > reducers fail to connect back to the AM after being launched due to > connection timeout. Looking at stack traces of the AM during this time we > see a lot of IPC servers stuck waiting for a lock to get the application ID > while type converting the map completion events. What's odd is that normally > getting the application ID should be very cheap, but in this case we're > type-converting thousands of map completion events for *each* reducer > connecting. That means we end up type-converting the map completion events > over 45 million times during the lifetime of the example job (13,000 * 3,500). > We either need to make the type conversion much cheaper (i.e.: lockless or at > least read-write locked) or, even better, store the completion events in a > form that does not require type conversion when serving them up to reducers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4929) mapreduce.task.timeout is ignored
[ https://issues.apache.org/jira/browse/MAPREDUCE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated MAPREDUCE-4929: - Resolution: Fixed Fix Version/s: 1.2.0 Target Version/s: (was: 1.1.2) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I just committed this. Thanks, Sandy! > mapreduce.task.timeout is ignored > - > > Key: MAPREDUCE-4929 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4929 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1 >Affects Versions: 1.1.1 >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Fix For: 1.2.0 > > Attachments: MAPREDUCE-4929-branch-1.patch > > > In MR1, only mapred.task.timeout works. Both should be made to work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4949) Enable multiple pi jobs to run in parallel
[ https://issues.apache.org/jira/browse/MAPREDUCE-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560552#comment-13560552 ] Hudson commented on MAPREDUCE-4949: --- Integrated in Hadoop-Yarn-trunk #105 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/105/]) MAPREDUCE-4949. Enable multiple pi jobs to run in parallel. (sandyr via tucu) (Revision 1437029) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437029 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/QuasiMonteCarlo.java > Enable multiple pi jobs to run in parallel > -- > > Key: MAPREDUCE-4949 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4949 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: examples >Affects Versions: 2.0.2-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza >Priority: Minor > Fix For: 2.0.3-alpha > > Attachments: MAPREDUCE-4949.patch > > > Currently the hadoop pi example uses a hardcoded temporary directory to store > its inputs and outputs. This makes it so that only one pi job can run at a > time, and that if it is cancelled, the temporary directory must be manually > deleted. > I propose using a temporary directory based on a timestamp and random number > to avoid these conflicts -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560547#comment-13560547 ] Hudson commented on MAPREDUCE-4946: --- Integrated in Hadoop-Yarn-trunk #105 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/105/]) MAPREDUCE-4946. Fix a performance problem for large jobs by reducing the number of map completion event type conversions. Contributed by Jason Lowe. (Revision 1437103) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437103 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockJobs.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CompletedJob.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/MockHistoryJobs.java > Type conversion of map completion events leads to performance problems with > large jobs > -- > > Key: MAPREDUCE-4946 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.2-alpha, 0.23.5 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Critical > Fix For: 2.0.3-alpha, 0.23.7 > > Attachments: MAPREDUCE-4946-branch-0.23.patch, MAPREDUCE-4946.patch > > > We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where > reducers fail to connect back to the AM after being launched due to > connection timeout. Looking at stack traces of the AM during this time we > see a lot of IPC servers stuck waiting for a lock to get the application ID > while type converting the map completion events. What's odd is that normally > getting the application ID should be very cheap, but in this case we're > type-converting thousands of map completion events for *each* reducer > connecting. That means we end up type-converting the map completion events > over 45 million times during the lifetime of the example job (13,000 * 3,500). > We either need to make the type conversion much cheaper (i.e.: lockless or at > least read-write locked) or, even better, store the completion events in a > form that does not require type conversion when serving them up to reducers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations
[ https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560548#comment-13560548 ] Hudson commented on MAPREDUCE-4808: --- Integrated in Hadoop-Yarn-trunk #105 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/105/]) MAPREDUCE-4808. Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations. (masokan via tucu) (Revision 1436936) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1436936 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryMapOutput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryReader.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MapOutput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeThread.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java > Refactor MapOutput and MergeManager to facilitate reuse by Shuffle > implementations > -- > > Key: MAPREDUCE-4808 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Reporter: Arun C Murthy >Assignee: Mariappan Asokan > Fix For: 3.0.0 > > Attachments: COMBO-mapreduce-4809-4812-4808.patch, M4808-0.patch, > M4808-1.patch, mapreduce-4808.patch, mapreduce-4808.patch, > mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, > mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, > MergeManagerPlugin.pdf, MR-4808.patch > > > Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for > alternate implementations to be able to reuse portions of the default > implementation. > This would come with the strong caveat that these classes are LimitedPrivate > and Unstable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira