[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local

2013-01-23 Thread yi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yi updated MAPREDUCE-4957:
--

Attachment: MAPREDUCE-4957.patch

> Throw FileNotFoundException when running in single node and 
> "mapreduce.framework.name" is local
> ---
>
> Key: MAPREDUCE-4957
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: yi
>Assignee: yi
>Priority: Minor
> Attachments: MAPREDUCE-4957.patch, MAPREDUCE-4957.patch
>
>
> Run in single node and "mapreduce.framework.name" is local, and get following 
> error:
> java.io.FileNotFoundException: File does not exist: 
> /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
>  
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
>  
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
>  
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
>  
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
>  
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
> at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
> at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
> Job Submission failed with exception 'java.io.FileNotFoundException(File does 
> not exist: 
> /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

2013-01-23 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561449#comment-13561449
 ] 

Sandy Ryza commented on MAPREDUCE-4951:
---

It doesn't seem to me that either approach would conflict with this patch at 
the moment.  While this code might get rewritten in the future, under the 
current preemption mechanism, when MR is explicitly told that a container was 
preempted, it should not count it as failed.  Does anybody disagree?

> Container preemption interpreted as task failure
> 
>
> Key: MAPREDUCE-4951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mr-am, mrv2
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951-2.patch, 
> MAPREDUCE-4951.patch
>
>
> When YARN reports a completed container to the MR AM, it always interprets it 
> as a failure.  This can lead to a job failing because too many of its tasks 
> failed, when in fact they only failed because the scheduler preempted them.
> MR needs to recognize the special exit code value of -100 and interpret it as 
> a container being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4958) close method of RawKeyValueIterator is not called after after finish using.

2013-01-23 Thread Jerry Chen (JIRA)
Jerry Chen created MAPREDUCE-4958:
-

 Summary: close method of RawKeyValueIterator is not called after 
after finish using.
 Key: MAPREDUCE-4958
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4958
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: trunk
Reporter: Jerry Chen


I observed that the close method of the RawKeyValueIterator returned from 
MergeManager is not called.

Which will cause resource leaks for RawKeyValueIterator implementation which 
depends on the RawKeyValueIterator.close for doing cleanup when finished.

Some other places in MapTask also not follow the convension to call 
RawKeyValueIterator.close after use it. 



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local

2013-01-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561407#comment-13561407
 ] 

Hadoop QA commented on MAPREDUCE-4957:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12566240/MAPREDUCE-4957.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 2024 javac 
compiler warnings (more than the trunk's current 2021 warnings).

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3265//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3265//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3265//console

This message is automatically generated.

> Throw FileNotFoundException when running in single node and 
> "mapreduce.framework.name" is local
> ---
>
> Key: MAPREDUCE-4957
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: yi
>Assignee: yi
>Priority: Minor
> Attachments: MAPREDUCE-4957.patch
>
>
> Run in single node and "mapreduce.framework.name" is local, and get following 
> error:
> java.io.FileNotFoundException: File does not exist: 
> /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
>  
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
>  
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
>  
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
>  
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
>  
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
> at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
> at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
> Job Submission failed with exception 'java.io.FileNotFoundException(File does 
> not exist: 
> /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

--
Thi

[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local

2013-01-23 Thread yi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yi updated MAPREDUCE-4957:
--

Description: 
Run in single node and "mapreduce.framework.name" is local, and get following 
error:

java.io.FileNotFoundException: File does not exist: 
/root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
 
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
 
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
 
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
 
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 
at java.lang.reflect.Method.invoke(Method.java:597) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
Job Submission failed with exception 'java.io.FileNotFoundException(File does 
not exist: 
/root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

  was:
java.io.FileNotFoundException: File does not exist: 
/root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
 
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
 
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
 
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
 
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 
at java.lang.reflect.Method.invoke(Method.java:597) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
Job Submission failed with exception 'java.io.FileN

[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local

2013-01-23 Thread yi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yi updated MAPREDUCE-4957:
--

Status: Patch Available  (was: Open)

Patch has been attached, please help to review.

> Throw FileNotFoundException when running in single node and 
> "mapreduce.framework.name" is local
> ---
>
> Key: MAPREDUCE-4957
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: yi
>Assignee: yi
>Priority: Minor
> Attachments: MAPREDUCE-4957.patch
>
>
> Run in single node and "mapreduce.framework.name" is local, and get following 
> error:
> java.io.FileNotFoundException: File does not exist: 
> /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
>  
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
>  
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
>  
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
>  
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
>  
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
> at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
> at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
> Job Submission failed with exception 'java.io.FileNotFoundException(File does 
> not exist: 
> /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local

2013-01-23 Thread yi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yi updated MAPREDUCE-4957:
--

Attachment: MAPREDUCE-4957.patch

> Throw FileNotFoundException when running in single node and 
> "mapreduce.framework.name" is local
> ---
>
> Key: MAPREDUCE-4957
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: yi
>Assignee: yi
>Priority: Minor
> Attachments: MAPREDUCE-4957.patch
>
>
> java.io.FileNotFoundException: File does not exist: 
> /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
>  
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
>  
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
>  
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
>  
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
>  
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
> at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
> at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
> Job Submission failed with exception 'java.io.FileNotFoundException(File does 
> not exist: 
> /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local

2013-01-23 Thread yi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561367#comment-13561367
 ] 

yi commented on MAPREDUCE-4957:
---

The file path added by "DistributedCache.addFileToClassPath" should be 
qualified by correct filesystem.

> Throw FileNotFoundException when running in single node and 
> "mapreduce.framework.name" is local
> ---
>
> Key: MAPREDUCE-4957
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: yi
>Assignee: yi
>Priority: Minor
>
> java.io.FileNotFoundException: File does not exist: 
> /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
>  
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
>  
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
>  
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
>  
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
>  
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
>  
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
> at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
> at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  
> at java.lang.reflect.Method.invoke(Method.java:597) 
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
> Job Submission failed with exception 'java.io.FileNotFoundException(File does 
> not exist: 
> /root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4957) Throw FileNotFoundException when running in single node and "mapreduce.framework.name" is local

2013-01-23 Thread yi (JIRA)
yi created MAPREDUCE-4957:
-

 Summary: Throw FileNotFoundException when running in single node 
and "mapreduce.framework.name" is local
 Key: MAPREDUCE-4957
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4957
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: yi
Assignee: yi
Priority: Minor


java.io.FileNotFoundException: File does not exist: 
/root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar 
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:772)
 
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
 
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:254)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:292)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:365)
 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
 
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1450)
 
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:446) 
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:683) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 
at java.lang.reflect.Method.invoke(Method.java:597) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
Job Submission failed with exception 'java.io.FileNotFoundException(File does 
not exist: 
/root/proj/hive-trunk/build/dist/lib/hive-builtins-0.11.0-SNAPSHOT.jar)'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4838) Add extra info to JH files

2013-01-23 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated MAPREDUCE-4838:
---

Attachment: MAPREDUCE-4838_3.patch

Hi Sid,

your new comments are addressed. Please have a look at the latest attached 
patch file.

Thanks,
Zhijie

> Add extra info to JH files
> --
>
> Key: MAPREDUCE-4838
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4838
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Zhijie Shen
> Attachments: MAPREDUCE-4838_1.patch, MAPREDUCE-4838_2.patch, 
> MAPREDUCE-4838_3.patch, MAPREDUCE-4838.patch
>
>
> It will be useful to add more task-info to JH for analytics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4956) The Additional JH Info Should Be Exposed

2013-01-23 Thread Zhijie Shen (JIRA)
Zhijie Shen created MAPREDUCE-4956:
--

 Summary: The Additional JH Info Should Be Exposed
 Key: MAPREDUCE-4956
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4956
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen


In MAPREDUCE-4838, the addition info has been added to JH. This info is useful 
to be exposed, at least via UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4838) Add extra info to JH files

2013-01-23 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561105#comment-13561105
 ] 

Siddharth Seth commented on MAPREDUCE-4838:
---

Took a quick look. This patch looks better but still needs some fixes.
- Unit tests should use the new properties defined in MRJobConfig.*
- In case of Reduce tasks, the container host doesn't need to be resolved (or 
in the case where dataLocalHosts is empty)
- The null check is still required in the history events - since these values 
don't need to be set.
- TaskAttemptImpl has a repeated dataLocalHosts assignment
- RMContainerAllocator has an unused import
- Formatting fix in TestJobImpl (patch line 377)
- Since resolveHosts has been changed to work with a hashSet - TaskAttemptImpl 
itself could store this. Instead of iterating over an array to match the 
container host.

Most of the test changes look good as well. Need to take a better look at some 
of them though. Thanks.

> Add extra info to JH files
> --
>
> Key: MAPREDUCE-4838
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4838
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Arun C Murthy
>Assignee: Zhijie Shen
> Attachments: MAPREDUCE-4838_1.patch, MAPREDUCE-4838_2.patch, 
> MAPREDUCE-4838.patch
>
>
> It will be useful to add more task-info to JH for analytics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service

2013-01-23 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560916#comment-13560916
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4049:
---

Avner, sounds good. We can do another review iteration on your updated patch, 
it will be easier looking at concrete code. Thx

> plugin for generic shuffle service
> --
>
> Key: MAPREDUCE-4049
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: performance, task, tasktracker
>Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0
>Reporter: Avner BenHanoch
>Assignee: Avner BenHanoch
>  Labels: merge, plugin, rdma, shuffle
> Fix For: 3.0.0
>
> Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, 
> MAPREDUCE-4049--branch-1.patch, mapreduce-4049.patch
>
>
> Support generic shuffle service as set of two plugins: ShuffleProvider & 
> ShuffleConsumer.
> This will satisfy the following needs:
> # Better shuffle and merge performance. For example: we are working on 
> shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, 
> or Infiniband) instead of using the current HTTP shuffle. Based on the fast 
> RDMA shuffle, the plugin can also utilize a suitable merge approach during 
> the intermediate merges. Hence, getting much better performance.
> # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden 
> dependency of NodeManager with a specific version of mapreduce shuffle 
> (currently targeted to 0.24.0).
> References:
> # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu 
> from Auburn University with others, 
> [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
> # I am attaching 2 documents with suggested Top Level Design for both plugins 
> (currently, based on 1.0 branch)
> # I am providing link for downloading UDA - Mellanox's open source plugin 
> that implements generic shuffle service using RDMA and levitated merge.  
> Note: At this phase, the code is in C++ through JNI and you should consider 
> it as beta only.  Still, it can serve anyone that wants to implement or 
> contribute to levitated merge. (Please be advised that levitated merge is 
> mostly suit in very fast networks) - 
> [http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=144&menu_section=69]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

2013-01-23 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560903#comment-13560903
 ] 

Bikas Saha commented on MAPREDUCE-4951:
---

We might be digressing from this jira here. But I really dont think the 2-step 
approach is worth its complexity. The main scenario where it makes sense is 
when the task has an ability to checkpoint its work before getting preempted. I 
havent seen this capability outside of basic research prototypes. Its much 
simpler to have the preemption be an RM only action. We do need to fix the 
action and information loop so that AM's can get correct information about the 
infrastructure's actions.

> Container preemption interpreted as task failure
> 
>
> Key: MAPREDUCE-4951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mr-am, mrv2
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951-2.patch, 
> MAPREDUCE-4951.patch
>
>
> When YARN reports a completed container to the MR AM, it always interprets it 
> as a failure.  This can lead to a job failing because too many of its tasks 
> failed, when in fact they only failed because the scheduler preempted them.
> MR needs to recognize the special exit code value of -100 and interpret it as 
> a container being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4951) Container preemption interpreted as task failure

2013-01-23 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560736#comment-13560736
 ] 

Jason Lowe commented on MAPREDUCE-4951:
---

bq. having the RM ask the AM to kill the container in case of preemption would 
likely not work as the AM cannot be trusted.

Agreed, I was thinking of exactly the alternative you propose where preemption 
has potentially two phases, a "please AM, preempt that container you have" with 
a watchdog timer to have the RM kill it forcefully if the AM does not comply in 
a reasonable amount of time.  This eliminates the race where the container can 
fail because of the preemption and provides a way for the AM to potentially 
checkpoint the state of the container for faster recovery.  However it does 
mean the meantime latency for container availability would be higher since the 
AM will have a grace period before relinquishing the resources.

> Container preemption interpreted as task failure
> 
>
> Key: MAPREDUCE-4951
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4951
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mr-am, mrv2
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: MAPREDUCE-4951-1.patch, MAPREDUCE-4951-2.patch, 
> MAPREDUCE-4951.patch
>
>
> When YARN reports a completed container to the MR AM, it always interprets it 
> as a failure.  This can lead to a job failing because too many of its tasks 
> failed, when in fact they only failed because the scheduler preempted them.
> MR needs to recognize the special exit code value of -100 and interpret it as 
> a container being killed instead of a container failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4949) Enable multiple pi jobs to run in parallel

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560708#comment-13560708
 ] 

Hudson commented on MAPREDUCE-4949:
---

Integrated in Hadoop-Mapreduce-trunk #1322 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1322/])
MAPREDUCE-4949. Enable multiple pi jobs to run in parallel. (sandyr via 
tucu) (Revision 1437029)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437029
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/QuasiMonteCarlo.java


> Enable multiple pi jobs to run in parallel
> --
>
> Key: MAPREDUCE-4949
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4949
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: examples
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>Priority: Minor
> Fix For: 2.0.3-alpha
>
> Attachments: MAPREDUCE-4949.patch
>
>
> Currently the hadoop pi example uses a hardcoded temporary directory to store 
> its inputs and outputs.  This makes it so that only one pi job can run at a 
> time, and that if it is cancelled, the temporary directory must be manually 
> deleted.
> I propose using a temporary directory based on a timestamp and random number 
> to avoid these conflicts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560703#comment-13560703
 ] 

Hudson commented on MAPREDUCE-4946:
---

Integrated in Hadoop-Mapreduce-trunk #1322 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1322/])
MAPREDUCE-4946. Fix a performance problem for large jobs by reducing the 
number of map completion event type conversions. Contributed by Jason Lowe. 
(Revision 1437103)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437103
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockJobs.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CompletedJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/MockHistoryJobs.java


> Type conversion of map completion events leads to performance problems with 
> large jobs
> --
>
> Key: MAPREDUCE-4946
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 2.0.3-alpha, 0.23.7
>
> Attachments: MAPREDUCE-4946-branch-0.23.patch, MAPREDUCE-4946.patch
>
>
> We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where 
> reducers fail to connect back to the AM after being launched due to 
> connection timeout.  Looking at stack traces of the AM during this time we 
> see a lot of IPC servers stuck waiting for a lock to get the application ID 
> while type converting the map completion events.  What's odd is that normally 
> getting the application ID should be very cheap, but in this case we're 
> type-converting thousands of map completion events for *each* reducer 
> connecting.  That means we end up type-converting the map completion events 
> over 45 million times during the lifetime of the example job (13,000 * 3,500).
> We either need to make the type conversion much cheaper (i.e.: lockless or at 
> least read-write locked) or, even better, store the completion events in a 
> form that does not require type conversion when serving them up to reducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560704#comment-13560704
 ] 

Hudson commented on MAPREDUCE-4808:
---

Integrated in Hadoop-Mapreduce-trunk #1322 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1322/])
MAPREDUCE-4808. Refactor MapOutput and MergeManager to facilitate reuse by 
Shuffle implementations. (masokan via tucu) (Revision 1436936)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1436936
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryReader.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeThread.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java


> Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
> implementations
> --
>
> Key: MAPREDUCE-4808
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Arun C Murthy
>Assignee: Mariappan Asokan
> Fix For: 3.0.0
>
> Attachments: COMBO-mapreduce-4809-4812-4808.patch, M4808-0.patch, 
> M4808-1.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
> mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
> mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
> MergeManagerPlugin.pdf, MR-4808.patch
>
>
> Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
> alternate implementations to be able to reuse portions of the default 
> implementation. 
> This would come with the strong caveat that these classes are LimitedPrivate 
> and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4949) Enable multiple pi jobs to run in parallel

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560660#comment-13560660
 ] 

Hudson commented on MAPREDUCE-4949:
---

Integrated in Hadoop-Hdfs-trunk #1294 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1294/])
MAPREDUCE-4949. Enable multiple pi jobs to run in parallel. (sandyr via 
tucu) (Revision 1437029)

 Result = FAILURE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437029
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/QuasiMonteCarlo.java


> Enable multiple pi jobs to run in parallel
> --
>
> Key: MAPREDUCE-4949
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4949
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: examples
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>Priority: Minor
> Fix For: 2.0.3-alpha
>
> Attachments: MAPREDUCE-4949.patch
>
>
> Currently the hadoop pi example uses a hardcoded temporary directory to store 
> its inputs and outputs.  This makes it so that only one pi job can run at a 
> time, and that if it is cancelled, the temporary directory must be manually 
> deleted.
> I propose using a temporary directory based on a timestamp and random number 
> to avoid these conflicts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560656#comment-13560656
 ] 

Hudson commented on MAPREDUCE-4808:
---

Integrated in Hadoop-Hdfs-trunk #1294 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1294/])
MAPREDUCE-4808. Refactor MapOutput and MergeManager to facilitate reuse by 
Shuffle implementations. (masokan via tucu) (Revision 1436936)

 Result = FAILURE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1436936
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryReader.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeThread.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java


> Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
> implementations
> --
>
> Key: MAPREDUCE-4808
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Arun C Murthy
>Assignee: Mariappan Asokan
> Fix For: 3.0.0
>
> Attachments: COMBO-mapreduce-4809-4812-4808.patch, M4808-0.patch, 
> M4808-1.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
> mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
> mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
> MergeManagerPlugin.pdf, MR-4808.patch
>
>
> Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
> alternate implementations to be able to reuse portions of the default 
> implementation. 
> This would come with the strong caveat that these classes are LimitedPrivate 
> and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560655#comment-13560655
 ] 

Hudson commented on MAPREDUCE-4946:
---

Integrated in Hadoop-Hdfs-trunk #1294 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1294/])
MAPREDUCE-4946. Fix a performance problem for large jobs by reducing the 
number of map completion event type conversions. Contributed by Jason Lowe. 
(Revision 1437103)

 Result = FAILURE
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437103
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockJobs.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CompletedJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/MockHistoryJobs.java


> Type conversion of map completion events leads to performance problems with 
> large jobs
> --
>
> Key: MAPREDUCE-4946
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 2.0.3-alpha, 0.23.7
>
> Attachments: MAPREDUCE-4946-branch-0.23.patch, MAPREDUCE-4946.patch
>
>
> We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where 
> reducers fail to connect back to the AM after being launched due to 
> connection timeout.  Looking at stack traces of the AM during this time we 
> see a lot of IPC servers stuck waiting for a lock to get the application ID 
> while type converting the map completion events.  What's odd is that normally 
> getting the application ID should be very cheap, but in this case we're 
> type-converting thousands of map completion events for *each* reducer 
> connecting.  That means we end up type-converting the map completion events 
> over 45 million times during the lifetime of the example job (13,000 * 3,500).
> We either need to make the type conversion much cheaper (i.e.: lockless or at 
> least read-write locked) or, even better, store the completion events in a 
> form that does not require type conversion when serving them up to reducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560645#comment-13560645
 ] 

Hudson commented on MAPREDUCE-4946:
---

Integrated in Hadoop-Hdfs-0.23-Build #503 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/503/])
MAPREDUCE-4946. Fix a performance problem for large jobs by reducing the 
number of map completion event type conversions. Contributed by Jason Lowe. 
(Revision 1437105)

 Result = FAILURE
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437105
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockJobs.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CompletedJob.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/MockHistoryJobs.java


> Type conversion of map completion events leads to performance problems with 
> large jobs
> --
>
> Key: MAPREDUCE-4946
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 2.0.3-alpha, 0.23.7
>
> Attachments: MAPREDUCE-4946-branch-0.23.patch, MAPREDUCE-4946.patch
>
>
> We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where 
> reducers fail to connect back to the AM after being launched due to 
> connection timeout.  Looking at stack traces of the AM during this time we 
> see a lot of IPC servers stuck waiting for a lock to get the application ID 
> while type converting the map completion events.  What's odd is that normally 
> getting the application ID should be very cheap, but in this case we're 
> type-converting thousands of map completion events for *each* reducer 
> connecting.  That means we end up type-converting the map completion events 
> over 45 million times during the lifetime of the example job (13,000 * 3,500).
> We either need to make the type conversion much cheaper (i.e.: lockless or at 
> least read-write locked) or, even better, store the completion events in a 
> form that does not require type conversion when serving them up to reducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4929) mapreduce.task.timeout is ignored

2013-01-23 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-4929:
-

  Resolution: Fixed
   Fix Version/s: 1.2.0
Target Version/s:   (was: 1.1.2)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Sandy!

> mapreduce.task.timeout is ignored
> -
>
> Key: MAPREDUCE-4929
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4929
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: 1.1.1
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 1.2.0
>
> Attachments: MAPREDUCE-4929-branch-1.patch
>
>
> In MR1, only mapred.task.timeout works.  Both should be made to work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4949) Enable multiple pi jobs to run in parallel

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560552#comment-13560552
 ] 

Hudson commented on MAPREDUCE-4949:
---

Integrated in Hadoop-Yarn-trunk #105 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/105/])
MAPREDUCE-4949. Enable multiple pi jobs to run in parallel. (sandyr via 
tucu) (Revision 1437029)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437029
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/QuasiMonteCarlo.java


> Enable multiple pi jobs to run in parallel
> --
>
> Key: MAPREDUCE-4949
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4949
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: examples
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>Priority: Minor
> Fix For: 2.0.3-alpha
>
> Attachments: MAPREDUCE-4949.patch
>
>
> Currently the hadoop pi example uses a hardcoded temporary directory to store 
> its inputs and outputs.  This makes it so that only one pi job can run at a 
> time, and that if it is cancelled, the temporary directory must be manually 
> deleted.
> I propose using a temporary directory based on a timestamp and random number 
> to avoid these conflicts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4946) Type conversion of map completion events leads to performance problems with large jobs

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560547#comment-13560547
 ] 

Hudson commented on MAPREDUCE-4946:
---

Integrated in Hadoop-Yarn-trunk #105 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/105/])
MAPREDUCE-4946. Fix a performance problem for large jobs by reducing the 
number of map completion event type conversions. Contributed by Jason Lowe. 
(Revision 1437103)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1437103
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/Job.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MockJobs.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFetchFailure.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRuntimeEstimators.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/CompletedJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/PartialJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/MockHistoryJobs.java


> Type conversion of map completion events leads to performance problems with 
> large jobs
> --
>
> Key: MAPREDUCE-4946
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 2.0.3-alpha, 0.23.7
>
> Attachments: MAPREDUCE-4946-branch-0.23.patch, MAPREDUCE-4946.patch
>
>
> We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where 
> reducers fail to connect back to the AM after being launched due to 
> connection timeout.  Looking at stack traces of the AM during this time we 
> see a lot of IPC servers stuck waiting for a lock to get the application ID 
> while type converting the map completion events.  What's odd is that normally 
> getting the application ID should be very cheap, but in this case we're 
> type-converting thousands of map completion events for *each* reducer 
> connecting.  That means we end up type-converting the map completion events 
> over 45 million times during the lifetime of the example job (13,000 * 3,500).
> We either need to make the type conversion much cheaper (i.e.: lockless or at 
> least read-write locked) or, even better, store the completion events in a 
> form that does not require type conversion when serving them up to reducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4808) Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560548#comment-13560548
 ] 

Hudson commented on MAPREDUCE-4808:
---

Integrated in Hadoop-Yarn-trunk #105 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/105/])
MAPREDUCE-4808. Refactor MapOutput and MergeManager to facilitate reuse by 
Shuffle implementations. (masokan via tucu) (Revision 1436936)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1436936
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/dev-support/findbugs-exclude.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/InMemoryReader.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeThread.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestFetcher.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java


> Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
> implementations
> --
>
> Key: MAPREDUCE-4808
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Arun C Murthy
>Assignee: Mariappan Asokan
> Fix For: 3.0.0
>
> Attachments: COMBO-mapreduce-4809-4812-4808.patch, M4808-0.patch, 
> M4808-1.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
> mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
> mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
> MergeManagerPlugin.pdf, MR-4808.patch
>
>
> Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
> alternate implementations to be able to reuse portions of the default 
> implementation. 
> This would come with the strong caveat that these classes are LimitedPrivate 
> and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira