[jira] [Commented] (MAPREDUCE-5655) Remote job submit from windows to a linux hadoop cluster fails due to wrong classpath
[ https://issues.apache.org/jira/browse/MAPREDUCE-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964967#comment-13964967 ] Jian He commented on MAPREDUCE-5655: Please refer to MAPREDUCE-4052 for the fix, the patch uploaded here is dead. > Remote job submit from windows to a linux hadoop cluster fails due to wrong > classpath > - > > Key: MAPREDUCE-5655 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5655 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client, job submission >Affects Versions: 2.2.0, 2.3.0 > Environment: Client machine is a Windows 7 box, with Eclipse > Remote: there is a multi node hadoop cluster, installed on Ubuntu boxes (any > linux) >Reporter: Attila Pados >Assignee: Joyoung Zhang > Attachments: MRApps.patch, YARNRunner.patch > > > I was trying to run a java class on my client, windows 7 developer > environment, which submits a job to the remote Hadoop cluster, initiates a > mapreduce there, and then downloads the results back to the local machine. > General use case is to use hadoop services from a web application installed > on a non-cluster computer, or as part of a developer environment. > The problem was, that the ApplicationMaster's startup shell script > (launch_container.sh) was generated with wrong CLASSPATH entry. Together with > the java process call on the bottom of the file, these entries were generated > in windows style, using % as shell variable marker and ; as the CLASSPATH > delimiter. > I tracked down the root cause, and found that the MrApps.java, and the > YarnRunner.java classes create these entries, and is passed forward to the > ApplicationMaster, assuming that the OS that runs these classes will match > the one running the ApplicationMaster. But it's not the case, these are in 2 > different jvm, and also the OS can be different, the strings are generated > based on the client/submitter side's OS. > I made some workaround changes to these 2 files, so i could launch my job, > however there may be more problems ahead. > update > error message: > 13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with > state FAILED due to: Application application_1386170530016_0001 failed 2 > times due to AM Container for appattempt_1386170530016_0001_02 exited > with exitCode: 1 due to: Exception from container-launch: > org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job > control > at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) > at org.apache.hadoop.util.Shell.run(Shell.java:379) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:724) > update2: > It also reqires to add the following property to > mapred-site.xml (or mapred-default.xml), on the windows box, so that the job > launcher knows, that the job runner will be a linux: > > mapred.remote.os > Linux > Remote MapReduce framework's OS, can be either Linux or > Windows > without this entry, the patched jar does the same as the unpatched, so it's > required to work! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MAPREDUCE-5655) Remote job submit from windows to a linux hadoop cluster fails due to wrong classpath
[ https://issues.apache.org/jira/browse/MAPREDUCE-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joyoung Zhang reassigned MAPREDUCE-5655: Assignee: Joyoung Zhang > Remote job submit from windows to a linux hadoop cluster fails due to wrong > classpath > - > > Key: MAPREDUCE-5655 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5655 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client, job submission >Affects Versions: 2.2.0, 2.3.0 > Environment: Client machine is a Windows 7 box, with Eclipse > Remote: there is a multi node hadoop cluster, installed on Ubuntu boxes (any > linux) >Reporter: Attila Pados >Assignee: Joyoung Zhang > Attachments: MRApps.patch, YARNRunner.patch > > > I was trying to run a java class on my client, windows 7 developer > environment, which submits a job to the remote Hadoop cluster, initiates a > mapreduce there, and then downloads the results back to the local machine. > General use case is to use hadoop services from a web application installed > on a non-cluster computer, or as part of a developer environment. > The problem was, that the ApplicationMaster's startup shell script > (launch_container.sh) was generated with wrong CLASSPATH entry. Together with > the java process call on the bottom of the file, these entries were generated > in windows style, using % as shell variable marker and ; as the CLASSPATH > delimiter. > I tracked down the root cause, and found that the MrApps.java, and the > YarnRunner.java classes create these entries, and is passed forward to the > ApplicationMaster, assuming that the OS that runs these classes will match > the one running the ApplicationMaster. But it's not the case, these are in 2 > different jvm, and also the OS can be different, the strings are generated > based on the client/submitter side's OS. > I made some workaround changes to these 2 files, so i could launch my job, > however there may be more problems ahead. > update > error message: > 13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with > state FAILED due to: Application application_1386170530016_0001 failed 2 > times due to AM Container for appattempt_1386170530016_0001_02 exited > with exitCode: 1 due to: Exception from container-launch: > org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job > control > at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) > at org.apache.hadoop.util.Shell.run(Shell.java:379) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:724) > update2: > It also reqires to add the following property to > mapred-site.xml (or mapred-default.xml), on the windows box, so that the job > launcher knows, that the job runner will be a linux: > > mapred.remote.os > Linux > Remote MapReduce framework's OS, can be either Linux or > Windows > without this entry, the patched jar does the same as the unpatched, so it's > required to work! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5655) Remote job submit from windows to a linux hadoop cluster fails due to wrong classpath
[ https://issues.apache.org/jira/browse/MAPREDUCE-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964931#comment-13964931 ] JoyoungZhang commented on MAPREDUCE-5655: - I've tested this patch on 2.2.0,it words fine. It should be pointed out that there are two aspects of the problem 1.It's vargs.add("Linux".equals(remoteOs) ? "$JAVA_HOME/bin/java" : Environment.JAVA_HOME.$() + "/bin/java"); but vargs.add("Linux".equals(remoteOs) ? "$JAVA_HOME/bin/java" : Environment.JAVA_HOME.$()); in YARNRunner.patch 2.you should add "mapreduce.application.classpath" to Configuration on the windows box > Remote job submit from windows to a linux hadoop cluster fails due to wrong > classpath > - > > Key: MAPREDUCE-5655 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5655 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client, job submission >Affects Versions: 2.2.0, 2.3.0 > Environment: Client machine is a Windows 7 box, with Eclipse > Remote: there is a multi node hadoop cluster, installed on Ubuntu boxes (any > linux) >Reporter: Attila Pados > Attachments: MRApps.patch, YARNRunner.patch > > > I was trying to run a java class on my client, windows 7 developer > environment, which submits a job to the remote Hadoop cluster, initiates a > mapreduce there, and then downloads the results back to the local machine. > General use case is to use hadoop services from a web application installed > on a non-cluster computer, or as part of a developer environment. > The problem was, that the ApplicationMaster's startup shell script > (launch_container.sh) was generated with wrong CLASSPATH entry. Together with > the java process call on the bottom of the file, these entries were generated > in windows style, using % as shell variable marker and ; as the CLASSPATH > delimiter. > I tracked down the root cause, and found that the MrApps.java, and the > YarnRunner.java classes create these entries, and is passed forward to the > ApplicationMaster, assuming that the OS that runs these classes will match > the one running the ApplicationMaster. But it's not the case, these are in 2 > different jvm, and also the OS can be different, the strings are generated > based on the client/submitter side's OS. > I made some workaround changes to these 2 files, so i could launch my job, > however there may be more problems ahead. > update > error message: > 13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with > state FAILED due to: Application application_1386170530016_0001 failed 2 > times due to AM Container for appattempt_1386170530016_0001_02 exited > with exitCode: 1 due to: Exception from container-launch: > org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job > control > at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) > at org.apache.hadoop.util.Shell.run(Shell.java:379) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:724) > update2: > It also reqires to add the following property to > mapred-site.xml (or mapred-default.xml), on the windows box, so that the job > launcher knows, that the job runner will be a linux: > > mapred.remote.os > Linux > Remote MapReduce framework's OS, can be either Linux or > Windows > without this entry, the patched jar does the same as the unpatched, so it's > required to work! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5815) Fix NPE in TestMRAppMaster
[ https://issues.apache.org/jira/browse/MAPREDUCE-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964916#comment-13964916 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-5815: Sorry, didn't see this before. Reviewing now.. > Fix NPE in TestMRAppMaster > -- > > Key: MAPREDUCE-5815 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5815 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client, mrv2 >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Akira AJISAKA >Priority: Blocker > Attachments: MAPREDUCE-5815.2.patch, MAPREDUCE-5815.v01.patch > > > Working MAPREDUCE-5813 I stumbled on NPE's in TestMRAppMaster. They seem to > be introduced by MAPREDUCE-5805. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5465) Container killed before hprof dumps profile.out
[ https://issues.apache.org/jira/browse/MAPREDUCE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964753#comment-13964753 ] Ming Ma commented on MAPREDUCE-5465: Thanks Jason for the review. I will upload the updated patch soon. Want to comment on the couple points you mentioned. 1. Yes, putting finishTaskMonitor under TaskAttemptListenerImpl isn't clean, given TaskAttemptListenerImpl should only deal with TaskUmbilicalProtocol related. I will move it out to AppContext layer. 2. Handling of TA_FAILMSG event. TA_FAILMSG can be triggered by task JVM as well as user via "hadoop job -fail-task command". For the case where task JVM reports failure, yes, it can wait for the container to exit. For the case where end users send the command, it will need to clean up the container right away. I skipped that for simplicity. If we want to support that, it seems we will need a new event like TA_FAILMSG_BY_USER. 3. Why are we transitioning from FINISHING_CONTAINER to SUCCESS_CONTAINER_CLEANUP rather than to SUCCEEDED when we receive a container completed event? It was done for simplicity so that all successful states will go to SUCCESS_CONTAINER_CLEANUP first. But I agree it can go directly to SUCCEEDED when we receive a container completed event. > Container killed before hprof dumps profile.out > --- > > Key: MAPREDUCE-5465 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5465 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am, mrv2 >Affects Versions: trunk, 2.0.3-alpha >Reporter: Radim Kolar >Assignee: Ming Ma > Attachments: MAPREDUCE-5465-2.patch, MAPREDUCE-5465-3.patch, > MAPREDUCE-5465.patch > > > If there is profiling enabled for mapper or reducer then hprof dumps > profile.out at process exit. It is dumped after task signaled to AM that work > is finished. > AM kills container with finished work without waiting for hprof to finish > dumps. If hprof is dumping larger outputs (such as with depth=4 while depth=3 > works) , it could not finish dump in time before being killed making entire > dump unusable because cpu and heap stats are missing. > There needs to be better delay before container is killed if profiling is > enabled. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5824) TestPipesNonJavaInputFormat.testFormat fails in windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964611#comment-13964611 ] Hadoop QA commented on MAPREDUCE-5824: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639441/MAPREDUCE-5824.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4494//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4494//console This message is automatically generated. > TestPipesNonJavaInputFormat.testFormat fails in windows > --- > > Key: MAPREDUCE-5824 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5824 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: MAPREDUCE-5824.1.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5824) TestPipesNonJavaInputFormat.testFormat fails in windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated MAPREDUCE-5824: - Status: Patch Available (was: Open) > TestPipesNonJavaInputFormat.testFormat fails in windows > --- > > Key: MAPREDUCE-5824 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5824 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: MAPREDUCE-5824.1.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5824) TestPipesNonJavaInputFormat.testFormat fails in windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated MAPREDUCE-5824: - Attachment: MAPREDUCE-5824.1.patch > TestPipesNonJavaInputFormat.testFormat fails in windows > --- > > Key: MAPREDUCE-5824 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5824 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: MAPREDUCE-5824.1.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5824) TestPipesNonJavaInputFormat.testFormat fails in windows
Xuan Gong created MAPREDUCE-5824: Summary: TestPipesNonJavaInputFormat.testFormat fails in windows Key: MAPREDUCE-5824 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5824 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Xuan Gong Assignee: Xuan Gong -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5824) TestPipesNonJavaInputFormat.testFormat fails in windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964382#comment-13964382 ] Xuan Gong commented on MAPREDUCE-5824: -- The reason is because in windows the path of file is something like : "C:\hadoop\..\". which can not pass StringUtils.unEscapeString() check. We should use StringUtils.escapeString to escape the "\" for the path just like we did in FileInputFormat.setInputPaths() which can fix the test failure. > TestPipesNonJavaInputFormat.testFormat fails in windows > --- > > Key: MAPREDUCE-5824 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5824 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Xuan Gong >Assignee: Xuan Gong > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5823) TestTaskAttempt fails in trunk and branch-2 with NPE
[ https://issues.apache.org/jira/browse/MAPREDUCE-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964324#comment-13964324 ] Mit Desai commented on MAPREDUCE-5823: -- The console output here is for TestTaskAttempt#testTooManyFetchFailureAfterKill Found something similar with TestTaskAttempt#testFetchFailureAttemptFinishTime. It seems to be the same issue {noformat} java.lang.NullPointerException: null at org.apache.hadoop.security.token.Token.write(Token.java:221) at org.apache.hadoop.mapred.ShuffleHandler.serializeServiceData(ShuffleHandler.java:272) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:715) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:801) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1516) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1493) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058) at org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testFetchFailureAttemptFinishTime(TestTaskAttempt.java:771) {noformat} > TestTaskAttempt fails in trunk and branch-2 with NPE > > > Key: MAPREDUCE-5823 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5823 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.0.0, 2.5.0 >Reporter: Mit Desai > > Here is the console output I got > {noformat} > java.lang.NullPointerException: null > at org.apache.hadoop.security.token.Token.write(Token.java:221) > at > org.apache.hadoop.mapred.ShuffleHandler.serializeServiceData(ShuffleHandler.java:272) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:715) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:801) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1516) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1493) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testTooManyFetchFailureAfterKill(TestTaskAttempt.java:660) > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5823) TestTaskAttempt fails in trunk and branch-2 with NPE
Mit Desai created MAPREDUCE-5823: Summary: TestTaskAttempt fails in trunk and branch-2 with NPE Key: MAPREDUCE-5823 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5823 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.5.0 Reporter: Mit Desai Here is the console output I got {noformat} java.lang.NullPointerException: null at org.apache.hadoop.security.token.Token.write(Token.java:221) at org.apache.hadoop.mapred.ShuffleHandler.serializeServiceData(ShuffleHandler.java:272) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:715) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:801) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1516) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1493) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058) at org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testTooManyFetchFailureAfterKill(TestTaskAttempt.java:660) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-5820) Unable to process mongodb gridfs collection data in Hadoop Mapreduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-5820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-5820. --- Resolution: Invalid closing as invalid. # this isn't a bug, it's a support issue you should raise on a mailing list # you're asking support questions related to Mongo DB and Spark APIs. https://wiki.apache.org/hadoop/InvalidJiraIssues > Unable to process mongodb gridfs collection data in Hadoop Mapreduce > > > Key: MAPREDUCE-5820 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5820 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task >Affects Versions: 2.2.0 > Environment: Hadoop, Mongodb >Reporter: sivaram >Priority: Critical > > I saved a 2GB pdf file into MongoDB using GridFS. now i want process those > GridFS collection data using Java Spark Mapreduce. previously i have > succesfully processed mongoDB collections with Hadoop mapreduce using > Mongo-Hadoop connector. now i'm unable to handle binary data which is coming > from input GridFS collections. > MongoConfigUtil.setInputURI(config, > "mongodb://localhost:27017/pdfbooks.fs.chunks" ); > MongoConfigUtil.setOutputURI(config,"mongodb://localhost:27017/"+output ); > JavaPairRDD mongoRDD = sc.newAPIHadoopRDD(config, > com.mongodb.hadoop.MongoInputFormat.class, Object.class, > BSONObject.class); > JavaRDD words = mongoRDD.flatMap(new > FlatMapFunction, >String>() { >@Override >public Iterable call(Tuple2 arg) { >System.out.println(arg._2.toString()); >... > In the above code i'm accesing fs.chunks collection as input to my mapper. so > mapper is taking it as BsonObject. but the problem is that input BSONObject > data is in unreadable binary format. for example the above program > "System.out.println(arg._2.toString());" statement giving following result: >{ "_id" : { "$oid" : "533e53048f0c8bcb0b3a7ff7"} , "files_id" : { "$oid" : > "533e5303fac7a2e2c4afea08"} , "n" : 0 , "data" : } > How Do i print/access that data in readable format. Can i use GridFS Api to > do that. if so please suggest me how to convert input BSONObject to GridFS > object and other best ways to do...Thank you in Advance!!! -- This message was sent by Atlassian JIRA (v6.2#6252)