[jira] [Commented] (MAPREDUCE-5655) Remote job submit from windows to a linux hadoop cluster fails due to wrong classpath

2014-04-09 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964967#comment-13964967
 ] 

Jian He commented on MAPREDUCE-5655:


Please refer to MAPREDUCE-4052 for the fix,  the patch uploaded here is dead.

> Remote job submit from windows to a linux hadoop cluster fails due to wrong 
> classpath
> -
>
> Key: MAPREDUCE-5655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, job submission
>Affects Versions: 2.2.0, 2.3.0
> Environment: Client machine is a Windows 7 box, with Eclipse
> Remote: there is a multi node hadoop cluster, installed on Ubuntu boxes (any 
> linux)
>Reporter: Attila Pados
>Assignee: Joyoung Zhang
> Attachments: MRApps.patch, YARNRunner.patch
>
>
> I was trying to run a java class on my client, windows 7 developer 
> environment, which submits a job to the remote Hadoop cluster, initiates a 
> mapreduce there, and then downloads the results back to the local machine.
> General use case is to use hadoop services from a web application installed 
> on a non-cluster computer, or as part of a developer environment.
> The problem was, that the ApplicationMaster's startup shell script 
> (launch_container.sh) was generated with wrong CLASSPATH entry. Together with 
> the java process call on the bottom of the file, these entries were generated 
> in windows style, using % as shell variable marker and ; as the CLASSPATH 
> delimiter.
> I tracked down the root cause, and found that the MrApps.java, and the 
> YarnRunner.java classes create these entries, and is passed forward to the 
> ApplicationMaster, assuming that the OS that runs these classes will match 
> the one running the ApplicationMaster. But it's not the case, these are in 2 
> different jvm, and also the OS can be different, the strings are generated 
> based on the client/submitter side's OS.
> I made some workaround changes to these 2 files, so i could launch my job, 
> however there may be more problems ahead.
> update
>  error message:
> 13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with 
> state FAILED due to: Application application_1386170530016_0001 failed 2 
> times due to AM Container for appattempt_1386170530016_0001_02 exited 
> with  exitCode: 1 due to: Exception from container-launch: 
> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job 
> control
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>   at org.apache.hadoop.util.Shell.run(Shell.java:379)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> update2: 
>  It also reqires to add the following property to 
>  mapred-site.xml (or mapred-default.xml), on the windows box, so that the job 
> launcher knows, that the job runner will be a linux:
>   
>   mapred.remote.os
>   Linux
>   Remote MapReduce framework's OS, can be either Linux or 
> Windows
>   without this entry, the patched jar does the same as the unpatched, so it's 
> required to work!



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (MAPREDUCE-5655) Remote job submit from windows to a linux hadoop cluster fails due to wrong classpath

2014-04-09 Thread Joyoung Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joyoung Zhang reassigned MAPREDUCE-5655:


Assignee: Joyoung Zhang

> Remote job submit from windows to a linux hadoop cluster fails due to wrong 
> classpath
> -
>
> Key: MAPREDUCE-5655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, job submission
>Affects Versions: 2.2.0, 2.3.0
> Environment: Client machine is a Windows 7 box, with Eclipse
> Remote: there is a multi node hadoop cluster, installed on Ubuntu boxes (any 
> linux)
>Reporter: Attila Pados
>Assignee: Joyoung Zhang
> Attachments: MRApps.patch, YARNRunner.patch
>
>
> I was trying to run a java class on my client, windows 7 developer 
> environment, which submits a job to the remote Hadoop cluster, initiates a 
> mapreduce there, and then downloads the results back to the local machine.
> General use case is to use hadoop services from a web application installed 
> on a non-cluster computer, or as part of a developer environment.
> The problem was, that the ApplicationMaster's startup shell script 
> (launch_container.sh) was generated with wrong CLASSPATH entry. Together with 
> the java process call on the bottom of the file, these entries were generated 
> in windows style, using % as shell variable marker and ; as the CLASSPATH 
> delimiter.
> I tracked down the root cause, and found that the MrApps.java, and the 
> YarnRunner.java classes create these entries, and is passed forward to the 
> ApplicationMaster, assuming that the OS that runs these classes will match 
> the one running the ApplicationMaster. But it's not the case, these are in 2 
> different jvm, and also the OS can be different, the strings are generated 
> based on the client/submitter side's OS.
> I made some workaround changes to these 2 files, so i could launch my job, 
> however there may be more problems ahead.
> update
>  error message:
> 13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with 
> state FAILED due to: Application application_1386170530016_0001 failed 2 
> times due to AM Container for appattempt_1386170530016_0001_02 exited 
> with  exitCode: 1 due to: Exception from container-launch: 
> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job 
> control
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>   at org.apache.hadoop.util.Shell.run(Shell.java:379)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> update2: 
>  It also reqires to add the following property to 
>  mapred-site.xml (or mapred-default.xml), on the windows box, so that the job 
> launcher knows, that the job runner will be a linux:
>   
>   mapred.remote.os
>   Linux
>   Remote MapReduce framework's OS, can be either Linux or 
> Windows
>   without this entry, the patched jar does the same as the unpatched, so it's 
> required to work!



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5655) Remote job submit from windows to a linux hadoop cluster fails due to wrong classpath

2014-04-09 Thread JoyoungZhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964931#comment-13964931
 ] 

JoyoungZhang commented on MAPREDUCE-5655:
-

I've tested this patch on 2.2.0,it words fine. It should be pointed out that 
there are two aspects of the problem
1.It's vargs.add("Linux".equals(remoteOs) ?   "$JAVA_HOME/bin/java" : 
Environment.JAVA_HOME.$() + "/bin/java"); 
but vargs.add("Linux".equals(remoteOs) ?   "$JAVA_HOME/bin/java" : 
Environment.JAVA_HOME.$()); in YARNRunner.patch
2.you should add "mapreduce.application.classpath" to Configuration on the 
windows box

> Remote job submit from windows to a linux hadoop cluster fails due to wrong 
> classpath
> -
>
> Key: MAPREDUCE-5655
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5655
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, job submission
>Affects Versions: 2.2.0, 2.3.0
> Environment: Client machine is a Windows 7 box, with Eclipse
> Remote: there is a multi node hadoop cluster, installed on Ubuntu boxes (any 
> linux)
>Reporter: Attila Pados
> Attachments: MRApps.patch, YARNRunner.patch
>
>
> I was trying to run a java class on my client, windows 7 developer 
> environment, which submits a job to the remote Hadoop cluster, initiates a 
> mapreduce there, and then downloads the results back to the local machine.
> General use case is to use hadoop services from a web application installed 
> on a non-cluster computer, or as part of a developer environment.
> The problem was, that the ApplicationMaster's startup shell script 
> (launch_container.sh) was generated with wrong CLASSPATH entry. Together with 
> the java process call on the bottom of the file, these entries were generated 
> in windows style, using % as shell variable marker and ; as the CLASSPATH 
> delimiter.
> I tracked down the root cause, and found that the MrApps.java, and the 
> YarnRunner.java classes create these entries, and is passed forward to the 
> ApplicationMaster, assuming that the OS that runs these classes will match 
> the one running the ApplicationMaster. But it's not the case, these are in 2 
> different jvm, and also the OS can be different, the strings are generated 
> based on the client/submitter side's OS.
> I made some workaround changes to these 2 files, so i could launch my job, 
> however there may be more problems ahead.
> update
>  error message:
> 13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with 
> state FAILED due to: Application application_1386170530016_0001 failed 2 
> times due to AM Container for appattempt_1386170530016_0001_02 exited 
> with  exitCode: 1 due to: Exception from container-launch: 
> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job 
> control
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>   at org.apache.hadoop.util.Shell.run(Shell.java:379)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> update2: 
>  It also reqires to add the following property to 
>  mapred-site.xml (or mapred-default.xml), on the windows box, so that the job 
> launcher knows, that the job runner will be a linux:
>   
>   mapred.remote.os
>   Linux
>   Remote MapReduce framework's OS, can be either Linux or 
> Windows
>   without this entry, the patched jar does the same as the unpatched, so it's 
> required to work!



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5815) Fix NPE in TestMRAppMaster

2014-04-09 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964916#comment-13964916
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5815:


Sorry, didn't see this before. Reviewing now..

> Fix NPE in TestMRAppMaster
> --
>
> Key: MAPREDUCE-5815
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5815
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, mrv2
>Affects Versions: 2.4.0
>Reporter: Gera Shegalov
>Assignee: Akira AJISAKA
>Priority: Blocker
> Attachments: MAPREDUCE-5815.2.patch, MAPREDUCE-5815.v01.patch
>
>
> Working MAPREDUCE-5813 I stumbled on NPE's in TestMRAppMaster. They seem to 
> be introduced by MAPREDUCE-5805.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5465) Container killed before hprof dumps profile.out

2014-04-09 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964753#comment-13964753
 ] 

Ming Ma commented on MAPREDUCE-5465:


Thanks Jason for the review. I will upload the updated patch soon. Want to 
comment on the couple points you mentioned.

1. Yes, putting finishTaskMonitor under TaskAttemptListenerImpl isn't clean, 
given TaskAttemptListenerImpl should only deal with TaskUmbilicalProtocol 
related. I will move it out to AppContext layer.
2.  Handling of TA_FAILMSG event.  TA_FAILMSG can be triggered by task JVM as 
well as user via "hadoop job -fail-task command". For the case where task JVM 
reports failure, yes, it can wait for the container to exit. For the case where 
end users send the command, it will need to clean up the container right away. 
I skipped that for simplicity. If we want to support that, it seems we will 
need a new event like TA_FAILMSG_BY_USER.
3. Why are we transitioning from FINISHING_CONTAINER to 
SUCCESS_CONTAINER_CLEANUP rather than to SUCCEEDED when we receive a container 
completed event? It was done for simplicity so that all successful states will 
go to SUCCESS_CONTAINER_CLEANUP first. But I agree it can go directly to 
SUCCEEDED when we receive a container completed event.

  

> Container killed before hprof dumps profile.out
> ---
>
> Key: MAPREDUCE-5465
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5465
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am, mrv2
>Affects Versions: trunk, 2.0.3-alpha
>Reporter: Radim Kolar
>Assignee: Ming Ma
> Attachments: MAPREDUCE-5465-2.patch, MAPREDUCE-5465-3.patch, 
> MAPREDUCE-5465.patch
>
>
> If there is profiling enabled for mapper or reducer then hprof dumps 
> profile.out at process exit. It is dumped after task signaled to AM that work 
> is finished.
> AM kills container with finished work without waiting for hprof to finish 
> dumps. If hprof is dumping larger outputs (such as with depth=4 while depth=3 
> works) , it could not finish dump in time before being killed making entire 
> dump unusable because cpu and heap stats are missing.
> There needs to be better delay before container is killed if profiling is 
> enabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5824) TestPipesNonJavaInputFormat.testFormat fails in windows

2014-04-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964611#comment-13964611
 ] 

Hadoop QA commented on MAPREDUCE-5824:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12639441/MAPREDUCE-5824.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4494//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4494//console

This message is automatically generated.

> TestPipesNonJavaInputFormat.testFormat fails in windows
> ---
>
> Key: MAPREDUCE-5824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: MAPREDUCE-5824.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5824) TestPipesNonJavaInputFormat.testFormat fails in windows

2014-04-09 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated MAPREDUCE-5824:
-

Status: Patch Available  (was: Open)

> TestPipesNonJavaInputFormat.testFormat fails in windows
> ---
>
> Key: MAPREDUCE-5824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: MAPREDUCE-5824.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5824) TestPipesNonJavaInputFormat.testFormat fails in windows

2014-04-09 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated MAPREDUCE-5824:
-

Attachment: MAPREDUCE-5824.1.patch

> TestPipesNonJavaInputFormat.testFormat fails in windows
> ---
>
> Key: MAPREDUCE-5824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: MAPREDUCE-5824.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5824) TestPipesNonJavaInputFormat.testFormat fails in windows

2014-04-09 Thread Xuan Gong (JIRA)
Xuan Gong created MAPREDUCE-5824:


 Summary: TestPipesNonJavaInputFormat.testFormat fails in windows
 Key: MAPREDUCE-5824
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5824
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Xuan Gong
Assignee: Xuan Gong






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5824) TestPipesNonJavaInputFormat.testFormat fails in windows

2014-04-09 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964382#comment-13964382
 ] 

Xuan Gong commented on MAPREDUCE-5824:
--

The reason is because in windows the path of file is something like : 
"C:\hadoop\..\". which can not pass StringUtils.unEscapeString() check.
We should use StringUtils.escapeString to escape the "\" for the path just like 
we did in FileInputFormat.setInputPaths() which can fix the test failure.

> TestPipesNonJavaInputFormat.testFormat fails in windows
> ---
>
> Key: MAPREDUCE-5824
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5824
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5823) TestTaskAttempt fails in trunk and branch-2 with NPE

2014-04-09 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964324#comment-13964324
 ] 

Mit Desai commented on MAPREDUCE-5823:
--

The console output here is for TestTaskAttempt#testTooManyFetchFailureAfterKill

Found something similar with TestTaskAttempt#testFetchFailureAttemptFinishTime. 
It seems to be the same issue
{noformat}
java.lang.NullPointerException: null
at org.apache.hadoop.security.token.Token.write(Token.java:221)
at 
org.apache.hadoop.mapred.ShuffleHandler.serializeServiceData(ShuffleHandler.java:272)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:715)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:801)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1516)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1493)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testFetchFailureAttemptFinishTime(TestTaskAttempt.java:771)
{noformat}

> TestTaskAttempt fails in trunk and branch-2 with NPE
> 
>
> Key: MAPREDUCE-5823
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5823
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.5.0
>Reporter: Mit Desai
>
> Here is the console output I got
> {noformat}
> java.lang.NullPointerException: null
>   at org.apache.hadoop.security.token.Token.write(Token.java:221)
>   at 
> org.apache.hadoop.mapred.ShuffleHandler.serializeServiceData(ShuffleHandler.java:272)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:715)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:801)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1516)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1493)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testTooManyFetchFailureAfterKill(TestTaskAttempt.java:660)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5823) TestTaskAttempt fails in trunk and branch-2 with NPE

2014-04-09 Thread Mit Desai (JIRA)
Mit Desai created MAPREDUCE-5823:


 Summary: TestTaskAttempt fails in trunk and branch-2 with NPE
 Key: MAPREDUCE-5823
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5823
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.5.0
Reporter: Mit Desai


Here is the console output I got

{noformat}
java.lang.NullPointerException: null
at org.apache.hadoop.security.token.Token.write(Token.java:221)
at 
org.apache.hadoop.mapred.ShuffleHandler.serializeServiceData(ShuffleHandler.java:272)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:715)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:801)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1516)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1493)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1058)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testTooManyFetchFailureAfterKill(TestTaskAttempt.java:660)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-5820) Unable to process mongodb gridfs collection data in Hadoop Mapreduce

2014-04-09 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved MAPREDUCE-5820.
---

Resolution: Invalid

closing as invalid. 
# this isn't a bug, it's a support issue you should raise on a mailing list
# you're asking support questions related to Mongo DB and Spark APIs. 

https://wiki.apache.org/hadoop/InvalidJiraIssues



> Unable to process mongodb gridfs collection data in Hadoop Mapreduce
> 
>
> Key: MAPREDUCE-5820
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5820
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 2.2.0
> Environment: Hadoop, Mongodb
>Reporter: sivaram
>Priority: Critical
>
> I saved a 2GB pdf file into MongoDB using GridFS. now i want process those 
> GridFS collection data using Java Spark Mapreduce. previously i have 
> succesfully processed mongoDB collections with Hadoop mapreduce using 
> Mongo-Hadoop connector. now i'm unable to handle binary data which is coming 
> from input GridFS collections.
>  MongoConfigUtil.setInputURI(config, 
> "mongodb://localhost:27017/pdfbooks.fs.chunks" );
>  MongoConfigUtil.setOutputURI(config,"mongodb://localhost:27017/"+output );
>  JavaPairRDD mongoRDD = sc.newAPIHadoopRDD(config,
> com.mongodb.hadoop.MongoInputFormat.class, Object.class,
> BSONObject.class);
>  JavaRDD words = mongoRDD.flatMap(new 
> FlatMapFunction,
>String>() {
>@Override
>public Iterable call(Tuple2 arg) {   
>System.out.println(arg._2.toString());
>...
> In the above code i'm accesing fs.chunks collection as input to my mapper. so 
> mapper is taking it as BsonObject. but the problem is that input BSONObject 
> data is in unreadable binary format. for example the above program 
> "System.out.println(arg._2.toString());" statement giving following result:
>{ "_id" : { "$oid" : "533e53048f0c8bcb0b3a7ff7"} , "files_id" : { "$oid" : 
> "533e5303fac7a2e2c4afea08"} , "n" : 0 , "data" : }
> How Do i print/access that data in readable format. Can i use GridFS Api to 
> do that. if so please suggest me how to convert input BSONObject to GridFS 
> object and other best ways to do...Thank you in Advance!!!



--
This message was sent by Atlassian JIRA
(v6.2#6252)