[ 
https://issues.apache.org/jira/browse/YARN-9190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758365#comment-16758365
 ] 

Billie Rinaldi commented on YARN-9190:
--------------------------------------

[~tangzhankun] Since I haven't seen this locally, I'll suggest some debugging 
tips:
* Check the client output for ApiServiceClient vs. ServiceClient logs. You 
should see ApiServiceClient, which is the one that goes through the RM REST API.
* Check the RM log for a POST: createService log by ApiServer. After that there 
should be ServiceClient logs describing what it's doing: using an existing 
tarball, uploading a new tarball for reuse (only allowed for hdfs or yarn 
admin), or uploading a temporary tarball just for this app.
* Find the tarball the AM is using (check the AM log and/or look in the 
container directory) and check which jars it contains. Compare the jar list 
with the contents of a good tarball, such as the one uploaded with yarn app 
-enableFastLaunch.

> [Submarine] Submarine job will fail to run as a first job on a new created 
> Hadoop 3.2.0 RC1 cluster
> ---------------------------------------------------------------------------------------------------
>
>                 Key: YARN-9190
>                 URL: https://issues.apache.org/jira/browse/YARN-9190
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Zhankun Tang
>            Assignee: Sunil Govindan
>            Priority: Major
>
> This issue was found when verifying submarine in Hadoop 3.2.0 RC1 planning. 
> The reproduce steps are:
>  # Init a new HDFS and YARN (LinuxContainerExecutor and Docker enabled)
>  # Before run any other yarn service job, use yarn user to submit a submarine 
> job
> The job will fail with below error:
>  
> {code:java}
> LogType:serviceam-err.txt
> LogLastModifiedTime:Thu Jan 10 21:15:23 +0800 2019
> LogLength:86
> LogContents:
> Error: Could not find or load main class 
> org.apache.hadoop.yarn.service.ServiceMaster
> End of LogType:serviceam-err.txt
> {code}
> This seems because the dependencies are not ready as the service client 
> reported:
> {code:java}
> 2019-01-10 21:50:47,380 WARN client.ServiceClient: Property 
> yarn.service.framework.path has a value 
> /yarn-services/3.2.0/service-dep.tar.gz, but is not a valid file
> 2019-01-10 21:50:47,381 INFO client.ServiceClient: Uploading all dependency 
> jars to HDFS. For faster submission of apps, set config property 
> yarn.service.framework.path to the dependency tarball location. Dependency 
> tarball can be uploaded to any HDFS path directly or by using command: yarn 
> app -enableFastLaunch [<Destination Folder>]{code}
>  
> When this error happens, I found that there is no “/yarn-services” directory 
> created in HDFS.
> But after I run “yarn app -launch my-sleeper sleeper”, the “/yarn-services” 
> created in HDFS and then the submarine job can run successfully.
> {code:java}
> yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ 
> hdfs dfs -ls /yarn-services/3.2.0/*
> -rwxr-xr-x 1 yarn supergroup 93596476 2019-01-11 08:23 
> /yarn-services/3.2.0/service-dep.tar.gz{code}
> It seems an issue of yarn service in 3.2.0 RC1 and I files this Jira to track 
> it.
>  
> And verified that trunk branch doesn't have this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to