File is very small - it may be nothing to do with file not found. Either permissions or something else
On Mon, Mar 27, 2017 at 1:17 PM, Sam William <sampri...@gmail.com> wrote: > I logged into the master host and looked at the nodemanager logs. It fails > at localizing the application jar. The files are there in HDFS. I can > even see it is able to copy the other files just fine (for example the > launcher jar and runtime.config) > > -rw-r--r-- 3 sam supergroup 22 2017-03-27 12:47 > /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- > 44a506886fc1/Build-shards-GRE-bd5d893b401041edceec38c78f1ece > c7-application.538b9590-d7f5-4121-824e-448a12a635c1.jar > -rw-r--r-- 3 sam supergroup 5991970 2017-03-27 12:47 > /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- > 44a506886fc1/buil.b0458483-23ca-4243-89f6-d1a40210110d. > -rw-r--r-- 3 sam supergroup 5725 2017-03-27 12:47 > /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384-44a506886fc1/launcher. > 4d7df397-5325-4a5f-8c95-ddcae99867f5.jar > -rw-r--r-- 3 sam supergroup 1038 2017-03-27 12:47 > /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- > 44a506886fc1/localizeFiles.bbe5dc82-9fe9-4249-8964-df15212a1812.json > -rw-r--r-- 3 sam supergroup 2072 2017-03-27 12:47 > /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- > 44a506886fc1/runtime.config.9dd1b585-c601-40b7-8831-25383013eb1e.jar > -rw-r--r-- 3 sam supergroup 48245414 2017-03-27 12:47 > /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- > 44a506886fc1/twill.c765e4d8-958e-4811-b138-c4ef71e2a93e.jar > > > 2017-03-27 12:47:45,632 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.localizer.LocalizedResource: Resource > hdfs://pv34-search-dev/user/sam/Build-shards-GRE/2f30b4ab- > d9e1-48bd-9384-44a506886fc1/runtime.config.9dd1b585-c601- > 40b7-8831-25383013eb1e.jar(->/data/8/yarn/nm/usercache/sam/ > appcache/application_1484158548936_11282/filecache/ > 11/runtime.config.9dd1b585-c601-40b7-8831-25383013eb1e.jar) transitioned > from DOWNLOADING to LOCALIZED > 2017-03-27 12:47:45,645 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.localizer.LocalizedResource: Resource > hdfs://pv34-search-dev/user/sam/Build-shards-GRE/2f30b4ab- > d9e1-48bd-9384-44a506886fc1/launcher.4d7df397-5325-4a5f- > 8c95-ddcae99867f5.jar(->/data/10/yarn/nm/usercache/sam/ > appcache/application_1484158548936_11282/filecache/ > 12/launcher.4d7df397-5325-4a5f-8c95-ddcae99867f5.jar) transitioned from > DOWNLOADING to LOCALIZED > 2017-03-27 12:47:45,651 WARN org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:sam (auth:SIMPLE) cause:ENOENT: No such file > or directory > 2017-03-27 12:47:45,655 WARN org.apache.hadoop.yarn.server. > nodemanager.containermanager.localizer.ResourceLocalizationService: { > hdfs://pv34-search-dev/user/sam/Build-shards-GRE/2f30b4ab- > d9e1-48bd-9384-44a506886fc1/Build-shards-GRE- > bd5d893b401041edceec38c78f1ecec7-application.538b9590-d7f5-4121-824e-448a12a635c1.jar, > 1490644063924, ARCHIVE, null } failed: No such file or directory > ENOENT: No such file or directory > at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native > Method) > at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO. > java:230) > at org.apache.hadoop.fs.RawLocalFileSystem.setPermission( > RawLocalFileSystem.java:660) > at org.apache.hadoop.fs.DelegateToFileSystem.setPermission( > DelegateToFileSystem.java:206) > at org.apache.hadoop.fs.FilterFs.setPermission(FilterFs.java:251) > at org.apache.hadoop.fs.FileContext$10.next(FileContext.java:955) > at org.apache.hadoop.fs.FileContext$10.next(FileContext.java:951) > at org.apache.hadoop.fs.FSLinkResolver.resolve( > FSLinkResolver.java:90) > at org.apache.hadoop.fs.FileContext.setPermission( > FileContext.java:951) > > > > On Mar 27, 2017, at 12:45, Sam William <sampri...@gmail.com> wrote: > > > > Hi Terence, > > Im not able to get logs for these jobs. “yarn logs” command does nt > return anything. > > Sam > >> On Mar 26, 2017, at 17:32, Terence Yim <cht...@gmail.com> wrote: > >> > >> Hi Sam, > >> > >> I guess it might be related to the missing of the Hadoop conf directory > in the container classpath, such that the locationfactory constructed from > the container side is not correct. Do you have access to the containers > stdout file? It shows the classpath twill uses. > >> > >> Terence > >> > >> Sent from my iPhone > >> > >>> On Mar 26, 2017, at 3:16 PM, Sam William <sampri...@gmail.com> wrote: > >>> > >>> It works with Twill-0.9.0. So far I have been able to narrow it down > to one commit > >>> > >>> 5986553 (TWILL-63) Speed up application launch time > >>> > >>> Let me see if can nail down to a particular change. > >>> > >>> Sam > >>> > >>> > >>>> On Mar 25, 2017, at 13:34, Sam William <sampri...@gmail.com> wrote: > >>>> > >>>> HI Terence, > >>>> Our cloudera installation is CDH-5.7 and I use hadoop 2.3.0 packages > for my fat jars. > >>>> > >>>> SAm > >>>>> On Mar 25, 2017, at 12:31, Terence Yim <cht...@gmail.com> wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> Haven't seen this error before. What is the version of Hadoop that > the cluster is running with? Also, seems like the $HADOOP_CONF is not in > the classpath as the FileContext is trying to use local file system instead > of the distributed one. > >>>>> > >>>>> Terence > >>>>> > >>>>> Sent from my iPhone > >>>>> > >>>>>> On Mar 25, 2017, at 12:25 PM, Sam William <sampri...@gmail.com> > wrote: > >>>>>> > >>>>>> Hi, > >>>>>> I have been using Twill for sometime now and I just tried to > upgrade our application from Twill-0.8.0 to 0.10.0. I havent made any kind > of code changes besides changing the Twill version string in the build > script. The application fails immediately and I see this on the RM UI. Any > idea why this could be happening? > >>>>>> > >>>>>> Diagnostics: > >>>>>> Application application_1484158548936_11154 failed 2 times due to > AM Container for appattempt_1484158548936_11154_000002 exited with > exitCode: -1000 > >>>>>> For more detailed output, check application tracking page<> Then, > click on links to logs of each attempt. > >>>>>> Diagnostics: No such file or directory > >>>>>> ENOENT: No such file or directory > >>>>>> at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native > Method) > >>>>>> at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO. > java:230) > >>>>>> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission( > RawLocalFileSystem.java:660) > >>>>>> at org.apache.hadoop.fs.DelegateToFileSystem.setPermission( > DelegateToFileSystem.java:206) > >>>>>> at org.apache.hadoop.fs.FilterFs.setPermission(FilterFs.java:251) > >>>>>> at org.apache.hadoop.fs.FileContext$10.next(FileContext.java:955) > >>>>>> at org.apache.hadoop.fs.FileContext$10.next(FileContext.java:951) > >>>>>> at org.apache.hadoop.fs.FSLinkResolver.resolve( > FSLinkResolver.java:90) > >>>>>> at org.apache.hadoop.fs.FileContext.setPermission( > FileContext.java:951) > >>>>>> at org.apache.hadoop.yarn.util.FSDownload$3.run(FSDownload. > java:419) > >>>>>> at org.apache.hadoop.yarn.util.FSDownload$3.run(FSDownload. > java:417) > >>>>>> at java.security.AccessController.doPrivileged(Native Method) > >>>>>> at javax.security.auth.Subject.doAs(Subject.java:422) > >>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1693) > >>>>>> at org.apache.hadoop.yarn.util.FSDownload.changePermissions( > FSDownload.java:417) > >>>>>> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:363) > >>>>>> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) > >>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) > >>>>>> at java.util.concurrent.Executors$RunnableAdapter. > call(Executors.java:511) > >>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) > >>>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1142) > >>>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:617) > >>>>>> at java.lang.Thread.run(Thread.java:745) > >>>>>> Failing this attempt. Failing the application. > >>>>>> > >>>>>> > >>>>>> Sam > >>>> > >>> > > > >