I was able to make it work with this 1 line change. https://github.com/sampd/twill/compare/branch-0.10.0...sampd:v10_test?expand=1 <https://github.com/sampd/twill/compare/branch-0.10.0...sampd:v10_test?expand=1>
We have a fat jar with all dependencies included for the application. Could that be the cause ? Sam > On Mar 27, 2017, at 15:49, Yuliya Feldman <yul...@dremio.com> wrote: > > Code of your application you want to be running in YARN I believe :) > > On Mon, Mar 27, 2017 at 3:28 PM, Sam William <sampri...@gmail.com> wrote: > >> Yes. 22 bytes looks like an empty zip file. Any idea what should there in >> the application jar file ? >> >> Sam >>> On Mar 27, 2017, at 13:22, Yuliya Feldman <yul...@dremio.com> wrote: >>> >>> File is very small - it may be nothing to do with file not found. Either >>> permissions or something else >>> >>> On Mon, Mar 27, 2017 at 1:17 PM, Sam William <sampri...@gmail.com> >> wrote: >>> >>>> I logged into the master host and looked at the nodemanager logs. It >> fails >>>> at localizing the application jar. The files are there in HDFS. I can >>>> even see it is able to copy the other files just fine (for example the >>>> launcher jar and runtime.config) >>>> >>>> -rw-r--r-- 3 sam supergroup 22 2017-03-27 12:47 >>>> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- >>>> 44a506886fc1/Build-shards-GRE-bd5d893b401041edceec38c78f1ece >>>> c7-application.538b9590-d7f5-4121-824e-448a12a635c1.jar >>>> -rw-r--r-- 3 sam supergroup 5991970 2017-03-27 12:47 >>>> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- >>>> 44a506886fc1/buil.b0458483-23ca-4243-89f6-d1a40210110d. >>>> -rw-r--r-- 3 sam supergroup 5725 2017-03-27 12:47 >>>> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- >> 44a506886fc1/launcher. >>>> 4d7df397-5325-4a5f-8c95-ddcae99867f5.jar >>>> -rw-r--r-- 3 sam supergroup 1038 2017-03-27 12:47 >>>> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- >>>> 44a506886fc1/localizeFiles.bbe5dc82-9fe9-4249-8964-df15212a1812.json >>>> -rw-r--r-- 3 sam supergroup 2072 2017-03-27 12:47 >>>> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- >>>> 44a506886fc1/runtime.config.9dd1b585-c601-40b7-8831-25383013eb1e.jar >>>> -rw-r--r-- 3 sam supergroup 48245414 2017-03-27 12:47 >>>> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- >>>> 44a506886fc1/twill.c765e4d8-958e-4811-b138-c4ef71e2a93e.jar >>>> >>>> >>>> 2017-03-27 12:47:45,632 INFO org.apache.hadoop.yarn.server. >>>> nodemanager.containermanager.localizer.LocalizedResource: Resource >>>> hdfs://pv34-search-dev/user/sam/Build-shards-GRE/2f30b4ab- >>>> d9e1-48bd-9384-44a506886fc1/runtime.config.9dd1b585-c601- >>>> 40b7-8831-25383013eb1e.jar(->/data/8/yarn/nm/usercache/sam/ >>>> appcache/application_1484158548936_11282/filecache/ >>>> 11/runtime.config.9dd1b585-c601-40b7-8831-25383013eb1e.jar) >> transitioned >>>> from DOWNLOADING to LOCALIZED >>>> 2017-03-27 12:47:45,645 INFO org.apache.hadoop.yarn.server. >>>> nodemanager.containermanager.localizer.LocalizedResource: Resource >>>> hdfs://pv34-search-dev/user/sam/Build-shards-GRE/2f30b4ab- >>>> d9e1-48bd-9384-44a506886fc1/launcher.4d7df397-5325-4a5f- >>>> 8c95-ddcae99867f5.jar(->/data/10/yarn/nm/usercache/sam/ >>>> appcache/application_1484158548936_11282/filecache/ >>>> 12/launcher.4d7df397-5325-4a5f-8c95-ddcae99867f5.jar) transitioned from >>>> DOWNLOADING to LOCALIZED >>>> 2017-03-27 12:47:45,651 WARN org.apache.hadoop.security. >> UserGroupInformation: >>>> PriviledgedActionException as:sam (auth:SIMPLE) cause:ENOENT: No such >> file >>>> or directory >>>> 2017-03-27 12:47:45,655 WARN org.apache.hadoop.yarn.server. >>>> nodemanager.containermanager.localizer.ResourceLocalizationService: { >>>> hdfs://pv34-search-dev/user/sam/Build-shards-GRE/2f30b4ab- >>>> d9e1-48bd-9384-44a506886fc1/Build-shards-GRE- >>>> bd5d893b401041edceec38c78f1ecec7-application.538b9590-d7f5- >> 4121-824e-448a12a635c1.jar, >>>> 1490644063924, ARCHIVE, null } failed: No such file or directory >>>> ENOENT: No such file or directory >>>> at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native >>>> Method) >>>> at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO. >>>> java:230) >>>> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission( >>>> RawLocalFileSystem.java:660) >>>> at org.apache.hadoop.fs.DelegateToFileSystem.setPermission( >>>> DelegateToFileSystem.java:206) >>>> at org.apache.hadoop.fs.FilterFs.setPermission(FilterFs.java: >> 251) >>>> at org.apache.hadoop.fs.FileContext$10.next( >> FileContext.java:955) >>>> at org.apache.hadoop.fs.FileContext$10.next( >> FileContext.java:951) >>>> at org.apache.hadoop.fs.FSLinkResolver.resolve( >>>> FSLinkResolver.java:90) >>>> at org.apache.hadoop.fs.FileContext.setPermission( >>>> FileContext.java:951) >>>> >>>> >>>>> On Mar 27, 2017, at 12:45, Sam William <sampri...@gmail.com> wrote: >>>>> >>>>> Hi Terence, >>>>> Im not able to get logs for these jobs. “yarn logs” command does nt >>>> return anything. >>>>> Sam >>>>>> On Mar 26, 2017, at 17:32, Terence Yim <cht...@gmail.com> wrote: >>>>>> >>>>>> Hi Sam, >>>>>> >>>>>> I guess it might be related to the missing of the Hadoop conf >> directory >>>> in the container classpath, such that the locationfactory constructed >> from >>>> the container side is not correct. Do you have access to the containers >>>> stdout file? It shows the classpath twill uses. >>>>>> >>>>>> Terence >>>>>> >>>>>> Sent from my iPhone >>>>>> >>>>>>> On Mar 26, 2017, at 3:16 PM, Sam William <sampri...@gmail.com> >> wrote: >>>>>>> >>>>>>> It works with Twill-0.9.0. So far I have been able to narrow it down >>>> to one commit >>>>>>> >>>>>>> 5986553 (TWILL-63) Speed up application launch time >>>>>>> >>>>>>> Let me see if can nail down to a particular change. >>>>>>> >>>>>>> Sam >>>>>>> >>>>>>> >>>>>>>> On Mar 25, 2017, at 13:34, Sam William <sampri...@gmail.com> wrote: >>>>>>>> >>>>>>>> HI Terence, >>>>>>>> Our cloudera installation is CDH-5.7 and I use hadoop 2.3.0 packages >>>> for my fat jars. >>>>>>>> >>>>>>>> SAm >>>>>>>>> On Mar 25, 2017, at 12:31, Terence Yim <cht...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Haven't seen this error before. What is the version of Hadoop that >>>> the cluster is running with? Also, seems like the $HADOOP_CONF is not in >>>> the classpath as the FileContext is trying to use local file system >> instead >>>> of the distributed one. >>>>>>>>> >>>>>>>>> Terence >>>>>>>>> >>>>>>>>> Sent from my iPhone >>>>>>>>> >>>>>>>>>> On Mar 25, 2017, at 12:25 PM, Sam William <sampri...@gmail.com> >>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> I have been using Twill for sometime now and I just tried to >>>> upgrade our application from Twill-0.8.0 to 0.10.0. I havent made any >> kind >>>> of code changes besides changing the Twill version string in the build >>>> script. The application fails immediately and I see this on the RM UI. >> Any >>>> idea why this could be happening? >>>>>>>>>> >>>>>>>>>> Diagnostics: >>>>>>>>>> Application application_1484158548936_11154 failed 2 times due to >>>> AM Container for appattempt_1484158548936_11154_000002 exited with >>>> exitCode: -1000 >>>>>>>>>> For more detailed output, check application tracking page<> Then, >>>> click on links to logs of each attempt. >>>>>>>>>> Diagnostics: No such file or directory >>>>>>>>>> ENOENT: No such file or directory >>>>>>>>>> at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native >>>> Method) >>>>>>>>>> at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO. >>>> java:230) >>>>>>>>>> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission( >>>> RawLocalFileSystem.java:660) >>>>>>>>>> at org.apache.hadoop.fs.DelegateToFileSystem.setPermission( >>>> DelegateToFileSystem.java:206) >>>>>>>>>> at org.apache.hadoop.fs.FilterFs.setPermission(FilterFs.java:251) >>>>>>>>>> at org.apache.hadoop.fs.FileContext$10.next(FileContext.java:955) >>>>>>>>>> at org.apache.hadoop.fs.FileContext$10.next(FileContext.java:951) >>>>>>>>>> at org.apache.hadoop.fs.FSLinkResolver.resolve( >>>> FSLinkResolver.java:90) >>>>>>>>>> at org.apache.hadoop.fs.FileContext.setPermission( >>>> FileContext.java:951) >>>>>>>>>> at org.apache.hadoop.yarn.util.FSDownload$3.run(FSDownload. >>>> java:419) >>>>>>>>>> at org.apache.hadoop.yarn.util.FSDownload$3.run(FSDownload. >>>> java:417) >>>>>>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:422) >>>>>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs( >>>> UserGroupInformation.java:1693) >>>>>>>>>> at org.apache.hadoop.yarn.util.FSDownload.changePermissions( >>>> FSDownload.java:417) >>>>>>>>>> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload. >> java:363) >>>>>>>>>> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload. >> java:60) >>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>>>>>>>>> at java.util.concurrent.Executors$RunnableAdapter. >>>> call(Executors.java:511) >>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>>>>>>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker( >>>> ThreadPoolExecutor.java:1142) >>>>>>>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >>>> ThreadPoolExecutor.java:617) >>>>>>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>>>>>> Failing this attempt. Failing the application. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Sam >>>>>>>> >>>>>>> >>>>> >>>> >>>> >> >>