Hi Sam, Do you have twill classes included in the fat jar as well? I think it might be affecting the application jar that twill generate. Would you mind filing a JIRA on this?
Thanks, Terence > On Mar 27, 2017, at 3:55 PM, Sam William <sampri...@gmail.com> wrote: > > I was able to make it work with this 1 line change. > > https://github.com/sampd/twill/compare/branch-0.10.0...sampd:v10_test?expand=1 > > <https://github.com/sampd/twill/compare/branch-0.10.0...sampd:v10_test?expand=1> > > We have a fat jar with all dependencies included for the application. Could > that be the cause ? > > Sam > > > >> On Mar 27, 2017, at 15:49, Yuliya Feldman <yul...@dremio.com> wrote: >> >> Code of your application you want to be running in YARN I believe :) >> >> On Mon, Mar 27, 2017 at 3:28 PM, Sam William <sampri...@gmail.com> wrote: >> >>> Yes. 22 bytes looks like an empty zip file. Any idea what should there in >>> the application jar file ? >>> >>> Sam >>>> On Mar 27, 2017, at 13:22, Yuliya Feldman <yul...@dremio.com> wrote: >>>> >>>> File is very small - it may be nothing to do with file not found. Either >>>> permissions or something else >>>> >>>> On Mon, Mar 27, 2017 at 1:17 PM, Sam William <sampri...@gmail.com> >>> wrote: >>>> >>>>> I logged into the master host and looked at the nodemanager logs. It >>> fails >>>>> at localizing the application jar. The files are there in HDFS. I can >>>>> even see it is able to copy the other files just fine (for example the >>>>> launcher jar and runtime.config) >>>>> >>>>> -rw-r--r-- 3 sam supergroup 22 2017-03-27 12:47 >>>>> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- >>>>> 44a506886fc1/Build-shards-GRE-bd5d893b401041edceec38c78f1ece >>>>> c7-application.538b9590-d7f5-4121-824e-448a12a635c1.jar >>>>> -rw-r--r-- 3 sam supergroup 5991970 2017-03-27 12:47 >>>>> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- >>>>> 44a506886fc1/buil.b0458483-23ca-4243-89f6-d1a40210110d. >>>>> -rw-r--r-- 3 sam supergroup 5725 2017-03-27 12:47 >>>>> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- >>> 44a506886fc1/launcher. >>>>> 4d7df397-5325-4a5f-8c95-ddcae99867f5.jar >>>>> -rw-r--r-- 3 sam supergroup 1038 2017-03-27 12:47 >>>>> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- >>>>> 44a506886fc1/localizeFiles.bbe5dc82-9fe9-4249-8964-df15212a1812.json >>>>> -rw-r--r-- 3 sam supergroup 2072 2017-03-27 12:47 >>>>> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- >>>>> 44a506886fc1/runtime.config.9dd1b585-c601-40b7-8831-25383013eb1e.jar >>>>> -rw-r--r-- 3 sam supergroup 48245414 2017-03-27 12:47 >>>>> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384- >>>>> 44a506886fc1/twill.c765e4d8-958e-4811-b138-c4ef71e2a93e.jar >>>>> >>>>> >>>>> 2017-03-27 12:47:45,632 INFO org.apache.hadoop.yarn.server. >>>>> nodemanager.containermanager.localizer.LocalizedResource: Resource >>>>> hdfs://pv34-search-dev/user/sam/Build-shards-GRE/2f30b4ab- >>>>> d9e1-48bd-9384-44a506886fc1/runtime.config.9dd1b585-c601- >>>>> 40b7-8831-25383013eb1e.jar(->/data/8/yarn/nm/usercache/sam/ >>>>> appcache/application_1484158548936_11282/filecache/ >>>>> 11/runtime.config.9dd1b585-c601-40b7-8831-25383013eb1e.jar) >>> transitioned >>>>> from DOWNLOADING to LOCALIZED >>>>> 2017-03-27 12:47:45,645 INFO org.apache.hadoop.yarn.server. >>>>> nodemanager.containermanager.localizer.LocalizedResource: Resource >>>>> hdfs://pv34-search-dev/user/sam/Build-shards-GRE/2f30b4ab- >>>>> d9e1-48bd-9384-44a506886fc1/launcher.4d7df397-5325-4a5f- >>>>> 8c95-ddcae99867f5.jar(->/data/10/yarn/nm/usercache/sam/ >>>>> appcache/application_1484158548936_11282/filecache/ >>>>> 12/launcher.4d7df397-5325-4a5f-8c95-ddcae99867f5.jar) transitioned from >>>>> DOWNLOADING to LOCALIZED >>>>> 2017-03-27 12:47:45,651 WARN org.apache.hadoop.security. >>> UserGroupInformation: >>>>> PriviledgedActionException as:sam (auth:SIMPLE) cause:ENOENT: No such >>> file >>>>> or directory >>>>> 2017-03-27 12:47:45,655 WARN org.apache.hadoop.yarn.server. >>>>> nodemanager.containermanager.localizer.ResourceLocalizationService: { >>>>> hdfs://pv34-search-dev/user/sam/Build-shards-GRE/2f30b4ab- >>>>> d9e1-48bd-9384-44a506886fc1/Build-shards-GRE- >>>>> bd5d893b401041edceec38c78f1ecec7-application.538b9590-d7f5- >>> 4121-824e-448a12a635c1.jar, >>>>> 1490644063924, ARCHIVE, null } failed: No such file or directory >>>>> ENOENT: No such file or directory >>>>> at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native >>>>> Method) >>>>> at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO. >>>>> java:230) >>>>> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission( >>>>> RawLocalFileSystem.java:660) >>>>> at org.apache.hadoop.fs.DelegateToFileSystem.setPermission( >>>>> DelegateToFileSystem.java:206) >>>>> at org.apache.hadoop.fs.FilterFs.setPermission(FilterFs.java: >>> 251) >>>>> at org.apache.hadoop.fs.FileContext$10.next( >>> FileContext.java:955) >>>>> at org.apache.hadoop.fs.FileContext$10.next( >>> FileContext.java:951) >>>>> at org.apache.hadoop.fs.FSLinkResolver.resolve( >>>>> FSLinkResolver.java:90) >>>>> at org.apache.hadoop.fs.FileContext.setPermission( >>>>> FileContext.java:951) >>>>> >>>>> >>>>>> On Mar 27, 2017, at 12:45, Sam William <sampri...@gmail.com> wrote: >>>>>> >>>>>> Hi Terence, >>>>>> Im not able to get logs for these jobs. “yarn logs” command does nt >>>>> return anything. >>>>>> Sam >>>>>>> On Mar 26, 2017, at 17:32, Terence Yim <cht...@gmail.com> wrote: >>>>>>> >>>>>>> Hi Sam, >>>>>>> >>>>>>> I guess it might be related to the missing of the Hadoop conf >>> directory >>>>> in the container classpath, such that the locationfactory constructed >>> from >>>>> the container side is not correct. Do you have access to the containers >>>>> stdout file? It shows the classpath twill uses. >>>>>>> >>>>>>> Terence >>>>>>> >>>>>>> Sent from my iPhone >>>>>>> >>>>>>>> On Mar 26, 2017, at 3:16 PM, Sam William <sampri...@gmail.com> >>> wrote: >>>>>>>> >>>>>>>> It works with Twill-0.9.0. So far I have been able to narrow it down >>>>> to one commit >>>>>>>> >>>>>>>> 5986553 (TWILL-63) Speed up application launch time >>>>>>>> >>>>>>>> Let me see if can nail down to a particular change. >>>>>>>> >>>>>>>> Sam >>>>>>>> >>>>>>>> >>>>>>>>> On Mar 25, 2017, at 13:34, Sam William <sampri...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> HI Terence, >>>>>>>>> Our cloudera installation is CDH-5.7 and I use hadoop 2.3.0 packages >>>>> for my fat jars. >>>>>>>>> >>>>>>>>> SAm >>>>>>>>>> On Mar 25, 2017, at 12:31, Terence Yim <cht...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Haven't seen this error before. What is the version of Hadoop that >>>>> the cluster is running with? Also, seems like the $HADOOP_CONF is not in >>>>> the classpath as the FileContext is trying to use local file system >>> instead >>>>> of the distributed one. >>>>>>>>>> >>>>>>>>>> Terence >>>>>>>>>> >>>>>>>>>> Sent from my iPhone >>>>>>>>>> >>>>>>>>>>> On Mar 25, 2017, at 12:25 PM, Sam William <sampri...@gmail.com> >>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> I have been using Twill for sometime now and I just tried to >>>>> upgrade our application from Twill-0.8.0 to 0.10.0. I havent made any >>> kind >>>>> of code changes besides changing the Twill version string in the build >>>>> script. The application fails immediately and I see this on the RM UI. >>> Any >>>>> idea why this could be happening? >>>>>>>>>>> >>>>>>>>>>> Diagnostics: >>>>>>>>>>> Application application_1484158548936_11154 failed 2 times due to >>>>> AM Container for appattempt_1484158548936_11154_000002 exited with >>>>> exitCode: -1000 >>>>>>>>>>> For more detailed output, check application tracking page<> Then, >>>>> click on links to logs of each attempt. >>>>>>>>>>> Diagnostics: No such file or directory >>>>>>>>>>> ENOENT: No such file or directory >>>>>>>>>>> at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native >>>>> Method) >>>>>>>>>>> at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO. >>>>> java:230) >>>>>>>>>>> at org.apache.hadoop.fs.RawLocalFileSystem.setPermission( >>>>> RawLocalFileSystem.java:660) >>>>>>>>>>> at org.apache.hadoop.fs.DelegateToFileSystem.setPermission( >>>>> DelegateToFileSystem.java:206) >>>>>>>>>>> at org.apache.hadoop.fs.FilterFs.setPermission(FilterFs.java:251) >>>>>>>>>>> at org.apache.hadoop.fs.FileContext$10.next(FileContext.java:955) >>>>>>>>>>> at org.apache.hadoop.fs.FileContext$10.next(FileContext.java:951) >>>>>>>>>>> at org.apache.hadoop.fs.FSLinkResolver.resolve( >>>>> FSLinkResolver.java:90) >>>>>>>>>>> at org.apache.hadoop.fs.FileContext.setPermission( >>>>> FileContext.java:951) >>>>>>>>>>> at org.apache.hadoop.yarn.util.FSDownload$3.run(FSDownload. >>>>> java:419) >>>>>>>>>>> at org.apache.hadoop.yarn.util.FSDownload$3.run(FSDownload. >>>>> java:417) >>>>>>>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:422) >>>>>>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs( >>>>> UserGroupInformation.java:1693) >>>>>>>>>>> at org.apache.hadoop.yarn.util.FSDownload.changePermissions( >>>>> FSDownload.java:417) >>>>>>>>>>> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload. >>> java:363) >>>>>>>>>>> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload. >>> java:60) >>>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>>>>>>>>>> at java.util.concurrent.Executors$RunnableAdapter. >>>>> call(Executors.java:511) >>>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>>>>>>>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker( >>>>> ThreadPoolExecutor.java:1142) >>>>>>>>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >>>>> ThreadPoolExecutor.java:617) >>>>>>>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>>>>>>> Failing this attempt. Failing the application. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Sam >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>>> >>> >>> >