Hi Harsh, Hitesh & Omkar, Thanks for the replies.
I tried getting the last modified timestamp like this and it works. Is this a right thing to do? File file = new File("/home_/dsadm/kishore/kk.ksh"); shellRsrc.setTimestamp(file.lastModified()); And, when I tried using a hdfs file qualifying it with both node name and port, it didn't work, I get a similar error as earlier. String shellScriptPath = "hdfs://isredeng:8020//kishore/kk.ksh"; 13/08/07 01:36:28 INFO ApplicationMaster: Got container status for containerID= container_1375853431091_0005_01_000002, state=COMPLETE, exitStatus=-1000, diagnostics=File does not exist: hdfs://isredeng:8020/kishore/kk.ksh 13/08/07 01:36:28 INFO ApplicationMaster: Got failure status for a container : -1000 On Wed, Aug 7, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote: > Thanks Hitesh! > > P.s. Port isn't a requirement (and with HA URIs, you shouldn't add a > port), but "isredeng" has to be the authority component. > > On Wed, Aug 7, 2013 at 7:37 AM, Hitesh Shah <hit...@apache.org> wrote: > > @Krishna, your logs showed the file error for > "hdfs://isredeng/kishore/kk.ksh" > > > > I am assuming you have tried dfs -ls /kishore/kk.ksh and confirmed that > the file exists? Also the qualified path seems to be missing the namenode > port. I need to go back and check if a path without the port works by > assuming the default namenode port. > > > > @Harsh, adding a helper function seems like a good idea. Let me file a > jira to have the above added to one of the helper/client libraries. > > > > thanks > > -- Hitesh > > > > On Aug 6, 2013, at 6:47 PM, Harsh J wrote: > > > >> It is kinda unnecessary to be asking developers to load in timestamps > and > >> length themselves. Why not provide a java.io.File, or perhaps a Path > >> accepting API, that gets it automatically on their behalf using the > >> FileSystem API internally? > >> > >> P.s. A HDFS file gave him a FNF, while a Local file gave him a proper > >> TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS > >> paths. > >> > >> On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hit...@apache.org> wrote: > >>> Hi Krishna, > >>> > >>> YARN downloads a specified local resource on the container's node from > the url specified. In all situtations, the remote url needs to be a fully > qualified path. To verify that the file at the remote url is still valid, > YARN expects you to provide the length and last modified timestamp of that > file. > >>> > >>> If you use an hdfs path such as hdfs://namenode:port/<absolute path to > file>, you will need to get the length and timestamp from HDFS. > >>> If you use file:///, the file should exist on all nodes and all nodes > should have the file with the same length and timestamp for localization to > work. ( For a single node setup, this works but tougher to get right on a > multi-node setup - deploying the file via a rpm should likely work). > >>> > >>> -- Hitesh > >>> > >>> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote: > >>> > >>>> Hi, > >>>> > >>>> You need to match the timestamp. Probably get the timestamp locally > before adding it. This is explicitly done to ensure that file is not > updated after user makes the call to avoid possible errors. > >>>> > >>>> > >>>> Thanks, > >>>> Omkar Joshi > >>>> Hortonworks Inc. > >>>> > >>>> > >>>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri < > write2kish...@gmail.com> wrote: > >>>> I tried the following and it works! > >>>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh"; > >>>> > >>>> But now getting a timestamp error like below, when I passed 0 to > setTimestamp() > >>>> > >>>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for > containerID= container_1375784329048_0017_01_000002, state=COMPLETE, > exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh > changed on src filesystem (expected 0, was 1367580580000 > >>>> > >>>> > >>>> > >>>> > >>>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote: > >>>> Can you try passing a fully qualified local path? That is, including > the file:/ scheme > >>>> > >>>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" < > write2kish...@gmail.com> wrote: > >>>> Hi Harsh, > >>>> The setResource() call on LocalResource() is expecting an argument > of type org.apache.hadoop.yarn.api.records.URL which is converted from a > string in the form of URI. This happens in the following call of > Distributed Shell example, > >>>> > >>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( > shellScriptPath))); > >>>> > >>>> So, if I give a local file I get a parsing error like below, which is > when I changed it to an HDFS file thinking that it should be given like > that only. Could you please give an example of how else it could be used, > using a local file as you are saying? > >>>> > >>>> 2013-08-06 06:23:12,942 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > Failed to parse resource-request > >>>> java.net.URISyntaxException: Expected scheme name at index 0: > :///home_/dsadm/kishore/kk.ksh > >>>> at java.net.URI$Parser.fail(URI.java:2820) > >>>> at java.net.URI$Parser.failExpecting(URI.java:2826) > >>>> at java.net.URI$Parser.parse(URI.java:3015) > >>>> at java.net.URI.<init>(URI.java:747) > >>>> at > org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77) > >>>> at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46) > >>>> > >>>> > >>>> > >>>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote: > >>>> To be honest, I've never tried loading a HDFS file onto the > >>>> LocalResource this way. I usually just pass a local file and that > >>>> works just fine. There may be something in the URI transformation > >>>> possibly breaking a HDFS source, but try passing a local file - does > >>>> that fail too? The Shell example uses a local file. > >>>> > >>>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri > >>>> <write2kish...@gmail.com> wrote: > >>>>> Hi Harsh, > >>>>> > >>>>> Please see if this is useful, I got a stack trace after the error > has > >>>>> occurred.... > >>>>> > >>>>> 2013-08-06 00:55:30,559 INFO > >>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: > CWD set > >>>>> to > /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004 > >>>>> = > >>>>> > file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004 > >>>>> 2013-08-06 00:55:31,017 ERROR > >>>>> org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException > >>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File > does not > >>>>> exist: hdfs://isredeng/kishore/kk.ksh > >>>>> 2013-08-06 00:55:31,029 INFO > >>>>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > >>>>> DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, > File does > >>>>> not exist: hdfs://isredeng/kishore/kk.ksh > >>>>> 2013-08-06 00:55:31,031 INFO > >>>>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > >>>>> Resource hdfs://isredeng/kishore/kk.ksh transitioned from > DOWNLOADING to > >>>>> FAILED > >>>>> 2013-08-06 00:55:31,034 INFO > >>>>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > >>>>> Container container_1375716148174_0004_01_000002 transitioned from > >>>>> LOCALIZING to LOCALIZATION_FAILED > >>>>> 2013-08-06 00:55:31,035 INFO > >>>>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: > >>>>> Container container_1375716148174_0004_01_000002 sent RELEASE event > on a > >>>>> resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } > not > >>>>> present in cache. > >>>>> 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: > interrupted > >>>>> waiting to send rpc request to server > >>>>> java.lang.InterruptedException > >>>>> at > >>>>> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290) > >>>>> at > >>>>> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229) > >>>>> at java.util.concurrent.FutureTask.get(FutureTask.java:94) > >>>>> at > >>>>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930) > >>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1285) > >>>>> at org.apache.hadoop.ipc.Client.call(Client.java:1264) > >>>>> at > >>>>> > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > >>>>> at $Proxy22.heartbeat(Unknown Source) > >>>>> at > >>>>> > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62) > >>>>> at > >>>>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249) > >>>>> at > >>>>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163) > >>>>> at > >>>>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106) > >>>>> at > >>>>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979) > >>>>> > >>>>> > >>>>> > >>>>> And here is my code snippet: > >>>>> > >>>>> ContainerLaunchContext ctx = > >>>>> Records.newRecord(ContainerLaunchContext.class); > >>>>> > >>>>> ctx.setEnvironment(oshEnv); > >>>>> > >>>>> // Set the local resources > >>>>> Map<String, LocalResource> localResources = new HashMap<String, > >>>>> LocalResource>(); > >>>>> > >>>>> LocalResource shellRsrc = > Records.newRecord(LocalResource.class); > >>>>> shellRsrc.setType(LocalResourceType.FILE); > >>>>> shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION); > >>>>> String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh"; > >>>>> try { > >>>>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new > >>>>> URI(shellScriptPath))); > >>>>> } catch (URISyntaxException e) { > >>>>> LOG.error("Error when trying to use shell script path > specified" > >>>>> + " in env, path=" + shellScriptPath); > >>>>> e.printStackTrace(); > >>>>> } > >>>>> > >>>>> shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/); > >>>>> shellRsrc.setSize(0/*shellScriptPathLen*/); > >>>>> String ExecShellStringPath = "ExecShellScript.sh"; > >>>>> localResources.put(ExecShellStringPath, shellRsrc); > >>>>> > >>>>> ctx.setLocalResources(localResources); > >>>>> > >>>>> > >>>>> Please let me know if you need anything else. > >>>>> > >>>>> Thanks, > >>>>> Kishore > >>>>> > >>>>> > >>>>> > >>>>> On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote: > >>>>>> > >>>>>> The detail is insufficient to answer why. You should also have > gotten > >>>>>> a trace after it, can you post that? If possible, also the relevant > >>>>>> snippets of code. > >>>>>> > >>>>>> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri > >>>>>> <write2kish...@gmail.com> wrote: > >>>>>>> Hi Harsh, > >>>>>>> Thanks for the quick and detailed reply, it really helps. I am > trying > >>>>>>> to > >>>>>>> use it and getting this error in node manager's log: > >>>>>>> > >>>>>>> 2013-08-05 08:57:28,867 ERROR > >>>>>>> org.apache.hadoop.security.UserGroupInformation: > >>>>>>> PriviledgedActionException > >>>>>>> as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File > does > >>>>>>> not > >>>>>>> exist: hdfs://isredeng/kishore/kk.ksh > >>>>>>> > >>>>>>> > >>>>>>> This file is there on the machine with name "isredeng", I could do > ls > >>>>>>> for > >>>>>>> that file as below: > >>>>>>> > >>>>>>> -bash-4.1$ hadoop fs -ls kishore/kk.ksh > >>>>>>> 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load > >>>>>>> native-hadoop > >>>>>>> library for your platform... using builtin-java classes where > applicable > >>>>>>> Found 1 items > >>>>>>> -rw-r--r-- 3 dsadm supergroup 1046 2013-08-05 08:48 > >>>>>>> kishore/kk.ksh > >>>>>>> > >>>>>>> Note: I am using a single node cluster > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Kishore > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> > wrote: > >>>>>>>> > >>>>>>>> The string for each LocalResource in the map can be anything that > >>>>>>>> serves as a common identifier name for your application. At > execution > >>>>>>>> time, the passed resource filename will be aliased to the name > you've > >>>>>>>> mapped it to, so that the application code need not track special > >>>>>>>> names. The behavior is very similar to how you can, in MR, define > a > >>>>>>>> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar). > >>>>>>>> > >>>>>>>> For an example, checkout the DistributedShell app sources. > >>>>>>>> > >>>>>>>> Over [1], you can see we take a user provided file path to a shell > >>>>>>>> script. This can be named anything as it is user-supplied. > >>>>>>>> Onto [2], we define this as a local resource [2.1] and embed it > with a > >>>>>>>> different name (the string you ask about) [2.2], as defined at > [3] as > >>>>>>>> an application reference-able constant. > >>>>>>>> Note that in [4], we add to the Container arguments the aliased > name > >>>>>>>> we mapped it to (i.e. [3]) and not the original filename we > received > >>>>>>>> from the user. The resource is placed on the container with this > name > >>>>>>>> instead, so thats what we choose to execute. > >>>>>>>> > >>>>>>>> [1] - > >>>>>>>> > >>>>>>>> > https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390 > >>>>>>>> > >>>>>>>> [2] - [2.1] > >>>>>>>> > >>>>>>>> > https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764 > >>>>>>>> and [2.2] > >>>>>>>> > >>>>>>>> > https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780 > >>>>>>>> > >>>>>>>> [3] - > >>>>>>>> > >>>>>>>> > https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205 > >>>>>>>> > >>>>>>>> [4] - > >>>>>>>> > >>>>>>>> > https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791 > >>>>>>>> > >>>>>>>> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri > >>>>>>>> <write2kish...@gmail.com> wrote: > >>>>>>>>> Hi, > >>>>>>>>> > >>>>>>>>> Can someone please tell me what is the use of calling > >>>>>>>>> setLocalResources() > >>>>>>>>> on ContainerLaunchContext? > >>>>>>>>> > >>>>>>>>> And, also an example of how to use this will help... > >>>>>>>>> > >>>>>>>>> I couldn't guess what is the String in the map that is passed to > >>>>>>>>> setLocalResources() like below: > >>>>>>>>> > >>>>>>>>> // Set the local resources > >>>>>>>>> Map<String, LocalResource> localResources = new > HashMap<String, > >>>>>>>>> LocalResource>(); > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Kishore > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Harsh J > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Harsh J > >>>>> > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Harsh J > >>>> > >>>> > >>>> > >>> > >> > >> > >> > >> -- > >> Harsh J > > > > > > -- > Harsh J >