It is kinda unnecessary to be asking developers to load in timestamps and length themselves. Why not provide a java.io.File, or perhaps a Path accepting API, that gets it automatically on their behalf using the FileSystem API internally?
P.s. A HDFS file gave him a FNF, while a Local file gave him a proper TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS paths. On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hit...@apache.org> wrote: > Hi Krishna, > > YARN downloads a specified local resource on the container's node from the > url specified. In all situtations, the remote url needs to be a fully > qualified path. To verify that the file at the remote url is still valid, > YARN expects you to provide the length and last modified timestamp of that > file. > > If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, > you will need to get the length and timestamp from HDFS. > If you use file:///, the file should exist on all nodes and all nodes should > have the file with the same length and timestamp for localization to work. ( > For a single node setup, this works but tougher to get right on a multi-node > setup - deploying the file via a rpm should likely work). > > -- Hitesh > > On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote: > >> Hi, >> >> You need to match the timestamp. Probably get the timestamp locally before >> adding it. This is explicitly done to ensure that file is not updated after >> user makes the call to avoid possible errors. >> >> >> Thanks, >> Omkar Joshi >> Hortonworks Inc. >> >> >> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri >> <write2kish...@gmail.com> wrote: >> I tried the following and it works! >> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh"; >> >> But now getting a timestamp error like below, when I passed 0 to >> setTimestamp() >> >> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for >> containerID= container_1375784329048_0017_01_000002, state=COMPLETE, >> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh >> changed on src filesystem (expected 0, was 1367580580000 >> >> >> >> >> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote: >> Can you try passing a fully qualified local path? That is, including the >> file:/ scheme >> >> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <write2kish...@gmail.com> >> wrote: >> Hi Harsh, >> The setResource() call on LocalResource() is expecting an argument of >> type org.apache.hadoop.yarn.api.records.URL which is converted from a string >> in the form of URI. This happens in the following call of Distributed Shell >> example, >> >> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( >> shellScriptPath))); >> >> So, if I give a local file I get a parsing error like below, which is when I >> changed it to an HDFS file thinking that it should be given like that only. >> Could you please give an example of how else it could be used, using a local >> file as you are saying? >> >> 2013-08-06 06:23:12,942 WARN >> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: >> Failed to parse resource-request >> java.net.URISyntaxException: Expected scheme name at index 0: >> :///home_/dsadm/kishore/kk.ksh >> at java.net.URI$Parser.fail(URI.java:2820) >> at java.net.URI$Parser.failExpecting(URI.java:2826) >> at java.net.URI$Parser.parse(URI.java:3015) >> at java.net.URI.<init>(URI.java:747) >> at >> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77) >> at >> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46) >> >> >> >> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote: >> To be honest, I've never tried loading a HDFS file onto the >> LocalResource this way. I usually just pass a local file and that >> works just fine. There may be something in the URI transformation >> possibly breaking a HDFS source, but try passing a local file - does >> that fail too? The Shell example uses a local file. >> >> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri >> <write2kish...@gmail.com> wrote: >> > Hi Harsh, >> > >> > Please see if this is useful, I got a stack trace after the error has >> > occurred.... >> > >> > 2013-08-06 00:55:30,559 INFO >> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set >> > to >> > /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004 >> > = >> > file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004 >> > 2013-08-06 00:55:31,017 ERROR >> > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not >> > exist: hdfs://isredeng/kishore/kk.ksh >> > 2013-08-06 00:55:31,029 INFO >> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: >> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does >> > not exist: hdfs://isredeng/kishore/kk.ksh >> > 2013-08-06 00:55:31,031 INFO >> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: >> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to >> > FAILED >> > 2013-08-06 00:55:31,034 INFO >> > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: >> > Container container_1375716148174_0004_01_000002 transitioned from >> > LOCALIZING to LOCALIZATION_FAILED >> > 2013-08-06 00:55:31,035 INFO >> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: >> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a >> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not >> > present in cache. >> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted >> > waiting to send rpc request to server >> > java.lang.InterruptedException >> > at >> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290) >> > at >> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229) >> > at java.util.concurrent.FutureTask.get(FutureTask.java:94) >> > at >> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930) >> > at org.apache.hadoop.ipc.Client.call(Client.java:1285) >> > at org.apache.hadoop.ipc.Client.call(Client.java:1264) >> > at >> > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) >> > at $Proxy22.heartbeat(Unknown Source) >> > at >> > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62) >> > at >> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249) >> > at >> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163) >> > at >> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106) >> > at >> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979) >> > >> > >> > >> > And here is my code snippet: >> > >> > ContainerLaunchContext ctx = >> > Records.newRecord(ContainerLaunchContext.class); >> > >> > ctx.setEnvironment(oshEnv); >> > >> > // Set the local resources >> > Map<String, LocalResource> localResources = new HashMap<String, >> > LocalResource>(); >> > >> > LocalResource shellRsrc = Records.newRecord(LocalResource.class); >> > shellRsrc.setType(LocalResourceType.FILE); >> > shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION); >> > String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh"; >> > try { >> > shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new >> > URI(shellScriptPath))); >> > } catch (URISyntaxException e) { >> > LOG.error("Error when trying to use shell script path specified" >> > + " in env, path=" + shellScriptPath); >> > e.printStackTrace(); >> > } >> > >> > shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/); >> > shellRsrc.setSize(0/*shellScriptPathLen*/); >> > String ExecShellStringPath = "ExecShellScript.sh"; >> > localResources.put(ExecShellStringPath, shellRsrc); >> > >> > ctx.setLocalResources(localResources); >> > >> > >> > Please let me know if you need anything else. >> > >> > Thanks, >> > Kishore >> > >> > >> > >> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote: >> >> >> >> The detail is insufficient to answer why. You should also have gotten >> >> a trace after it, can you post that? If possible, also the relevant >> >> snippets of code. >> >> >> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri >> >> <write2kish...@gmail.com> wrote: >> >> > Hi Harsh, >> >> > Thanks for the quick and detailed reply, it really helps. I am trying >> >> > to >> >> > use it and getting this error in node manager's log: >> >> > >> >> > 2013-08-05 08:57:28,867 ERROR >> >> > org.apache.hadoop.security.UserGroupInformation: >> >> > PriviledgedActionException >> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does >> >> > not >> >> > exist: hdfs://isredeng/kishore/kk.ksh >> >> > >> >> > >> >> > This file is there on the machine with name "isredeng", I could do ls >> >> > for >> >> > that file as below: >> >> > >> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh >> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load >> >> > native-hadoop >> >> > library for your platform... using builtin-java classes where applicable >> >> > Found 1 items >> >> > -rw-r--r-- 3 dsadm supergroup 1046 2013-08-05 08:48 >> >> > kishore/kk.ksh >> >> > >> >> > Note: I am using a single node cluster >> >> > >> >> > Thanks, >> >> > Kishore >> >> > >> >> > >> >> > >> >> > >> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote: >> >> >> >> >> >> The string for each LocalResource in the map can be anything that >> >> >> serves as a common identifier name for your application. At execution >> >> >> time, the passed resource filename will be aliased to the name you've >> >> >> mapped it to, so that the application code need not track special >> >> >> names. The behavior is very similar to how you can, in MR, define a >> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar). >> >> >> >> >> >> For an example, checkout the DistributedShell app sources. >> >> >> >> >> >> Over [1], you can see we take a user provided file path to a shell >> >> >> script. This can be named anything as it is user-supplied. >> >> >> Onto [2], we define this as a local resource [2.1] and embed it with a >> >> >> different name (the string you ask about) [2.2], as defined at [3] as >> >> >> an application reference-able constant. >> >> >> Note that in [4], we add to the Container arguments the aliased name >> >> >> we mapped it to (i.e. [3]) and not the original filename we received >> >> >> from the user. The resource is placed on the container with this name >> >> >> instead, so thats what we choose to execute. >> >> >> >> >> >> [1] - >> >> >> >> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390 >> >> >> >> >> >> [2] - [2.1] >> >> >> >> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764 >> >> >> and [2.2] >> >> >> >> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780 >> >> >> >> >> >> [3] - >> >> >> >> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205 >> >> >> >> >> >> [4] - >> >> >> >> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791 >> >> >> >> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri >> >> >> <write2kish...@gmail.com> wrote: >> >> >> > Hi, >> >> >> > >> >> >> > Can someone please tell me what is the use of calling >> >> >> > setLocalResources() >> >> >> > on ContainerLaunchContext? >> >> >> > >> >> >> > And, also an example of how to use this will help... >> >> >> > >> >> >> > I couldn't guess what is the String in the map that is passed to >> >> >> > setLocalResources() like below: >> >> >> > >> >> >> > // Set the local resources >> >> >> > Map<String, LocalResource> localResources = new HashMap<String, >> >> >> > LocalResource>(); >> >> >> > >> >> >> > Thanks, >> >> >> > Kishore >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Harsh J >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Harsh J >> > >> > >> >> >> >> -- >> Harsh J >> >> >> > -- Harsh J