It is kinda unnecessary to be asking developers to load in timestamps and
length themselves. Why not provide a java.io.File, or perhaps a Path
accepting API, that gets it automatically on their behalf using the
FileSystem API internally?

P.s. A HDFS file gave him a FNF, while a Local file gave him a proper
TS/Len error. I'm guessing there's a bug here w.r.t. handling HDFS
paths.

On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah <hit...@apache.org> wrote:
> Hi Krishna,
>
> YARN downloads a specified local resource on the container's node from the 
> url specified. In all situtations, the remote url needs to be a fully 
> qualified path. To verify that the file at the remote url is still valid, 
> YARN expects you to provide the length and last modified timestamp of that 
> file.
>
> If you use an hdfs path such as hdfs://namenode:port/<absolute path to file>, 
> you will need to get the length and timestamp from HDFS.
> If you use file:///, the file should exist on all nodes and all nodes should 
> have the file with the same length and timestamp for localization to work. ( 
> For a single node setup, this works but tougher to get right on a multi-node 
> setup - deploying the file via a rpm should likely work).
>
> -- Hitesh
>
> On Aug 6, 2013, at 11:11 AM, Omkar Joshi wrote:
>
>> Hi,
>>
>> You need to match the timestamp. Probably get the timestamp locally before 
>> adding it. This is explicitly done to ensure that file is not updated after 
>> user makes the call to avoid possible errors.
>>
>>
>> Thanks,
>> Omkar Joshi
>> Hortonworks Inc.
>>
>>
>> On Tue, Aug 6, 2013 at 5:25 AM, Krishna Kishore Bonagiri 
>> <write2kish...@gmail.com> wrote:
>> I tried the following and it works!
>> String shellScriptPath = "file:///home_/dsadm/kishore/kk.ksh";
>>
>> But now getting a timestamp error like below, when I passed 0 to 
>> setTimestamp()
>>
>> 13/08/06 08:23:48 INFO ApplicationMaster: Got container status for 
>> containerID= container_1375784329048_0017_01_000002, state=COMPLETE, 
>> exitStatus=-1000, diagnostics=Resource file:/home_/dsadm/kishore/kk.ksh 
>> changed on src filesystem (expected 0, was 1367580580000
>>
>>
>>
>>
>> On Tue, Aug 6, 2013 at 5:24 PM, Harsh J <ha...@cloudera.com> wrote:
>> Can you try passing a fully qualified local path? That is, including the 
>> file:/ scheme
>>
>> On Aug 6, 2013 4:05 PM, "Krishna Kishore Bonagiri" <write2kish...@gmail.com> 
>> wrote:
>> Hi Harsh,
>>    The setResource() call on LocalResource() is expecting an argument of 
>> type org.apache.hadoop.yarn.api.records.URL which is converted from a string 
>> in the form of URI. This happens in the following call of Distributed Shell 
>> example,
>>
>> shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new URI( 
>> shellScriptPath)));
>>
>> So, if I give a local file I get a parsing error like below, which is when I 
>> changed it to an HDFS file thinking that it should be given like that only. 
>> Could you please give an example of how else it could be used, using a local 
>> file as you are saying?
>>
>> 2013-08-06 06:23:12,942 WARN 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>  Failed to parse resource-request
>> java.net.URISyntaxException: Expected scheme name at index 0: 
>> :///home_/dsadm/kishore/kk.ksh
>>         at java.net.URI$Parser.fail(URI.java:2820)
>>         at java.net.URI$Parser.failExpecting(URI.java:2826)
>>         at java.net.URI$Parser.parse(URI.java:3015)
>>         at java.net.URI.<init>(URI.java:747)
>>         at 
>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:77)
>>         at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>
>>
>>
>> On Tue, Aug 6, 2013 at 3:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> To be honest, I've never tried loading a HDFS file onto the
>> LocalResource this way. I usually just pass a local file and that
>> works just fine. There may be something in the URI transformation
>> possibly breaking a HDFS source, but try passing a local file - does
>> that fail too? The Shell example uses a local file.
>>
>> On Tue, Aug 6, 2013 at 10:54 AM, Krishna Kishore Bonagiri
>> <write2kish...@gmail.com> wrote:
>> > Hi Harsh,
>> >
>> >   Please see if this is useful, I got a stack trace after the error has
>> > occurred....
>> >
>> > 2013-08-06 00:55:30,559 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
>> > to 
>> > /tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > =
>> > file:/tmp/nm-local-dir/usercache/dsadm/appcache/application_1375716148174_0004
>> > 2013-08-06 00:55:31,017 ERROR
>> > org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not
>> > exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,029 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>> > DEBUG: FAILED { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null }, File does
>> > not exist: hdfs://isredeng/kishore/kk.ksh
>> > 2013-08-06 00:55:31,031 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
>> > Resource hdfs://isredeng/kishore/kk.ksh transitioned from DOWNLOADING to
>> > FAILED
>> > 2013-08-06 00:55:31,034 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> > Container container_1375716148174_0004_01_000002 transitioned from
>> > LOCALIZING to LOCALIZATION_FAILED
>> > 2013-08-06 00:55:31,035 INFO
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
>> > Container container_1375716148174_0004_01_000002 sent RELEASE event on a
>> > resource request { hdfs://isredeng/kishore/kk.ksh, 0, FILE, null } not
>> > present in cache.
>> > 2013-08-06 00:55:31,036 WARN org.apache.hadoop.ipc.Client: interrupted
>> > waiting to send rpc request to server
>> > java.lang.InterruptedException
>> >         at
>> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1290)
>> >         at
>> > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:229)
>> >         at java.util.concurrent.FutureTask.get(FutureTask.java:94)
>> >         at
>> > org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:930)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1285)
>> >         at org.apache.hadoop.ipc.Client.call(Client.java:1264)
>> >         at
>> > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>> >         at $Proxy22.heartbeat(Unknown Source)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:249)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:163)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:106)
>> >         at
>> > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:979)
>> >
>> >
>> >
>> > And here is my code snippet:
>> >
>> >       ContainerLaunchContext ctx =
>> > Records.newRecord(ContainerLaunchContext.class);
>> >
>> >       ctx.setEnvironment(oshEnv);
>> >
>> >       // Set the local resources
>> >       Map<String, LocalResource> localResources = new HashMap<String,
>> > LocalResource>();
>> >
>> >       LocalResource shellRsrc = Records.newRecord(LocalResource.class);
>> >       shellRsrc.setType(LocalResourceType.FILE);
>> >       shellRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
>> >       String shellScriptPath = "hdfs://isredeng//kishore/kk.ksh";
>> >       try {
>> >         shellRsrc.setResource(ConverterUtils.getYarnUrlFromURI(new
>> > URI(shellScriptPath)));
>> >       } catch (URISyntaxException e) {
>> >         LOG.error("Error when trying to use shell script path specified"
>> >             + " in env, path=" + shellScriptPath);
>> >         e.printStackTrace();
>> >       }
>> >
>> >       shellRsrc.setTimestamp(0/*shellScriptPathTimestamp*/);
>> >       shellRsrc.setSize(0/*shellScriptPathLen*/);
>> >       String ExecShellStringPath = "ExecShellScript.sh";
>> >       localResources.put(ExecShellStringPath, shellRsrc);
>> >
>> >       ctx.setLocalResources(localResources);
>> >
>> >
>> > Please let me know if you need anything else.
>> >
>> > Thanks,
>> > Kishore
>> >
>> >
>> >
>> > On Tue, Aug 6, 2013 at 12:05 AM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> The detail is insufficient to answer why. You should also have gotten
>> >> a trace after it, can you post that? If possible, also the relevant
>> >> snippets of code.
>> >>
>> >> On Mon, Aug 5, 2013 at 6:36 PM, Krishna Kishore Bonagiri
>> >> <write2kish...@gmail.com> wrote:
>> >> > Hi Harsh,
>> >> >  Thanks for the quick and detailed reply, it really helps. I am trying
>> >> > to
>> >> > use it and getting this error in node manager's log:
>> >> >
>> >> > 2013-08-05 08:57:28,867 ERROR
>> >> > org.apache.hadoop.security.UserGroupInformation:
>> >> > PriviledgedActionException
>> >> > as:dsadm (auth:SIMPLE) cause:java.io.FileNotFoundException: File does
>> >> > not
>> >> > exist: hdfs://isredeng/kishore/kk.ksh
>> >> >
>> >> >
>> >> > This file is there on the machine with name "isredeng", I could do ls
>> >> > for
>> >> > that file as below:
>> >> >
>> >> > -bash-4.1$ hadoop fs -ls kishore/kk.ksh
>> >> > 13/08/05 09:01:03 WARN util.NativeCodeLoader: Unable to load
>> >> > native-hadoop
>> >> > library for your platform... using builtin-java classes where applicable
>> >> > Found 1 items
>> >> > -rw-r--r--   3 dsadm supergroup       1046 2013-08-05 08:48
>> >> > kishore/kk.ksh
>> >> >
>> >> > Note: I am using a single node cluster
>> >> >
>> >> > Thanks,
>> >> > Kishore
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Aug 5, 2013 at 3:00 PM, Harsh J <ha...@cloudera.com> wrote:
>> >> >>
>> >> >> The string for each LocalResource in the map can be anything that
>> >> >> serves as a common identifier name for your application. At execution
>> >> >> time, the passed resource filename will be aliased to the name you've
>> >> >> mapped it to, so that the application code need not track special
>> >> >> names. The behavior is very similar to how you can, in MR, define a
>> >> >> symlink name for a DistributedCache entry (e.g. foo.jar#bar.jar).
>> >> >>
>> >> >> For an example, checkout the DistributedShell app sources.
>> >> >>
>> >> >> Over [1], you can see we take a user provided file path to a shell
>> >> >> script. This can be named anything as it is user-supplied.
>> >> >> Onto [2], we define this as a local resource [2.1] and embed it with a
>> >> >> different name (the string you ask about) [2.2], as defined at [3] as
>> >> >> an application reference-able constant.
>> >> >> Note that in [4], we add to the Container arguments the aliased name
>> >> >> we mapped it to (i.e. [3]) and not the original filename we received
>> >> >> from the user. The resource is placed on the container with this name
>> >> >> instead, so thats what we choose to execute.
>> >> >>
>> >> >> [1] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L390
>> >> >>
>> >> >> [2] - [2.1]
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L764
>> >> >> and [2.2]
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L780
>> >> >>
>> >> >> [3] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L205
>> >> >>
>> >> >> [4] -
>> >> >>
>> >> >> https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L791
>> >> >>
>> >> >> On Mon, Aug 5, 2013 at 2:44 PM, Krishna Kishore Bonagiri
>> >> >> <write2kish...@gmail.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> >   Can someone please tell me what is the use of calling
>> >> >> > setLocalResources()
>> >> >> > on ContainerLaunchContext?
>> >> >> >
>> >> >> >   And, also an example of how to use this will help...
>> >> >> >
>> >> >> >  I couldn't guess what is the String in the map that is passed to
>> >> >> > setLocalResources() like below:
>> >> >> >
>> >> >> >       // Set the local resources
>> >> >> >       Map<String, LocalResource> localResources = new HashMap<String,
>> >> >> > LocalResource>();
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Kishore
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Harsh J
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>>
>>
>



-- 
Harsh J

Reply via email to