Yes, my HDFS paths are of the form /home/user-name/ And I have used these in DistributedCache's addCacheFiles method successfully.
Thanks, Akhil Amareshwari Sriramadasu wrote: > > Is your hdfs path /home/akhil1988/Config.zip? Usually hdfs path is of the > form /user/akhil1988/Config.zip. > Just wondering if you are giving wrong path in the uri! > > Thanks > Amareshwari > > akhil1988 wrote: >> Thanks Amareshwari for your reply! >> >> The file Config.zip is lying in the HDFS, if it would not have been then >> the >> error would be reported by the jobtracker itself while executing the >> statement: >> DistributedCache.addCacheArchive(new URI("/home/akhil1988/Config.zip"), >> conf); >> >> But I get error in the map function when I try to access the Config >> directory. >> >> Now I am using the following statement but still getting the same error: >> DistributedCache.addCacheArchive(new >> URI("/home/akhil1988/Config.zip#Config"), conf); >> >> Do you think whether there should be any problem in distributing a zipped >> directory and then hadoop unzipping it recursively. >> >> Thanks! >> Akhil >> >> >> >> Amareshwari Sriramadasu wrote: >> >>> Hi Akhil, >>> >>> DistributedCache.addCacheArchive takes path on hdfs. From your code, it >>> looks like you are passing local path. >>> Also, if you want to create symlink, you should pass URI as >>> hdfs://<path>#<linkname>, besides calling >>> DistributedCache.createSymlink(conf); >>> >>> Thanks >>> Amareshwari >>> >>> >>> akhil1988 wrote: >>> >>>> Please ask any questions if I am not clear above about the problem I am >>>> facing. >>>> >>>> Thanks, >>>> Akhil >>>> >>>> akhil1988 wrote: >>>> >>>> >>>>> Hi All! >>>>> >>>>> I want a directory to be present in the local working directory of the >>>>> task for which I am using the following statements: >>>>> >>>>> DistributedCache.addCacheArchive(new >>>>> URI("/home/akhil1988/Config.zip"), >>>>> conf); >>>>> DistributedCache.createSymlink(conf); >>>>> >>>>> >>>>> >>>>>>> Here Config is a directory which I have zipped and put at the given >>>>>>> location in HDFS >>>>>>> >>>>>>> >>>>> I have zipped the directory because the API doc of DistributedCache >>>>> (http://hadoop.apache.org/core/docs/r0.20.0/api/index.html) says that >>>>> the >>>>> archive files are unzipped in the local cache directory : >>>>> >>>>> DistributedCache can be used to distribute simple, read-only data/text >>>>> files and/or more complex types such as archives, jars etc. Archives >>>>> (zip, >>>>> tar and tgz/tar.gz files) are un-archived at the slave nodes. >>>>> >>>>> So, from my understanding of the API docs I expect that the Config.zip >>>>> file will be unzipped to Config directory and since I have SymLinked >>>>> them >>>>> I can access the directory in the following manner from my map >>>>> function: >>>>> >>>>> FileInputStream fin = new FileInputStream("Config/file1.config"); >>>>> >>>>> But I get the FileNotFoundException on the execution of this >>>>> statement. >>>>> Please let me know where I am going wrong. >>>>> >>>>> Thanks, >>>>> Akhil >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> >> > > > -- View this message in context: http://www.nabble.com/Using-addCacheArchive-tp24207739p24214730.html Sent from the Hadoop core-user mailing list archive at Nabble.com.