Re: DistributedCache - getLocalCacheFiles method returns null
btw, just to let you know that I am running my job in a pseudo-distributed mode. Thanks, Neeral From: neeral beladia neeral_bela...@yahoo.com To: common-user@hadoop.apache.org Sent: Tue, May 31, 2011 10:00:00 PM Subject: DistributedCache - getLocalCacheFiles method returns null Hi, I have a file on amazon aws under : s3n://Access Key:Secret Key@Bucket Name/file.txt I want this file to be accessible by the slave nodes via Distributed Cache. I put the following after the job configuration statements in the Driver program : DistributedCache.addCacheFile(new Path(s3n://Access Key:Secret Key@Bucket Name/file.txt).toUri(), job.getConfiguration()); Also in my setup method in the mapper class, I have the below statement : Path[] cacheFiles = DistributedCache.getLocalCacheFiles(context.getConfiguration()); cacheFiles is gettng assigned null. Could you please let me know what I am doing wrong here ? The file does exist on S3. Thanks, Neeral
copyToLocal (from Amazon AWS)
Hi, I am not sure if this question has been asked. Its more of a hadoop fs question. I am trying to execute the following hadoop fs command : hadoop fs -copyToLocal s3n://Access Key:Secret Key@bucket name/file.txt /home/hadoop/workspace/file.txt When I execute this command directly from the Terminal shell, it works perfectly fine, however the above command from code doesn't execute. In fact, it says : Exception in thread main copyToLocal: null Please note I am using Runtime.getRunTime().exec(cmdStr), where cmdStr is the above hadoop command. Also, please note that hadoop fs -cp or hadoop fs -rmr commands work fine with source and destination being both Amazon AWS locations. In the above command (hadoop fs -copyToLocal) the destination is local location to my machine(Ubuntu installed). Your help would be greatly appreciated. Thanks, Neeral
DistributedCache - getLocalCacheFiles method returns null
Hi, I have a file on amazon aws under : s3n://Access Key:Secret Key@Bucket Name/file.txt I want this file to be accessible by the slave nodes via Distributed Cache. I put the following after the job configuration statements in the Driver program : DistributedCache.addCacheFile(new Path(s3n://Access Key:Secret Key@Bucket Name/file.txt).toUri(), job.getConfiguration()); Also in my setup method in the mapper class, I have the below statement : Path[] cacheFiles = DistributedCache.getLocalCacheFiles(context.getConfiguration()); cacheFiles is gettng assigned null. Could you please let me know what I am doing wrong here ? The file does exist on S3. Thanks, Neeral