Hi All,
I am using DistributedCache.addCacheArchives() to distribute a tar file to
the tasktrackers using the following statement.
DistributedCache.addCacheArchives(new URI("/home/akhil1988/sample.tar"),
conf);
According to the documentation it should get unarchived at the tasktra
9562a8d6#
>
> Hope this helps,
>
> Chris
>
> On Thu, Jun 25, 2009 at 4:50 PM, akhil1988 wrote:
>
>>
>> Please ask any questions if I am not clear above about the problem I am
>> facing.
>>
>> Thanks,
>> Akhil
>>
>> akhil1988
Yes, my HDFS paths are of the form /home/user-name/
And I have used these in DistributedCache's addCacheFiles method
successfully.
Thanks,
Akhil
Amareshwari Sriramadasu wrote:
>
> Is your hdfs path /home/akhil1988/Config.zip? Usually hdfs path is of the
> form /user/akhil1
Thanks Amareshwari for your reply!
The file Config.zip is lying in the HDFS, if it would not have been then the
error would be reported by the jobtracker itself while executing the
statement:
DistributedCache.addCacheArchive(new URI("/home/akhil1988/Config.zip"),
conf);
But I get er
Please ask any questions if I am not clear above about the problem I am
facing.
Thanks,
Akhil
akhil1988 wrote:
>
> Hi All!
>
> I want a directory to be present in the local working directory of the
> task for which I am using the following statements:
>
> Distributed
Hi All!
I want a directory to be present in the local working directory of the task
for which I am using the following statements:
DistributedCache.addCacheArchive(new URI("/home/akhil1988/Config.zip"),
conf);
DistributedCache.createSymlink(conf);
>> Here Config is a direct
task has removed it, and I am guessing it is
> only created at cluster start time.
>
> On Mon, Jun 22, 2009 at 6:19 PM, akhil1988 wrote:
>
>>
>> Hi All!
>>
>> I have been running Hadoop jobs through my user account on a cluster, for
>> a
>> while now.
wordcount_classes_dir.jar
org.uiuc.upcrc.extClasses.WordCount /home/akhil1988/input
/home/akhil1988/output
JO
09/06/22 19:19:01 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
org.apache.hadoop.ipc.RemoteException: java.io.FileNotFoundException
One thing that I would like to ask you is that can we use DistributerCache
for transferring directories to the local cache of the tasks?
Thanks,
Akhil
akhil1988 wrote:
>
> Hi Jason!
>
> Thanks for going with me to solve my problem.
>
> To restate things and make it mor
:
> DistributedCache.addCacheFile(new
> URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf);
> DistributedCache.addCacheFile(new
> URI("/home/akhil1988/Ner/OriginalNer/Config/"), conf);
> DistributedCache.createSymlink(conf);
The program executes till the same point as b
DistributedCache.addCacheFile(new
URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf); Data is a directory
which contains some text as well as some binary files. In the statement
Parameters.readConfigAndLoadExternalData("Config/allLayer1.config"); I can
see(in the output messages) tha
(JobShell.java:68)
akhil1988 wrote:
>
> Thank you Jason for your reply.
>
> My Map class is an inner class and it is a static class. Here is the
> structure of my code.
>
> public class NerTagger {
>
> public static class Map extends MapRed
conf.set("mapred.job.tracker", "local");
conf.set("fs.default.name", "file:///");
DistributedCache.addCacheFile(new
URI("/home/akhil1988/Ner/OriginalNer/Data/"), conf);
DistributedCache.addCacheFile(new
URI("/home/akhil
Hi All,
I am running my mapred program in local mode by setting
mapred.jobtracker.local to local mode so that I can debug my code.
The mapred program is a direct porting of my original sequential code. There
is no reduce phase.
Basically, I have just put my program in the map class.
My program
Can anyone help me on this issue. I have an account on the cluster and I
cannot go and start server on each server process on each tasktracker.
Akhil
akhil1988 wrote:
>
> Hi All,
>
> I am porting a machine learning application on Hadoop using MapReduce. The
> architecture of
Hi All,
I am porting a machine learning application on Hadoop using MapReduce. The
architecture of the application goes like this:
1. run a number of server processes which take around 2-3 minutes to start
and then remain as daemon waiting for a client to call for a connection.
During the startu
I wish to give a path of a jar file as an argument when executing the "hadoop
jar . " command as my mapper uses that jar file for its operation. I
found that -libjars option can be used but for me it is not working, it is
giving an exception. Can anyone tell, how to use libjars generic command
Hi!
I am working on applying WordCount example on the entire Wikipedia dump. The
entire english wikipedia is around 200GB which I have stored in HDFS in a
cluster to which I have access.
The problem: Wikipedia dump contains many directories (it has a very big
directory structure) containing HTM
18 matches
Mail list logo