Hi,
I have met some problem about the DistributedCache.

There is a document called 'model.txt',  I want every mapper can read it
because there are some public data in it.
So, I use the DistributedCache.

1.In the main()
DistributedCache.createSymlink(conf);
String path = "/user/Li/model/model.txt";
Path filePath = new Path(path);
String uriWithLink = filePath.toUri().toString() + "#" + "model.txt";
System.out.println(uriWithLink);
DistributedCache.addCacheFile(new URI(uriWithLink), conf);

2.In the Mapper()
protected void setup(Context context) throws
IOException,InterruptedException {
  System.out.println("Now, use the distributed cache and syslink");
try {


                FileReader reader = new FileReader("model.txt");
BufferedReader br = new BufferedReader(reader); String s1 = null; while ((s1
= br.readLine()) != null) { System.out.println(s1); } br.close();
reader.close();



} catch (Exception e) {
e.printStackTrace();
}
}

3.When run it, in the Task logs.
java.io.FileNotFoundException: model.txt (拒绝访问。)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:120)
at java.io.FileInputStream.<init>(FileInputStream.java:79)
at java.io.FileReader.<init>(FileReader.java:41)
at
NB.NBClusterTrain.UseDistributedCacheBySymbolicLink(NBClusterTrain.java:24)
at NB.NBClusterTrain$NBClusterTrainMapper.setup(NBClusterTrain.java:45)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)

4.When I use /bin/hadoop fs -cat /user/Li/model/model.txt, This can be read.

What do you think can I do?
Thank you!

Reply via email to