Hello,

On Tue, Apr 5, 2011 at 2:53 AM, W.P. McNeill <bill...@gmail.com> wrote:
> If I try:
>
>      storePath = FileOutputFormat.getPathForWorkFile(context, "my-file",
> ".seq");
>      writer = SequenceFile.createWriter(FileSystem.getLocal(configuration),
>            configuration, storePath, IntWritable.class, itemClass);
>      ...
>      reader = new SequenceFile.Reader(FileSystem.getLocal(configuration),
> storePath, configuration);
>
> I get an exception about a mismatch in file systems when trying to read from
> the file.
>
> Alternately if I try:
>
>      storePath = new Path(SequenceFileOutputFormat.getUniqueFile(context,
> "my-file", ".seq"));
>      writer = SequenceFile.createWriter(FileSystem.get(configuration),
>            configuration, storePath, IntWritable.class, itemClass);
>      ...
>      reader = new SequenceFile.Reader(FileSystem.getLocal(configuration),
> storePath, configuration);

FileOutputFormat.getPathForWorkFile will give back HDFS paths. And
since you are looking to create local temporary files to be used only
by the task within itself, you shouldn't really worry about unique
filenames (stuff can go wrong).

You're looking for the tmp/ directory locally created in the FS where
the Task is running (at ${mapred.child.tmp}, which defaults to ./tmp).
You can create a regular file there using vanilla Java APIs for files,
or using RawLocalFS + your own created Path (not derived via
OutputFormat/etc.).

>      storePath = new Path(new Path(context.getConf().get("mapred.child.tmp"), 
> "my-file.seq");
>      writer = SequenceFile.createWriter(FileSystem.getLocal(configuration),
>            configuration, storePath, IntWritable.class, itemClass);
>      ...
>      reader = new SequenceFile.Reader(FileSystem.getLocal(configuration),
> storePath, configuration);

The above should work, I think (haven't tried, but the idea is to use
the mapred.child.tmp).

Also see: 
http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#Directory+Structure

-- 
Harsh J

Reply via email to