Hi Marcin,
did you solve this error ? I stumbled into the same thing also i have no NFS
involved...
Johannes
> Hi there,
>
> I've got a simple Map Reduce application that works perfectly when I use
> NFS as an underlying filesystem (not using HDFS at all).
> I've got a working HDFS configuration as well - grep example works for
> me with this configuration.
>
> However, when I try to run the same application on HDFS instead of NFS I
> keep recieving "IOException: Filesystem closed." exception and the job
> fails.
> I've spent a day searching for a solution with Google and scanning thru
> old archieves but no results so far...
>
> Job summary is:
> --->output
> 10/05/26 17:29:13 INFO mapred.JobClient: Job complete: job_201005261710_0002
> 10/05/26 17:29:13 INFO mapred.JobClient: Counters: 4
> 10/05/26 17:29:13 INFO mapred.JobClient: Job Counters
> 10/05/26 17:29:13 INFO mapred.JobClient: Rack-local map tasks=12
> 10/05/26 17:29:13 INFO mapred.JobClient: Launched map tasks=16
> 10/05/26 17:29:13 INFO mapred.JobClient: Data-local map tasks=4
> 10/05/26 17:29:13 INFO mapred.JobClient: Failed map tasks=1
>
> Each map task's attempt log reads somehow like:
> --->attempt_201005261710_0001_m_000000_3/syslog:
> 2010-05-26 17:13:47,297 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=MAP, session
> Id=
> 2010-05-26 17:13:47,470 INFO org.apache.hadoop.mapred.MapTask:
> io.sort.mb = 100
> 2010-05-26 17:13:47,688 INFO org.apache.hadoop.mapred.MapTask: data
> buffer = 79691776/99614720
> 2010-05-26 17:13:47,688 INFO org.apache.hadoop.mapred.MapTask: record
> buffer = 262144/327680
> 2010-05-26 17:13:47,712 INFO org.apache.hadoop.mapred.MapTask: Starting
> flush of map output
> 2010-05-26 17:13:47,784 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 0
> 2010-05-26 17:13:47,788 INFO org.apache.hadoop.mapred.TaskRunner:
> Task:attempt_201005261710_0001_m_000000_3 is done. And is i
> n the process of commiting
> 2010-05-26 17:13:47,797 WARN org.apache.hadoop.mapred.TaskTracker: Error
> running child
> *java.io.IOException: Filesystem closed
> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:617)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:453)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:648)
> at
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.needsTaskCommit(FileOutputCommitter.java:217)
> at org.apache.hadoop.mapred.Task.done(Task.java:671)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:309)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)*
> 2010-05-26 17:13:47,802 INFO org.apache.hadoop.mapred.TaskRunner:
> Runnning cleanup for the task
> 2010-05-26 17:13:47,802 WARN
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Error
> discarding output*
> java.io.IOException: Filesystem closed
> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226)
> at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:580)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:227)
> at
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.abortTask(FileOutputCommitter.java:179)
> at org.apache.hadoop.mapred.Task.taskCleanup(Task.java:815)
> at org.apache.hadoop.mapred.Child.main(Child.java:191)*
>
> There are no reduce task run, as map tasks haven't managed to save their
> solution.
>
> This exceptions are visible in JobTracker's log as well. What is the
> reason for this excpetion? Is it critical (I guess it is, but it's
> listed in JobTracker's log as INFO not ERROR).
>
> My config (I'm not sure which directories should be local and which
> located on HDFS, maybe the issue is somewhere here?):
>
> ---->core-site.xml
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <configuration>
> <property>
> <name>fs.default.name</name>
> <value>hdfs://blade02:5432/</value>
> </property>
> <property>
> <name>hadoop.tmp.dir</name>
> <value>/tmp/hadoop/tmp</value> <!-- local -->
> </property>
>
> </configuration>
>
> ---->hdfs-site.xml
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <configuration>
> <property>
> <name>dfs.replication</name>
> <value>1</value>
> </property>
> <property>
> <name>dfs.name.dir</name>
> <value>/tmp/hadoop/name2</value> <!-- local dir where HDFS is located-->
> </property>
> <property>
> <name>dfs.data.dir</name>
> <value>/tmp/hadoop/data</value> <!-- local dir where HDFS is located -->
> </property>
> </configuration>
>
> ---->mapred-site.xml
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <configuration>
> <property>
> <name>mapred.job.tracker</name>
> <value>blade02:5435</value>
> </property>
> <property>
> <name>mapred.temp.dir</name>
> <value>mapred_tmp</value> <!-- on HDFS I suppose -->
> </property>
> <property>
> <name>mapred.system.dir</name>
> <value>system</value> <!-- on HDFS I suppose -->
> </property>
> <property>
> <name>mapred.local.dir</name>
> <value>/tmp/hadoop/local</value> <!-- local -->
> </property>
> <property>
> <name>mapred.task.tracker.http.address</name>
> <value>0.0.0.0:0</value>
> </property>
> <property>
> <name>mapred.textoutputformat.separator</name>
> <value>,</value>
> </property>
> </configuration>
>
> I'm using Hadoop 0.20.2 (new API -> org.apache.hadoop.mapreduce.*,
> default OutputFormat and RecordWriter), running on a 3-node cluster
> (blade02, blade03, blade04). blade02 is a master, all of them are
> slaves. My OS: Linux blade02 2.6.9-42.0.2.ELsmp #1 SMP Tue Aug 22
> 17:26:55 CDT 2006 i686 i686 i386 GNU/Linux.
>
> Note that there are currently 3 filesystems in my configuration:
> /tmp/* - is a local fs for each processor
> /home/* - as the NFS common for all processors - this is where the
> hadoop is installed
> hdfs://blade02:5432/* - HDFS
>
> I'm not sure if this is relevant, but intermediate (key, value) pair is
> of type (Text, TermVector), and TermVector Writable methods are
> implemented like this:
> public class TermVector implements Writable {
> private Map<Text, IntWritable> vec = new HashMap<Text,
> IntWritable>();
>
> @Override
> public void write(DataOutput out) throws IOException {
> out.writeInt(vec.size());
> for (Map.Entry<Text, IntWritable> e :
> vec.entrySet()) {
> e.getKey().write(out);
> e.getValue().write(out);
> }
> }
>
> @Override
> public void readFields(DataInput in) throws IOException {
> int n = in.readInt();
> for (int i = 0; i < n; ++i) {
> Text t = new Text();
> t.readFields(in);
> IntWritable iw = new IntWritable();
> iw.readFields(in);
> vec.put(t, iw);
> }
> }
> ...
> }
>
> Any help appreciated.
>
> Many thanks,
> Marcin Sieniek
>
>