Qiang, I couldn't find now which one, but there is a JIRA issue about MultipleTextOutputFormat (especially when reducers = 0). If you have no reducers, you can try having one or two, then you can see if your problem is related with this one.
Cheers, Rasit 2009/2/25 ma qiang <maqiang1...@gmail.com> > Thanks for your reply. > If I increase the number of computers, can we solve this problem of > running out of file descriptors? > > > > > On Wed, Feb 25, 2009 at 11:07 AM, jason hadoop <jason.had...@gmail.com> > wrote: > > My 1st guess is that your application is running out of file > > descriptors,possibly because your MultipleOutputFormat instance is > opening > > more output files than you expect. > > Opening lots of files in HDFS is generally a quick route to bad job > > performance if not job failure. > > > > On Tue, Feb 24, 2009 at 6:58 PM, ma qiang <maqiang1...@gmail.com> wrote: > > > >> Hi all, > >> I have one class extends MultipleOutputFormat as below, > >> > >> public class MyMultipleTextOutputFormat<K, V> extends > >> MultipleOutputFormat<K, V> { > >> private TextOutputFormat<K, V> theTextOutputFormat = null; > >> > >> @Override > >> protected RecordWriter<K, V> getBaseRecordWriter(FileSystem fs, > >> JobConf job, String name, Progressable arg3) > throws > >> IOException { > >> if (theTextOutputFormat == null) { > >> theTextOutputFormat = new TextOutputFormat<K, > V>(); > >> } > >> return theTextOutputFormat.getRecordWriter(fs, job, name, > >> arg3); > >> } > >> @Override > >> protected String generateFileNameForKeyValue(K key, V value, > String > >> name) { > >> return name + "_" + key.toString(); > >> } > >> } > >> > >> > >> also conf.setOutputFormat(MultipleTextOutputFormat2.class) in my job > >> configuration. but when the program run, error print as follow: > >> > >> 09/02/25 10:22:32 INFO mapred.JobClient: Task Id : > >> attempt_200902250959_0002_r_000001_0, Status : FAILED > >> java.io.IOException: Could not read from stream > >> at > >> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:119) > >> at java.io.DataInputStream.readByte(DataInputStream.java:248) > >> at > >> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:325) > >> at > >> org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:346) > >> at org.apache.hadoop.io.Text.readString(Text.java:400) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2779) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2704) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183) > >> > >> 09/02/25 10:22:42 INFO mapred.JobClient: map 100% reduce 69% > >> 09/02/25 10:22:55 INFO mapred.JobClient: map 100% reduce 0% > >> 09/02/25 10:22:55 INFO mapred.JobClient: Task Id : > >> attempt_200902250959_0002_r_000000_1, Status : FAILED > >> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File > >> > >> > /user/qiang/output/_temporary/_attempt_200902250959_0002_r_000000_1/part-00000_t0x5y3 > >> could only be replicated to 0 nodes, instead of 1 > >> at > >> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1270) > >> at > >> > org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351) > >> at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > >> at > >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >> at java.lang.reflect.Method.invoke(Method.java:597) > >> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452) > >> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892) > >> at org.apache.hadoop.ipc.Client.call(Client.java:696) > >> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) > >> at $Proxy1.addBlock(Unknown Source) > >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >> at > >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > >> at > >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >> at java.lang.reflect.Method.invoke(Method.java:597) > >> at > >> > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) > >> at > >> > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) > >> at $Proxy1.addBlock(Unknown Source) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2815) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2697) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183) > >> > >> > >> Of course the program run successfully without MyMultipleOutputFormat. > >> who can help me solve this problem? > >> Thanks. > >> > >> yours, Qiang > >> > > > -- M. Raşit ÖZDAŞ