Right, please use FileSystem#append From: Stanley Shi [mailto:s...@pivotal.io] Sent: Thursday, August 28, 2014 2:18 PM To: user@hadoop.apache.org Subject: Re: Appending to HDFS file
You should not use this method: FSDataOutputStream fp = fs.create(pt, true) Here's the java doc for this "create" method: /** * Create an FSDataOutputStream at the indicated Path. * @param f the file to create * @param overwrite if a file with this name already exists, then if true, * the file will be overwritten, and if false an exception will be thrown. */ public FSDataOutputStream create(Path f, boolean overwrite) throws IOException { return create(f, overwrite, getConf().getInt("io.file.buffer.size", 4096), getDefaultReplication(f), getDefaultBlockSize(f)); } On Wed, Aug 27, 2014 at 2:12 PM, rab ra <rab...@gmail.com<mailto:rab...@gmail.com>> wrote: hello Here is d code snippet, I use to append def outFile = "${outputFile}.txt" Path pt = new Path("${hdfsName}/${dir}/${outFile}") def fs = org.apache.hadoop.fs.FileSystem.get(configuration); FSDataOutputStream fp = fs.create(pt, true) fp << "${key} ${value}\n" On 27 Aug 2014 09:46, "Stanley Shi" <s...@pivotal.io<mailto:s...@pivotal.io>> wrote: would you please past the code in the loop? On Sat, Aug 23, 2014 at 2:47 PM, rab ra <rab...@gmail.com<mailto:rab...@gmail.com>> wrote: Hi By default, it is true in hadoop 2.4.1. Nevertheless, I have set it to true explicitly in hdfs-site.xml. Still, I am not able to achieve append. Regards On 23 Aug 2014 11:20, "Jagat Singh" <jagatsi...@gmail.com<mailto:jagatsi...@gmail.com>> wrote: What is value of dfs.support.append in hdfs-site.cml https://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml On Sat, Aug 23, 2014 at 1:41 AM, rab ra <rab...@gmail.com<mailto:rab...@gmail.com>> wrote: Hello, I am currently using Hadoop 2.4.1.I am running a MR job using hadoop streaming utility. The executable needs to write large amount of information in a file. However, this write is not done in single attempt. The file needs to be appended with streams of information generated. In the code, inside a loop, I open a file in hdfs, appends some information. This is not working and I see only the last write. How do I accomplish append operation in hadoop? Can anyone share a pointer to me? regards Bala -- Regards, Stanley Shi, [http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png] -- Regards, Stanley Shi, [http://www.gopivotal.com/files/media/logos/pivotal-logo-email-signature.png]