Can't we use getmerge here ? If you requirement is to merge some files in a particular directory to single file ..
hadoop fs -getmerge <dir_of_input_files> <mergedsinglefile> --Senthil -----Original Message----- From: Giovanni Mascari [mailto:giovanni.masc...@polito.it] Sent: Thursday, November 03, 2016 7:24 PM To: Piyush Mukati <piyush.muk...@gmail.com>; user@hadoop.apache.org Subject: Re: merging small files in HDFS Hi, if I correctly understand your request you need only to merge some data resulting from an hdfs write operation. In this case, I suppose that your best option is to use hadoop-stream with 'cat' command. take a look here: https://hadoop.apache.org/docs/r1.2.1/streaming.html Regards Il 03/11/2016 13:53, Piyush Mukati ha scritto: > Hi, > I want to merge multiple files in one HDFS dir to one file. I am > planning to write a map only job using input format which will create > only one inputSplit per dir. > this way my job don't need to do any shuffle/sort.(only read and write > back to disk) Is there any such file format already implemented ? > Or any there better solution for the problem. > > thanks. > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org For additional commands, e-mail: user-h...@hadoop.apache.org