I'd recommend making a SequenceFile[1] to store each XML file as a value. -Joey
[1] http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/io/SequenceFile.html On Tue, Feb 21, 2012 at 12:15 PM, Mohit Anchlia <mohitanch...@gmail.com>wrote: > We have small xml files. Currently I am planning to append these small > files to one file in hdfs so that I can take advantage of splits, larger > blocks and sequential IO. What I am unsure is if it's ok to append one file > at a time to this hdfs file > > Could someone suggest if this is ok? Would like to know how other do it. > -- Joseph Echeverria Cloudera, Inc. 443.305.9434