Your best bet would be to take a look at synthetic load generator. 10^8 files would be a problem for most cases because you'd need to have a really beefy NN for that (~48GB of JVM heap and all that). The biggest I've heard about hold something at the order of 1.15*10^8 objects (files & dirs) and is serving a largest Hadoop cluster in the world for Yahoo! production setup. You might want to check YDN for more details about this case, I guess.
Hope it helps, Cos On Mon, May 30, 2011 at 10:44AM, ccxixicc wrote: > Hi all > I'm doing a test and need create lots of files ( 100 million ) in > HDFS-L-NOT I use a shell script to do this , it's very very slow, how to > create a lot files in HDFS quickly? > Thanks