Re: Exception while closing the stream

2011-05-29 Thread sudhanshu arora
Since I did not get any response, I am reposting it to get the attention... On Fri, May 27, 2011 at 7:57 PM, sudhanshu arora wrote: > I am writing multiple files using multiple FSOutputStreams through > different threads in HDFS. All the files are getting written properly and I > see that namenod

RE: Can't start datanode?

2011-05-29 Thread Stuti Awasthi
Keeping alias in the loop -Original Message- From: Stuti Awasthi Sent: Monday, May 30, 2011 10:56 AM To: 'Jain, Prem' Subject: RE: Can't start datanode? Hi Prem, The datanode pid file name is "hadoop-[USERNAME]-datanode.pid" and by default it is present at location /tmp directory. Here

Re: How to create a lot files in HDFS quickly?

2011-05-29 Thread Konstantin Boudnik
Your best bet would be to take a look at synthetic load generator. 10^8 files would be a problem for most cases because you'd need to have a really beefy NN for that (~48GB of JVM heap and all that). The biggest I've heard about hold something at the order of 1.15*10^8 objects (files & dirs) and i

Re: How to create a lot files in HDFS quickly?

2011-05-29 Thread Ted Dunning
First, it is virtually impossible to create 100 million files in HDFS because the name node can't hold that many. Secondly, file creation is bottle-necked by the name node so the files that you can create can't be created at more than about 1000 per second (and achieving more than half that rate i

How to create a lot files in HDFS quickly?

2011-05-29 Thread ccxixicc
Hi all I'm doing a test and need create lots of files ( 100 million ) in HDFS, I use a shell script to do this , it's very very slow, how to create a lot files in HDFS quickly? Thanks