Copying data to hdfs

2011-12-13 Thread Steve Ed
Sorry for the layman question. Whats the best way of writing data into HDFS from outside of the cluster. My customer is looking for wire speed data ingest into HDFS. We are considering flume, but initial performance results from flume are very discouraging. Thanks in advance. Steve

Moving data into HDFS

2011-11-22 Thread Steve Ed
Sorry for this novice question. I am trying to find the best way of moving (Copying) data in and out of HDFS. There are bunch of tools available and I need to pick the one which offers the easiest way. I have seen MapR presentation, who claim to offer direct NFS mounts to feed data into HDFS. Is

RE: Sizing help

2011-11-11 Thread Steve Ed
u would need close to 170 servers with 12 TB disk pack installed on them (with replication factor of 2). Thats a conservative estimate > CPUs: 4 cores with 16gb of memory > > Namenode: 4 core with 32gb of memory should be ok. > > > On Fri, Oct 21, 2011 at 5:40 PM, Steve Ed

RE: NameNode corruption: NPE addChild at start up

2011-10-26 Thread Steve Ed
Did you ever considered keeping a backup copy of FSImage on a NFS share? The best practice is to have a reliable NFS storage mounted on the namenode and instruct the site-xml to keep a copy on the NFS mount. This will prevent FSImage loss. -Original Message- From: Markus Jelsma [mailto:ma

Sizing help

2011-10-21 Thread Steve Ed
I am a newbie to Hadoop and trying to understand how to Size a Hadoop cluster. What are factors I should consider deciding the number of datanodes ? Datanode configuration ? CPU, Memory Amount of memory required for namenode ? My client is looking at 1 PB of usable data and will be