> On 17 Mar 2016, at 12:28, Mich Talebzadeh <mich.talebza...@gmail.com> wrote:
> 
> Thanks Steve,
> 
> For NN it all depends how fast you want a start-up. 1GB of NameNode memory 
> accommodates around 42T so if you are talking about 100GB of NN memory then 
> SSD may make sense to speed up the start-up. Raid 10 is the best one that one 
> can get  assuming all internal disks.

I wasn't really thinking of startup: in larger clusters startup time is often 
determined by how long it takes for all the datanodes to report in, and for 
HDFS to exit safe mode. But of course, the NN doesn't start listening for DN 
block reports until it's read in the FS image *and replayed the log*, so start 
time will be O(image+ log-events + DNs)

> 
> In general it is also suggested that fsimage are copied across to NFS mount 
> directory between primary and fail-over in case of an issue.

yes

if you're curious, there's a 2011 paper on Y!s experience

https://www.usenix.org/system/files/login/articles/chansler_0.pdf

there are also a trace of HDFS failure events in some of the JIRAs, HDFS-599 
being the classic, as is HADOOP-572. Both of these document cascade failures in 
Facebook's HDFS clusters. Scale brings interesting problems


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to