> On 17 Mar 2016, at 12:28, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > > Thanks Steve, > > For NN it all depends how fast you want a start-up. 1GB of NameNode memory > accommodates around 42T so if you are talking about 100GB of NN memory then > SSD may make sense to speed up the start-up. Raid 10 is the best one that one > can get assuming all internal disks.
I wasn't really thinking of startup: in larger clusters startup time is often determined by how long it takes for all the datanodes to report in, and for HDFS to exit safe mode. But of course, the NN doesn't start listening for DN block reports until it's read in the FS image *and replayed the log*, so start time will be O(image+ log-events + DNs) > > In general it is also suggested that fsimage are copied across to NFS mount > directory between primary and fail-over in case of an issue. yes if you're curious, there's a 2011 paper on Y!s experience https://www.usenix.org/system/files/login/articles/chansler_0.pdf there are also a trace of HDFS failure events in some of the JIRAs, HDFS-599 being the classic, as is HADOOP-572. Both of these document cascade failures in Facebook's HDFS clusters. Scale brings interesting problems --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org