Thank you for info Steve. I always believed (IMO) that there is an optimal position where one can plot the projected NN memory (assuming 1GB--> 40TB of data) to the number of nodes. For example heuristically how many nodes would be sufficient for 1PB of storage with nodes each having 512GB of memory, 50TB of storage and 32 cores? That will require 25GB of RAM for NN with 20 DN in the cluster. but then one can half that number of nodes to 10 and increase the storage to 100TG on each. So the question is the optimal balance between storage and nodes. Would one go to more DNs and less storage or lesser number of DNs and more storage in each DN. The proponent may argue that more nodes provide better MPP but at what cost to the operation, start-up and maintenance?
Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 18 March 2016 at 11:42, Steve Loughran <ste...@hortonworks.com> wrote: > > > On 17 Mar 2016, at 12:28, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > > > > Thanks Steve, > > > > For NN it all depends how fast you want a start-up. 1GB of NameNode > memory accommodates around 42T so if you are talking about 100GB of NN > memory then SSD may make sense to speed up the start-up. Raid 10 is the > best one that one can get assuming all internal disks. > > I wasn't really thinking of startup: in larger clusters startup time is > often determined by how long it takes for all the datanodes to report in, > and for HDFS to exit safe mode. But of course, the NN doesn't start > listening for DN block reports until it's read in the FS image *and > replayed the log*, so start time will be O(image+ log-events + DNs) > > > > > In general it is also suggested that fsimage are copied across to NFS > mount directory between primary and fail-over in case of an issue. > > yes > > if you're curious, there's a 2011 paper on Y!s experience > > https://www.usenix.org/system/files/login/articles/chansler_0.pdf > > there are also a trace of HDFS failure events in some of the JIRAs, > HDFS-599 being the classic, as is HADOOP-572. Both of these document > cascade failures in Facebook's HDFS clusters. Scale brings interesting > problems > >