Hi all, I have a quick question regarding SmartFrog and managing Hadoop clusters. I'm setting up a reasonable-sized Hadoop cluster that's expected to grow fairly quickly. However, I'm not really sure what the appropriate cluster management and administration tools are. From what I understand, I'll need tools to manage software configurations (packages+config files), obtain hadoop-specific metrics, obtain overall machine metrics(CPU load, memory usage, etc), and continuously perform process health/service lifecycle management (i.e. restarting datanodes that crash, reporting on critical errors like EDAC errors in Linux indicating uncorrectable DRAM flaws).
It appears right now folks like Yahoo! use Yum (or apt) + bcfg2 for the package/configuration, some variant of Ganglia for Hadoop, Nagios for overall metrics, and there is nothing publicly available for process health. I'm fine using Ganglia and Nagios for metrics unless someone can point me to better tools, but I'd rather not use cobble together something using bcfg2 and hacked up shell scripts for configuration/process health management. It looks like SmartFrog in conjunction with HADOOP-3628 would provide both reasonable configuration and process health/service lifecycle management. If I understand correctly, SmartFrog will allow me to manage Hadoop configuration files, manage the installation of my Hadoop packages, _and_ also provide monitoring of the health of Hadoop nodes so I can automatically log and restart hadoop nodes when they crash, etc. Is this a reasonably accurate description of the state of the art with Hadoop and cluster administration? Also, are there are any estimates on when the HADOOP-3628 branch will be committed into SVN trunk, and when that occurs will the SmartFrog project still need to maintain its own Hadoop branch? Lastly, I was unable to find a list of real-world projects running SmartFrog: are there any large-scale(> 1000 node) clusters running SF? Thanks for your help Best regards, Mike
------------------------------------------------------------------------------
_______________________________________________ Smartfrog-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/smartfrog-users
