I posted to the list some time ago about problems scaling Gmetad. I found that after passing some number of monitored nodes I was showing nodes failed.
I've finally worked out the issue... my disk was the bottleneck. It appears that Gmetad serially updates RRDs for a node/cluster. Unable to tune around this I decided to use tmpfs with a special startup script and cron job that rsync's the data out of tmpfs every 5 minutes. Here's my startup script (executed by SMF on Solaris): #!/usr/bin/bash ## SMF Start method for Gmetad ## -benr if ( mount | grep "/opt/ganglia/data on swap" >/dev/null ) then echo "Tmpfs already mounted." else echo "Mounting tmpfs..." chown nobody /opt/ganglia/data mount -F tmpfs -o size=100m,noxattr swap /opt/ganglia/data fi ## Now sync the data in if its empty: if [ -d /opt/ganglia/data/__SummaryInfo__/ ] then echo "Cache primed, ready to start." chown -R nobody /opt/ganglia/data else echo "Priming the cache..." /opt/csw/bin/rsync -at /opt/ganglia/data-disk/ /opt/ganglia/data/ chown -R nobody /opt/ganglia/data fi ## Finally, start ganglia: /opt/ganglia/sbin/gmetad #==================================== The cronjob to sync is simple: # Sync gmetad ram-disk to physical-disk # 5,10,15,20,25,30,35,40,45,50,55,0 * * * * /opt/csw/bin/rsync -at /opt/ganglia/data/ /opt/ganglia/data-disk/ The question I have for the list is... has anyone else run into this bottleneck? If so, how did you solve the issue? I know several people have multiple gmetad's out there, perhaps unnecessarily. CPU usage and disk consumption are low, its just doing a lot of IO. My data/ dir is only 27MB. benr. ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general