I noticed a potential problem that results from the use of the gethostbyXXX functions. We have a small group of Sun file servers that we want to monitor with ganglia. The problem originates from having an /etc/hosts file that includes an entry for your local host that has both the short and fqdn forms of the hostname defined, like:
xxx.xxx.xxx.xxx host host.domain.name but no other hosts in your organization. If you have a setup like that, and the default or standard config in the appropriate /etc/nsswitch.* config file: hosts: files dns then a gethostbyXXX lookup will return the short name for your local host, but the fqdn for all other hosts in your multicast cluster. This is obviously a problem because depending on which host in your cluster gmetad queries, you will end up with one short hostname and the rest long. Also, if your first host fails to respond to gmetad, then the next host might be queried resulting in two previously unknown hostnames that will now be displayed and saved to new rrds. I have been told that changing the local config is not really an option for us. I an not a Sun/Solaris admin, but have been told by our admins here that there was some wierd magic that having your /etc/hosts file configured in this way solved some strange problems. It may be mostly historical now, but our admins are understandably reluctant to change an otherwise working config. I am no expert on this, but others have told me that a better way to do hostname lookups is to always use DNS lookups (if your facility is configured to use DNS) through the bind API and not rely on the local unix config and the gethostbyXXX functions. ~Jason -- /------------------------------------------------------------------\ | Jason A. Smith Email: [EMAIL PROTECTED] | | Atlas Computing Facility, Bldg. 510M Phone: (631)344-4226 | | Brookhaven National Lab, P.O. Box 5000 Fax: (631)344-7616 | | Upton, NY 11973-5000 | \------------------------------------------------------------------/