Thanks for the feedback. I have modified my configuration as
suggested. I now have only one of my machines as the data source. The
problem that I now have is that only one node shows up in the web
interface. That's why I had previously added each node as a data
source.
Is there something wrong with my configuration? I've attached the
relevant portions of the files below.
Thanks,
Matt
##############
gmetad.conf:
data_source "g" 192.168.7.10
##############
gmond.conf on 192.168.7.10:
globals {
setuid = yes
user = nobody
cleanup_threshold = 300 /*secs */
}
cluster {
name = "Geo Cluster"
}
udp_send_channel {
mcast_join = 239.2.11.71
port = 8649
}
udp_recv_channel {
mcast_join = 239.2.11.71
port = 8649
bind = 239.2.11.71
}
tcp_accept_channel {
port = 8649
}
####################
gmond.conf on all others:
globals {
mute = "no"
deaf = "yes"
debug_level = "0"
setuid = "yes"
user="nobody"
gexec = "no"
host_dmax = "0"
}
cluster {
name = "Geo Cluster"
}
udp_send_channel {
mcast_join = 239.2.11.71
port = 8649
}
udp_recv_channel {
mcast_join = 239.2.11.71
port = 8649
bind = 239.2.11.71
}
tcp_accept_channel {
port = 8649
}
On Mon, 2005-05-23 at 15:01, Paul Henderson wrote:
> The way I would do it is this:
>
> define only one data source
> in the /etc/gmond.conf on the four systems that are not data sources,
> set "mute = no" and "deaf = yes" in the global variables section, i.e.:
>
> /* global variables */
> globals {
> mute = "no"
> deaf = "yes"
> debug_level = "0"
> setuid = "yes"
> user="nobody"
> gexec = "no"
> host_dmax = "0"
> }
>
>
> Ian Cunningham wrote:
>
> > Matt Klaric,
> >
> > I am seeing this same problem as well. There seems to be a problem
> > with how gmetad computes the summaries for each grid. It seems as
> > though it resets it's count of machines each processing loop. When it
> > is asked by the front end, it seems as though gmetad has not yet
> > finished its counting, so you get incomplete numbers for the grid
> > summary. The odd thing is that cluster summaries work just fine.
> >
> > As an aside, I think I have noticed that you are using each machine on
> > your cluster as a data_source. Normally you would just have one of the
> > machines on the cluster as a data source, as well as backup nodes for
> > redundancy. If you were using multicast, all your nodes would share
> > information on the multicast channel. This is all defined in your
> > gmond.conf. Using the data in your example, I would suggest that your
> > gmetad look more like:
> >
> > data_source "foo" 192.168.7.10 192.168.7.11
> >
> > This way you do not need to define every node in the cluster in the
> > gmetad config file (as separate clusters)
> >
> > Ian
> >
> > Matt Klaric wrote:
> >
> >> I've installed Ganglia v3.0.1 and setup the web interface to gmetad.
> >> I've setup this up on a small cluster of 5 machines using the default
> >> configuration for gmond by using the command 'gmond -t'. I've put this
> >> config file no all the nodes.
> >> Then I setup my gmetad.conf file as follows:
> >> data_source "a" 192.168.7.10
> >> data_source "b" 192.168.7.11
> >> data_source "c" 192.168.7.12
> >> data_source "d" 192.168.7.13
> >> data_source "e" 192.168.7.14
> >> gridname "foo"
> >>
> >> When I look at the web interface for Ganglia I notice that the image
> >> showing the number of CPUs in the cluster is not accurate. It
> >> oscillates up and down over time despite nodes not being added or
> >> removed from the cluster. It's reporting anywhere from 8 to 14 CPUs in
> >> the cluster when there are really 20 CPUs in the 5 boxes. (The text to
> >> the left of this image does indicate there are 20 CPUs in 5 hosts.)
> >> Additionally, "Total In-core Memory" shown in the cluster on this
> >> interface is lower than the sum of the amount of RAM in all boxes and
> >> varies over time.
> >> However, if I look at the stats for any one node in the cluster the
> >> values are correct and constant over time.
> >> Has anyone seen these kinds of problems? How have you addressed them?
> >>
> >> Thanks,
> >> Matt
> >>
> >>
> >> -------------------------------------------------------
> >> This SF.Net email is sponsored by Oracle Space Sweepstakes
> >> Want to be the first software developer in space?
> >> Enter now for the Oracle Space Sweepstakes!
> >> http://ads.osdn.com/?ad_id=7412&alloc_id=16344&op=click
> >> _______________________________________________
> >> Ganglia-general mailing list
> >> [email protected]
> >> https://lists.sourceforge.net/lists/listinfo/ganglia-general
> >>
> >>
> >>
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by Oracle Space Sweepstakes
> > Want to be the first software developer in space?
> > Enter now for the Oracle Space Sweepstakes!
> > http://ads.osdn.com/?ad_id=7412&alloc_id=16344&op=click
> > _______________________________________________
> > Ganglia-general mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/ganglia-general
>
>
>