josh-

i bet most people on this list wish they had a "problem" of having 1100
node with 4 GB of memory each.  :)

you are right.  the memory metrics for ganglia are xdr_long variables
and that won't work for you.

in future releases of ganglia this will be addressed by allowing you to
reduce the data size with a unit reduction e.g. from KB to GB to TB to
PB etc.  alternately, a big number library may be used (although it's
much slower).

for now, i'm sorry to say, you will have to do one of two workarounds.

a quick workaround would be to split your cluster up into smaller sub
clusters by using separate multicast channels.  the memory statistics
for each cluster will be correct however the summary statistics for the
overall cluster will not be.

a better solution it to alter the gmond code to send memory information
as an xdr_hyper (long long).  this will ensure accurate summary
information.  it's not trivial to do but i will help you work through it
if that is something you are interested in doing.


On Tue, 2003-12-30 at 15:01, Josh Durham wrote:
> I have another problem.
> 
> I have 1100 nodes, with 4GB of RAM each.
> 
> 1100 nodes * 4,194,304 kilobytes of RAM = 4,613,734,400
>                  uint32 max value = 2^32 = 4,294,967,296
> 
> I have a feeling, from what I've looked at, that moving all the memory 
> stuff to uint64 is going to be painful.  Not to mention there doesn't 
> seem to be an xdr_longlong, or whatever it should be.
> 
> Am I walking down the wrong path?

nope.  just a new one.  :)

-matt

-- 
Mobius strippers never show you their back side
PGP fingerprint 'A7C2 3C2F 8445 AD3C 135E  F40B 242A 5984 ACBC 91D3'

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to