On 05/10/2020 09:40, Simon Thompson wrote:
I now need to check IBM are not going to throw a wobbler down the
line if I need to get support before deploying it to the DSS-G
nodes :-)
I know there were a lot of other emails about this ...
I think you maybe want to be careful doing this. Whilst it might work
when you setup the DSS-G like this, remember that the memory usage
you are seeing at this point in time may not be what you always need.
For example if you fail-over the recovery groups, you need to have
enough free memory to handle this. E.g. a node failure, or more
likely you are upgrading the building blocks.
I think there is a lack of understanding on exactly how light weight
keepalived is.
It's the same code as on my routers which are admittedly different CPU's
(MIPS to be precise) but memory usage (taking out shared memory usage -
libc for example is loaded anyway) is under 200KB. A bash shell uses
more memory...
Personally I wouldn't run other things like this on my DSS-G storage
nodes. We do run e.g. nrpe monitoring to collect and report faults,
but this is pretty lightweight compared to everything else. They even
removed support for running the gui packages on the IO nodes - the
early DSS-G builds used the IO nodes for this, but now you need
separate systems for this.
And keepalived is in the same range as nrpe, which you do run :-) I have
seen nrpe get out of hand and consume significant amounts of resources
on a machine; the machine was ground to halt due to nrpe. One of the
standard plugins was failing and sitting their busy waiting. Every five
minutes it ran again. It of course decided to wait till ~7pm on a Friday
to go wonky. By mid morning on Saturday it was virtually unresponsive,
several minutes to get a shell...
I would note that you can run keepalived quite happily on an Ubiquiti
EdgeRouter X which has a dual core 880 MHz MIPS CPU with 256MB of RAM.
Mikrotik have models with similar specs that run it too.
On a dual Xeon Gold 6142 machine the usage of RAM and CPU by keepalived
is noise.
JAB.
--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss