Owww!

If they were all name brand like HP or something then there's tools from the 
manufacturer that you could buy to manage them.

Otherwise at most if they are all the same CPU depending on the hardware you 
might be able to use something from Intel.

Otherwise, all you got is in-band management.  SNMP does exist for the OS and 
you can get some stats such as disk space, etc. from it.

Ted

-----Original Message-----
From: PLUG <plug-boun...@lists.pdxlinux.org> On Behalf Of Ben Koenig
Sent: Saturday, March 2, 2024 2:30 PM
To: Portland Linux/Unix Group <plug@lists.pdxlinux.org>
Subject: Re: [PLUG] Linux Software for Data Center Monitoring

On Saturday, March 2nd, 2024 at 10:50 AM, Ted Mittelstaedt 
<t...@portlandia-it.com> wrote:

> Are these 800 servers virtual or physical? 

Physical. 

> Are the physical servers home-built or commercial from a major brand (HP 
> Proliant, etc.)

Home-built... but often with parts from major brands. Or copy cat brands

> Are the servers all the same brand and model or are they a mismash of pieces 
> from different makers?

Uhh.. Ever seen a graphics card with a Gigabyte logo and EVGA silkscreened onto 
the PCB?

> Are the servers yours or owned by customers? That is, if they are virtual 
> servers owned by remote customers do you have any responsibility to monitor 
> them?> 

We own them. And the racks, cabinets, PDUs. 

> For "emergency notifications" the go-to for FOSS is "Big Sister" 
> https://bigsister.ch/ Set that up to ping the server interface and if it 
> trips a breaker and goes offline then have Big Sister email a text-to-SMS 
> gateway for your cell phone number
> 
> For monitoring power consumption you have to configure the PDUs for that. 
> I've yet to see one of these that supports current monitoring but does not 
> support SNMP, so once you get that going you can monitor power consumption 
> with mrtg or, if you want to get fancy, https://www.cacti.net/ Cacti is based 
> on RRDtool with is the successor to MRTG https://oss.oetiker.ch/rrdtool/
> 

The PDUs have SNMP so I may have to take a look at those. 

I've used RT in the past and it's a bit on the excessive side. IIRC it uses 
perl and I know next to nothing about perl. As of right now, it basically is a 
one man show, I am the only one regularly on side for the physical hardware. 
That said, they want to hire a second person which is where these tools will 
start to come in handy. Creating a custom tool to manage all this stuff is not 
outside the realm of possibility, but that might end up meaning that I spend 
all my time maintaining said tool. 

My instinct is to start setting up some sort of relational database and build 
it up piece by piece simply because there is literally NOTHING used to manage 
this stuff. Especially since the servers are already installed and running. But 
like anything else the first step is to list all options and make my list of 
pros and cons. ;)

Reply via email to