Re: [gridengine users] Hardware thoughts?

Chris Dagdigian Wed, 20 Jul 2016 09:06:06 -0700

In environments where you do tens of thousands of jobs per day or tonsor really short jobs or a constant flow of jobs always active you mayneed a master node that is somewhat beefy. If you've never seen yourhead node get slammed then you can downsize. If there is a chance thatyour workload could change significantly then keep the size as is.

I'm in favor of massive login nodes. They are often used by users whoare prototyping job scripts and we can't always train them to 'qlogin'or 'qrsh' into a remote node for testing. All you need is a couple ofpeople running large R or Matlab tasks plus some other people doing amassive set of array job prep combined with a couple of people whoconstantly "qstat" and you can run the login node out of resourcespretty quickly.

The cost of CPU and RAM at this scale is dirt cheap. Effectively noiserelative to cost of networking and storage so I also tend to make loginand interactive nodes larger than strictly necessary.


My $.02!

Chris



Notorious Biggles wrote:

Hi all,
I have some money available to replace the infrastructure nodes of oneof my company's grid engine clusters and I wanted a sanity checkbefore I order anything new.
Initially we contacted the company we originally bought the clusterfrom and they quoted us for a combined login/storage/master node withloads of everything and a hefty price tag. I feel an aversion tocombining login nodes with storage and master nodes - we already havethat on one of the clusters and a user being able to crash the entirecluster seems a bad thing to me and it happened often enough.
I read Rayson's blog post about scaling grid engine to 10k nodes athttp://blogs.scalablelogic.com/2012/11/running-10000-node-grid-engine-cluster.htmland it seems that 4 cores and 1 GB of memory is more than enough torun a grid engine master. Given that I'd be lucky to have 100 nodes toa master, can anybody see a reason to spec a high powered master node?I look at my existing master nodes with 8+ cores and 24+ GB of memoryand in Ganglia all I see is acres of green from memory being used ascache and buffers. It seems rather a waste.
The other thing I was curious about is what kind of spec seemsreasonable to you for a login node. My one cluster with separate loginnodes has similar specs to the master nodes - 8 cores, 24 GB memoryand it seems wasted. I can see an argument for these nodes to be morethan just a low end box, especially if anybody is trying to do somekind of visualization on them, but I've never had complaints aboutthem being under-powered yet.
Any thoughts you might have are appreciated.

Thanks
Biggles


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] Hardware thoughts?

Reply via email to