Just to add 2 points to what Chris wrote: The amount of cores is less important. What's important is enough memory, fast access to storage where spooling goes and, on a really busy system, the performance of the cores. Depending on the exact workload profiles and policy set-ups one can become more important than the other.
Cheers, Fritz On Wed, Jul 20, 2016 at 5:57 PM, Chris Dagdigian <[email protected]> wrote: > > In environments where you do tens of thousands of jobs per day or tons or > really short jobs or a constant flow of jobs always active you may need a > master node that is somewhat beefy. If you've never seen your head node get > slammed then you can downsize. If there is a chance that your workload > could change significantly then keep the size as is. > > I'm in favor of massive login nodes. They are often used by users who are > prototyping job scripts and we can't always train them to 'qlogin' or > 'qrsh' into a remote node for testing. All you need is a couple of people > running large R or Matlab tasks plus some other people doing a massive set > of array job prep combined with a couple of people who constantly "qstat" > and you can run the login node out of resources pretty quickly. > > The cost of CPU and RAM at this scale is dirt cheap. Effectively noise > relative to cost of networking and storage so I also tend to make login and > interactive nodes larger than strictly necessary. > > My $.02! > > Chris > > > > > Notorious Biggles wrote: > >> Hi all, >> >> I have some money available to replace the infrastructure nodes of one of >> my company's grid engine clusters and I wanted a sanity check before I >> order anything new. >> >> Initially we contacted the company we originally bought the cluster from >> and they quoted us for a combined login/storage/master node with loads of >> everything and a hefty price tag. I feel an aversion to combining login >> nodes with storage and master nodes - we already have that on one of the >> clusters and a user being able to crash the entire cluster seems a bad >> thing to me and it happened often enough. >> >> I read Rayson's blog post about scaling grid engine to 10k nodes at >> http://blogs.scalablelogic.com/2012/11/running-10000-node-grid-engine-cluster.html >> and it seems that 4 cores and 1 GB of memory is more than enough to run a >> grid engine master. Given that I'd be lucky to have 100 nodes to a master, >> can anybody see a reason to spec a high powered master node? I look at my >> existing master nodes with 8+ cores and 24+ GB of memory and in Ganglia all >> I see is acres of green from memory being used as cache and buffers. It >> seems rather a waste. >> >> The other thing I was curious about is what kind of spec seems reasonable >> to you for a login node. My one cluster with separate login nodes has >> similar specs to the master nodes - 8 cores, 24 GB memory and it seems >> wasted. I can see an argument for these nodes to be more than just a low >> end box, especially if anybody is trying to do some kind of visualization >> on them, but I've never had complaints about them being under-powered yet. >> >> Any thoughts you might have are appreciated. >> >> Thanks >> Biggles >> >> > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users >
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
