Just to add 2 points to what Chris wrote: The amount of cores is less
important. What's important is enough memory, fast access to storage where
spooling goes and, on a really busy system, the performance of the cores.
Depending on the exact workload profiles and policy set-ups one can become
more important than the other.

Cheers,

Fritz

On Wed, Jul 20, 2016 at 5:57 PM, Chris Dagdigian <[email protected]> wrote:

>
> In environments where you do tens of thousands of jobs per day or tons or
> really short jobs or a constant flow of jobs always active you may need a
> master node that is somewhat beefy. If you've never seen your head node get
> slammed then you can downsize. If there is a chance that your workload
> could change significantly then keep the size as is.
>
> I'm in favor of massive login nodes. They are often used by users who are
> prototyping job scripts and we can't always train them to 'qlogin' or
> 'qrsh' into a remote node for testing. All you need is a couple of people
> running large R or Matlab tasks plus some other people doing a massive set
> of array job prep combined with a couple of people who constantly "qstat"
> and you can run the login node out of resources pretty quickly.
>
> The cost of CPU and RAM at this scale is dirt cheap. Effectively noise
> relative to cost of networking and storage so I also tend to make login and
> interactive nodes larger than strictly necessary.
>
> My $.02!
>
> Chris
>
>
>
>
> Notorious Biggles wrote:
>
>> Hi all,
>>
>> I have some money available to replace the infrastructure nodes of one of
>> my company's grid engine clusters and I wanted a sanity check before I
>> order anything new.
>>
>> Initially we contacted the company we originally bought the cluster from
>> and they quoted us for a combined login/storage/master node with loads of
>> everything and a hefty price tag. I feel an aversion to combining login
>> nodes with storage and master nodes - we already have that on one of the
>> clusters and a user being able to crash the entire cluster seems a bad
>> thing to me and it happened often enough.
>>
>> I read Rayson's blog post about scaling grid engine to 10k nodes at
>> http://blogs.scalablelogic.com/2012/11/running-10000-node-grid-engine-cluster.html
>> and it seems that 4 cores and 1 GB of memory is more than enough to run a
>> grid engine master. Given that I'd be lucky to have 100 nodes to a master,
>> can anybody see a reason to spec a high powered master node? I look at my
>> existing master nodes with 8+ cores and 24+ GB of memory and in Ganglia all
>> I see is acres of green from memory being used as cache and buffers. It
>> seems rather a waste.
>>
>> The other thing I was curious about is what kind of spec seems reasonable
>> to you for a login node. My one cluster with separate login nodes has
>> similar specs to the master nodes - 8 cores, 24 GB memory and it seems
>> wasted. I can see an argument for these nodes to be more than just a low
>> end box, especially if anybody is trying to do some kind of visualization
>> on them, but I've never had complaints about them being under-powered yet.
>>
>> Any thoughts you might have are appreciated.
>>
>> Thanks
>> Biggles
>>
>>
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to