Hi Kent,

>
> *Here’s my question*:   In our case, we have nodes that have 8, 12 and even
> 24 cores.     A much different world than not too long ago when everything
> was single or dual or (max) quad core.   And while I have no doubt the boxes
> can handle the cpu load of many jobs, I think I’m hitting network
> limitations and stuff is getting dropped.    Can anyone here speak to
> opinions, experiences, etc- when it comes to “max simultaneous jobs per
> executions host” as relates to networking?

I have a cluster of 94 nodes, all dual hexacore (12 cores per node)
and using QDR IB. I have not run into any networking issues as a
result of overloaded network, and my storage system (when healthy) has
withstood multiple nodes running 12 jobs, as well as hitting loads of
1000 jobs per day across the cluster as a whole (it had done over 500K
jobs in 9 months of production and just recently over 100K jobs in two
months).

Ian

-- 
Ian Kaufman
Research Systems Administrator
UC San Diego, Jacobs School of Engineering ikaufman AT ucsd DOT edu

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to