Hi Kent, > > *Here’s my question*: In our case, we have nodes that have 8, 12 and even > 24 cores. A much different world than not too long ago when everything > was single or dual or (max) quad core. And while I have no doubt the boxes > can handle the cpu load of many jobs, I think I’m hitting network > limitations and stuff is getting dropped. Can anyone here speak to > opinions, experiences, etc- when it comes to “max simultaneous jobs per > executions host” as relates to networking?
I have a cluster of 94 nodes, all dual hexacore (12 cores per node) and using QDR IB. I have not run into any networking issues as a result of overloaded network, and my storage system (when healthy) has withstood multiple nodes running 12 jobs, as well as hitting loads of 1000 jobs per day across the cluster as a whole (it had done over 500K jobs in 9 months of production and just recently over 100K jobs in two months). Ian -- Ian Kaufman Research Systems Administrator UC San Diego, Jacobs School of Engineering ikaufman AT ucsd DOT edu _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users