>> > >> >"Owned compute" has some advantages over "rented compute." In general, the >> >control one has over one's owned resources enables applications to run with >> >greater performance. Some optimizations just demand root access! >> > > As someone who has been Scientific Computing/HPC System Admin, I can > tell you this is a complete myth.
You need at least access to the person with root access. If (s)he's a colleague who needs to keep 10 more users in the department happy, you can sit down and find a solution if your program doesn't work. Try to convince Amazon that you need SHMMAX increased, and that their default processor affinity settings slow your code down. Herbert Herbert, hello again! You make a very good point there. Very often I find that when diagnosing problems in HPC you have to have root access - and you have to start with the simple things first - which are almost always the root cause. For instance, when user jobs crash or are going I'm often asked "Is the Infiniband down?" I always, always take a step back and look at the nodes running the jobs - look in their system logs etc. Often you will see things such as port ranges being exhausted for rsh (nto on current systems, but a long time ago), OOM killer events. Or just that processes from old jobs haven't died properly and are still running on the machine. 99% of the time it is just simple system admin which points the way. And as you say, you need root access to (say) increase port ranges or futz with the OOM killer tunings. The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
