>> >
>> >"Owned compute" has some advantages over "rented compute."  In general, the
>> >control one has over one's owned resources enables applications to run with
>> >greater performance.  Some optimizations just demand root access!
>> >
> As someone who has been Scientific Computing/HPC System Admin, I can
> tell you this is a complete myth.

You need at least access to the person with root access. If (s)he's a colleague 
who needs to keep 10 more users in the department happy, you can sit down and 
find a solution if your program doesn't work. Try to convince Amazon that you 
need SHMMAX increased, and that their default processor affinity settings slow 
your code down.

   Herbert

Herbert, hello again!

You make a very good point there.
Very often I find that when diagnosing problems in HPC you have to have root 
access -
and you have to start with the simple things first - which are almost always 
the root cause.

For instance, when user jobs crash or are going I'm often asked "Is the 
Infiniband down?"
I always, always take a step back and look at the nodes running the jobs - look 
in their system logs etc.
Often you will see things such as port ranges being exhausted for rsh (nto on 
current systems, but a long time ago),
OOM killer events. Or just that processes from old jobs haven't died properly 
and are still running on the machine.
99% of the time it is just simple system admin which points the way.
And as you say, you need root access to (say) increase port ranges or futz with 
the OOM killer tunings.


The contents of this email are confidential and for the exclusive use of the 
intended recipient.  If you receive this email in error you should not copy it, 
retransmit it, use it or disclose its contents but should return it to the 
sender immediately and delete your copy.
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to