my $.02
SGE can run 100% local without NFS - the main thing (in my experience)
that you lose in this config is the easy troublshooting ability of going
into a central $SGE_ROOT/$SGE_CELL/ and seeing all of the various node
spool and message files. It's annoying but not a dealbreaker especially
after seeing what you are experiencing.
That said, I do a ton of SGE work with classic spooling on EMC Isilon
storage - some environments that do close to 1 million jobs/month in
throughput and we've never seen a catastrophic loss of jobs or spool
data. Most are without Bright although I know of at least one group
running Bright on 1000 cores sitting on top of Isilon storage and
they've not seen anything like this either.
If you go 100% local my recommendation would just be to put the whole
$SGE_ROOT out on the local nodes. The time it would take to winnow down
to the minimal file set is not worth it relative to the size of the
whole thing.
-Chris
Peskin, Eric <mailto:[email protected]>
November 12, 2014 at 8:26 AM
All,
Does SGE have to use NFS or can it work locally on each node?
If parts of it have to be on NFS, what is the minimal subset?
How much of this changes if you want redundant masters?
We have a cluster running CentOS 6.3, Bright Cluster Manager 6.0, and
SGE 2011.11. Specifically, SGE is provided by a Bright package:
sge-2011.11-360_cm6.0.x86_64
Twice, we have lost all the running SGE jobs when the cluster failed
over from one head node to the other. =( Not supposed to happen.
Since then, we have also had many individual jobs get lost. The later
situation correlates with messages in the system logs saying
That file lives on an NFS mount on our Isilon storage.
Surely, the executables don't have to be on NFS?
Interesting, we are using local spooling, the spool directory on each
node is /cm/local/apps/sge/var/spool , which is, indeed local.
But the $SGE_ROOT , /cm/shared/apps/sge/2011.11 lives on NFS.
Does any of it need to?
Maybe just the var part would need to: /cm/shared/apps/sge/var ?
Thanks,
Eric
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users