[gridengine users] SGE supports heterogeneous network?

2015-01-26 Thread Sangmin Park
Hello, We have three HPC systems called A, B, and C and these could be accessible through the login node. SGE is installed login node. A and B HPC systems are consist of master node and computing nodes respectively and connected gigabit ethernet between them. But, C HPC system has ideal configurat

Re: [gridengine users] Cannot request resource if it is a load value of memory type: SGE reports it as unknown resource

2015-01-26 Thread Ilya M
Yes, it does list the nodes OK: >qhost -F mem_free -l mem_free=80g gpu001lx24-amd64 16 3.24 126.1G 24.9G4.0G 0.0 Host Resource(s): hl:mem_free=101.250G gpu002lx24-amd64 16 2.21 126.1G 27.4G4.0G 0.0 Host Resource(s): hl:mem_free=98.770G gp

Re: [gridengine users] serial and mpi jobs running on the same nodes

2015-01-26 Thread Reuti
Hi, Am 26.01.2015 um 21:39 schrieb Winkler, Ursula (ursula.wink...@uni-graz.at): > I'll trying to find a solution for an environment running serial jobs as well > as mpi jobs on > 6 hosts where each host has 32 cores/slots. Due to the small number of nodes, > assigning > each sort of jobs to se

[gridengine users] serial and mpi jobs running on the same nodes

2015-01-26 Thread Winkler, Ursula (ursula.wink...@uni-graz.at)
Hi gridengine mailinglist members, I'll trying to find a solution for an environment running serial jobs as well as mpi jobs on 6 hosts where each host has 32 cores/slots. Due to the small number of nodes, assigning each sort of jobs to separate nodes (e.g. nodes 1-2 for serial, nodes 3-6 for m

Re: [gridengine users] Huge amount of files generated in local disk

2015-01-26 Thread Reuti
Hi, > Am 26.01.2015 um 17:15 schrieb Feng Zhang : > > I just found a strange behavior of SGE 2011. > > One user's job generate 1+ million small files in local > disk($TEMPDIR). Hence in the local scratch directory provided by SGE? > It looks like it makes the execd very busy and from > the si

[gridengine users] Huge amount of files generated in local disk

2015-01-26 Thread Feng Zhang
Hi Guys, I just found a strange behavior of SGE 2011. One user's job generate 1+ million small files in local disk($TEMPDIR). It looks like it makes the execd very busy and from the side of qmaster, the node is lost and unavailable, while I can ssh to login. On the node, execd makes huge IOs( a f

Re: [gridengine users] Epilog to print out usage summary?

2015-01-26 Thread James Abbott
Should do...some processes which hold their log files open need to be HUPped or restarted after rotating the logs, but although 'fuser' reports the accounting file is open the qmaster doesn't seem to have a problem with this configuration, so the reporting file will probably be the same. If nec