Am 01.11.2013 um 14:39 schrieb Sylvain Foisy Ph. D.:

> Hi Reuti,
> 
> Everything seems to be working fine now. My $SGE_ROOT is located on a SAN 
> volume, connected to me cluster via NFS. Would network saturation issues 
> might cause this type of behaviour?

Yes. It's best to have all spool directories for the exechosts local too 
instead of having all in a shared $SGE_ROOT: 
http://arc.liv.ac.uk/SGE/howto/nfsreduce.html

Unless you have a shadow qmaster also the qmaster's spool directory should be 
local IMO.

During installation you can give different directories for the qmaster spool 
(must exist already) and exechosts spool (will be created with the name of each 
exechost during startup of each of them). The latter can also be changed quite 
easy by editing each exechosts' configuration file (`qconf -mconf node001`...).

-- Reuti


> Thanks in advance
> 
> S 
> 
> On 2013-10-31, at 1:25 PM, Reuti wrote:
> 
>> Hi,
>> 
>> Am 31.10.2013 um 15:38 schrieb Sylvain Foisy Ph. D.:
>> 
>>> I sent a whole bunch of next gen sequencing alignment jobs on our cluster 
>>> that completed just fine on the slaves but my qmaster process dies along 
>>> the way and I had to restart it. Following this, I tried to submit 
>>> sleeper.sh test jobs to check if everytinng was fine but they get stuck in 
>>> the queue in qw state, never being submitted for execution. When I look 
>>> into the qmaster log file, I see this message a number of times (I guess 
>>> that each time the master tries to submit):
>>> 
>>> rule "default rule (spool dir)" in spooling context "flatfile spooling" 
>>> failed writing an object
>>> 
>>> Ok, I did my googling on this and found out that the problem is lack of 
>>> space for spooling into the $SGE_ROOT folder. All good and fine but my df 
>>> inspection shows me that my $SGE_ROOT is only at 90% free...
>> 
>> The spool directory is at the location you specified during installation. So 
>> all the flat files are in $SGE_ROOT/default/spool/qmaster? This location is 
>> writable too?
>> 
>> -- Reuti
>> 
>> 
>>> Before I go and restart the master server, is there anything that I should 
>>> be looking for?
>>> 
>>> Best regards and thanks in advance
>>> 
>>> Sylvain
>>> 
>>> ==============================================================
>>> Sylvain Foisy, Ph. D.
>>> Chargé de projet | Project Manager
>>> Bioinformatics
>>> Labo. de génétique et médecine génomique de l'inflammation
>>> Centre de recherche
>>> Institut de cardiologie de Montréal
>>> 5000 Bélanger Est
>>> Montréal, Qc  H1T 1C8
>>> CANADA
>>> ==============================================================
>>> 
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users
>> 
>> 
>> Email secured by Check Point
> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to