Am 20.05.2013 um 15:50 schrieb Jacques Foucry:

> Le 15/05/2013 18:23, Reuti a écrit :
> Hello,
> 
>> It's necessary to specify a cluster queue or queue domain/instance (see `man 
>> sge_typed`=> wc_queue):
>> 
>> $ qsub -q "*@sge1" test.sh
>> 
>> You can also request a dedicated host this way:
>> 
>> $ qsub -l h=sge1 test.sh
> 
> Ok, understood.
> 
> I still cannot launch script on sge1 host.
> 
> qstat -f does not show it's queue:
> 
> # qstat -f
> queuename                      qtype resv/used/tot. load_avg arch    states
> ---------------------------------------------------------------------------------
> [email protected]             BIP   0/0/1          0.00     linux-x64
> ---------------------------------------------------------------------------------
> [email protected]           BIP   0/0/1          -NA-     -NA-    au

`qhost` shows a similar output?


> On this host it seems like sge_execd is not launch. So I tried to start it by 
> hand:
> 
> # sudo /etc/init.d/sge_execd start
> sed: can't read dist/util/rctemplates/sgeexecd_template: No such file or 
> directory
>   starting sge_execd

Maybe an old instance is still running:

`ps -e f | grep sge`

There might also be a file in /tmp with some information about it.

-- Reuti


> And it does not start:
> 05/20/2013 15:40:30|  main|sge1|E|communication error for 
> "sge1.ns42.fr/execd/1" running on port 6445: "ca
> n't bind socket"
> 05/20/2013 15:40:31|  main|sge1|E|commlib error: can't bind socket (no 
> additional information available)
> 05/20/2013 15:40:59|  main|sge1|C|abort qmaster registration due to 
> communication errors
> 05/20/2013 15:41:00|  main|sge1|W|daemonize error: child exited before 
> sending daemonize state
> 
> $SGE_CELL is correctly mounted from qmaster, I can launch jobs excuted on 
> other host (sge0).
> 
> port 6645 seems to not be already used:
> # netstat -ntu
> Active Internet connections (w/o servers)
> Proto Recv-Q Send-Q Local Address               Foreign Address     State
> tcp        0      0 192.168.77.154:36491        192.168.77.150:6444     
> ESTABLISHED
> tcp        0      0 192.168.77.154:57260        192.168.72.12:389     
> ESTABLISHED
> tcp        0      0 192.168.77.154:57258        192.168.72.12:389     
> ESTABLISHED
> tcp        0      0 192.168.77.154:57256        192.168.72.12:389     
> ESTABLISHED
> tcp        0      0 192.168.77.154:22           192.168.64.10:38962     
> ESTABLISHED
> tcp        0      0 192.168.77.154:831          192.168.77.150:2049     
> ESTABLISHED
> tcp        0      0 192.168.77.154:57259        192.168.72.12:389     
> ESTABLISHED
> tcp        0      0 192.168.77.154:22           192.168.64.10:38889     
> ESTABLISHED
> tcp        0      0 192.168.77.154:57257        192.168.72.12:389     
> ESTABLISHED
> 
> I made a mistake, but I cannot figure where.
> 
> Thanks for your help,
> Jacques
> -- 
> Jacques Foucry
> *NOVΛSPARKS *
> IT Manager
> Tel : +33 (0)1 42 68 12 61
> [email protected]
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to