Am 29.08.2014 um 17:44 schrieb François-Michel L'Heureux:

> What I'm leaning toward is
>       • From docker, ssh to the server running the said docker
>       • Call qrsh from there
> Already tested and working!

But then you need to know beforehand where you want to get a `qrsh` session? 
There is one docker running on each exechost - maybe I get your configuration 
in the wrong way as I don't use docker.

-- Reuti


> 2014-08-29 11:35 GMT-04:00 Reuti <[email protected]>:
> Hi,
> 
> Am 29.08.2014 um 16:01 schrieb François-Michel L'Heureux:
> 
> > I've made some tests and it seems that regarding the connection flow it's 
> > the opposite. On the submit host, I activated debugging and launched qrsh.
> >
> > I got
> >
> >      9   4931         main     starting commlib server
> >     10   4931         main     trying to create commlib handle
> >     11   4931         main     (*handle)->connect_port = 0
> >     12   4931         main     (*handle)->service_port = 46780
> >     13   4931         main     B E F O R E     S E N D I N G! ! ! ! ! ! ! ! 
> > ! ! ! ! ! !
> >     14   4931         main     
> > =====================================================
> >     15   4931         main     sge_set_auth_info: username(uid) = root(0), 
> > groupname = root(0)
> >     16   4931         main     JSV client context
> >     17   4931         main     JSV list for current thread updated
> >     18   4931         main     job id is: 32
> >     19   4931         main     R E A D I N G    J O B ! ! ! ! ! ! ! ! ! ! !
> >     20   4931         main     ============================================
> >     21   4931         main     random polling set to 3
> >     22   4931         main     waiting for connection
> >
> > Then ctrl+z, ran "netstat -plnt" and got
> >
> > Proto Recv-Q Send-Q Local Address           Foreign Address         State   
> >     PID/Program name
> > tcp        0      0 0.0.0.0:46780           0.0.0.0:*               LISTEN  
> >     4931/qrsh
> >
> > So it seems that it is in fact the submit host that opens a port and await 
> > for the exec host to connect.
> 
> Aha, interesting. Maybe it's different when it's set up to use `ssh` resp. 
> `rsh`.
> 
> http://arc.liv.ac.uk/SGE/htmlman/htmlman5/remote_startup.html
> 
> 
> > Since docker doesn't currently allow to bind ports on demand I think I have 
> > to declare my attempt a failure. (A very instructive one though!) It may 
> > work someday if docker changes the way it handles ports, but in its current 
> > state I don't see what can be done. I'll keep thinking about it...
> 
> What I can think of: after you know the port you need to forward, `ssh` from 
> the exechost to the submithost for this user and use a remote forward for 
> this port.
> 
> -- Reuti
> 
> 
> > Thanks a lot.
> > Mich
> >
> >
> > 2014-08-29 6:27 GMT-04:00 Reuti <[email protected]>:
> > Hi,
> >
> > Am 28.08.2014 um 23:28 schrieb François-Michel L'Heureux:
> >
> > > Ok, I'm pretty sure now my issue is related to ports exposition.
> > > Is there a known/configurable port range used by SGE? There seem to be 
> > > none...
> >
> > SGE will instruct the shepherd on the selected exechost to start a damon 
> > (either builtin/sshd/rshd) solely for your job and listen on a randomly 
> > selected port. On the other side the submission host will get the 
> > information to connect on exactly using this port to the exechost.
> >
> > I'm also not aware of any option to restrict this to a range of ports 
> > (besides doing it somewhere in the source).
> >
> > -- Reuti
> >
> >
> > > 2014-08-28 16:57 GMT-04:00 François-Michel L'Heureux 
> > > <[email protected]>:
> > > Hello!
> > >
> > > I'll start by admitting that I'm working on a somewhat complex setup: I'm 
> > > trying to submit jobs to SGE from a docker environment.
> > >
> > > So far, I've managed to mount the proper directory, run theinstallation 
> > > and so on. qhost and qsub work perfectly well. qrsh and qlogin however 
> > > don't. When I check /opt/sge6/default/spool/qmaster/messages I have
> > >
> > > 08/28/2014 20:41:31|worker|master|W|job 23.1 failed on host master 
> > > assumedly after job because: can't read usage file for job 23.1
> > >
> > > for all of my qrsh/qlogin attemps.
> > >
> > > Googling the error did not help much. Anyone ever encountered that?
> > >
> > > Thanks a lot!
> > > Mich
> > >
> > > _______________________________________________
> > > users mailing list
> > > [email protected]
> > > https://gridengine.org/mailman/listinfo/users
> >
> >
> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to