The docker is on the submit host.

2014-08-29 11:57 GMT-04:00 Reuti <[email protected]>:

> Am 29.08.2014 um 17:44 schrieb François-Michel L'Heureux:
>
> > What I'm leaning toward is
> >       • From docker, ssh to the server running the said docker
> >       • Call qrsh from there
> > Already tested and working!
>
> But then you need to know beforehand where you want to get a `qrsh`
> session? There is one docker running on each exechost - maybe I get your
> configuration in the wrong way as I don't use docker.
>
> -- Reuti
>
>
> > 2014-08-29 11:35 GMT-04:00 Reuti <[email protected]>:
> > Hi,
> >
> > Am 29.08.2014 um 16:01 schrieb François-Michel L'Heureux:
> >
> > > I've made some tests and it seems that regarding the connection flow
> it's the opposite. On the submit host, I activated debugging and launched
> qrsh.
> > >
> > > I got
> > >
> > >      9   4931         main     starting commlib server
> > >     10   4931         main     trying to create commlib handle
> > >     11   4931         main     (*handle)->connect_port = 0
> > >     12   4931         main     (*handle)->service_port = 46780
> > >     13   4931         main     B E F O R E     S E N D I N G! ! ! ! !
> ! ! ! ! ! ! ! ! !
> > >     14   4931         main
>  =====================================================
> > >     15   4931         main     sge_set_auth_info: username(uid) =
> root(0), groupname = root(0)
> > >     16   4931         main     JSV client context
> > >     17   4931         main     JSV list for current thread updated
> > >     18   4931         main     job id is: 32
> > >     19   4931         main     R E A D I N G    J O B ! ! ! ! ! ! ! !
> ! ! !
> > >     20   4931         main
>  ============================================
> > >     21   4931         main     random polling set to 3
> > >     22   4931         main     waiting for connection
> > >
> > > Then ctrl+z, ran "netstat -plnt" and got
> > >
> > > Proto Recv-Q Send-Q Local Address           Foreign Address
>  State       PID/Program name
> > > tcp        0      0 0.0.0.0:46780           0.0.0.0:*
>  LISTEN      4931/qrsh
> > >
> > > So it seems that it is in fact the submit host that opens a port and
> await for the exec host to connect.
> >
> > Aha, interesting. Maybe it's different when it's set up to use `ssh`
> resp. `rsh`.
> >
> > http://arc.liv.ac.uk/SGE/htmlman/htmlman5/remote_startup.html
> >
> >
> > > Since docker doesn't currently allow to bind ports on demand I think I
> have to declare my attempt a failure. (A very instructive one though!) It
> may work someday if docker changes the way it handles ports, but in its
> current state I don't see what can be done. I'll keep thinking about it...
> >
> > What I can think of: after you know the port you need to forward, `ssh`
> from the exechost to the submithost for this user and use a remote forward
> for this port.
> >
> > -- Reuti
> >
> >
> > > Thanks a lot.
> > > Mich
> > >
> > >
> > > 2014-08-29 6:27 GMT-04:00 Reuti <[email protected]>:
> > > Hi,
> > >
> > > Am 28.08.2014 um 23:28 schrieb François-Michel L'Heureux:
> > >
> > > > Ok, I'm pretty sure now my issue is related to ports exposition.
> > > > Is there a known/configurable port range used by SGE? There seem to
> be none...
> > >
> > > SGE will instruct the shepherd on the selected exechost to start a
> damon (either builtin/sshd/rshd) solely for your job and listen on a
> randomly selected port. On the other side the submission host will get the
> information to connect on exactly using this port to the exechost.
> > >
> > > I'm also not aware of any option to restrict this to a range of ports
> (besides doing it somewhere in the source).
> > >
> > > -- Reuti
> > >
> > >
> > > > 2014-08-28 16:57 GMT-04:00 François-Michel L'Heureux <
> [email protected]>:
> > > > Hello!
> > > >
> > > > I'll start by admitting that I'm working on a somewhat complex
> setup: I'm trying to submit jobs to SGE from a docker environment.
> > > >
> > > > So far, I've managed to mount the proper directory, run
> theinstallation and so on. qhost and qsub work perfectly well. qrsh and
> qlogin however don't. When I check /opt/sge6/default/spool/qmaster/messages
> I have
> > > >
> > > > 08/28/2014 20:41:31|worker|master|W|job 23.1 failed on host master
> assumedly after job because: can't read usage file for job 23.1
> > > >
> > > > for all of my qrsh/qlogin attemps.
> > > >
> > > > Googling the error did not help much. Anyone ever encountered that?
> > > >
> > > > Thanks a lot!
> > > > Mich
> > > >
> > > > _______________________________________________
> > > > users mailing list
> > > > [email protected]
> > > > https://gridengine.org/mailman/listinfo/users
> > >
> > >
> >
> >
>
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to