Hi Everyone,

 

I am looking to get some help in setting up my cluster to use interactive
sessions.

 

My cluster currently looks like this:

  qmaster: "GridMaster" - CentOS 6.5 VM running Sons of Grid Engine 8.1.6.

  execution hosts: ~40 CentOS 6.6 physical servers of varying capability.
All execd installed from GridMaster 'start_gui_installer'.

  Users: Managed by FreeIPA 3.x authentication server. Each user has
passwordless SSH between hosts via public keys.

 

My goals:

1.       A user will call something like 'qlogin -l
mf=20G,h_vmem=20G,h_rt=1:00:00' and they will be presented with a terminal
to a system with their requested resources / slots reserved.

2.       If a user goes over their h_vmem or h_rt their shell will be
terminated and any processes launched from it will be terminated / cleaned
up.

3.       Shell must be capable of X-forwarding.

 

I think my goals are achievable through some configuration of SoGE using the
qlogin wrapper script and the cgroups functions, but I am not sure how.

 

I was able to set up the wrapper script, but could not use the port number.
I fear it has to do with FreeIPA and their control over SSH.

--

  #!/bin/sh

  HOST=$1

  PORT=$2

  /usr/bin/ssh -Y $HOST

--

 

I was able to get the qlogin command working: 

--

  $ hostname

  charmander.server.com

  $ qlogin -m beas -N "grid-gui" -l mf=20G -l h_vmem=20G -l h_rt=1:00:00 -w
e

  Your job 140263 ("grid-gui") has been submitted

  waiting for interactive job to be scheduled ...

  Your interactive job 140263 has been successfully scheduled.

  Establishing /import/stargate/grid/scripts/qlogin_ssh_wrapper.sh session
to host bulbasaur.server.com ...

  Warning: Permanently added 'bulbasaur.server.com' (RSA) to the list of
known hosts.

  Last login: Fri Jun  5 15:59:37 2015 from charmander.server.com

  -bash-4.1$ hostname

  bulbasaur.server.com

--

For some reason Grid Engine seems to think the job is aborted after 60
seconds (maybe due to the port not being used?). I still have an open
terminal, and can still run anything I want, but the Grid has released the
mem_free and slots for others to use.

 

If anyone has advice on how I can set this up to accomplish my three goals,
it would be very much appreciated. I can post any configuration / logs /
details required.

 

Thanks,

-Chris Tobey

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to