Most batch systems have an option to wait until the job is finished before the submit command returns. I know PBS uses "-W block=true" and that SGE and LSF have similar options (but I don't recall the precise flags).

If your batch system doesn't provide that, I'd recommend adding some shell scripting to loop through checking the queue for job completion and not return until it's done. The sleep thing would work, but wouldn't exit when the server finishes, leaving the ssh tunnels (and other things like portfwd if you put them in your scripts) lying around.

Incidentally, this brings up an interesting point about ParaView with client/server. It doesn't try to clean up it's child processes, AFAIK. For example, if you set up this ssh tunnel inside the ParaView GUI (e.g., using a command instead of a manual connection), and you cancel the connection, it will leave the ssh running. You have to track down the ssh process and kill it yourself. It's minor thing, but it can also prevent future connections if you don't realize there's a zombie ssh that kept your ports open.


On 02/08/10 21:03, burlen wrote:
I am curious to hear what Sean has to say.

But, say the batch system returns right away after the job is submitted,
I think we can doctor the command so that it will live for a while
longer, what about something like this:

ssh -R XXXX:localhost:YYYY remote_machine "submit_my_job.sh && sleep
100d"


pat marion wrote:
Hey just checked out the wiki page, nice! One question, wouldn't this
command hang up and close the tunnel after submitting the job?
ssh -R XXXX:localhost:YYYY remote_machine submit_my_job.sh
Pat

On Mon, Feb 8, 2010 at 8:12 PM, pat marion <pat.mar...@kitware.com
<mailto:pat.mar...@kitware.com>> wrote:

Actually I didn't write the notes at the hpc.mil <http://hpc.mil>
link.

Here is something- and maybe this is the problem that Sean refers
to- in some cases, when I have set up a reverse ssh tunnel from
login node to workstation (command executed from workstation) then
the forward does not work when the compute node connects to the
login node. However, if I have the compute node connect to the
login node on port 33333, then use portfwd to forward that to
localhost:11111, where the ssh tunnel is listening on port 11111,
it works like a charm. The portfwd tricks it into thinking the
connection is coming from localhost and allow the ssh tunnel to
work. Hope that made a little sense...

Pat


On Mon, Feb 8, 2010 at 6:29 PM, burlen <burlen.lor...@gmail.com
<mailto:burlen.lor...@gmail.com>> wrote:

Nice, thanks for the clarification. I am guessing that your
example should probably be the recommended approach rather
than the portfwd method suggested on the PV wiki. :) I took
the initiative to add it to the Wiki. KW let me know if this
is not the case!

http://paraview.org/Wiki/Reverse_connection_and_port_forwarding#Reverse_connection_over_an_ssh_tunnel


Would you mind taking a look to be sure I didn't miss anything
or bollix it up?

The sshd config options you mentioned may be why your method
doesn't work on the Pleiades system, either that or there is a
firewall between the front ends and compute nodes. In either
case I doubt the NAS sys admins are going to reconfigure for
me :) So at least for now I'm stuck with the two hop ssh
tunnels and interactive batch jobs. if there were someway to
script the ssh tunnel in my batch script I would be golden...

By the way I put the details of the two hop ssh tunnel on the
wiki as well, and a link to Pat's hpc.mil <http://hpc.mil>
notes. I don't dare try to summarize them since I've never
used portfwd and it refuses to compile both on my workstation
and the cluster.

Hopefully putting these notes on the Wiki will save future
ParaView users some time and headaches.


Sean Ziegeler wrote:

Not quite- the pvsc calls ssh with both the tunnel options
and the commands to submit the batch job. You don't even
need a pvsc; it just makes the interface fancier. As long
as you or PV executes something like this from your machine:
ssh -R XXXX:localhost:YYYY remote_machine submit_my_job.sh

This means that port XXXX on remote_machine will be the
port to which the server must connect. Port YYYY (e.g.,
11111) on your client machine is the one on which PV
listens. You'd have to tell the server (in the batch
submission script, for example) the name of the node and
port XXXX to which to connect.

One caveat that might be causing you problems, port
forwarding (and "gateway ports" if the server is running
on a different node than the login node) must be enabled
in the remote_machine's sshd_config. If not, no ssh
tunnels will work at all (see: man ssh and man
sshd_config). That's something that an administrator
would need to set up for you.

On 02/08/10 12:26, burlen wrote:

So to be sure about what you're saying: Your .pvsc
script ssh's to the
front end and submits a batch job which when it's
scheduled , your batch
script creates a -R style tunnel and starts pvserver
using PV reverse
connection. ? or are you using portfwd or a second ssh
session to
establish the tunnel ?

If you're doing this all from your .pvsc script
without a second ssh
session and/or portfwd that's awesome! I haven't been
able to script
this, something about the batch system prevents the
tunnel created
within the batch job's ssh session from working. I
don't know if that's
particular to this system or a general fact of life
about batch systems.

Question: How are you creating the tunnel in your
batch script?

Sean Ziegeler wrote:

Both ways will work for me in most cases, i.e. a
"forward" connection
with ssh -L or a reverse connection with ssh -R.

However, I find that the reverse method is more
scriptable. You can
set up a .pvsc file that the client can load and
will call ssh with
the appropriate options and commands for the
remote host, all from the
GUI. The client will simply wait for the reverse
connection from the
server, whether it takes 5 seconds or 5 hours for
the server to get
through the batch queue.

Using the forward connection method, if the server
isn't started soon
enough, the client will attempt to connect and
then fail. I've always
had to log in separately, wait for the server to
start running, then
tell my client to connect.

-Sean

On 02/06/10 12:58, burlen wrote:

Hi Pat,

My bad. I was looking at the PV wiki, and
thought you were talking about
doing this without an ssh tunnel and using
only port forward and
paraview's --reverse-connection option . Now
that I am reading your
hpc.mil <http://hpc.mil> post I see what you
mean :)

Burlen


pat marion wrote:

Maybe I'm misunderstanding what you mean
by local firewall, but
usually as long as you can ssh from your
workstation to the login node
you can use a reverse ssh tunnel.


_______________________________________________
Powered by www.kitware.com
<http://www.kitware.com>

Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html

Please keep messages on-topic and check the
ParaView Wiki at:
http://paraview.org/Wiki/ParaView

Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview




_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at 
http://www.kitware.com/opensource/opensource.html

Please keep messages on-topic and check the ParaView Wiki at: 
http://paraview.org/Wiki/ParaView

Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview

Reply via email to