On Thu, Nov 08, 2007 at 04:21:25PM +0100, Arnau Bria alleged:
> Hi,
> 
> 
> a couple of days I sent this e-mail to torque list. I got no reply, so
> I decided to post here too, maybe someone has seen this error before.
> 
> Sorry in advance for the cross-posting.

Sorry, I ment to respond to the first one.


> we're getting sporadic errors when jobs finishes running in a WN and
> has to copy its output to submitter host. 
> 
> We've configured ssh in our submitter/executer in order to avoid
> requesting password, so for example:
> 
> [EMAIL PROTECTED] ~]# su - ops006
> [EMAIL PROTECTED] ~]$ ssh ce07 date
> Scientific Linux CERN Release 3.0.8 (SL)
> Tue Nov  6 12:09:05 CET 2007
> [EMAIL PROTECTED] ~]$

Non-interactive shells shouldn't print anything other than the command output.
It tends to confused programs that do their own remote shells (like scp).

For example, I can't do this on your cluster:
  date=`ssh ce07 date`

 
> But looking job's log in WN we find:
> Oct 24 02:20:48 td237 pbs_mom: req_cpyfile, Unable to copy file
> [EMAIL 
> PROTECTED]:/home/ops006/.lcgjm/globus-cache-export.Q18475/globus-cache-e
> xport.Q18475.gpg to globus-cache-export.Q18475.gpg

The actual copy file error message should be in syslog.

Attachment: pgpJlLodiDTFE.pgp
Description: PGP signature

_______________________________________________
mauiusers mailing list
mauiusers@supercluster.org
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to