ForwardX11 yes
Just change it to no or comment it, as I think the default is no. If you want to do it on a user level, just create a ~/.ssh/config file, with that same line in it, which will override any global configuration paramether. e.g.
ForwardX11 no
I'm cc'ing the OSCAR users list to keep them in the loop as well.
Jeremy
At 03:29 PM 5/8/2004, Alexander V Shirokov wrote:
Dear Jeremy,
I have been at the Beowulf cluster workshop at MIT that you were presenting two years ago. Since then I have been using beowulf clusters all over time. I have been trying to solve a problem (a bug) for two weeks already. I am supposed to defend a PhD in August, time is short. I would really appreciate your help, since it will make things move then. Please help me solve this problem, if possible.
When I run the code, the program stops crashes after about 40 timesteps when ran without submitting to PBSPro by qsub.
When I run the code by submitting by qsub of PBSPro, I get this error diagnostics after about 10 timesteps, and the run dies:
1) Standard error PBSPro file int2.pbs.e919:
Warning: No xauth data; using fake authentication data for X11 forwarding. =>> PBS: job killed: node 17 (node18) requested job die, code 15009
2) File /var/spool/PBS/mom_logs/20040508 on node18:
13:31:17;0008;pbs_mom;Job;919.antares.mit.edu;JOIN JOB as node 17 15:04:46;0004;pbs_mom;Job;919.antares.mit.edu;polling stopped 15:04:46;0008;pbs_mom;Job;919.antares.mit.edu;kill_job
3) File /var/spool/PBS/mom_logs/20040508 on node1:
11:43:54;0008;pbs_mom;Job;790.antares.mit.edu;Started, pid = 13919
13:18:12;0008;pbs_mom;Job;844.antares.mit.edu;Started, pid = 12919
13:31:17;0008;pbs_mom;Job;919.antares.mit.edu;Started, pid = 14043
15:06:46;0008;pbs_mom;Job;919.antares.mit.edu;send POLL failed
15:06:46;0008;pbs_mom;Job;919.antares.mit.edu;node 17 (node18) requested job die, code 15009
15:06:46;0008;pbs_mom;Job;919.antares.mit.edu;kill_job
15:06:48;0080;pbs_mom;Job;919.antares.mit.edu;task 1 terminated
15:06:48;0008;pbs_mom;Job;919.antares.mit.edu;Terminated
15:06:58;0008;pbs_mom;Job;919.antares.mit.edu;kill_job
15:06:58;0100;pbs_mom;Job;919.antares.mit.edu;Obit sent
4) The error messages in the standard output files on these nodes look the same:
p67_5862: p4_error: net_recv read: probable EOF on socket: 1
However on node16, it is p64_6016: (5813.998720) net_recv failed for fd = 3 p64_6016: p4_error: net_recv read, errno = : 104 on node4 it is p16_6446: (5832.189857) net_recv failed for fd = 3 p16_6446: p4_error: net_recv read, errno = : 104
Thank you, and I would really appreciate your help.
Regards, Alex
------------------------------------------------------- This SF.Net email is sponsored by Sleepycat Software Learn developer strategies Cisco, Motorola, Ericsson & Lucent use to deliver higher performing products faster, at low TCO. http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3 _______________________________________________ Oscar-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/oscar-users
