We're running SGE 6.2u5, OpenMPI 1.3.3 compiled with SGE integration. Our cluster has some AMD and some Intel-based servers, but all are managed through the same kickstart build and the same cfengine configuration. The only deliberate differences in the nodes are:
ATLAS and BLAS libraries are optimized per-CPU type the AMD nodes have disk drives that correctly report SMART readings, so the smartd monitoring process runs on those machines There are 3 PEs defined: openmpi all nodes openmpi-Intel Intel CPU nodes openmpi-AMD AMD CPU nodes The only known differences in the PEs are: the hostgroup assigned to the PE (all nodes, just Intel nodes, just AMD nodes) the number of slots per PE the environment variable "ARCHPATH" is set to the directory where libraries optimized per-architecture are stored the environment variable "ARCH" is set to the architecture (as in: Intel-Nehalem) (The environment variables are set outside of SGE jobs, so they will exist when a job is launched via mpirun or submitted via qsub.) I can run MPI jobs using "mpirun" on the Intel, AMD, or mixed sets of nodes, using a machines file. Running the same commands as an SGE job fails on the AMD nodes. For example, running: mpirun -np 50 -machinefile machines.AMD /bin/hostname succeeds mpirun -np 50 -machinefile machines.AMD hello_world.mpi succeeds mpirun -np 50 -machinefile machines.Intel /bin/hostname succeeds mpirun -np 50 -machinefile machines.Intel hello_world.mpi succeeds mpirun -np 50 -machinefile machines.mixed /bin/hostname succeeds mpirun -np 50 -machinefile machines.mixed hello_world.mpi succeeds qsub -pe openmpi-Intel 50 mpirun /bin/hostname succeeds qsub -pe openmpi-Intel 50 mpirun hello_world.mpi succeeds qsub -pe openmpi 50 mpirun /bin/hostname * FAILS if AMD nodes are used * qsub -pe openmpi 50 mpirun hello_world.mpi * FAILS if AMD nodes are used * qsub -pe openmpi-AMD 50 mpirun /bin/hostname * FAILS * qsub -pe openmpi-AMD 50 mpirun hello_world.mpi * FAILS * When I run the job on the openmpi-AMD PE with debugging statements, I can see that it starts on a node, that the slave MPI processes are dispatched. As expected, all processes are run only on AMD nodes. However, there are no results and the job finishes without an error. It does take longer (~minutes) for the job to finish than the jobs that work correctly on the Intel nodes. Perhaps the job 'finishes' when there's some orted timeout, but no error is reported. Any suggestions for more troubleshooting? Please see below for output from a test job. Thanks, Mark ---------------------------------------------- Command as submitted via qsub: mpirun --verbose \ --display-map \ --tag-output \ --debug-daemons \ --display-allocation \ --mca orte_forward_job_control 1 \ --mca pls_gridengine_verbose 1 \ --mca pls_gridengine_debug 1 \ --mca OMPI_MCA_mca_verbose 1 \ --mca btl_base_verbose 30 \ --mca routed direct \ --prefix $OPENMPI -np $NSLOTS ~/hello_openmpi ----- STDOUT from ~/hello_openmpi below this line ----- Command: ~/hello_openmpi Arguments: Executing in: /acme/home/bergman/sge_job_output Executing on: acme-c5-8.example.com Executing at: Thu Dec 20 17:00:46 EST 2012 ----- STDERR from ~/hello_openmpi below this line ----- ====================== ALLOCATED NODES ====================== Data for node: Name: acme-c5-8.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c5-9.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c5-10.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c5-11.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c5-12.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c5-13.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c5-14.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c5-15.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c5-16.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c5-17.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c5-18.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c5-19.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c5-20.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c4-9.example.com Num slots: 1 Max slots: 0 Data for node: Name: acme-c4-11.example.com Num slots: 1 Max slots: 0 ================================================================= ======================== JOB MAP ======================== Data for node: Name: acme-c5-8.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 0 Data for node: Name: acme-c5-9.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 1 Data for node: Name: acme-c5-10.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 2 Data for node: Name: acme-c5-11.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 3 Data for node: Name: acme-c5-12.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 4 Data for node: Name: acme-c5-13.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 5 Data for node: Name: acme-c5-14.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 6 Data for node: Name: acme-c5-15.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 7 Data for node: Name: acme-c5-16.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 8 Data for node: Name: acme-c5-17.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 9 Data for node: Name: acme-c5-18.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 10 Data for node: Name: acme-c5-19.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 11 Data for node: Name: acme-c5-20.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 12 Data for node: Name: acme-c4-9.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 13 Data for node: Name: acme-c4-11.example.com Num procs: 1 Process OMPI jobid: [58179,1] Process rank: 14 ============================================================= Daemon was launched on acme-c5-10.example.com - beginning to initialize Daemon [[58179,0],2] checking in as pid 9734 on host acme-c5-10.example.com Daemon [[58179,0],2] not using static ports [acme-c5-10.example.com:09734] [[58179,0],2] orted: up and running - waiting for commands! Daemon was launched on acme-c5-20.example.com - beginning to initialize Daemon was launched on acme-c5-9.example.com - beginning to initialize Daemon [[58179,0],12] checking in as pid 7292 on host acme-c5-20.example.com Daemon [[58179,0],12] not using static ports Daemon [[58179,0],1] checking in as pid 31954 on host acme-c5-9.example.com [acme-c5-20.example.com:07292] [[58179,0],12] orted: up and running - waiting for commands! Daemon [[58179,0],1] not using static ports [acme-c5-9.example.com:31954] [[58179,0],1] orted: up and running - waiting for commands! Daemon was launched on acme-c4-11.example.com - beginning to initialize Daemon was launched on acme-c5-12.example.com - beginning to initialize Daemon was launched on acme-c5-11.example.com - beginning to initialize Daemon [[58179,0],14] checking in as pid 13717 on host acme-c4-11.example.com Daemon [[58179,0],14] not using static ports [acme-c4-11.example.com:13717] [[58179,0],14] orted: up and running - waiting for commands! Daemon [[58179,0],4] checking in as pid 1010 on host acme-c5-12.example.com Daemon [[58179,0],4] not using static ports Daemon was launched on acme-c5-15.example.com - beginning to initialize [acme-c5-12.example.com:01010] [[58179,0],4] orted: up and running - waiting for commands! Daemon was launched on acme-c4-9.example.com - beginning to initialize Daemon [[58179,0],3] checking in as pid 6876 on host acme-c5-11.example.com Daemon [[58179,0],3] not using static ports [acme-c5-11.example.com:06876] [[58179,0],3] orted: up and running - waiting for commands! Daemon was launched on acme-c5-16.example.com - beginning to initialize Daemon [[58179,0],7] checking in as pid 7819 on host acme-c5-15.example.com Daemon [[58179,0],7] not using static ports [acme-c5-15.example.com:07819] [[58179,0],7] orted: up and running - waiting for commands! Daemon was launched on acme-c5-17.example.com - beginning to initialize Daemon was launched on acme-c5-18.example.com - beginning to initialize Daemon [[58179,0],13] checking in as pid 28397 on host acme-c4-9.example.com Daemon [[58179,0],13] not using static ports [acme-c4-9.example.com:28397] [[58179,0],13] orted: up and running - waiting for commands! Daemon was launched on acme-c5-19.example.com - beginning to initialize Daemon [[58179,0],8] checking in as pid 21432 on host acme-c5-16.example.com Daemon [[58179,0],8] not using static ports [acme-c5-16.example.com:21432] [[58179,0],8] orted: up and running - waiting for commands! Daemon was launched on acme-c5-14.example.com - beginning to initialize Daemon [[58179,0],9] checking in as pid 26411 on host acme-c5-17.example.com Daemon [[58179,0],9] not using static ports [acme-c5-17.example.com:26411] [[58179,0],9] orted: up and running - waiting for commands! Daemon [[58179,0],10] checking in as pid 11348 on host acme-c5-18.example.com Daemon [[58179,0],10] not using static ports [acme-c5-18.example.com:11348] [[58179,0],10] orted: up and running - waiting for commands! Daemon was launched on acme-c5-13.example.com - beginning to initialize Daemon [[58179,0],11] checking in as pid 18318 on host acme-c5-19.example.com Daemon [[58179,0],11] not using static ports [acme-c5-19.example.com:18318] [[58179,0],11] orted: up and running - waiting for commands! Daemon [[58179,0],6] checking in as pid 3987 on host acme-c5-14.example.com Daemon [[58179,0],6] not using static ports [acme-c5-14.example.com:03987] [[58179,0],6] orted: up and running - waiting for commands! Daemon [[58179,0],5] checking in as pid 21829 on host acme-c5-13.example.com Daemon [[58179,0],5] not using static ports [acme-c5-13.example.com:21829] [[58179,0],5] orted: up and running - waiting for commands! [acme-c5-8.example.com:27764] [[58179,0],0] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[1].name acme-c5-9 daemon 1 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[2].name acme-c5-10 daemon 2 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[3].name acme-c5-11 daemon 3 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[4].name acme-c5-12 daemon 4 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[5].name acme-c5-13 daemon 5 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[6].name acme-c5-14 daemon 6 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[7].name acme-c5-15 daemon 7 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[8].name acme-c5-16 daemon 8 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[9].name acme-c5-17 daemon 9 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[10].name acme-c5-18 daemon 10 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[11].name acme-c5-19 daemon 11 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[12].name acme-c5-20 daemon 12 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[13].name acme-c4-9 daemon 13 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] node[14].name acme-c4-11 daemon 14 arch ffca0200 [acme-c5-8.example.com:27764] [[58179,0],0] orted_cmd: received add_local_procs [acme-c5-16.example.com:21432] [[58179,0],8] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[1].name acme-c5-9 daemon 1 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[2].name acme-c5-10 daemon 2 arch ffca0200 [acme-c5-9.example.com:31954] [[58179,0],1] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[1].name acme-c5-9 daemon 1 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[2].name acme-c5-10 daemon 2 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[1].name acme-c5-9 daemon 1 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[3].name acme-c5-11 daemon 3 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[4].name acme-c5-12 daemon 4 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[5].name acme-c5-13 daemon 5 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[6].name acme-c5-14 daemon 6 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[7].name acme-c5-15 daemon 7 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[8].name acme-c5-16 daemon 8 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[9].name acme-c5-17 daemon 9 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[10].name acme-c5-18 daemon 10 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[11].name acme-c5-19 daemon 11 arch ffca0200 [acme-c5-9.example.com:31954] [[58179,0],1] node[1].name acme-c5-9 daemon 1 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[3].name acme-c5-11 daemon 3 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[4].name acme-c5-12 daemon 4 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[5].name acme-c5-13 daemon 5 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[6].name acme-c5-14 daemon 6 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[7].name acme-c5-15 daemon 7 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[8].name acme-c5-16 daemon 8 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[9].name acme-c5-17 daemon 9 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[10].name acme-c5-18 daemon 10 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[11].name acme-c5-19 daemon 11 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[12].name acme-c5-20 daemon 12 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[2].name acme-c5-10 daemon 2 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[3].name acme-c5-11 daemon 3 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[4].name acme-c5-12 daemon 4 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[5].name acme-c5-13 daemon 5 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[6].name acme-c5-14 daemon 6 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[7].name acme-c5-15 daemon 7 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[8].name acme-c5-16 daemon 8 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[9].name acme-c5-17 daemon 9 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[10].name acme-c5-18 daemon 10 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[11].name acme-c5-19 daemon 11 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[1].name acme-c5-9 daemon 1 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[2].name acme-c5-10 daemon 2 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[3].name acme-c5-11 daemon 3 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[4].name acme-c5-12 daemon 4 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[5].name acme-c5-13 daemon 5 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[6].name acme-c5-14 daemon 6 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[7].name acme-c5-15 daemon 7 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[8].name acme-c5-16 daemon 8 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[9].name acme-c5-17 daemon 9 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[10].name acme-c5-18 daemon 10 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[11].name acme-c5-19 daemon 11 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[12].name acme-c5-20 daemon 12 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[13].name acme-c4-9 daemon 13 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] node[14].name acme-c4-11 daemon 14 arch ffca0200 [acme-c5-18.example.com:11348] [[58179,0],10] orted_cmd: received add_local_procs [acme-c5-10.example.com:09734] [[58179,0],2] node[12].name acme-c5-20 daemon 12 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[13].name acme-c4-9 daemon 13 arch ffca0200 [acme-c5-9.example.com:31954] [[58179,0],1] node[2].name acme-c5-10 daemon 2 arch ffca0200 [acme-c5-9.example.com:31954] [[58179,0],1] node[3].name acme-c5-11 daemon 3 arch ffca0200 [acme-c5-9.example.com:31954] [[58179,0],1] node[4].name acme-c5-12 daemon 4 arch ffca0200 [acme-c5-9.example.com:31954] [[58179,0],1] node[5].name acme-c5-13 daemon 5 arch ffca0200 [acme-c5-9.example.com:31954] [[58179,0],1] node[6].name acme-c5-14 daemon 6 arch ffca0200 [acme-c5-9.example.com:31954] [[58179,0],1] node[7].name acme-c5-15 daemon 7 arch ffca0200 [acme-c5-11.example.com:06876] [[58179,0],3] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c5-11.example.com:06876] [[58179,0],3] node[1].name acme-c5-9 daemon 1 arch ffca0200 [acme-c5-14.example.com:03987] [[58179,0],6] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c5-14.example.com:03987] [[58179,0],6] node[1].name acme-c5-9 daemon 1 arch ffca0200 [acme-c5-14.example.com:03987] [[58179,0],6] node[2].name acme-c5-10 daemon 2 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[13].name acme-c4-9 daemon 13 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] node[14].name acme-c4-11 daemon 14 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[12].name acme-c5-20 daemon 12 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[13].name acme-c4-9 daemon 13 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] node[14].name acme-c4-11 daemon 14 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[1].name acme-c5-9 daemon 1 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[2].name acme-c5-10 daemon 2 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[3].name acme-c5-11 daemon 3 arch ffca0200 [acme-c5-17.example.com:26411] [[58179,0],9] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c5-17.example.com:26411] [[58179,0],9] node[1].name acme-c5-9 daemon 1 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] node[14].name acme-c4-11 daemon 14 arch ffca0200 [acme-c5-10.example.com:09734] [[58179,0],2] orted_cmd: received add_local_procs [acme-c5-9.example.com:31954] [[58179,0],1] node[8].name acme-c5-16 daemon 8 arch ffca0200 [acme-c5-11.example.com:06876] [[58179,0],3] node[2].name acme-c5-10 daemon 2 arch ffca0200 [acme-c5-11.example.com:06876] [[58179,0],3] node[3].name acme-c5-11 daemon 3 arch ffca0200 [acme-c5-11.example.com:06876] [[58179,0],3] node[4].name acme-c5-12 daemon 4 arch ffca0200 [acme-c5-11.example.com:06876] [[58179,0],3] node[5].name acme-c5-13 daemon 5 arch ffca0200 [acme-c5-11.example.com:06876] [[58179,0],3] node[6].name acme-c5-14 daemon 6 arch ffca0200 [acme-c5-11.example.com:06876] [[58179,0],3] node[7].name acme-c5-15 daemon 7 arch ffca0200 [acme-c4-11.example.com:13717] [[58179,0],14] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c4-11.example.com:13717] [[58179,0],14] node[1].name acme-c5-9 daemon 1 arch ffca0200 [acme-c5-14.example.com:03987] [[58179,0],6] node[3].name acme-c5-11 daemon 3 arch ffca0200 [acme-c5-14.example.com:03987] [[58179,0],6] node[4].name acme-c5-12 daemon 4 arch ffca0200 [acme-c5-14.example.com:03987] [[58179,0],6] node[5].name acme-c5-13 daemon 5 arch ffca0200 [acme-c5-14.example.com:03987] [[58179,0],6] node[6].name acme-c5-14 daemon 6 arch ffca0200 [acme-c5-14.example.com:03987] [[58179,0],6] node[7].name acme-c5-15 daemon 7 arch ffca0200 [acme-c5-12.example.com:01010] [[58179,0],4] orted_cmd: received add_local_procs [acme-c5-13.example.com:21829] [[58179,0],5] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c5-13.example.com:21829] [[58179,0],5] node[1].name acme-c5-9 daemon 1 arch ffca0200 [acme-c5-16.example.com:21432] [[58179,0],8] orted_cmd: received add_local_procs [acme-c5-20.example.com:07292] [[58179,0],12] node[4].name acme-c5-12 daemon 4 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[5].name acme-c5-13 daemon 5 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[6].name acme-c5-14 daemon 6 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[7].name acme-c5-15 daemon 7 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[8].name acme-c5-16 daemon 8 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[9].name acme-c5-17 daemon 9 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[10].name acme-c5-18 daemon 10 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[11].name acme-c5-19 daemon 11 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[12].name acme-c5-20 daemon 12 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[13].name acme-c4-9 daemon 13 arch ffca0200 [acme-c5-20.example.com:07292] [[58179,0],12] node[14].name acme-c4-11 daemon 14 arch ffca0200 [acme-c5-17.example.com:26411] [[58179,0],9] node[2].name acme-c5-10 daemon 2 arch ffca0200 [acme-c5-17.example.com:26411] [[58179,0],9] node[3].name acme-c5-11 daemon 3 arch ffca0200 [acme-c5-17.example.com:26411] [[58179,0],9] node[4].name acme-c5-12 daemon 4 arch ffca0200 [acme-c5-17.example.com:26411] [[58179,0],9] node[5].name acme-c5-13 daemon 5 arch ffca0200 [acme-c5-17.example.com:26411] [[58179,0],9] node[6].name acme-c5-14 daemon 6 arch ffca0200 [acme-c5-17.example.com:26411] [[58179,0],9] node[7].name acme-c5-15 daemon 7 arch ffca0200 [acme-c5-9.example.com:31954] [[58179,0],1] node[9].name acme-c5-17 daemon 9 arch ffca0200 [acme-c5-15.example.com:07819] [[58179,0],7] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c5-19.example.com:18318] [[58179,0],11] node[[acme-c4-9.example.com:28397] [[58179,0],13] node[0].name acme-c5-8 daemon 0 arch ffca0200 [acme-c4-9.example.com:28397] [[58179,0],13] node[1].name acme-c5-9 daemon 1 arch ffca0200 ---------------------------------------------- ----- Mark Bergman _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users