On 16 Nov 2010, at 17:25, Terry Dontje wrote:
>>> 
>> Sure.   Here's the stderr of a job submitted to my cluster with 'qsub -pe 
>> mpi 8 -binding linear:2 myScript.com'  where myScript.com runs 'mpirun -mca 
>> ras_gridengine_verbose 100 --report-bindings ./unterm':
>> 
>> [exec4:17384] System has detected external process binding to cores 0022
>> [exec4:17384] ras:gridengine: JOB_ID: 59352
>> [exec4:17384] ras:gridengine: PE_HOSTFILE: 
>> /usr/sge/default/spool/exec4/active_jobs/59352.1/pe_hostfile
>> [exec4:17384] ras:gridengine: exec4.cluster.stats.local: PE_HOSTFILE shows 
>> slots=2
>> [exec4:17384] ras:gridengine: exec2.cluster.stats.local: PE_HOSTFILE shows 
>> slots=1
>> [exec4:17384] ras:gridengine: exec7.cluster.stats.local: PE_HOSTFILE shows 
>> slots=1
>> [exec4:17384] ras:gridengine: exec3.cluster.stats.local: PE_HOSTFILE shows 
>> slots=1
>> [exec4:17384] ras:gridengine: exec6.cluster.stats.local: PE_HOSTFILE shows 
>> slots=1
>> [exec4:17384] ras:gridengine: exec1.cluster.stats.local: PE_HOSTFILE shows 
>> slots=1
>> [exec4:17384] ras:gridengine: exec5.cluster.stats.local: PE_HOSTFILE shows 
>> slots=1
>> 
>> 
>> 
> Is that all that came out?  I would have expected a some output from each 
> process after the orted forked the processes but before the exec of unterm.

Yes.  It appears that if orted detects binding done by external processes, then 
this is all you get.  Scratch the GE enforced binding, and you get:

[exec4:17670] [[23443,0],0] odls:default:fork binding child [[23443,1],0] to 
cpus 0001
[exec4:17670] [[23443,0],0] odls:default:fork binding child [[23443,1],1] to 
cpus 0002
[exec7:06781] [[23443,0],2] odls:default:fork binding child [[23443,1],3] to 
cpus 0001
[exec2:24160] [[23443,0],1] odls:default:fork binding child [[23443,1],2] to 
cpus 0001
[exec6:30097] [[23443,0],4] odls:default:fork binding child [[23443,1],5] to 
cpus 0001
[exec5:02736] [[23443,0],6] odls:default:fork binding child [[23443,1],7] to 
cpus 0001
[exec1:30779] [[23443,0],5] odls:default:fork binding child [[23443,1],6] to 
cpus 0001
[exec3:12818] [[23443,0],3] odls:default:fork binding child [[23443,1],4] to 
cpus 0001
.....


C
--
Dr Chris Jewell
Department of Statistics
University of Warwick
Coventry
CV4 7AL
UK
Tel: +44 (0)24 7615 0778






Reply via email to