Cris' output is coming solely from the HNP, which is correct given the way
things were executed. My comment was from another email where he did what I
asked, which was to include the flags:

--report-bindings --leave-session-attached

so we could see the output from each orted. In that email, it was clear that
while mpirun was bound to multiple cores, the orteds are being bound to a
-single- core.

Hence the problem.

HTH
Ralph


On Wed, Nov 17, 2010 at 6:57 AM, Terry Dontje <terry.don...@oracle.com>wrote:

>  On 11/17/2010 07:41 AM, Chris Jewell wrote:
>
> On 17 Nov 2010, at 11:56, Terry Dontje wrote:
>
>  You are absolutely correct, Terry, and the 1.4 release series does include 
> the proper code. The point here, though, is that SGE binds the orted to a 
> single core, even though other cores are also allocated. So the orted detects 
> an external binding of one core, and binds all its children to that same core.
>
>  I do not think you are right here.  Chris sent the following which looks 
> like OGE (fka SGE) actually did bind the hnp to multiple cores.  However that 
> message I believe is not coming from the processes themselves and actually is 
> only shown by the hnp.  I wonder if Chris adds a "-bind-to-core" option  
> we'll see more output from the a.out's before they exec unterm?
>
>  As requested using
>
> $ qsub -pe mpi 8 -binding linear:2 myScript.com'
>
> and
>
> 'mpirun -mca ras_gridengine_verbose 100 --report-bindings -by-core 
> -bind-to-core ./unterm'
>
> [exec5:06671] System has detected external process binding to cores 0028
> [exec5:06671] ras:gridengine: JOB_ID: 59434
> [exec5:06671] ras:gridengine: PE_HOSTFILE: 
> /usr/sge/default/spool/exec5/active_jobs/59434.1/pe_hostfile
> [exec5:06671] ras:gridengine: exec5.cluster.stats.local: PE_HOSTFILE shows 
> slots=2
> [exec5:06671] ras:gridengine: exec1.cluster.stats.local: PE_HOSTFILE shows 
> slots=2
> [exec5:06671] ras:gridengine: exec4.cluster.stats.local: PE_HOSTFILE shows 
> slots=1
> [exec5:06671] ras:gridengine: exec3.cluster.stats.local: PE_HOSTFILE shows 
> slots=1
> [exec5:06671] ras:gridengine: exec2.cluster.stats.local: PE_HOSTFILE shows 
> slots=1
> [exec5:06671] ras:gridengine: exec7.cluster.stats.local: PE_HOSTFILE shows 
> slots=1
>
> No more info.  I note that the external binding is slightly different to what 
> I had before, but our cluster is busier today :-)
>
>
>  I would have expected more output.
>
> --td
>
>  Chris
>
>
> --
> Dr Chris Jewell
> Department of Statistics
> University of Warwick
> Coventry
> CV4 7AL
> UK
> Tel: +44 (0)24 7615 0778
>
>
>
>
>
>
> _______________________________________________
> users mailing 
> listusers@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> --
> [image: Oracle]
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
>  Oracle * - Performance Technologies*
>  95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com
>
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to