I was just recently reminded of a comment that is near the top of 
opal_init_util():

    /* JMS See note in runtime/opal.h -- this is temporary; to be
       replaced with real hwloc information soon (in trunk/v1.5 and
       beyond, only).  This *used* to be a #define, so it's important
       to define it very early.  */
    opal_cache_line_size = 128;

A few points:

1. On my platforms, hwloc tells me that my cache line size is 64, not 128.  
Probably not a tragedy, but...

2. I see opal_cache_line_size being used in a lot of BTL and PML initialization 
locations.  I see it being used in opal/class/free_list.*, too.

3. I poked around with this yesterday to see if we could have hwloc initialize 
the opal_cache_line_size value.  Points to remember:

- we initialize the opal hwloc framework in opal_init(), but we do not load the 
local machine's architecture then (because it can be expensive, particularly if 
lots of MPI processes are all doing it simultaneously)
- instead, the local machine topology is discovered once by each orted (using 
hwloc) and then RML sent to each local MPI process, where it is locally loaded 
into each MPI proc's hwloc tree
- this happens during the orte_init() in ompi_mpi_init()

Meaning: we can initialize the opal_cache_line_size in MPI processes during 
orte_init().

Is this acceptable to everyone?  

If so, I can go ahead and code this up.  I would probably leave the initial 
value hard-coded to 128 (just in case something uses it before orte_init()), 
and then later during orte_init(), reset it to the smallest L1 cache size that 
hwloc finds on the machine.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to