On 02/28/2014 03:32 AM, Brice Goglin wrote:
Le 28/02/2014 02:48, Ralph Castain a écrit :
Remember, hwloc doesn't actually "sense" hardware - it just parses files in the 
/proc area. So if something is garbled in those files, hwloc will report errors. Doesn't 
mean anything is wrong with the hardware at all.

For the record, that's not really true:

hwloc looks at /sys (and a bit /proc files), but it also uses cpuid
instructions. 90% of the times, the former is better because the kernel
already took care of cleaning up the hardware mess and reporting
useful/correct info in /proc and /sys. Sometimes the kernel is too old
and it misses some hardware quirks (like L1i sharing on Gus' machine)
causing /sys files to be incompatible.


Hi Brice

The (pdf) output of lstopo shows one L1d (16k) for each core,
and one L1i (64k) for each *pair* of cores.
Is this wrong?
Anything else wrong that reported by by

Sorry for my ignorance of the specifics of the AMD cache structure.
BTW, if there are any helpful web links, or references, or graphs
about the AMD cache structure, I would love to know.

In the end, the vast majority of problems come from buggy BIOS, and
these cause both cpuid and kernel to report invalid info. Aside of
upgrading the BIOS, the only solution there is to replace the topology
with a correct XML one.

Brice


I am a bit skeptical that the BIOS is the culprit because I replaced
two motherboards (node14 and node16), and only node14 doesn't pass
the hwloc-gather-topology test.
Just in case, I attach the diagnostic for node16 also,
if you want to take a look.  :)

FYI, the two new motherboards (nodes 14 and 16)
have a *newer* BIOS version (AMI, version 3.5, 11/25/2013)
then the one in the
original nodes (node15 below) (AMI, version 3.0, 08/31/2012).
I even thought of upgrading the old nodes' BIOSes ...
... but now I am not so sure about this ...  :(

New motherboards:

[root@node14 ~]# dmidecode -s bios-vendor
American Megatrends Inc.
[root@node14 ~]# dmidecode -s bios-version
3.5
[root@node14 ~]# dmidecode -s bios-release-date
11/25/2013

**

[root@node16 ~]# dmidecode -s bios-vendor
American Megatrends Inc.
[root@node16 ~]# dmidecode -s bios-version
3.5
[root@node16 ~]# dmidecode -s bios-release-date
11/25/2013

**

Original motherboard:

[root@node15 ~]# dmidecode -s bios-vendor
American Megatrends Inc.
[root@node15 ~]# dmidecode -s bios-version
3.0
[root@node15 ~]# dmidecode -s bios-release-date
08/31/2012

**

Thanks again for your help and advice.

Gus Correa

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Machine (P#0 total=134199400KB DMIProductName=H8DGU 
DMIProductVersion=1234567890 DMIProductSerial=1234567890 
DMIProductUUID=534D4349-0002-F190-2500-F1902500697D DMIBoardVendor=Supermicro 
DMIBoardName=H8DGU DMIBoardVersion=1234567890 DMIBoardSerial=NM141S600145 
DMIBoardAssetTag="To Be Filled By O.E.M." DMIChassisVendor=Supermicro 
DMIChassisType=17 DMIChassisVersion=1234567890 DMIChassisSerial=1234567890 
DMIChassisAssetTag="To Be Filled By O.E.M." DMIBIOSVendor="American Megatrends 
Inc." DMIBIOSVersion="3.5       " DMIBIOSDate=11/25/2013 
DMISysVendor=Supermicro Backend=Linux LinuxCgroup=/)
  Socket L#0 (P#0 total=67106920KB CPUModel="AMD Opteron(tm) Processor 6376     
            ")
    NUMANode L#0 (P#0 local=33552488KB total=33552488KB)
      L3Cache L#0 (size=6144KB linesize=64 ways=64)
        L2Cache L#0 (size=2048KB linesize=64 ways=16)
          L1iCache L#0 (size=64KB linesize=64 ways=2)
            L1dCache L#0 (size=16KB linesize=64 ways=4)
              Core L#0 (P#0)
                PU L#0 (P#0)
            L1dCache L#1 (size=16KB linesize=64 ways=4)
              Core L#1 (P#1)
                PU L#1 (P#1)
        L2Cache L#1 (size=2048KB linesize=64 ways=16)
          L1iCache L#1 (size=64KB linesize=64 ways=2)
            L1dCache L#2 (size=16KB linesize=64 ways=4)
              Core L#2 (P#2)
                PU L#2 (P#2)
            L1dCache L#3 (size=16KB linesize=64 ways=4)
              Core L#3 (P#3)
                PU L#3 (P#3)
        L2Cache L#2 (size=2048KB linesize=64 ways=16)
          L1iCache L#2 (size=64KB linesize=64 ways=2)
            L1dCache L#4 (size=16KB linesize=64 ways=4)
              Core L#4 (P#4)
                PU L#4 (P#4)
            L1dCache L#5 (size=16KB linesize=64 ways=4)
              Core L#5 (P#5)
                PU L#5 (P#5)
        L2Cache L#3 (size=2048KB linesize=64 ways=16)
          L1iCache L#3 (size=64KB linesize=64 ways=2)
            L1dCache L#6 (size=16KB linesize=64 ways=4)
              Core L#6 (P#6)
                PU L#6 (P#6)
            L1dCache L#7 (size=16KB linesize=64 ways=4)
              Core L#7 (P#7)
                PU L#7 (P#7)
    NUMANode L#1 (P#1 local=33554432KB total=33554432KB)
      L3Cache L#1 (size=6144KB linesize=64 ways=64)
        L2Cache L#4 (size=2048KB linesize=64 ways=16)
          L1iCache L#4 (size=64KB linesize=64 ways=2)
            L1dCache L#8 (size=16KB linesize=64 ways=4)
              Core L#8 (P#0)
                PU L#8 (P#8)
            L1dCache L#9 (size=16KB linesize=64 ways=4)
              Core L#9 (P#1)
                PU L#9 (P#9)
        L2Cache L#5 (size=2048KB linesize=64 ways=16)
          L1iCache L#5 (size=64KB linesize=64 ways=2)
            L1dCache L#10 (size=16KB linesize=64 ways=4)
              Core L#10 (P#2)
                PU L#10 (P#10)
            L1dCache L#11 (size=16KB linesize=64 ways=4)
              Core L#11 (P#3)
                PU L#11 (P#11)
        L2Cache L#6 (size=2048KB linesize=64 ways=16)
          L1iCache L#6 (size=64KB linesize=64 ways=2)
            L1dCache L#12 (size=16KB linesize=64 ways=4)
              Core L#12 (P#4)
                PU L#12 (P#12)
            L1dCache L#13 (size=16KB linesize=64 ways=4)
              Core L#13 (P#5)
                PU L#13 (P#13)
        L2Cache L#7 (size=2048KB linesize=64 ways=16)
          L1iCache L#7 (size=64KB linesize=64 ways=2)
            L1dCache L#14 (size=16KB linesize=64 ways=4)
              Core L#14 (P#6)
                PU L#14 (P#14)
            L1dCache L#15 (size=16KB linesize=64 ways=4)
              Core L#15 (P#7)
                PU L#15 (P#15)
  Socket L#1 (P#1 total=67092480KB CPUModel="AMD Opteron(tm) Processor 6376     
            ")
    NUMANode L#2 (P#2 local=33554432KB total=33554432KB)
      L3Cache L#2 (size=6144KB linesize=64 ways=64)
        L2Cache L#8 (size=2048KB linesize=64 ways=16)
          L1iCache L#8 (size=64KB linesize=64 ways=2)
            L1dCache L#16 (size=16KB linesize=64 ways=4)
              Core L#16 (P#0)
                PU L#16 (P#16)
            L1dCache L#17 (size=16KB linesize=64 ways=4)
              Core L#17 (P#1)
                PU L#17 (P#17)
        L2Cache L#9 (size=2048KB linesize=64 ways=16)
          L1iCache L#9 (size=64KB linesize=64 ways=2)
            L1dCache L#18 (size=16KB linesize=64 ways=4)
              Core L#18 (P#2)
                PU L#18 (P#18)
            L1dCache L#19 (size=16KB linesize=64 ways=4)
              Core L#19 (P#3)
                PU L#19 (P#19)
        L2Cache L#10 (size=2048KB linesize=64 ways=16)
          L1iCache L#10 (size=64KB linesize=64 ways=2)
            L1dCache L#20 (size=16KB linesize=64 ways=4)
              Core L#20 (P#4)
                PU L#20 (P#20)
            L1dCache L#21 (size=16KB linesize=64 ways=4)
              Core L#21 (P#5)
                PU L#21 (P#21)
        L2Cache L#11 (size=2048KB linesize=64 ways=16)
          L1iCache L#11 (size=64KB linesize=64 ways=2)
            L1dCache L#22 (size=16KB linesize=64 ways=4)
              Core L#22 (P#6)
                PU L#22 (P#22)
            L1dCache L#23 (size=16KB linesize=64 ways=4)
              Core L#23 (P#7)
                PU L#23 (P#23)
    NUMANode L#3 (P#3 local=33538048KB total=33538048KB)
      L3Cache L#3 (size=6144KB linesize=64 ways=64)
        L2Cache L#12 (size=2048KB linesize=64 ways=16)
          L1iCache L#12 (size=64KB linesize=64 ways=2)
            L1dCache L#24 (size=16KB linesize=64 ways=4)
              Core L#24 (P#0)
                PU L#24 (P#24)
            L1dCache L#25 (size=16KB linesize=64 ways=4)
              Core L#25 (P#1)
                PU L#25 (P#25)
        L2Cache L#13 (size=2048KB linesize=64 ways=16)
          L1iCache L#13 (size=64KB linesize=64 ways=2)
            L1dCache L#26 (size=16KB linesize=64 ways=4)
              Core L#26 (P#2)
                PU L#26 (P#26)
            L1dCache L#27 (size=16KB linesize=64 ways=4)
              Core L#27 (P#3)
                PU L#27 (P#27)
        L2Cache L#14 (size=2048KB linesize=64 ways=16)
          L1iCache L#14 (size=64KB linesize=64 ways=2)
            L1dCache L#28 (size=16KB linesize=64 ways=4)
              Core L#28 (P#4)
                PU L#28 (P#28)
            L1dCache L#29 (size=16KB linesize=64 ways=4)
              Core L#29 (P#5)
                PU L#29 (P#29)
        L2Cache L#15 (size=2048KB linesize=64 ways=16)
          L1iCache L#15 (size=64KB linesize=64 ways=2)
            L1dCache L#30 (size=16KB linesize=64 ways=4)
              Core L#30 (P#6)
                PU L#30 (P#30)
            L1dCache L#31 (size=16KB linesize=64 ways=4)
              Core L#31 (P#7)
                PU L#31 (P#31)
depth 0:        1 Machine (type #1)
 depth 1:       2 Socket (type #3)
  depth 2:      4 NUMANode (type #2)
   depth 3:     4 L3Cache (type #4)
    depth 4:    16 L2Cache (type #4)
     depth 5:   16 L1iCache (type #4)
      depth 6:  32 L1dCache (type #4)
       depth 7: 32 Core (type #5)
        depth 8:        32 PU (type #6)
latency matrix between NUMANodes (depth 2) by logical indexes:
  index     0     1     2     3
      0 1.000 1.600 1.600 1.600
      1 1.600 1.000 1.600 1.600
      2 1.600 1.600 1.000 1.600
      3 1.600 1.600 1.600 1.000
Topology not from this system

Attachment: node16.tar.bz2
Description: application/bzip

Reply via email to