Sorry for the typo in the subject, I meant "Topology" ;)

On 5/9/16 5:58 PM, Mehmet Belgin wrote:
Greetings!

We've been receiving this error for a while on our 64-core Interlagos AMD machines:

**************************************************************************** * hwloc has encountered what looks like an error from the operating system.
*
* Socket (P#2 cpuset 0x0000ffff,0x0) intersects with NUMANode (P#3 cpuset 0x0000ff00,0xff000000) without inclusion!
* Error occurred in topology.c line 940
*
* Please report this error message to the hwloc user's mailing list,
* along with the output+tarball generated by the hwloc-gather-topology script. ****************************************************************************

I've found some information in the hwloc list archives mentioning this is due to buggy AMD platform and the impact should be limited to hwloc missing L3 cache info (thanks Brice). If that's the case and processor representation is correct then I am sure we can live with this, but I still wanted to check with the list to confirm that (1) this is really harmless and (2) are there any known solutions other than upgrading BIOS/kernel?

The hwloc-gather-topology output is also attached.

Our schedulers (Torque/Moab) and MPI stacks highly rely on hwloc and I need to ensure that this is not a critical issue, so any suggestions will help.

Thank you!
-Mehmet




_______________________________________________
hwloc-users mailing list
hwloc-us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
Link to this post: 
http://www.open-mpi.org/community/lists/hwloc-users/2016/05/1272.php

--
=========================================
Mehmet Belgin, Ph.D. (mehmet.bel...@oit.gatech.edu)
Scientific Computing Consultant | OIT - Academic and Research Technologies
Georgia Institute of Technology
258 4th Str NW, Rich Building, Room 326
Atlanta, GA  30332-0700
Office: (404) 385-0665

Reply via email to