from:"Brice Goglin"

Re: [hwloc-users] [EXTERNAL] Re: How to show all cores in lstopo output?

2024-06-25 Thread Brice Goglin

You may also hit the 'f' key from the graphical X11 output to toggle 
factorizing of cores and collapsing of PCI devices (those shortcuts are 
shown in the terminal text output while the graphical window is running).


Brice


Le 25/06/2024 à 21:34, Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND 
APPLICATIONS INC] via hwloc-users a écrit :


Ahhh! That's the stuff! Nice and unreadable. Need me a wide-widescreen 

I tried a few different options (like --no-collapse) but I just didn't 
get the right one.


Matt

--

Matt Thompson, SSAI, Ld Scientific Prog/Analyst/Super

NASA GSFC,    Global Modeling and Assimilation Office

Code 610.1,  8800 Greenbelt Rd,  Greenbelt,  MD 20771

Phone: 301-614-6712 Fax: 301-614-6246

http://science.gsfc.nasa.gov/sed/bio/matthew.thompson

*From: *Samuel Thibault 
*Date: *Tuesday, June 25, 2024 at 3:29 PM
*To: *Hardware locality user list 
*Cc: *Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS 
INC] 
*Subject: *[EXTERNAL] Re: [hwloc-users] How to show all cores in 
lstopo output?


CAUTION: This email originated from outside of NASA.  Please take care 
when clicking links or opening attachments. Use the "Report Message" 
button to report suspicious messages to the NASA SOC.





Hello,

Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC] via 
hwloc-users, le mar. 25 juin 2024 19:24:24 +, a ecrit:
> But when I ran lstopo, the resulting SVG sort of has all the cores 
"smooshed"

> so you see core 0, core 1, ... core 71.
>
> Is there an lstopo option that will let me defy logic and show All 
The Cores™?


I guess you mean the --no-factorize option.

Samuel


___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users


OpenPGP_signature.asc
Description: OpenPGP digital signature
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] MI300A support

2024-02-08 Thread Brice Goglin


Hello

I don't have access to a MI300A but I worked with AMD several month ago 
to solve a very similar issue. It was caused by a buggy APCI HMAT in the 
BIOS.


Try setting HWLOC_USE_NUMA_DISTANCES=0 in the environment to disable the 
hwloc code that uses this HMAT info. If the warning goes away then you 
need to get a more recent firmware. That said, it would be annoying if 
this old buggy firmware is still in the wild 4 months later.


Brice


Le 08/02/2024 à 18:23, Hartman, John a écrit :


Is there a timeline for hwloc to support the MI300A? Currently, hwloc 
isn’t happy when it encounters one:




* hwloc 2.9.0 received invalid information from the operating system.

*

* Failed with: intersection without inclusion

* while inserting Group0 (cpuset 
0x00ff,0x,0x00ff,0x00ff,0x,0x00ff) at 
Group0 (cpuset 0x,0x,,0x,0x)


* coming from: linux:sysfs:numa

*

* The following FAQ entry in the hwloc documentation may help:

* What should I do when hwloc reports "operating system" warnings?

* Otherwise please report this error message to the hwloc user's 
mailing list,


* along with the files generated by the hwloc-gather-topology script.

*

* hwloc will now ignore this invalid topology information and continue.



John


___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users


OpenPGP_signature.asc
Description: OpenPGP digital signature
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Support for Intel's hybrid architecture - can I restrict hwloc-distrib to P cores only?

2023-11-24 Thread Brice Goglin

Le 24/11/2023 à 08:51, John Hearns a écrit :

Good question. Maybe not an answer referring to hwloc.
When managing a large NUMA machine, SGI UV, I ran the OS processes in
a boot cpuset which was restricted to (AFAIR) the first 8 Cpus.
On Intel architecures with E and P cores could we think of running OS
on E cores only and having the batch system schedule compute tasks on
P cores?

That's certainly possible. Linux has things like isolcpus to force
isolate some cores away from the OS tasks, should work for these
platforms too (by the way, it's also for ARM big.LITTLE platforms
running Linux, including Apple M1, etc).

However, keep in mind that splitting P+E CPUs is not like splitting NUMA
platforms: isolating NUMA node #0 on SGI left tons of cores available
for HPC tasks on NUMA nodes. Current P+E from Intel usually have more E
than P, and several models are even 2P+8E, that would be a lot of
E-cores for the OS and very few P-cores for real apps. Your idea would
apply better if we rather had 2E+8P but that's not the trend.

Things might be more interesting with MeteorLake which (according to
https://www.hardwaretimes.com/intel-14th-gen-meteor-lake-cpu-cores-almost-identical-to-13th-gen-its-a-tic/)
has P+E as usual but also 2 "low-power E" on the side. There, you could
put the OS on those 2 Low-Power E.

By the way, the Linux scheduler is supposed to get enhanced to
automatically find out which tasks to put on P and E core but they've
been discussing things for a long time and it's hard to know what's
actually working well already.

Brice

OpenPGP_signature
Description: OpenPGP digital signature
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Support for Intel's hybrid architecture - can I restrict hwloc-distrib to P cores only?

2023-11-24 Thread Brice Goglin


Le 23/11/2023 à 19:29, Jirka Hladky a écrit :

Hi Brice,

I have a question about the hwloc's support for Intel's hybrid 
architectures, like in Alder Lake CPUs:

https://en.wikipedia.org/wiki/Alder_Lake

There are P (performance) and E (efficiency) cores. Is hwloc able to 
detect which core is which? Can I, for example, restrict hwloc-distrib 
to P cores only?


Pseudocode:
hwloc-distrib --single --restrict <> <>



Hello

On machines with 2 kinds of cores (P and E), P are the second cpukinds 
in hwloc (cpukinds are ordered by energy-efficiency first):


$ lstopo  --cpukinds
CPU kind #0 efficiency 0 cpuset 0x000ff000
  CoreType = IntelAtom
CPU kind #1 efficiency 1 cpuset 0x0fff
  CoreType = IntelCore

These cpusets may be returned by hwloc-calc either by index or by info 
attribute:


$ hwloc-calc  --cpukind 1 all
0x0fff
$ hwloc-calc --cpukind CoreType=IntelCore all
0x0fff

So just pass --restrict $(hwloc-calc --cpukind 1 all) to hwloc-distrib 
and you should be good.


Brice




OpenPGP_signature
Description: OpenPGP digital signature
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] hwloc: Topology became empty, aborting!

2023-08-02 Thread Brice Goglin

The cgroup information under /sys/fs/cgroup/ should be fixed. 
cpuset.cpus should contain 0-3 and cpuset.mems should contain 0. In the 
meantime, hwloc may ignore this cgroup info if you set HWLOC_ALLOW=all 
in the environment.


The x86 CPUID information is also wrong on this machine. All 4 cores 
report the same "APIC id" (sort of hardware core ID), I guess all your 4 
cores are virtualized over a single hardware core and the hypervisor 
doesn't care about emulating topology information correctly.


Brice



Le 02/08/2023 à 15:23, Max R. Dechantsreiter a écrit :

Hi Brice,

Well, the VPS gives me a 4-core slice of an Intel(R) Xeon(R)
CPU E5-2620 node, which is Sandy Bridge EP, with 6 physical
cores, so probably 12 cores on the node.  The numbering does
seem wacky: it seems to describe a node with 2 8-core CPUs.

This is the VPS on which I host my Web site; I use its shell
account for sundry testing, mostly of build procedures.

Is there anything I could do to get hwloc to work?

Regards,

Max
---


On Wed, Aug 02, 2023 at 03:12:27PM +0200, Brice Goglin wrote:

Hello

There's something wrong in this machine. It exposes 4 cores (number 0 to 3)
and no NUMA node, but says the only allowed resources are cores 8-15,24-31
and NUMA node 1. That's why hwloc says the topology is empty (running lstopo
--disallowed shows NUMA 0 and cores 0-3 in red, which means they aren't
allowed). How did this get configured so badly?

Brice



Le 02/08/2023 à 14:54, Max R. Dechantsreiter a écrit :

Hello,

On my VPS I tested my build of hwloc-2.9.2 by running lstopo:

./lstopo
hwloc: Topology became empty, aborting!
Segmentation fault

On a GCP n1-standard-2 a similar build (GCC 12.2 vs. 13.2) seemed to work:

./lstopo
hwloc/nvml: Failed to initialize with nvmlInit(): Driver Not Loaded
Machine (7430MB total)
 Package L#0
   NUMANode L#0 (P#0 7430MB)
   L3 L#0 (45MB) + L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core 
L#0
 PU L#0 (P#0)
 PU L#1 (P#1)
 HostBridge
   PCI 00:03.0 (Other)
 Block(Disk) "sda"
   PCI 00:04.0 (Ethernet)
 Net "ens4"
   PCI 00:05.0 (Other)

(from which I conclude my build procedure is correct).

At the suggestion of Brice Goglin (in response to my post of the same
issue to Open MPI Users), I rebuilt with '--enable-debug' and ran lstopo;
then I also ran

hwloc-gather-topology hwloc-gather-topology

The resulting lstopo.tar.gz and hwloc-gather-topology.tar.gz are attached,
as I was unable to recognize the underlying problem, although I believe it
could be a system issue, for my builds of OpenMPI on the VPS used to work
before a new OS image was installed.

Max





OpenPGP_signature
Description: OpenPGP digital signature
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] hwloc: Topology became empty, aborting!

2023-08-02 Thread Brice Goglin


Hello

There's something wrong in this machine. It exposes 4 cores (number 0 to 
3) and no NUMA node, but says the only allowed resources are cores 
8-15,24-31 and NUMA node 1. That's why hwloc says the topology is empty 
(running lstopo --disallowed shows NUMA 0 and cores 0-3 in red, which 
means they aren't allowed). How did this get configured so badly?


Brice



Le 02/08/2023 à 14:54, Max R. Dechantsreiter a écrit :

Hello,

On my VPS I tested my build of hwloc-2.9.2 by running lstopo:

./lstopo
hwloc: Topology became empty, aborting!
Segmentation fault

On a GCP n1-standard-2 a similar build (GCC 12.2 vs. 13.2) seemed to work:

./lstopo
hwloc/nvml: Failed to initialize with nvmlInit(): Driver Not Loaded
Machine (7430MB total)
Package L#0
  NUMANode L#0 (P#0 7430MB)
  L3 L#0 (45MB) + L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core 
L#0
PU L#0 (P#0)
PU L#1 (P#1)
HostBridge
  PCI 00:03.0 (Other)
Block(Disk) "sda"
  PCI 00:04.0 (Ethernet)
Net "ens4"
  PCI 00:05.0 (Other)

(from which I conclude my build procedure is correct).

At the suggestion of Brice Goglin (in response to my post of the same
issue to Open MPI Users), I rebuilt with '--enable-debug' and ran lstopo;
then I also ran

hwloc-gather-topology hwloc-gather-topology

The resulting lstopo.tar.gz and hwloc-gather-topology.tar.gz are attached,
as I was unable to recognize the underlying problem, although I believe it
could be a system issue, for my builds of OpenMPI on the VPS used to work
before a new OS image was installed.

Max


OpenPGP_signature
Description: OpenPGP digital signature
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Problems with binding memory

2022-03-02 Thread Brice Goglin


Le 02/03/2022 à 09:39, Mike a écrit :

Hello,

Please run "lstopo -.synthetic" to compress the output a lot. I
will be able to reuse it from here and understand your binding mask.

Package:2 [NUMANode(memory=270369247232)] L3Cache:8(size=33554432) 
L2Cache:8(size=524288) L1dCache:1(size=32768) L1iCache:1(size=32768) 
Core:1 PU:2(indexes=2*128:1*2)




Ok then your mask 0x,0x,,,0x,0x 
corresponds exactly to NUMA node 0 (socket 0). Object cpusets can be 
displayed on the command-line with "lstopo --cpuset" or "hwloc-calc numa:0".


This would be OK if you're only spawning threads to the first socket. Do 
you see the same mask for threads on the other socket?


Brice




OpenPGP_signature
Description: OpenPGP digital signature
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Problems with binding memory

2022-03-01 Thread Brice Goglin



Le 01/03/2022 à 17:34, Mike a écrit :

Hello,

Usually you would rather allocate and bind at the same time so
that the memory doesn't need to be migrated when bound. However,
if you do not touch the memory after allocation, pages are not
actually physically allocated, hence there's no to migrate. Might
work but keep this in mind.


I need all the data in one allocation, so that is why I opted to 
allocate and then bind via the area function. The way I understand it 
is that by using the memory binding policy HWLOC_MEMBIND_BIND with 
hwloc_set_area_membind() the pages will actually get allocated on the 
specified cores. If that is not the case I suppose the best solution 
would be to just touch the allocated data with my threads.



set_area_membind() doesn't allocate pages, but it tells the operating 
system "whenever you allocate them, do it on that NUMA node". Anyway, 
what you're doing makes sense.





Can you print memory binding like below instead of printing only
the first PU in the set returned by get_area_membind?

    char *s;
    hwloc_bitmap_asprintf(, set);
    /* s is now a C string of the bitmap, use it in your std::cout line */

I tried that and now get_area_membind returns that all memory is bound 
to 0x,0x,,,0x,0x




Please run "lstopo -.synthetic" to compress the output a lot. I will be 
able to reuse it from here and understand your binding mask.


Brice




OpenPGP_signature
Description: OpenPGP digital signature
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Problems with binding memory

2022-03-01 Thread Brice Goglin



Le 01/03/2022 à 15:17, Mike a écrit :


Dear list,

I have a program that utilizes Openmpi + multithreading and I want the 
freedom to decide on which hardware cores my threads should run. By 
using hwloc_set_cpubind() that already works, so now I also want to 
bind memory to the hardware cores. But I just can't get it to work.


Basically, I wrote the memory binding into my allocator, so the memory 
will be allocated and then bound.




Hello

Usually you would rather allocate and bind at the same time so that the 
memory doesn't need to be migrated when bound. However, if you do not 
touch the memory after allocation, pages are not actually physically 
allocated, hence there's no to migrate. Might work but keep this in mind.



I use hwloc 2.4.1, run the code on a Linux system and I did check with 
“hwloc-info --support” if hwloc_set_area_membind() and 
hwloc_get_area_membind() are supported and they are.


Here is a snippet of my code, which runs through without any error. 
But the hwloc_get_area_membind() always returns that all memory is 
bound to PU 0, when I think it should be bound to different PUs. Am I 
missing something?




Can you print memory binding like below instead of printing only the 
first PU in the set returned by get_area_membind?


char *s;
hwloc_bitmap_asprintf(, set);
/* s is now a C string of the bitmap, use it in your std::cout line */

And send the output of lstopo on your machine so that I can understand it.

Or you could print the smallest object that contains the binding by 
calling hwloc_get_obj_covering_cpuset(topology, set). It returns an 
object whose type may be printed as a C-string with 
hwloc_obj_type_string(obj->type).


You may also do the same before set_area_membind() if you want to verify 
that you're bindin where you really want.





T* allocate(size_t n, hwloc_topology_t topology, int rank)
{
  // allocate memory
  T* t = (T*)hwloc_alloc(topology, sizeof(T) * n);
  // elements perthread
  size_t ept = 1024;
  hwloc_bitmap_t set;
  size_t offset = 0;
  size_t threadcount= 4;

  set = hwloc_bitmap_alloc();
  if(!set) {
    fprintf(stderr, "failed to allocate a bitmap\n");
  }
  // bind memory to every thread
  for(size_t i = 0;i < threadcount; i++)
  {
    // logical indexof where to bind the memory
    auto logid = (i +rank * threadcount) * 2;
    auto logobj = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, logid);
    hwloc_bitmap_only(set, logobj->os_index);
    //set the memory binding
    // I use HWLOC_MEMBIND_BIND as policy so I do not have to touch 
the memory first to allocate it
    auto err = hwloc_set_area_membind(topology, t + offset, sizeof(T) 
*ept, set, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_STRICT | 
HWLOC_MEMBIND_THREAD);

    if(err < 0)
      std::cout << "Error: memory binding failed" <    std::cout << "Rank=" << rank << " Tid=" << i << " on PU logical 
index="

Re: [hwloc-users] [OMPI users] hwloc error

2021-08-23 Thread Brice Goglin


Hello

For Windows, we have prebuilt zipballs on the download pages. Try lstopo 
using such a zipball from a text console first.


If you want to build, the process depends on whether you use MSVC or 
cygwin or mingw. Let's avoid that for now.


Which Windows version are you using? Is there some sort of virtual 
machine on top of it? Can you run `coreinfo -cgnlsm` and send the 
output? I have never seen a Windows report such invalid information.


Brice



Le 24/08/2021 à 00:20, Dwaipayan Sarkar a écrit :


Hello Brice

Thanks for your reply.

I forgot to mention that my machine is a windows one and not Linux.

I did download the new version of hwloc.

Could you brief me the steps for installing it? Are the steps similar 
to this?


cd $HWLOC

./configure --prefix=

make -j install

Thanks

Dwaipayan

*From:* Dwaipayan Sarkar
*Sent:* August-23-21 6:13 PM
*To:* hwloc-users@lists.open-mpi.org; Brice Goglin 
*Subject:* RE: [OMPI users] hwloc error

Hello Brice

Thanks for your reply.

I forgot to mention that my machine is a windows one and not Linux.

I did download the new version of hwloc.

Could you brief me the steps for installing it? Are the steps similar 
to this?


cd $HWLOC

./configure --prefix=

make -j install

Thanks

Dwaipayan

*From:* users <mailto:users-boun...@lists.open-mpi.org>> *On Behalf Of *Brice Goglin 
via users

*Sent:* August-23-21 5:32 PM
*To:* us...@lists.open-mpi.org <mailto:us...@lists.open-mpi.org>
*Cc:* Brice Goglin mailto:brice.gog...@inria.fr>>
*Subject:* Re: [OMPI users] hwloc error

Hello Dwaipayan

You seem to be running a very old hwloc (maybe embedded inside an old 
Open MPI release?). Can you install a more recent hwloc from 
https://www.open-mpi.org/projects/hwloc/ 
<https://www.open-mpi.org/projects/hwloc/>, build it, and run its 
"lstopo" to check whether the error remains?


If so, could you open an issue on the hwloc github at 
https://github.com/open-mpi/hwloc/issues/new 
<https://github.com/open-mpi/hwloc/issues/new>?


Your error looks strange. We've seen issue with such "intersection" 
errors in the past because the BIOS or ACPI was reporting invalid 
cache or NUMA affinity with respect to CPU packages. But intersecting 
packages is really unexpected. Among what's requested in the issue 
template, the important information that we'll need is the tarball 
generated by hwloc-gather-topology, it will allow us to check that 
Linux is indeed reporting invalid socket information.


By the way, this mailing list is for Open MPI, the hwloc mailing list 
is hwloc-users@lists.open-mpi.org 
<mailto:hwloc-users@lists.open-mpi.org>. Please use that list if you 
want to discuss by email instead of in a github issue.


Thanks

Brice

Le 23/08/2021 à 19:36, Dwaipayan Sarkar via users a écrit :

Hello

I am Dwaipayan, a PhD graduate student at the Western University
Canada.

Recently, I have been facing issues while using the CFD software
package ANSYS Fluent in my local machine which have two Xeon
processors with 12 cores each.

Whenever I am trying to run a simulation in parallel, it gives me
this warning

Text Description automatically generated with low confidence

And then when I run the simulation it abruptly stops giving me a
segmentation fault

Can you help me fix this issue, please?

I can provide with you more information of the local desktop machine.

Thanks

Dwaipayan



OpenPGP_signature
Description: OpenPGP digital signature
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Build an OS-X Universal version

2021-03-23 Thread Brice Goglin


Le 23/03/2021 à 08:08, Brice Goglin a écrit :
> Le 23/03/2021 à 02:28, ro...@uberware.net a écrit :
>> Hi. I'm trying to build hwloc on OS-X Big Sur on an M1. Ultimate plan is
>> to build it as a universal binary. Right now, I cannot even get the git
>> master to autogen. This is what I get:
>>
>> robin@Robins-Mac-mini hwloc % ./autogen.sh
>> autoreconf: Entering directory `.'
>> autoreconf: configure.ac: not using Gettext
>> autoreconf: running: aclocal --force -I ./config
>> autoreconf: configure.ac: tracing
>> configure.ac:77: error: libtool version 2.2.6 or higher is required
>
> Hello
>
> There's something strange in your environment if libtool 2.2.6+ couldn't
> be found. It likely explains the rest of the messages.


I read somewhere else that brew provides glibtool and glibtoolize while
libtool/libtoolize still points to the old Apple libtool. If so,
aliasing libtool/libtoolize to glibtool/glibtoolize may help.

Brice


___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Build an OS-X Universal version

2021-03-23 Thread Brice Goglin

Le 23/03/2021 à 02:28, ro...@uberware.net a écrit :
> Hi. I'm trying to build hwloc on OS-X Big Sur on an M1. Ultimate plan is
> to build it as a universal binary. Right now, I cannot even get the git
> master to autogen. This is what I get:
>
> robin@Robins-Mac-mini hwloc % ./autogen.sh
> autoreconf: Entering directory `.'
> autoreconf: configure.ac: not using Gettext
> autoreconf: running: aclocal --force -I ./config
> autoreconf: configure.ac: tracing
> configure.ac:77: error: libtool version 2.2.6 or higher is required


Hello

There's something strange in your environment if libtool 2.2.6+ couldn't
be found. It likely explains the rest of the messages.

Brice



> configure.ac:77: the top level
> autom4te: /usr/bin/m4 failed with exit status: 63
> autoreconf: configure.ac: not using Libtool
> autoreconf: running: /opt/homebrew/Cellar/autoconf/2.69/bin/autoconf --force
> configure.ac:77: error: libtool version 2.2.6 or higher is required
> configure.ac:77: the top level
> autom4te: /usr/bin/m4 failed with exit status: 63
> autoreconf: /opt/homebrew/Cellar/autoconf/2.69/bin/autoconf failed with
> exit status: 63
> Checking whether configure needs patching for MacOS Big Sur libtool.m4
> bug... grep: configure: No such file or directory
> grep: configure: No such file or directory
> yes
> Trying to patch configure...
> can't find file to patch at input line 9
> Perhaps you used the wrong -p or --strip option?
> The text leading up to this was:
> --
> |Updated from libtool.m4 patch:
> |
> |[PATCH] Improve macOS version detection to support macOS 11 and simplify
> legacy logic
> |
> |Signed-off-by: Jeremy Huddleston Sequoia 
> |
> |--- hwloc/configure.old  2020-11-25 16:03:04.225097149 +0100
> |+++ hwloc/configure  2020-11-25 16:02:29.368995613 +0100
> --
> File to patch:
>
> It hangs there waiting for me to supply a file. I see the patch is trying
> to generate configure from configure.old, but there is no configure.old in
> the repo. I know next to nothing about autogen and auto configure and have
> always just followed basic instructions that have always worked before.
>
> I'm using homebrew to get the gnu tools, and have autoconf 2.69, automate
> 1.16.3, and lib tool 2.4.6. Thanks for any help to get this building.
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

[hwloc-users] getting the latest snapshot version string

2021-03-11 Thread Brice Goglin

Hello

The "latest_snapshot.txt" files on the website were broken (for years).
Things are now fixed and improved. And they are also explicitly
documented on the main web page.

If you want the version string of the latest release or release
candidate, read
https://www.open-mpi.org/software/hwloc/current/downloads/latest_snapshot.txt
=> currently returns "2.4.1"

If you want the latest on the specific release series, replace "current"
with "vX.Y", for instance
https://www.open-mpi.org/software/hwloc/v2.0/downloads/latest_snapshot.txt
=> returns "2.0.4"

Note that release candidates are returned when they exist, so you may
get "2.4.2rc1" until "2.4.2" is really released. This doesn't apply to
major release candidates such as 2.5.0rcX (because "current" won't point
to "v2.5" until the final 2.5.0 is released).

Brice



___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Netloc questions

2021-02-16 Thread Brice Goglin

Hello Kevin

There is some very experimental support for Cray networks as well as
Intel OmniPath. But the entire subproject has been unmaintained for a
while and I don't expect anybody to revive it anytime soon unfortunately.

Brice


Le 16/02/2021 à 17:00, ke...@continuum-dynamics.com a écrit :
> Hello,
>
> Can someone tell me if there is a version of netloc that supports more
> networks that ethernet and infiniband?
>
> Thanks,
> Kevin Olson
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] [Bug] Topology incorrect when CPU 0 offline

2021-02-05 Thread Brice Goglin

Hello
I am not sure we ever tested this because offlining cpu0 was impossible in 
Linux until recently. I knew things would change because arm kernel devs were 
modifying Linux to allow it. Looks like it matters to x86 too, now. I'll take a 
look.
Brice


Le 5 février 2021 20:43:33 GMT+01:00, "Clay, Garrett"  
a écrit :
>Hello,
>
>Has anyone experienced invalid topology creation from hwloc when CPU 0
>is offline (disabled)? Details of the issue here:
>https://github.com/open-mpi/hwloc/issues/450
>
>In short, seems hwloc does not handle CPU 0 being offline properly and
>mixes up L3 cache in the topology hierarchy. In contrast, CPU 1 offline
>creates a topology as expected. Seems like it could be a simple edge
>case bug.
>
>Thanks,
>Garrett
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] [hwloc-announce] hwloc 2.3.0 released

2020-10-02 Thread Brice Goglin


Le 02/10/2020 à 01:59, Jirka Hladky a écrit :
>
> I'll see if I can make things case-insensitive in the tools (not
> in the C API).
>
> Yes, it would be a nice improvement.  Currently, there is a mismatch
> between different commands.  hwloc-info supports both bandwidth and
> Bandwidth, but hwloc-annotate requires a capital letter.

I just fixed that, and pushed some manpage updates as discussed earlier.

We have several minor issues (spurious runtime warnings) that may
justify doing a 2.3.1 in the near future, your changes will be in there.

Thanks

Brice


___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] [hwloc-announce] hwloc 2.3.0 released

2020-10-01 Thread Brice Goglin

Le 01/10/2020 à 22:17, Jirka Hladky a écrit :
>
> This is interesting! ACPI tables are often wrong - having the option
> to annotate more accurate data to the hwloc is great.


The ACPI SLIT table (reported by numactl -H) was indeed often dumb or
even wrong. But SLIT wasn't widely used anyway, so vendors didn't care
much about putting valid info there, it didn't break anything in most
applications. Hopefully it won't be the case for HMAT because HMAT will
be the official way to figure out which target memory is fast or not. If
vendors don't fill it properly, the OS may use HBM or NVDIMMs by default
instead of DDR, which will likely cause more problems than a broken SLIT.


>
> We have a simple C program to measure the bandwidth between NUMA
> nodes, producing a table similar to the output of numactl -H (but with
> values in GB/s). 
>
> node   0   1   2   3  
>  0:  10  16  16  16  
>  1:  16  10  16  16  
>  2:  16  16  10  16  
>  3:  16  16  16  10 
>
> I was trying to annotate it using hwloc-annotate, but I have not
> succeeded. :
>
> lstopo in.xml
> hwloc-annotate in.xml out.xml node:0 memattr bandwidth node:0 18
> Failed to find memattr by name bandwidth
>
> Is there some example of how to do this?


There's an example at the end of the manpage of hwloc-annotate. It's
very similar to your line, but you likely need a capital to "Bandwidth".
I'll see if I can make things case-insensitive in the tools (not in the
C API).


>
> Also, are there any plans for having a tool, which would measure the
> memory bandwidth and annotate the results to XML for later usage with
> hwloc commands?


We've been talking about this for years. Having a good performance
measurement tool isn't easy. I see people sending patches for adding
some assembly because this corner case on this processor isn't well
optimized by GCC :/ I am not sure we want to put this inside hwloc.

Brice


>
>
> On Thu, Oct 1, 2020 at 7:28 PM Brice Goglin  <mailto:brice.gog...@inria.fr>> wrote:
>
>
> Le 01/10/2020 à 19:16, Jirka Hladky a écrit :
>> Hi Brice,
>>
>> this new feature sounds very interesting! 
>>
>> Add hwloc/memattrs.h for exposing latency/bandwidth information
>>     between initiators (CPU sets for now) and target NUMA nodes,
>>     typically on heterogeneous platforms.
>>
>>
>> If I get it right, I need to have an ACPI HMAT table on the
>> system to use the new functionality, right?
>
>
> Hello Jirka
>
> It's also possible to add memory attribute using the C API or with
> hwloc-annotate to modify a XML (you may create attribute, or add
> values for a given attribute).
>
>
>> I have tried following on Fedora
>> acpidump -o acpidump.bin
>> acpixtract -a acpidump.bin
>>
>> but there is no HMAT table reported. So it seems I'm out of luck,
>> and I cannot test the new functionality, right?
>
>
> Besides KNL (which is too old to have HMAT, but hwloc now provides
> hardwired bandwidth/latency values), the only platforms with
> heterogeneous memories right now are Intel machines with Optane
> DCPMM (NVDIMMs). Some have a HMAT, some don't. If your machine
> doesn't, it's possible to provide a custom HMAT table in the
> initrd. That's not easy, so adding attribute values with
> hwloc-annotate might be easier.
>
>
>>
>> Also, where can we find the list of attributes supported
>> by --best-memattr?
>>   --best-memattr  Only display the best target among the
>> local nodes
>
>
> There are 4 standard attributes defined in hwloc/memattrs.h:
> capacity, locality, latency and bandwidth.They are also visible in
> lstopo -vv or lstopo --memattrs. I'll something in the doc.
>
>
>>
>> By trial and error, I have found out that latency and bandwidth
>> are supported. Are there any other? Could you please add the list
>> to hwloc-info -h?
>
>
> I could add the default ones, but I'll need to specify that
> additional user-given attributes may exist.
>
> Thanks for the feedback.
>
> Brice
>
>
>
>>
>> hwloc-info --best-memattr bandwidth
>> hwloc-info --best-memattr latency
>>
>> Thanks a lot!
>> Jirka
>>
>>
>> On Thu, Oct 1, 2020 at 12:45 AM Brice Goglin
>> mailto:brice.gog...@inria.fr>> wrote:
>>
>> hwloc (Hardware Locality) 2.3.0 is now available for download.
>>
>>  https://www.open-mpi.org/software/hwloc/v2.3/ 
>> <https://www.open-mpi.org/software/hwloc/v2.0/>
>>
>

Re: [hwloc-users] [hwloc-announce] hwloc 2.3.0 released

2020-10-01 Thread Brice Goglin


Le 01/10/2020 à 19:16, Jirka Hladky a écrit :
> Hi Brice,
>
> this new feature sounds very interesting! 
>
> Add hwloc/memattrs.h for exposing latency/bandwidth information
>     between initiators (CPU sets for now) and target NUMA nodes,
>     typically on heterogeneous platforms.
>
>
> If I get it right, I need to have an ACPI HMAT table on the system to
> use the new functionality, right?


Hello Jirka

It's also possible to add memory attribute using the C API or with
hwloc-annotate to modify a XML (you may create attribute, or add values
for a given attribute).


> I have tried following on Fedora
> acpidump -o acpidump.bin
> acpixtract -a acpidump.bin
>
> but there is no HMAT table reported. So it seems I'm out of luck, and
> I cannot test the new functionality, right?


Besides KNL (which is too old to have HMAT, but hwloc now provides
hardwired bandwidth/latency values), the only platforms with
heterogeneous memories right now are Intel machines with Optane DCPMM
(NVDIMMs). Some have a HMAT, some don't. If your machine doesn't, it's
possible to provide a custom HMAT table in the initrd. That's not easy,
so adding attribute values with hwloc-annotate might be easier.


>
> Also, where can we find the list of attributes supported
> by --best-memattr?
>   --best-memattr  Only display the best target among the local nodes


There are 4 standard attributes defined in hwloc/memattrs.h: capacity,
locality, latency and bandwidth.They are also visible in lstopo -vv or
lstopo --memattrs. I'll something in the doc.


>
> By trial and error, I have found out that latency and bandwidth are
> supported. Are there any other? Could you please add the list to
> hwloc-info -h?


I could add the default ones, but I'll need to specify that additional
user-given attributes may exist.

Thanks for the feedback.

Brice



>
> hwloc-info --best-memattr bandwidth
> hwloc-info --best-memattr latency
>
> Thanks a lot!
> Jirka
>
>
> On Thu, Oct 1, 2020 at 12:45 AM Brice Goglin  <mailto:brice.gog...@inria.fr>> wrote:
>
> hwloc (Hardware Locality) 2.3.0 is now available for download.
>
>   https://www.open-mpi.org/software/hwloc/v2.3/ 
> <https://www.open-mpi.org/software/hwloc/v2.0/>
>
> v2.3.0 brings quite a lot of changes. The biggest one is the addition
> of the memory attribute API to expose hardware information that vendors
> are (slowly) adding to ACPI tables to describe heterogeneous memory
> platforms (mostly DDR+NVDIMMs right now).
>
> The following is a summary of the changes since v2.2.0.
>
> Version 2.3.0
> -
> * API
>   + Add hwloc/memattrs.h for exposing latency/bandwidth information
> between initiators (CPU sets for now) and target NUMA nodes,
> typically on heterogeneous platforms.
> - When available, bandwidths and latencies are read from the ACPI HMAT
>   table exposed by Linux kernel 5.2+.
> - Attributes may also be customized to expose user-defined performance
>   information.
>   + Add hwloc_get_local_numanode_objs() for listing NUMA nodes that are
> local to some locality.
>   + The new topology flag HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT causes
> support arrays to be loaded from XML exported with hwloc 2.3+.
> - hwloc_topology_get_support() now returns an additional "misc"
>   array with feature "imported_support" set when support was imported.
>   + Add hwloc_topology_refresh() to refresh internal caches after 
> modifying
> the topology and before consulting the topology in a multithread 
> context.
> * Backends
>   + Add a ROCm SMI backend and a hwloc/rsmi.h helper file for getting
> the locality of AMD GPUs, now exposed as "rsmi" OS devices.
> Thanks to Mike Li.
>   + Remove POWER device-tree-based topology on Linux,
> (it was disabled by default since 2.1).
> * Tools
>   + Command-line options for specifying flags now understand 
> comma-separated
> lists of flag names (substrings).
>   + hwloc-info and hwloc-calc have new --local-memory --local-memory-flags
> and --best-memattr options for reporting local memory nodes and 
> filtering
> by memory attributes.
>   + hwloc-bind has a new --best-memattr option for filtering by memory 
> attributes
> among the memory binding set.
>   + Tools that have a --restrict option may now receive a nodeset or
> some custom flags for restricting the topology.
>   + lstopo now has a --thickness option for changing line thickness in the
> graphical output.
>   + Fix lstopo drawing w

Re: [hwloc-users] hwloc Python3 Bindings - Correctly Grab number cores available

2020-08-31 Thread Brice Goglin

If you don't care about the overhead, tell python to use the output of
shell command "hwloc-calc -N pu all".

Brice


Le 31/08/2020 à 18:38, Brock Palen a écrit :
> Thanks,
>
> yeah I was looking for an API that would take into consideration most
> cases, like I find with hwloc-bind --get   where I can find the number
> the process has access to.  Wether is cgroups,  other sorts of
> affinity setting etc.
>
> Brock Palen
> IG: brockpalen1984
> www.umich.edu/~brockp 
> Director Advanced Research Computing - TS
> bro...@umich.edu 
> (734)936-1985
>
>
> On Mon, Aug 31, 2020 at 12:37 PM Guy Streeter  > wrote:
>
> I forgot that the cpuset value is still available in cgroups v2. You
> would want the cpuset.cpus.effective value.
> More information is available here:
> https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html
>
> On Mon, Aug 31, 2020 at 11:19 AM Guy Streeter
> mailto:guy.stree...@gmail.com>> wrote:
> >
> > As I said, cgroups doesn't limit the group to a number of cores, it
> > limits processing time, either as an absolute amount or as a
> share of
> > what is available.
> > A docker process can be restricted to a set of cores, but that
> is done
> > with cpu affinity, not cgroups.
> >
> > You could try to figure out an equivalency. For instance if you are
> > using cpu.shares to limit the cgroups, then figure the ratio of a
> > cgroup's share to the shares of all the cgroups at that level, and
> > apply that ratio to the number of available cores to get an
> estimated
> > number of threads you should start.
> >
> > On Mon, Aug 31, 2020 at 10:40 AM Brock Palen  > wrote:
> > >
> > > Sorry if wasn't clear, I'm trying to find out what is
> available to my process before it starts up threads.  If the user
> is jailed in a cgroup (docker, slurm, other)  and the program
> tries to start 36 threads, when it only has access to 4 cores,
> it's probably not a huge deal, but not desirable.
> > >
> > > I do allow the user to specify number of threads, but would
> like to automate it for least astonishment.
> > >
> > > Brock Palen
> > > IG: brockpalen1984
> > > www.umich.edu/~brockp 
> > > Director Advanced Research Computing - TS
> > > bro...@umich.edu 
> > > (734)936-1985
> > >
> > >
> > > On Mon, Aug 31, 2020 at 11:34 AM Guy Streeter
> mailto:guy.stree...@gmail.com>> wrote:
> > >>
> > >> My very basic understanding of cgroups is that it can be used
> to limit
> > >> cpu processing time for a group, and to ensure fair
> distribution of
> > >> processing time within the group, but I don't know of a way
> to use
> > >> cgroups to limit the number of CPUs available to a cgroup.
> > >>
> > >> On Mon, Aug 31, 2020 at 8:56 AM Brock Palen  > wrote:
> > >> >
> > >> > Hello,
> > >> >
> > >> > I have a small utility,  it is currently using 
> multiprocess.cpu_count()
> > >> > Which currently ignores cgroups etc.
> > >> >
> > >> > I see https://gitlab.com/guystreeter/python-hwloc
> > >> > But appears stale,
> > >> >
> > >> > How would you detect number of threads that are safe to
> start in a cgroup from Python3 ?
> > >> >
> > >> > Thanks!
> > >> >
> > >> > Brock Palen
> > >> > IG: brockpalen1984
> > >> > www.umich.edu/~brockp 
> > >> > Director Advanced Research Computing - TS
> > >> > bro...@umich.edu 
> > >> > (734)936-1985
> > >> > ___
> > >> > hwloc-users mailing list
> > >> > hwloc-users@lists.open-mpi.org
> 
> > >> > https://lists.open-mpi.org/mailman/listinfo/hwloc-users
> > >> ___
> > >> hwloc-users mailing list
> > >> hwloc-users@lists.open-mpi.org
> 
> > >> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
> > >
> > > ___
> > > hwloc-users mailing list
> > > hwloc-users@lists.open-mpi.org
> 
> > > https://lists.open-mpi.org/mailman/listinfo/hwloc-users
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org 
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] hwloc Python3 Bindings - Correctly Grab number cores available

2020-08-31 Thread Brice Goglin


Le 31/08/2020 à 18:19, Guy Streeter a écrit :
> As I said, cgroups doesn't limit the group to a number of cores, it
> limits processing time, either as an absolute amount or as a share of
> what is available.
> A docker process can be restricted to a set of cores, but that is done
> with cpu affinity, not cgroups.


cgroup can actually do lots of things: limit the available cores
(example below with only PU #0), the available NUMA nodes, the amount of
RAM in those nodes, processing time, etc.

$ cat /sys/fs/cgroup/cpuset/foobar/cpuset.cpus
0

Brice


___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] hwloc 1.11.13 incorrect PCI locality information Xeon Platinum 9242

2020-08-30 Thread Brice Goglin

Hello

Do you know which lstopo is correct here? Do you have a way to know if
the IB interface is indeed connected to first NUMA node of 2nd package,
or to 2nd NUMA node of 1st package? Benchmarking IB bandwidth when
memory/cores are in NUMA node #1 vs #2 would be nice.

The warning/fixup was added to hwloc 1.11 for Haswell-Broadwell-Xeon
bugs in the Linux kernel. It was removed in hwloc 2.x because the kernel
was fixed a while ago. It looks the warning/fixup detection was too
large and also matches Xeon 9200 too, unfortunately.

From what I guess from some diagram on the web, PCI slots on this
platform all go to first package, none to 2nd package. If so, then
lstopo 2.0 is correct and the 1.11.13 fixup should be disabled in this case.

So I guess you should just export the environment variables such as
HWLOC_PCI__40_LOCALCPUS= (empty value) as said in the warning. We're
likely not going to release a 1.11.14 ever, so don't expect a proper fix
for this. We wouldn't be able to test the code on the broken HSW/BDW
platforms anymore anyway.

If you can confirm this through IB benchmarking, it'd be very nice.

Thanks

Brice



Le 29/08/2020 à 14:41, Christian Tuma a écrit :
> Dear hwloc experts,
>
> Using hwloc 1.11.13 I receive an "incorrect PCI locality information"
> error message. The complete message is attached as file
> "lstopo_1.11.13.err".
>
> I get this error on a dual socket Xeon Platinum 9242 system running
> CentOS 7.8.
>
> I don't see this error on a dual socket Xeon Gold 6148 system running
> the same CentOS release (7.8).
>
> And if I remember correctly, I also did not see that error earlier
> with our dual socket Xeon Platinum 9242 system before it was updated
> to version 7.8 of CentOS.
>
> So to me it is the combination of that specific CentOS release (7.8)
> and that particular CPU type (Xeon Platinum 9242) which triggers the
> error in hwloc 1.11.13.
>
> With hwloc 2.1.0, however, I do not see any error message. For your
> reference, I am attaching the XML output files obtained from hwloc
> 1.11.13 and 2.1.0.
>
> Unfortunately, I cannot switch from hwloc 1.x to 2.x because I need to
> compile OpenMPI 3.x where hwloc 1.x is required. And simply setting
> HWLOC_HIDE_ERRORS is not a true solution.
>
> Could someone please provide a fix for this particular problem in
> hwloc 1.x?
>
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] issue with MSVC Community Edition 2019

2020-07-23 Thread Brice Goglin

Good to know. Now I don't understand why it occurs. My testing seems to
show that we don't even enter hwloc_topology_init(). If Windows lazily
loads DLLs, it could mean that libhwloc-15.dll is only loaded when the
first hwloc function is called, and that loading would fail for some
reason such as ABI mismatch. Do you know if that's possible on Windows?
If so, would we get an error popup in such a case? And is there a way to
debug dynamic linking issues? On Linux we have ldd and nm/objdump for
checking which libraries are used and what symbols they contain.

Brice

Le 23/07/2020 à 19:32, Jon Dart a écrit :
> That was it - the older DLL was in the path. Thanks for looking into it.
>
> --Jon
>
> On 7/22/2020 6:02 AM, Brice Goglin wrote:
>>
>> Hello Jon
>>
>> Sorry the delay. I finally got some time to look at this. I can only
>> reproduce the issue when I am compiling against hwloc 2.0.4 and using
>> 2.2.0 at runtime (placing libhwloc-15.dll in the current directory). If
>> using the same version (either 2.0.4 or 2.2.0) for both compiling at
>> running, the program works fine. Can you double-check if you were mixing
>> versions?
>>
>> If so, it may look like some sort of ABI break. I am trying to find out
>> where hwloc_topology_init() crashes.
>>
>> Brice
>>
>>
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] issue with MSVC Community Edition 2019

2020-07-22 Thread Brice Goglin

Le 01/07/2020 à 15:55, Jon Dart a écrit :
> On 6/30/2020 4:00 PM, Brice Goglin wrote:
>>
>> Hello
>>
>> We don't have many windows-specific changes in 2.1 except some late
>> MSVC-related changes added after rc1. Can you try 2.1.0rc1 instead of
>> 2.1.0? It's not visible on the download page but it's actually
>> available, for instance at
>> https://download.open-mpi.org/release/hwloc/v2.1/hwloc-win64-build-2.1.0rc1.zip
>>
> Tried it, that one does not work, either.

Hello Jon

Sorry the delay. I finally got some time to look at this. I can only
reproduce the issue when I am compiling against hwloc 2.0.4 and using
2.2.0 at runtime (placing libhwloc-15.dll in the current directory). If
using the same version (either 2.0.4 or 2.2.0) for both compiling at
running, the program works fine. Can you double-check if you were mixing
versions?

If so, it may look like some sort of ABI break. I am trying to find out
where hwloc_topology_init() crashes.

Brice

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Error occurred in topology.c line 940

2020-07-20 Thread Brice Goglin

Hello

It looks your hardware and/or OS is reporting buggy information. We'd
need more details to debug this. Can you open an githab issue at
https://github.com/open-mpi/hwloc/issues/new ? This page lists what
information you need to provide for debugging.

It looks like you're using hwloc inside a MPI. If so, you'll need to
install hwloc to check whether running "lstopo" fails the same.

Brice



Le 12/07/2020 à 17:10, muhamed badawi a écrit :
>
> *_
> I have faced this error and the fluent setup page turned off, and I
> can’t solve this problem._*
>
>  
>
> MPI Application rank 0 exited before MPI_Finalize() with status 2
> **
> **
> * hwloc has encountered what looks like an error from the operating
> system.
> *
> * Socket (cpuset 0xfff000ff) intersects with Socket (cpuset
> 0x000f) without inclusion!
> * Error occurred in topology.c line 940
> *
> * Please report this error message to the hwloc user's mailing list,
> * along with any relevant topology information from your platform.
> **
> **
> The fl process could not be started.
>
>  
>
> Regrads,
> Muhamed Badawi
>
>
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] issue with MSVC Community Edition 2019

2020-06-30 Thread Brice Goglin

Hello

We don't have many windows-specific changes in 2.1 except some late
MSVC-related changes added after rc1. Can you try 2.1.0rc1 instead of
2.1.0? It's not visible on the download page but it's actually
available, for instance at
https://download.open-mpi.org/release/hwloc/v2.1/hwloc-win64-build-2.1.0rc1.zip

Also, can you clarify whether your cygwin working fine was using that
released zipball or a cygwin-built libhwloc?

Thanks

Brice



Le 30/06/2020 à 21:51, Jon Dart a écrit :
> I have had some trouble with even a simple hwloc program on Windows 10
> when building with Visual Studio 2019 Community Edition.
>
> The attached program works fine with cygwin when built like this:
>
> g++ -c -I/cygdrive/e/chess/hwloc-win64-build-2.2.0/include -O2
> main.cpp topo.cpp
> g++ -o main -L /cygdrive/e/chess/hwloc-win64-build-2.2.0/lib main.o
> topo.o -lstdc++ -lhwloc
>
> It just initializes the topology and prints out some basic
> information, such as:
>
> detected 1 socket(s), 8 core(s), 16 logical processing units.
>
> If I compile the same program with MSVC (64-bit compiler):
>
> cl /EHsc /c -DNOMINMAX -IE:\chess\hwloc-win64-build-2.2.0/include -O2
> main.cpp topo.cpp
> link /out:main.exe main.obj topo.obj kernel32.lib user32.lib winmm.lib
> E:\chess\hwloc-win64-build-2.2.0/lib/libhwloc.lib /nologo
> /incremental:no /opt:ref /subsystem:console
>
> Then it does not output anything, it just terminates. Running in the
> debugger indicates that there is an exception in the first call to hwloc.
>
> This program works with MSVC and the released build of version 2.0.4
> of hwloc. It does not work with 2.1.0 or 2.2.0.
>
> --Jon
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Unused function

2020-05-29 Thread Brice Goglin

Oh sure, I thought we fixed this a while ago. I pushed it to master. Do
you need in 2.2 only or also earlier stable series?

Brice


Le 29/05/2020 à 05:32, Balaji, Pavan via hwloc-users a écrit :
> Hello,
>
> We are maintaining this patch for hwloc internally in mpich.  Can this be 
> upstreamed?
>
> https://github.com/pmodels/hwloc/commit/a6d7018f092a0754433a0a2b17a527e64a125d38
>
> It was throwing a warning when compiled with clang (and our usual strict 
> flags):
>
> 8<
>   CC   topology-hardwired.lo
> topology-linux.c:458:1: warning: unused function 'hwloc_lstat' 
> [-Wunused-function]
> hwloc_lstat(const char *p, struct stat *st, int d __hwloc_attribute_unused)
> ^
>   CC   topology-x86.lo
> 1 warning generated.
> In directory: 
> /var/lib/jenkins-slave/workspace/mpich-warnings-gpu/compiler/clang-8/config/strict/device/ch4-ucx/gpu/cuda/label/centos64_review/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc
>   CC   topology-hardwired.lo
> topology-linux.c:458:1: warning: unused function 'hwloc_lstat' 
> [-Wunused-function]
> hwloc_lstat(const char *p, struct stat *st, int d __hwloc_attribute_unused)
> ^
>   CC   topology-x86.lo
> 8<
>
> Thanks,
>
>   -- Pavan
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Multi-Node Topologies in hwloc 2.0+

2020-05-12 Thread Brice Goglin

Hello Stephen

There's no equivalent in hwloc 2.x unfortunately, even with netloc.

"custom" caused too many issues for core maintenance (mostly because of
cpusets being different between machines) while use cases were very rare.

Brice



Le 12/05/2020 à 08:01, Herbein, Stephen via hwloc-users a écrit :
> Hi,
>
> I assume with the removal of `hwloc-assembler` and
> `hwloc_topology_set_custom` in 2.0+ that multi-node topology support
> has been relegated to netloc, but I figured I would run my use-case by
> you all just in case.
>
> Essentially I want to do what `hwloc-assembler` did (i.e., combine
> multiple single-node topologies into one big one) with the 2.0+ API. 
> No need for any complex network connections/topologies, a simple flat
> `system:1 machine:X core:Y PU:Z` topology will do. With that topology,
> I'd like to do two things: 1) print out a human readable summary
> equivalent to what you get with `lstopo-no-graphics` and 2) visualize
> the topology with `lstopo`.
>
> Is multi-node topology like this feasible with hwloc 2.0+?  If so, are
> there any examples or documentation on how to achieve that?
>
> Thanks,
> Stephen
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

[hwloc-users] heterogeneous memory in hwloc

2020-03-19 Thread Brice Goglin

Hello

Several people asked recently how hwloc exposes heterogeneous memory and
how to recognize which NUMA nodes is which kind of memory. Short answer
is that it's currently ugly but we're working on it for hwloc 2.3. I put
all details in this wiki page :

https://github.com/open-mpi/hwloc/wiki/Heterogeneous-Memory

Brice


___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] PCI to NUMA node mapping.

2020-02-03 Thread Brice Goglin

Hello Liam

dmidecode is usually reserved to root only because it uses SMBIOS or
whatever hardware/ACPI/... tables. Those tables are read by the Linux
kernel and exported to non-root users in sysfs:

$ cat /sys/bus/pci/devices/:ae:0c.6/numa_node 
1

However this file isn't that good because some old platforms had PCI
buses attached to 2 NUMA nodes (cannot be exposed in the above sysfs
"numa_node" file). So we rather read the list of local CPUs:

$ cat /sys/bus/pci/devices/:ae:0c.6/local_cpus
,,
$ cat /sys/bus/pci/devices/:ae:0c.6/local_cpulist 
1,5,9,13,17,21,25,29,33,37,41,45,49,53,57,61,65,69,73,77

Brice



Le 03/02/2020 à 17:46, Murphy, Liam a écrit :
>
> Newbie question.
>
>  
>
> I know that dmidecode uses the num_node files under
> /sys/devices/pcie…, but hwloc does not seem
>
> to use the same mechanism to determine which PCI devices are on which
> numa node? From which
>
> file is it deriving the information?
>
>  
>
> Regards,
>
> Liam
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] disabling ucx over omnipath

2019-11-15 Thread Brice Goglin

Oops wrong list, sorry :)

Le 15/11/2019 à 10:49, Brice Goglin a écrit :
> Hello
>
> We have a platform with an old MLX4 partition and another OPA partition.
> We want a single OMPI installation working for both kinds of nodes. When
> we enable UCX in OMPI for MLX4, UCX ends up being used on the OPA
> partition too, and the performance is poor (3GB/s instead of 10). The
> problem seems to be that UCX gets enabled because they added support for
> OPA in UCX 1.6 even that's just poor OPA support through Verbs.
>
> The only solution we found is to bump the mtl_psm2_priority to 52 so
> that PSM2 gets used before PML UCX. Seems to work fine but I am not sure
> it's a good idea. Could OMPI rather tell UCX to disable itself when it
> only finds OPA?
>
> Thanks
>
> Brice
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

[hwloc-users] disabling ucx over omnipath

2019-11-15 Thread Brice Goglin

Hello

We have a platform with an old MLX4 partition and another OPA partition.
We want a single OMPI installation working for both kinds of nodes. When
we enable UCX in OMPI for MLX4, UCX ends up being used on the OPA
partition too, and the performance is poor (3GB/s instead of 10). The
problem seems to be that UCX gets enabled because they added support for
OPA in UCX 1.6 even that's just poor OPA support through Verbs.

The only solution we found is to bump the mtl_psm2_priority to 52 so
that PSM2 gets used before PML UCX. Seems to work fine but I am not sure
it's a good idea. Could OMPI rather tell UCX to disable itself when it
only finds OPA?

Thanks

Brice


___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Embedded hwloc and Name Mangling Convention

2019-10-10 Thread Brice Goglin

Le 10/10/2019 à 17:38, Gutierrez, Samuel K. via hwloc-users a écrit :
> Good morning,
>
> I have a question about expected name mangling behavior when using 
> HWLOC_SET_SYMBOL_PREFIX in hwloc v2.1.0 (and perhaps other versions).
>
> Say, for example, I do the following in a project embedding hwloc:
>
> HWLOC_SET_SYMBOL_PREFIX(foo_internal_)
> HWLOC_SETUP_CORE(…)
> ...
>
> Now, entry points into hwloc are prefixed with foo_internal_ (e.g., 
> foo_internal_hwloc_topology_init()). This all works great.
>
> Next, let’s consider what happens to hwloc-exported constants such as 
> HWLOC_OBJ_MACHINE when using the same setup above. I would expect something 
> like this: 
> HWLOC_OBJ_MACHINE now becomes FOO_INTERNAL_HWLOC_OBJ_MACHINE. Instead, I 
> notice the following curious mangling convention:  
> FOO_INTERNAL_hwloc_OBJ_MACHINE. Is this intentional? If so, that’s 
> fine—functionally, it works as expected. However, this seems like a bug.


Hello

#define HWLOC_NAME(name) HWLOC_MUNGE_NAME(HWLOC_SYM_PREFIX, hwloc_ ## name)
#define HWLOC_NAME_CAPS(name) HWLOC_MUNGE_NAME(HWLOC_SYM_PREFIX_CAPS,
hwloc_ ## name)

Indeed I don't see any reason not to use HWLOC_ on the second line. It
looks like we've been doing this forever.

Even if users are supposed to only use official (non-renamed) names, I
guess there might exist a ugly hack that explicitly depends on renamed
names somewhere in hwloc users' code. So I'd rather not touch this as
long as it doesn't break anything.

Brice


___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Netloc feature suggestion

2019-08-19 Thread Brice Goglin

Hello


Indeed we would like to expose this kind of info but Netloc is
unfornately undermanpowered these days. The code in git master is
outdated. We have a big rework in a branch but it still needs quite a
lot of polishing before being merged


The API is still mostly-scotch-oriented (i.e. for process placement
using communication graphs) because that's pretty-much the only clear
user-request we got in the last years (most people said "we want netloc"
but never gave any idea of what API they actually needed). Of course,
there will be a way to say "I want the entire machine" or "only my
allocated nodes".


The non-scotch API for exposing topology details has been made private
until we understand better what users want. And your request would
definitely help there.


Brice




Le 19/08/2019 à 09:31, Rigel Falcao do Couto Alves a écrit :
>
> Thanks John and Jeff for the replies.
>
>
> Indeed, we are using Slurm here at our cluster; so, for now, I can
> stick with the runtime reading of the network topology's description
> file, explained here:
>
>
> https://slurm.schedmd.com/topology.conf.html
>
>
> But given the idea of the project is to produce a library that can be
> distributed to anyone in the world, it would still worth it to have a
> way to gather such information on-the-go -- as I can already do
> with /hwloc/'s topology information. No problem about starting
> simple, i.e. only single-path hierarchies supported in the beginning.
>
>
> The additional /switch/ information (coming from /netloc/) would then
> be added to the graphical output of our tools, allowing users to
> visually analyse how resources placement (both /intra/ and /inter/
> node) affect their applications.
>
>
>
> 
> *Von:* hwloc-users  im Auftrag
> von John Hearns via hwloc-users 
> *Gesendet:* Freitag, 16. August 2019 07:16
> *An:* Hardware locality user list
> *Cc:* John Hearns
> *Betreff:* Re: [hwloc-users] Netloc feature suggestion
>  
> Hi Rigel. This is very interesting.
> First though I should say - most batch systems have built in node
> grouping utilities.
> PBSPro has bladesets - I think they are called placement groups now.
> I used these when running CFD codes in a Formula 1 team.
> The systems administrator has to set these up manually, using
> knowledge of the switch topology.
> In PBSPro jobs would then 'prefer' to run within the smallest bladeset
> which could accomodate them.
> So you define bladesets for (say) 8/16/24/48 node jobs.
>
> https://pbspro.atlassian.net/wiki/spaces/PD/pages/455180289/Finer+grained+node+grouping
>
> Similarly for Slurm
> https://slurm.schedmd.com/topology.html
>
>
> On Wed, 14 Aug 2019 at 18:53, Rigel Falcao do Couto Alves
> mailto:rigel.al...@tu-dresden.de>> wrote:
>
> Hi,
>
>
> I am doing a PhD in performance analysis of highly parallel CFD
> codes and would like to suggest a feature for Netloc: from
> topic /Build Scotch sub-architectures/
> (at https://www.open-mpi.org/projects/hwloc/doc/v2.0.3/a00329.php),
> create a function-version of /netloc_get_resources/, which could
> retrieve at runtime the network details of the available cluster
> resources (i.e. the nodes allocated to the job). I am mostly
> interested about how many switches (the gray circles in the figure
> below) need to be traversed in order for any pair of
> allocated nodes to communicate with each other:
>
> [removed 200kB image]
>
>
> For example, suppose my job is running within 4 nodes in the
> cluster, illustrated by the numbers above. All I would love to get
> from Netloc - at runtime - is some sort of classification of the
> nodes, like:
>
>
> 1: aa
>
> 2: ab
>
> 3: ba
>
> 4: ca
>
>
> The difference between nodes 1 and 2 is on the last digit, which
> means their MPI communications only need to traverse 1 switch;
> however, between any of them and nodes 3 or 4, the difference
> starts on the second-last digit, which means their communications
> need to traverse two switches. More digits may be left-added to
> the string, per necessity; i.e. if the central gray circle on the
> above figure is connected to another switch, which in turnleads to
> another part of the cluster's structure (with its own switches,
> nodes etc.). For me, it is at the present moment irrelevant
> whether e.g. nodes 1 and 2 are physically - or logically -
> consecutive to each other: /a/, /b/, /c/ etc. would be just
> arbitrary identifiers.
>
>
> I would then use this data to plot the process placement, using
> open-source tools developed here in the University of Dresden
> (Germany); i.e. Scotch is not an option for me. The results of my
> study will be open-source as well and I can gladly share them with
> you once the thesis is finished.
>
>
> I hope I have clearly explained what I have in mind; please let me
>

Re: [hwloc-users] Hang with SunOS

2019-07-08 Thread Brice Goglin

Hello

It may be similar to https://github.com/open-mpi/hwloc/issues/290 but we
weren't able to find the exact issue unfortunately :/

Setting HWLOC_COMPONENTS=-x86 in the environment would disable that code
path, causing the topology to be possibly not as precise.

Brice



Le 08/07/2019 à 20:43, Junchao Zhang a écrit :
> Hello Brice,
>   When I was installing PETSc with --download-mpich on a machine with
> uname -a = "SunOS n-gage 5.11 illumos-a22312a201 i86pc i386 i86pc", 
> petsc configure script hung in a conftest. I have the following stack
> trace. The compiler is "Sun C 5.10 SunOS_i386 2009/06/03". 
>   Note MPICH was successfully installed by petsc. The error happened
> when running a conftest program. Hope you can find clues from that.  
>   Thanks.
>
> $cat test.c
> #include 
>
> int main() {
> MPI_Aint size;
> int ierr;
> MPI_Init(0,0);
> ierr = MPI_Type_extent(MPI_LONG_DOUBLE, );
> if(ierr || (size == 0)) exit(1);
> MPI_Finalize();
> return 0;
> }
>
> (gdb) bt
> #0  0xfc8db765 in hwloc_x86_cpuid (eax=0x803d5b4, ebx=0x803d5b8,
> ecx=0x803d5b0, edx=0x803d5bc)
>     at
> /export/home/jczhang/petsc/arch-opensolaris-pkgs-dbg/externalpackages/mpich-3.3.1/src/hwloc/include/private/cpuid-x86.h:79
> #1  0xfc8dbbf1 in cpuid_or_from_dump (eax=0x803d5b4, ebx=0x803d5b8,
> ecx=0x803d5b0, edx=0x803d5bc, src_cpuiddump=0x0) at topology-x86.c:165
> #2  0xfc8dc33a in look_proc (backend=0x8412d80, infos=0x842dfe0,
> highest_cpuid=13, highest_ext_cpuid=13, features=0x803d6d0,
> cpuid_type=intel, src_cpuiddump=0x0)
>     at topology-x86.c:505
> #3  0xfc8dda2a in look_procs (backend=0x8412d80, infos=0x842dfe0,
> fulldiscovery=0, highest_cpuid=13, highest_ext_cpuid=13,
> features=0x803d6d0, cpuid_type=intel,
>     get_cpubind=0xfc8d61c0 ,
> set_cpubind=0xfc8d605c ) at
> topology-x86.c:1083
> #4  0xfc8de067 in hwloc_look_x86 (backend=0x8412d80, fulldiscovery=0)
> at topology-x86.c:1279
> #5  0xfc8de1ac in hwloc_x86_discover (backend=0x8412d80) at
> topology-x86.c:1348
> #6  0xfc89b899 in hwloc_discover (topology=0x84129f0) at topology.c:3007
> #7  0xfc89c974 in hwloc_topology_load (topology=0x84129f0) at
> topology.c:3618
> #8  0xfbf150d8 in MPIR_Init_thread (argc=0x0, argv=0x0, required=0,
> provided=0x803d8ac) at src/mpi/init/initthread.c:375
> #9  0xfbf0c53b in PMPI_Init (argc=0x0, argv=0x0) at
> src/mpi/init/init.c:180
> #10 0x08050acf in main () at test.c:6
> (gdb) f 0
> #0  0xfc8db765 in hwloc_x86_cpuid (eax=0x803d5b4, ebx=0x803d5b8,
> ecx=0x803d5b0, edx=0x803d5bc)
>     at
> /export/home/jczhang/petsc/arch-opensolaris-pkgs-dbg/externalpackages/mpich-3.3.1/src/hwloc/include/private/cpuid-x86.h:79
> 79  : "+a" (*eax), "=" (*ebx), "+c" (*ecx), "=" (*edx));
> (gdb) l
> 74 #elif defined(HWLOC_X86_32_ARCH)
> 75  __asm__(
> 76  "mov %%ebx,%1\n\t"
> 77  "cpuid\n\t"
> 78  "xchg %%ebx,%1\n\t"
> 79  : "+a" (*eax), "=" (*ebx), "+c" (*ecx), "=" (*edx));
> 80 #else
> 81 #error unknown architecture
> 82 #endif
> 83 #endif /* HWLOC_HAVE_MSVC_CPUIDEX */
>
>
> --Junchao Zhang
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Build warnings with hwloc-2.0.3

2019-03-18 Thread Brice Goglin

Hello Pavan

I am planning to fix this in 2.1 (to be released before summer). I'll
backport the trivial pieces to 2.0.x too.

Do you care about a specific value passed to -Wstack-usage=X? or do you
just want to avoid dynamic/unbounded allocs on the stack?

Brice



Le 18/03/2019 à 15:04, Balaji, Pavan via hwloc-users a écrit :
> Brice, all,
>
> Any update on this?  Are you guys planning on fixing these?
>
>   -- Pavan
>
>> On Feb 25, 2019, at 7:33 AM, Balaji, Pavan via hwloc-users 
>>  wrote:
>>
>> Hi Brice,
>>
>>> On Feb 25, 2019, at 2:27 AM, Brice Goglin  wrote:
>>> Are you sure you're not passing -Wstack-usage? My Ubuntu 18.04 with
>>> latest gcc-7 (7.3.0-27ubuntu1~18.04) doesn't show any of those warnings.
>> Yes, you are right, -Wstack-usage was explicitly added too.  Sorry, I missed 
>> the fact that it wasn't default in -Wall.
>>
>>> It looks like all these warnings are caused by C99 variable-length
>>> arrays (except 2 that I don't understand). I know the kernel devs
>>> stopped using VLA recently, and it looks like C11 made them optional.
>>> But are we really supposed to stop using VLA already?
>> They are optional, which means we cannot assume them for portability 
>> reasons.  FWIW, we have made the rest of mpich stack-usage clean.
>>
>>  -- Pavan
>>
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Build warnings with hwloc-2.0.3

2019-02-25 Thread Brice Goglin

Hello Pavan,

Are you sure you're not passing -Wstack-usage? My Ubuntu 18.04 with
latest gcc-7 (7.3.0-27ubuntu1~18.04) doesn't show any of those warnings.

It looks like all these warnings are caused by C99 variable-length
arrays (except 2 that I don't understand). I know the kernel devs
stopped using VLA recently, and it looks like C11 made them optional.
But are we really supposed to stop using VLA already?

Brice



Le 25/02/2019 à 02:07, Balaji, Pavan via hwloc-users a écrit :
> Folks,
>
> I'm getting the below build warnings with hwloc-2.0.3, gcc-7.3 on Ubuntu 
> (with -Wall -O2):
>
> 8<
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/distances.c:
>  In function 'hwloc__groups_by_distances':
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/distances.c:817:1:
>  warning: stack usage might be unbounded [-Wstack-usage=]
>  hwloc__groups_by_distances(struct hwloc_topology *topology,
>  ^~
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology.c:
>  In function 'hwloc_propagate_symmetric_subtree':
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology.c:2388:1:
>  warning: stack usage might be unbounded [-Wstack-usage=]
>  hwloc_propagate_symmetric_subtree(hwloc_topology_t topology, hwloc_obj_t 
> root)
>  ^
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-synthetic.c:
>  In function 'hwloc_synthetic_process_indexes':
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-synthetic.c:71:1:
>  warning: stack usage might be unbounded [-Wstack-usage=]
>  hwloc_synthetic_process_indexes(struct hwloc_synthetic_backend_data_s *data,
>  ^~~
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-xml.c:
>  In function 'hwloc__xml_export_object_contents':
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-xml.c:1920:1:
>  warning: stack usage might be unbounded [-Wstack-usage=]
>  hwloc__xml_export_object_contents (hwloc__xml_export_state_t state, 
> hwloc_topology_t topology, hwloc_obj_t obj, unsigned long flags)
>  ^
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-linux.c:
>  In function 'hwloc_linux_get_area_membind':
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-linux.c:1883:1:
>  warning: stack usage might be unbounded [-Wstack-usage=]
>  hwloc_linux_get_area_membind(hwloc_topology_t topology, const void *addr, 
> size_t len, hwloc_nodeset_t nodeset, hwloc_membind_policy_t *policy, int 
> flags __hwloc_attribute_unused)
>  ^~~~
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-linux.c:
>  In function 'hwloc_linux_set_thisthread_membind':
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-linux.c:1737:1:
>  warning: stack usage might be unbounded [-Wstack-usage=]
>  hwloc_linux_set_thisthread_membind(hwloc_topology_t topology, 
> hwloc_const_nodeset_t nodeset, hwloc_membind_policy_t policy, int flags)
>  ^~
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-linux.c:
>  In function 'hwloc_linux_get_thisthread_membind':
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-linux.c:1848:1:
>  warning: stack usage might be unbounded [-Wstack-usage=]
>  hwloc_linux_get_thisthread_membind(hwloc_topology_t topology, 
> hwloc_nodeset_t nodeset, hwloc_membind_policy_t *policy, int flags 
> __hwloc_attribute_unused)
>  ^~
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-linux.c:
>  In function 'hwloc_linux__get_allowed_resources':
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-linux.c:4426:13:
>  warning: stack usage might be unbounded [-Wstack-usage=]
>  static void hwloc_linux__get_allowed_resources(hwloc_topology_t topology, 
> const char *root_path, int root_fd, char **cpuset_namep)
>  ^~
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-x86.c:
>  In function 'cpuiddump_read':
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-x86.c:69:1:
>  warning: stack usage might be unbounded [-Wstack-usage=]
>  cpuiddump_read(const char *dirpath, unsigned idx)
>  ^~
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-x86.c:
>  In function 'hwloc_x86_component_instantiate':
> ../../../../../../../../../mpich/src/pm/hydra/tools/topo/hwloc/hwloc/hwloc/topology-x86.c:1456:1:
>  warning: stack usage might be unbounded

Re: [hwloc-users] unusual memory binding results

2019-01-29 Thread Brice Goglin

Only the one in brackets is set, others are unset alternatives.

If you write "madvise" in that file, it'll become "always [madvise] never".

Brice


Le 29/01/2019 à 15:36, Biddiscombe, John A. a écrit :
> On the 8 numa node machine
>
> $cat /sys/kernel/mm/transparent_hugepage/enabled 
> [always] madvise never
>
> is set already, so I'm not really sure what should go in there to disable it.
>
> JB
>
> -Original Message-
> From: Brice Goglin  
> Sent: 29 January 2019 15:29
> To: Biddiscombe, John A. ; Hardware locality user list 
> 
> Subject: Re: [hwloc-users] unusual memory binding results
>
> Oh, that's very good to know. I guess lots of people using first touch will 
> be affected by this issue. We may want to add a hwloc memory flag doing 
> something similar.
>
> Do you have root access to verify that writing "never" or "madvise" in 
> /sys/kernel/mm/transparent_hugepage/enabled fixes the issue too?
>
> Brice
>
>
>
> Le 29/01/2019 à 14:02, Biddiscombe, John A. a écrit :
>> Brice
>>
>> madvise(addr, n * sizeof(T), MADV_NOHUGEPAGE)
>>
>> seems to make things behave much more sensibly. I had no idea it was a 
>> thing, but one of my colleagues pointed me to it.
>>
>> Problem seems to be solved for now. Thank you very much for your insights 
>> and suggestions/help.
>>
>> JB
>>
>> -Original Message-
>> From: Brice Goglin 
>> Sent: 29 January 2019 10:35
>> To: Biddiscombe, John A. ; Hardware locality user 
>> list 
>> Subject: Re: [hwloc-users] unusual memory binding results
>>
>> Crazy idea: 512 pages could be replaced with a single 2MB huge page.
>> You're not requesting huge pages in your allocation but some systems 
>> have transparent huge pages enabled by default (e.g. RHEL
>> https://access.redhat.com/solutions/46111)
>>
>> This could explain why 512 pages get allocated on the same node, but it 
>> wouldn't explain crazy patterns you've seen in the past.
>>
>> Brice
>>
>>
>>
>>
>> Le 29/01/2019 à 10:23, Biddiscombe, John A. a écrit :
>>> I simplified things and instead of writing to a 2D array, I allocate a 1D 
>>> array of bytes and touch pages in a linear fashion.
>>> Then I call syscall(NR)move_pages, ) and retrieve a status array for 
>>> each page in the data.
>>>
>>> When I allocate 511 pages and touch alternate pages on alternate numa 
>>> nodes
>>>
>>> Numa page binding 511
>>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
>>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
>>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
>>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
>>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
>>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
>>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
>>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
>>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
>>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
>>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
>>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
>>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
>>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
>>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
>>>
>>> but as soon as I increase to 512 pages, it breaks.
>>>
>>> Numa page binding 512
>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Re: [hwloc-users] unusual memory binding results

2019-01-29 Thread Brice Goglin

Oh, that's very good to know. I guess lots of people using first touch
will be affected by this issue. We may want to add a hwloc memory flag
doing something similar.

Do you have root access to verify that writing "never" or "madvise" in
/sys/kernel/mm/transparent_hugepage/enabled fixes the issue too?

Brice



Le 29/01/2019 à 14:02, Biddiscombe, John A. a écrit :
> Brice
>
> madvise(addr, n * sizeof(T), MADV_NOHUGEPAGE)
>
> seems to make things behave much more sensibly. I had no idea it was a thing, 
> but one of my colleagues pointed me to it.
>
> Problem seems to be solved for now. Thank you very much for your insights and 
> suggestions/help.
>
> JB
>
> -Original Message-
> From: Brice Goglin  
> Sent: 29 January 2019 10:35
> To: Biddiscombe, John A. ; Hardware locality user list 
> 
> Subject: Re: [hwloc-users] unusual memory binding results
>
> Crazy idea: 512 pages could be replaced with a single 2MB huge page.
> You're not requesting huge pages in your allocation but some systems have 
> transparent huge pages enabled by default (e.g. RHEL
> https://access.redhat.com/solutions/46111)
>
> This could explain why 512 pages get allocated on the same node, but it 
> wouldn't explain crazy patterns you've seen in the past.
>
> Brice
>
>
>
>
> Le 29/01/2019 à 10:23, Biddiscombe, John A. a écrit :
>> I simplified things and instead of writing to a 2D array, I allocate a 1D 
>> array of bytes and touch pages in a linear fashion.
>> Then I call syscall(NR)move_pages, ) and retrieve a status array for 
>> each page in the data.
>>
>> When I allocate 511 pages and touch alternate pages on alternate numa 
>> nodes
>>
>> Numa page binding 511
>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 
>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 
>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 
>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 
>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 
>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 
>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
>> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 
>> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
>>
>> but as soon as I increase to 512 pages, it breaks.
>>
>> Numa page binding 512
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
>> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>
>> On the 8 numa node machine it sometimes gives the right answer even with 512 
>> pages.
>>
>> Still baffled
>>
>> JB
>>
>> -Original Message-
>> From: hwloc-users  On Behalf Of 
>> Biddiscombe, John A.
>> Sent: 28 January 2019 16:14
>> To: Brice Goglin 
>> Cc: Hardware locality user list 
>> Subject: Re: [hwloc-users] unusual memory binding results
>>
>> Brice
>>
>>> Can you print the pattern before and after thread 1 touched its pages, or 
>>> even in the middle ?
>>> It looks like somebody is touching too many pages here.
>> Expe

Re: [hwloc-users] unusual memory binding results

2019-01-29 Thread Brice Goglin

Crazy idea: 512 pages could be replaced with a single 2MB huge page.
You're not requesting huge pages in your allocation but some systems
have transparent huge pages enabled by default (e.g. RHEL
https://access.redhat.com/solutions/46111)

This could explain why 512 pages get allocated on the same node, but it
wouldn't explain crazy patterns you've seen in the past.

Brice




Le 29/01/2019 à 10:23, Biddiscombe, John A. a écrit :
> I simplified things and instead of writing to a 2D array, I allocate a 1D 
> array of bytes and touch pages in a linear fashion.
> Then I call syscall(NR)move_pages, ) and retrieve a status array for each 
> page in the data.
>
> When I allocate 511 pages and touch alternate pages on alternate numa nodes
>
> Numa page binding 511
> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 
> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 
> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 
> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 
> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 
> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
> 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 
> 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 
> 1 0 1 0
>
> but as soon as I increase to 512 pages, it breaks.
>
> Numa page binding 512
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
> 0 0 0 0 0
>
> On the 8 numa node machine it sometimes gives the right answer even with 512 
> pages.
>
> Still baffled
>
> JB
>
> -Original Message-
> From: hwloc-users  On Behalf Of 
> Biddiscombe, John A.
> Sent: 28 January 2019 16:14
> To: Brice Goglin 
> Cc: Hardware locality user list 
> Subject: Re: [hwloc-users] unusual memory binding results
>
> Brice
>
>> Can you print the pattern before and after thread 1 touched its pages, or 
>> even in the middle ?
>> It looks like somebody is touching too many pages here.
> Experimenting with different threads touching one or more pages, I get 
> unpredicatable results
>
> here on the 8 numa node device, the result is perfect. I am only allowing 
> thread 3 and 7 to write a single memory location
>
> get_numa_domain() 8 Domain Numa pattern
> 
> 
> 
> 3---
> 
> 
> 
> 7---
> 
>
> 
> Contents of memory locations
> 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 
> 26 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 
> 63 0 0 0 0 0 0 0 
> 
>
> you can see that core 26 (numa domain 3) wrote to memory, and so did core 63 
> (domain 8)
>
> Now I run it a second time and look, its rubbish
>
> get_numa_domain() 8 Domain Numa pattern
> 3---
> 3---
> 3---
> 3---
> 3---
> 3---
> 3---
> 3---
> 
>
> 
> Contents of memory locations
> 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 
> 26 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 
> 0 0 0 0 0 0 0 0 
> 0 0 0

Re: [hwloc-users] unusual memory binding results

2019-01-28 Thread Brice Goglin

Le 28/01/2019 à 11:28, Biddiscombe, John A. a écrit :
> If I disable thread 0 and allow thread 1 then I get this pattern on 1 machine 
> (clearly wrong)
> 
> 
> 
> 
> 


Can you print the pattern before and after thread 1 touched its pages,
or even in the middle ?

It looks like somebody is touching too many pages here.

Brice


> and on another I get
> -1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1
> 1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-
> -1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1
> 1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-
> -1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1
> 1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-
> which is correct because the '-' is a negative status. I will run again and 
> see if it's -14 or -2
>
> JB
>
>
> -Original Message-
> From: Brice Goglin  
> Sent: 28 January 2019 10:56
> To: Biddiscombe, John A. 
> Cc: Hardware locality user list 
> Subject: Re: [hwloc-users] unusual memory binding results
>
> Can you try again disabling the touching in one thread to check whether the 
> other thread only touched its own pages? (others' status should be
> -2 (ENOENT))
>
> Recent kernels have ways to migrate memory at runtime
> (CONFIG_NUMA_BALANCING) but this should only occur when it detects that some 
> thread does a lot of remote access, which shouldn't be the case here, at 
> least at the beginning of the program.
>
> Brice
>
>
>
> Le 28/01/2019 à 10:35, Biddiscombe, John A. a écrit :
>> Brice
>>
>> I might have been using the wrong params to hwloc_get_area_memlocation 
>> in my original version, but I bypassed it and have been calling
>>
>> int get_numa_domain(void *page)
>> {
>> HPX_ASSERT( (std::size_t(page) & 4095) ==0 );
>>
>> void *pages[1] = { page };
>> int  status[1] = { -1 };
>> if (syscall(__NR_move_pages, 0, 1, pages, nullptr, status, 0) == 
>> 0) {
>> if (status[0]>=0 && 
>> status[0]<=HPX_HAVE_MAX_NUMA_DOMAIN_COUNT) {
>> return status[0];
>> }
>> return -1;
>> }
>> throw std::runtime_error("Failed to get numa node for page");
>> }
>>
>> this function instead. Just testing one page address at a time. I 
>> still see this kind of pattern
>> 00101101010010101001010101011010011011010101110101110111010101
>> 010101
>> 00101101010010101001010101011010011011010101110101110111010101
>> 010101
>> 00101101010010101001010101011010011011010101110101110111010101
>> 010101
>> 00101101010010101001010101011010011011010101110101110111010101
>> 010101
>> 00101101010010101001010101011010011011010101110101110111010101
>> 010101
>> 00101101010010101001010101011010011011010101110101110111010101
>> 010101
>> 00101101010010101001010101011010011011010101110101110111010101
>> 010101
>> 00101101010010101001010101011010011011010101110101110111010101
>> 010101
>> 00101101010010101001010101011010011011010101110101110111010101
>> 010101
>> 00101101010010101001010101011010011011010101110101110111010101
>> 010101
>> 00101101010010101001010101011010011011010101110101110111010101
>> 010101
>> when I should see
>> 0101010101010101010101010101010101010101010101010101010101010101010101
>> 0101010101
>> 1010101010101010101010101010101010101010101010101010101010101010101010
>> 1010101010
>> 0101010101010101010101010101010101010101010101010101010101010101010101
>> 0101010101
>> 1010101010101010101010101010101010101010101010101010101010101010101010
>> 1010101010
>> 0101010101010101010101010101010101010101010101010101010101010101010101
>> 0101010101
>> 1010101010101010101010101010101010101010101010101010101010101010101010
>> 1010101010
>> 0101010101010101010101010101010101010101010101010101010101010

Re: [hwloc-users] unusual memory binding results

2019-01-28 Thread Brice Goglin

Can you try again disabling the touching in one thread to check whether
the other thread only touched its own pages? (others' status should be
-2 (ENOENT))

Recent kernels have ways to migrate memory at runtime
(CONFIG_NUMA_BALANCING) but this should only occur when it detects that
some thread does a lot of remote access, which shouldn't be the case
here, at least at the beginning of the program.

Brice



Le 28/01/2019 à 10:35, Biddiscombe, John A. a écrit :
> Brice
>
> I might have been using the wrong params to hwloc_get_area_memlocation in my 
> original version, but I bypassed it and have been calling
>
> int get_numa_domain(void *page)
> {
> HPX_ASSERT( (std::size_t(page) & 4095) ==0 );
>
> void *pages[1] = { page };
> int  status[1] = { -1 };
> if (syscall(__NR_move_pages, 0, 1, pages, nullptr, status, 0) == 
> 0) {
> if (status[0]>=0 && 
> status[0]<=HPX_HAVE_MAX_NUMA_DOMAIN_COUNT) {
> return status[0];
> }
> return -1;
> }
> throw std::runtime_error("Failed to get numa node for page");
> }
>
> this function instead. Just testing one page address at a time. I still see 
> this kind of pattern
> 00101101010010101001010101011010011011010101110101110111010101010101
> 00101101010010101001010101011010011011010101110101110111010101010101
> 00101101010010101001010101011010011011010101110101110111010101010101
> 00101101010010101001010101011010011011010101110101110111010101010101
> 00101101010010101001010101011010011011010101110101110111010101010101
> 00101101010010101001010101011010011011010101110101110111010101010101
> 00101101010010101001010101011010011011010101110101110111010101010101
> 00101101010010101001010101011010011011010101110101110111010101010101
> 00101101010010101001010101011010011011010101110101110111010101010101
> 00101101010010101001010101011010011011010101110101110111010101010101
> 00101101010010101001010101011010011011010101110101110111010101010101
> when I should see
> 01010101010101010101010101010101010101010101010101010101010101010101010101010101
> 10101010101010101010101010101010101010101010101010101010101010101010101010101010
> 01010101010101010101010101010101010101010101010101010101010101010101010101010101
> 10101010101010101010101010101010101010101010101010101010101010101010101010101010
> 01010101010101010101010101010101010101010101010101010101010101010101010101010101
> 10101010101010101010101010101010101010101010101010101010101010101010101010101010
> 01010101010101010101010101010101010101010101010101010101010101010101010101010101
> 10101010101010101010101010101010101010101010101010101010101010101010101010101010
> 01010101010101010101010101010101010101010101010101010101010101010101010101010101
> 10101010101010101010101010101010101010101010101010101010101010101010101010101010
>
> I am deeply troubled by this and can't think of what to try next since I can 
> see the memory contents hold the correct CPU ID of the thread that touched 
> the memory, so either the syscall is wrong, or the kernel is doing something 
> else. I welcome any suggestions on what might be wrong.
>
> Thanks for trying to help.
>
> JB
>
> -Original Message-
> From: Brice Goglin  
> Sent: 26 January 2019 10:19
> To: Biddiscombe, John A. 
> Cc: Hardware locality user list 
> Subject: Re: [hwloc-users] unusual memory binding results
>
> Le 25/01/2019 à 23:16, Biddiscombe, John A. a écrit :
>>> move_pages() returning 0 with -14 in the status array? As opposed to 
>>> move_pages() returning -1 with errno set to 14, which would definitely be a 
>>> bug in hwloc.
>> I think it was move_pages returning zero with -14 in the status array, and 
>> then hwloc returning 0 with an empty nodeset (which I then messed up by 
>> calling get bitmap first and assuming 0 meant numa node zero and not 
>> checking for an empty nodeset).
>>
>> I'm not sure why I get -EFAULT status rather than -NOENT, but that's what 
>> I'm seeing in the status field when I pass the pointer returned from the 
>> alloc_membind call.
> The only reason I see for getting -EFAULT there would be that you pass the 
> buffer to move_pages (what hwloc_get_area_memlocation() wants, a start 
> pointer and length) instead of a pointer to an array of page addresses 
> (move_pages wants a void** pointing to individual pages).
>
> Brice
>
>
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] unusual memory binding results

2019-01-25 Thread Brice Goglin


Le 25/01/2019 à 14:17, Biddiscombe, John A. a écrit :
> Dear List/Brice
>
> I experimented with disabling the memory touch on threads except for 
> N=1,2,3,4 etc and found a problem in hwloc, which is that the function 
> hwloc_get_area_memlocation was returning '0' when the status of the memory 
> null move operation was -14 (#define EFAULT 14 /* Bad address */). This was 
> when I call get area memlocation immediately after allocating and then 'not' 
> touching. I think if the status is an error, then the function should 
> probably return -1, but anyway. I'll file a bug and send a patch if this is 
> considered to be a bug.


Just to be sure, you talking about move_pages() returning 0 with -14 in
the status array? As opposed to move_pages() returning -1 with errno set
to 14, which would definitely be a bug in hwloc.


When the page is valid but not allocated yet, move_pages() is supposed
to return status = -ENOENT. This case is not an error, so returning 0
with an empty nodeset looks fine to me (pages are not allocated, hence
they are allocated on an empty set of nodes).

-EFAULT means that the page is invalid (you'd get a segfault if you
touch it). I am not sure what we should return in that case. It's also
true that pages are allocated nowhere :)

Anyway, if you get -EFAULT in status, it should mean that an invalid
address was passed to hwloc_get_area_memlocation() or an invalid length.

Brice


___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] unusual memory binding results

2019-01-21 Thread Brice Goglin


Le 21/01/2019 à 17:08, Biddiscombe, John A. a écrit :
> Dear list,
>
> I'm allocating a matrix of size (say) 2048*2048 on a node with 2 numa domains 
> and initializing the matrix by using 2 threads, one pinned on each numa 
> domain - with the idea that I can create tiles of memory bound to each numa 
> domain rather than having pages assigned all to one, interleaved, or possibly 
> random. The tiling pattern can be user defined, but I am using a simple 
> strategy that touches pages based on a simple indexing scheme using (say) a 
> tile size of 256 elements and should give a pattern like this


Hello John,

First idea:

A title of 256 element means you're switching between tiles every 2kB
(if elements are double precision), hence half the page belongs to one
thread and the other half to the another thread, hence only the first
one touching his tile will actually allocate locally.

One way to debug would be to disable touching in N-1 thread to check
that everything allocated in on the right node.

Can you share the code, or at least part of it?

Brice


___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] mem bind

2018-12-21 Thread Brice Goglin

Hello

That's not how current operating systems work, hence hwloc cannot do it.
Usually you can bind a process virtual memory to a specific part of the
physical memory (a NUMA node is basically a big static range), but the
reverse isn't allowed by any OS I know.

If you can tweak the hardware, you could try tweaking the ACPI tables so
that a specific range of physical memory moves a new dedicated NUMA node :)

Another crazy idea is to tell the Linux kernel at boot that your ranges
aren't RAM but non-volatile memory. They won't be used by anybody by
default, but you can make them "dax" devices that programs could mmap.

Brice




Le 21/12/2018 à 21:11, Dahai Guo a écrit :
> Hi, 
>
> I was wondering if there is a good way in hwloc to bind a particular
> range of memory to a process? For example, suppose there are totally
> 1000MB on the node, how to bind memory range [50, 100]  to a process,
> and [101,200] to another one?
>
> If hwloc can, an example will be greatly appreciated.
>
> D.
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Travis CI unit tests failing with HW "operating system" error

2018-09-13 Thread Brice Goglin

This is actually just a warning. Usually it causes the topology to be
wrong (like a missing object), but it shouldn't prevent the program from
working. Are you sure your programs are failing because of hwloc? Do you
have a way to run lstopo on that node?

By the way, you shouldn't use hwloc 2.0.0rc2, at least because it's old,
it has a broken ABI, and it's a RC :)

Brice



Le 13/09/2018 à 16:12, Jeff Hammond a écrit :
> I am running ARMCI-MPI over MPICH in a Travis CI Linux instance and
> topology is causing it to fail.  I do not care about topology in a
> virtualized environment.  How do I fix this?
>
> 
> * hwloc 2.0.0rc2-git has encountered what looks like an error from the
> operating system.
> *
> * Group0 (cpuset 0x,0x) intersects with L3 (cpuset
> 0x1000,0x0212) without inclusion!
> * Error occurred in topology.c line 1384
> *
> * The following FAQ entry in the hwloc documentation may help:
> *   What should I do when hwloc reports "operating system" warnings?
> * Otherwise please report this error message to the hwloc user's
> mailing list
> * along with the files generated by the hwloc-gather-topology script.
> 
>
> https://travis-ci.org/jeffhammond/armci-mpi/jobs/425342479 has all of
> the details.
>
> Jeff
>
>
> --
> Jeff Hammond
> jeff.scie...@gmail.com 
> http://jeffhammond.github.io/
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] How to get pid in hwloc?

2018-09-04 Thread Brice Goglin

Hello

The only public portability layer we have for PIDs is hwloc_pid_t when
passed to things like set_proc_cpubind(). But we don't have a portable
getpid() or printf(). You'll have to use getpid() and printf("%ld",
(long)pid) on Unix.

On Windows, hwloc_pid_t is a HANDLE, you don't want to print that. You
can print a process number, and get a HANDLE from a process number using
something like
https://github.com/open-mpi/hwloc/blob/master/utils/hwloc/misc.h#L323

Brice



Le 05/09/2018 à 00:01, Junchao Zhang a écrit :
> Hi,
>   hwloc_set_proc_cpubind() has a pid argument. But how to get the pid
> portably? In addition, I want to convert a pid to an integer and then
> print it out. Does hwloc has APIs to support the needs?
>   Thank you.
> --Junchao Zhang
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] conflicts of multiple hwloc libraries

2018-09-01 Thread Brice Goglin

This was also addressed offline while the mailing was (again) broken.

Some symbols weren't renamed in old releases. This was fixed a couple
months ago. It will be in 2.0.2 and 1.11.11 (to be released on Monday
Sept 3rd).

Brice



Le 30/08/2018 à 06:31, Junchao Zhang a écrit :
> Hi,
>    My program calls a third party library, which in turn contains an
> embedded hwloc library.  My program itself also calls hwloc, and I
> installed a higher version of hwloc than the library's.  It seems this
> setting works with dynamic build. But on Cray machines with static
> library, I have a "multiple definition of `hwloc_linux_component'"
> error when linking my code. How to fix that?
>   Thank you.
> --Junchao Zhang
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Question about hwloc_bitmap_singlify

2018-08-28 Thread Brice Goglin

Hello

If you bind a thread to a newset that contains 4 PUs (4 bits), the
operating system scheduler is free to run that thread on any of these
PUs. It means it may run on it on one PU, then migrate it to the other
PU, then migrate it back, etc. If these PUs do not share all caches, you
will see a performance drop because the data you put in the cache when
running on PU1 has to be stored/migrated in the cache on another PU when
the thread is migrated by the OS scheduler. If the PU share all caches,
the performance drop is much lower, but still exists because migrating
tasks between PU takes a bit of time.

If you call hwloc_bitmap_signlify(newset) before binding, you basically
say "I want to run on any of these 4 PUs, I am actually going to run on
a specific one". Singlify takes your set of PUs in the bitmap and keeps
a single one. Your original binding is respected (you run inside the
original binding), but you don't use all of them.

HOWEVER if you bind multiple threads to the same identical newset, you
don't want to singlify because all of them would run on the SAME PU. You
can either bind without singlify() so that the OS scheduler spreads your
threads on different PUs among newset. Or you want to manually split
newset into multiple subset (hwloc_distrib can do that).

I'll try to improve the doc.

Brice



Le 29/08/2018 à 06:26, Junchao Zhang a écrit :
> Hi,   
>   On cpu binding, hwloc manual says "It is often useful to call
> hwloc_bitmap_singlify() first so that a single CPU remains in the set.
> This way, the process will not even migrate between different CPUs
> inside the given set" . I don't understand it. If I do not do
> hwloc_bitmap_singlify, what will happen? Suppose a process's old cpu
> binding is oldset, and I want to bind it to newset. What should I do
> to use hwloc_bitmap_singlify?
>   Thank you.
> --Junchao Zhang
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] How to combine bitmaps on MPI ranks?

2018-08-28 Thread Brice Goglin

This question was addressed offline while the mailing lists were offline.

We had things like hwloc_bitmap_set_ith_ulong() and
hwloc_bitmap_from_ith_ulong() for packing/unpacking but they weren't
very convenient unless you know multiple ulongs are actually needed to
store the bitmap.

We added new functions to ease things
(hwloc_bitmap_nr/from/to_ulongs()). They will be in the upcoming hwloc 2.1.

Brice



Le 23/08/2018 à 04:57, Junchao Zhang a écrit :
> Hello,
>   Suppose I call hwloc on two MPI ranks and get a bitmap on each.  On
> rank 0, I want to bitwise OR the two. How to do that?  I did not find
> bitmap APIs to pack/unpack bitmaps to/from ulongs for MPI send/recv
> purpose. 
>   Thank you.
> --Junchao Zhang
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Please help interpreting reported topology - possible bug?

2018-05-17 Thread Brice Goglin

Hello Hartmut

The mailing list address changed a while ago, there's an additional
"lists." in the domaine name.

Regarding your question, I would assume you are running in a cgroup with
the second NUMA node disallowed (while all the corresponding cores are
allowed). lstopo with --whole-system would confirm that by showing
disallowed stuff.

Brice



Le 17/05/2018 à 15:58, Hartmut Kaiser a écrit :
> Let me rephrase my question below:
>
> Why does the second socket does not show up as a NUMA domain (as the first
> socket does)?
> Is this a problem in HWLOC or is this expected?
>
> Thanks!
> Regards Hartmut
> ---
> http://stellar.cct.lsu.edu
> https://github.com/STEllAR-GROUP/hpx
>
>> -Original Message-
>> From: Hartmut Kaiser [mailto:hartmut.kai...@gmail.com]
>> Sent: Wednesday, May 16, 2018 3:48 PM
>> To: hwloc-us...@open-mpi.org
>> Subject: Please help interpreting reported topology
>>
>> All,
>>
>> We're seeing some topology reported by hwloc V2.0 we're not able to
>> interpret. Here is what lstopo gives us:
>>
>> Machine (63GB total)
>>   Package L#0
>> NUMANode L#0 (P#0 63GB)
>> L3 L#0 (30MB)
>>   L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0
>> (P#0)
>>   L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1
>> (P#1)
>>   L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2
>> (P#2)
>>   L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3
>> (P#3)
>>   L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4
>> (P#4)
>>   L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5
>> (P#5)
>>   L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 + PU L#6
>> (P#6)
>>   L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 + PU L#7
>> (P#7)
>>   L2 L#8 (256KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 + PU L#8
>> (P#8)
>>   L2 L#9 (256KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 + PU L#9
>> (P#9)
>>   L2 L#10 (256KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 + PU
>> L#10 (P#10)
>>   L2 L#11 (256KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 + PU
>> L#11 (P#11)
>> HostBridge
>>   PCIBridge
>> PCI 04:00.0 (VGA)
>>   PCI 00:11.4 (SATA)
>>   PCIBridge
>> PCI 07:00.0 (Ethernet)
>>   Net "eno1"
>>   PCIBridge
>> PCI 08:00.0 (Ethernet)
>>   Net "eno2"
>>   PCI 00:1f.2 (SATA)
>> Block(Removable Media Device) "sr0"
>> Block(Disk) "sda"
>>   PCI 00:1f.5 (IDE)
>>   Package L#1 + L3 L#1 (30MB)
>> L2 L#12 (256KB) + L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12 + PU
>> L#12 (P#12)
>> L2 L#13 (256KB) + L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13 + PU
>> L#13 (P#13)
>> L2 L#14 (256KB) + L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14 + PU
>> L#14 (P#14)
>> L2 L#15 (256KB) + L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15 + PU
>> L#15 (P#15)
>> L2 L#16 (256KB) + L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16 + PU
>> L#16 (P#16)
>> L2 L#17 (256KB) + L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17 + PU
>> L#17 (P#17)
>> L2 L#18 (256KB) + L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18 + PU
>> L#18 (P#18)
>> L2 L#19 (256KB) + L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19 + PU
>> L#19 (P#19)
>> L2 L#20 (256KB) + L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20 + PU
>> L#20 (P#20)
>> L2 L#21 (256KB) + L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21 + PU
>> L#21 (P#21)
>> L2 L#22 (256KB) + L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22 + PU
>> L#22 (P#22)
>> L2 L#23 (256KB) + L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23 + PU
>> L#23 (P#23)
>>
>> The machine has 2 sockets of Intel E5-2670 v3 and has HT disabled.
>> Everything looks ok for the first socket, but the second does not make any
>> sense to us.
>>
>> Any help would be appreciated.
>>
>> Thanks!
>> Regards Hartmut
>> ---
>> http://boost-spirit.com
>> http://stellar.cct.lsu.edu
>>
>
>

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Netloc integration with hwloc

2018-04-04 Thread Brice Goglin

Le 04/04/2018 à 16:49, Madhu, Kavitha Tiptur a écrit :
>
> — I tried building older netloc with hwloc 2.0 and it throws compiler errors. 
> Note that netloc was cloned from it’s git repo.

My guess is that the "map" part that joins netloc's info about the
fabric with hwloc's info about the nodes doesn't like hwloc 2.0. But
that should be easy to disable in the Makefiles and/or to update for
hwloc 2.0.

>>> The plan should rather be to tell us what you need from netloc so that
>>> we can reenable it with a good API. We hear lots of people saying they
>>> are interested in netloc, but *nobody* ever told us anything about what
>>> they want to do for real. And I am not even sure anybody ever played
>>> with the old API. This software cannot go forward unless we know where
>>> it's going. There are many ways to design the netloc API.
> — At this point, our requirement is to expose graph construction from raw 
> topology xml and mapping and traversal at best.
> I see some of these already defined in private/hwloc.h in the newer version. 
> Our problem here Is that we couldn’t build it in embedded mode, which is how 
> we are using hwloc.

Can't you hack your build system to build hwloc in standalone instead of
embedded mode for testing? Or use an external hwloc instead of your
embedded one?
I'd like to get feedback about private/netloc.h before making some of it
public.

I'll look at making libnetloc embeddable in 2.1.

Brice

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Netloc integration with hwloc

2018-04-03 Thread Brice Goglin

If you really want the old netloc API now, you could try hwloc 2.x with
the old netloc. But that's certainly not maintained anymore, and that
only works for IB while the new netloc should have OPA and Cray support
soon.

The plan should rather be to tell us what you need from netloc so that
we can reenable it with a good API. We hear lots of people saying they
are interested in netloc, but *nobody* ever told us anything about what
they want to do for real. And I am not even sure anybody ever played
with the old API. This software cannot go forward unless we know where
it's going. There are many ways to design the netloc API.

* We had an explicit graph API in the old netloc but that API implied
expensive graph algorithmics in the runtimes using it. It seemed
unusable for taking decision at runtime anyway, but again ever nobody
tried. Also it was rather strange to expose the full graph when you know
the fabric is a 3D dragonfly on Cray, etc.

* In the new netloc, we're thinking of having higher-level implicit
topologies for each class of fabric (dragon-fly, fat-tree, clos-network,
etc) that require more work on the netloc side and easier work in the
runtime using it. However that's less portable than exposing the full
graph. Not sure which one is best, or if both are needed.

* There are also issues regarding nodes/links failure etc. How do we
expose topology changes at runtime? Do we have a daemon running as root
in the background, etc?

Lots of question that need to be discussed before we expose a new API In
the wild. Unfortunately, we lost several years because of the lack of
users' feedback. I don't want to invest time and rush for a new API if
MPICH never actually uses it like other people did in the past.

Brice

Le 04/04/2018 à 01:36, Balaji, Pavan a écrit :
> Brice,
>
> We want to use both hwloc and netloc in mpich.  What are our options here?  
> Move back to hwloc-1.x?  That’d be a bummer because we already invested a lot 
> of effort to migrate to hwloc-2.x.
>
>   — Pavan
>
> Sent from my iPhone
>
>> On Apr 3, 2018, at 6:19 PM, Brice Goglin <brice.gog...@inria.fr> wrote:
>>
>> It's not possible now but that would certainly be considered whenever
>> people start using the API and linking against libnetloc.
>>
>> Brice
>>
>>
>>
>>
>>> Le 03/04/2018 à 21:34, Madhu, Kavitha Tiptur a écrit :
>>> Hi
>>> A follow up question, is it possible to build netloc along with hwloc in 
>>> embedded mode?
>>>
>>>
>>>> On Mar 30, 2018, at 1:34 PM, Brice Goglin <brice.gog...@inria.fr> wrote:
>>>>
>>>> Hello
>>>>
>>>> In 2.0, netloc is still highly experimental. Hopefully, a large rework
>>>> will be merged in git master next month for being released in hwloc 2.1.
>>>>
>>>> Most of the API from the old standalone netloc was made private when
>>>> integrated in hwloc because there wasn't any actual user. The API was
>>>> quite large (things for traversing the graph of both the fabric and the
>>>> servers' internals). We didn't want to expose such a large API before
>>>> getting actual user feedback.
>>>>
>>>> In short, in your need features, please let us know, so that we can
>>>> discuss what to expose in the public headers and how.
>>>>
>>>> Brice
>>>>
>>>>
>>>>
>>>>
>>>>> Le 30/03/2018 à 20:14, Madhu, Kavitha Tiptur a écrit :
>>>>> Hi
>>>>>
>>>>> I need some info on the status of netloc integration with hwloc. I see 
>>>>> the include/netloc.h header is almost empty in hwloc 2.0 and lots of 
>>>>> functionality missing compared to the previous standalone netloc release, 
>>>>> even in private/netloc.h. Am I missing something here?
>>>>>
>>>>> Thanks
>>>>> Kavitha
>>>>>
>>>> ___
>>>> hwloc-users mailing list
>>>> hwloc-users@lists.open-mpi.org
>>>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>>> ___
>>> hwloc-users mailing list
>>> hwloc-users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Netloc integration with hwloc

2018-04-03 Thread Brice Goglin

It's not possible now but that would certainly be considered whenever
people start using the API and linking against libnetloc.

Brice




Le 03/04/2018 à 21:34, Madhu, Kavitha Tiptur a écrit :
> Hi
> A follow up question, is it possible to build netloc along with hwloc in 
> embedded mode?
>
>
>> On Mar 30, 2018, at 1:34 PM, Brice Goglin <brice.gog...@inria.fr> wrote:
>>
>> Hello
>>
>> In 2.0, netloc is still highly experimental. Hopefully, a large rework
>> will be merged in git master next month for being released in hwloc 2.1.
>>
>> Most of the API from the old standalone netloc was made private when
>> integrated in hwloc because there wasn't any actual user. The API was
>> quite large (things for traversing the graph of both the fabric and the
>> servers' internals). We didn't want to expose such a large API before
>> getting actual user feedback.
>>
>> In short, in your need features, please let us know, so that we can
>> discuss what to expose in the public headers and how.
>>
>> Brice
>>
>>
>>
>>
>> Le 30/03/2018 à 20:14, Madhu, Kavitha Tiptur a écrit :
>>> Hi
>>>
>>> I need some info on the status of netloc integration with hwloc. I see the 
>>> include/netloc.h header is almost empty in hwloc 2.0 and lots of 
>>> functionality missing compared to the previous standalone netloc release, 
>>> even in private/netloc.h. Am I missing something here?
>>>
>>> Thanks
>>> Kavitha
>>>
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Netloc integration with hwloc

2018-03-30 Thread Brice Goglin

Hello

In 2.0, netloc is still highly experimental. Hopefully, a large rework
will be merged in git master next month for being released in hwloc 2.1.

Most of the API from the old standalone netloc was made private when
integrated in hwloc because there wasn't any actual user. The API was
quite large (things for traversing the graph of both the fabric and the
servers' internals). We didn't want to expose such a large API before
getting actual user feedback.

In short, in your need features, please let us know, so that we can
discuss what to expose in the public headers and how.

Brice




Le 30/03/2018 à 20:14, Madhu, Kavitha Tiptur a écrit :
> Hi
>
> I need some info on the status of netloc integration with hwloc. I see the 
> include/netloc.h header is almost empty in hwloc 2.0 and lots of 
> functionality missing compared to the previous standalone netloc release, 
> even in private/netloc.h. Am I missing something here?
>
> Thanks
> Kavitha
>

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

[hwloc-users] libhwloc soname change in 2.0.1rc1

2018-03-21 Thread Brice Goglin

Hello

In case you missed the announce yesterday, hwloc 2.0.1rc1 changes the
library soname from 12:0:0 to 15:0:0. On Linux, it means that we'll now
build libhwloc.so.15 instead of libhwloc.so.12. That means any
application built for hwloc 2.0.0 will need to be recompiled against 2.0.1.

I should have set the soname to 15:0:0 in 2.0.0 but I forgot. It may
cause issues because hwloc 1.11.x uses 12:x:y (we have "12" in both).
Given that 2.0.0 isn't widely used yet, I hope this way-too-late change
won't cause too many issues. Sorry.

As said on the download page, we want people to stop using 2.0.0 so that
we can forget this issue. If you already switched to hwloc 2.0.0 (and if
some applications are linked with libhwloc), please try to upgrade to
2.0.1 as soon as possible (final release expected next monday).

Brice

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] NUMA, io and miscellaneous object depths

2018-03-14 Thread Brice Goglin

Good point. In theory, that's possible because we only look at cpusets
(NUMA nodes have cpusets, I/O don't). So the name of the function still
matches its behavior.

However it won't happen in practice with the current code because I/O
are always attached to CPU objects. But it may change in the future with
things like processing-in-memory etc.

Instead of calling this function, you could do a while
(!hwloc_obj_type_is_normal(obj->type)) obj = obj->parent;

I'll update the doc too. Thanks.

Brice



Le 14/03/2018 à 22:16, Madhu, Kavitha Tiptur a écrit :
> A follow up question, can the call to hwloc_get_non_io_ancestor_obj() return 
> a numa object? 
>
>> On Mar 14, 2018, at 3:09 PM, Madhu, Kavitha Tiptur <kma...@anl.gov> wrote:
>>
>> Hi
>> This function was used to query depth of hardware objects of a certain type 
>> to bind processes to objects at the depth or above in Hydra previously. As 
>> you pointed out, the functionality makes no sense with NUMA/IO objects 
>> possibly being at different depths or for objects.
>>
>>> On Mar 14, 2018, at 3:00 PM, Brice Goglin <brice.gog...@inria.fr> wrote:
>>>
>>> Hello
>>>
>>> I can fix the documentation to say that the function always suceeds and
>>> returns the virtual depth for NUMA/IO/Misc.
>>>
>>> I don't understand your third sentence. If by "actual depth", you mean
>>> the depth of a (normal) parent where NUMA are attached (for instance the
>>> depth of Package if NUMAs are attached to Packages), see
>>> hwloc_get_memory_parents_depth(). However, you may have NUMA/IO/Misc
>>> attached to parents at different depths, so it doesn't make much sense
>>> in the general case.
>>>
>>> What do you use this function for? I thought of removing it from 2.0
>>> because it's hard to define a "usual" order for object types (for
>>> instance L3 can be above or below NUMA for different modern platforms).
>>>
>>> Brice
>>>
>>>
>>>
>>> Le 14/03/2018 à 20:24, Madhu, Kavitha Tiptur a écrit :
>>>> Hello folks,
>>>>
>>>> The function hwloc_get_type_or_above_depth() is supposed to return the 
>>>> depth of objects of type “type" or above. It internally calls 
>>>> hwloc_get_type_depth which returns virtual depths to NUMA, IO and misc 
>>>> objects. In order to retrieve the actual depth of these objects, one needs 
>>>> to call hwloc_get_obj_depth() with virtual depth. Can the documentation be 
>>>> updated to cover this? Or are there plans of changing this behavior?
>>>>
>>>> Thanks
>>>> Kavitha
>>>> ___
>>>> hwloc-users mailing list
>>>> hwloc-users@lists.open-mpi.org
>>>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>>> ___
>>> hwloc-users mailing list
>>> hwloc-users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] NUMA, io and miscellaneous object depths

2018-03-14 Thread Brice Goglin

Hello

I can fix the documentation to say that the function always suceeds and
returns the virtual depth for NUMA/IO/Misc.

I don't understand your third sentence. If by "actual depth", you mean
the depth of a (normal) parent where NUMA are attached (for instance the
depth of Package if NUMAs are attached to Packages), see
hwloc_get_memory_parents_depth(). However, you may have NUMA/IO/Misc
attached to parents at different depths, so it doesn't make much sense
in the general case.

What do you use this function for? I thought of removing it from 2.0
because it's hard to define a "usual" order for object types (for
instance L3 can be above or below NUMA for different modern platforms).

Brice



Le 14/03/2018 à 20:24, Madhu, Kavitha Tiptur a écrit :
> Hello folks,
>
> The function hwloc_get_type_or_above_depth() is supposed to return the depth 
> of objects of type “type" or above. It internally calls hwloc_get_type_depth 
> which returns virtual depths to NUMA, IO and misc objects. In order to 
> retrieve the actual depth of these objects, one needs to call 
> hwloc_get_obj_depth() with virtual depth. Can the documentation be updated to 
> cover this? Or are there plans of changing this behavior?
>
> Thanks
> Kavitha
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Machine nodes in hwloc topology

2018-02-05 Thread Brice Goglin

Yes, one and only one, the top-level root object of the topology.

This is somehow implied by the doc of HWLOC_OBJ_MACHINE itself ("The
root object type"), I guess I should make it even more clear there? If
there's another place in the doc where this should be added/clarified,
please let me know.

Brice



Le 05/02/2018 à 23:19, Madhu, Kavitha Tiptur a écrit :
> Hi
>
> Thanks for the response. Could you also confirm if hwloc topology
> object would have only machine node?
>
> Thanks,
> Kavitha
>
>
>
>> On Feb 5, 2018, at 4:14 PM, Brice Goglin <brice.gog...@inria.fr
>> <mailto:brice.gog...@inria.fr>> wrote:
>>
>> Hello,
>>
>> Oops, sorry, this sentence is obsolete, I am removing it from the doc
>> right now.
>>
>> We don't support the assembly of multiple machines in a single hwloc
>> topology anymore. For the record, this feature was a very small
>> corner case and it had important limitations (you couldn't bind
>> things or use cpusets unless you were very careful about which host
>> you were talking about), and it made the core hwloc code much more
>> complex.
>>
>> Thanks for the report
>> Brice
>>
>>
>> Le 05/02/2018 à 23:02, Madhu, Kavitha Tiptur a écrit :
>>> Hi
>>>
>>> I have a question on topology query. The hwloc 2.0.0 documentation
>>> states that "Additionally it may assemble the topologies of multiple
>>> machines into a single one so as to let applications consult the
>>> topology of an entire fabric or cluster at once.”. Since “system”
>>> object type has been removed from hwloc, does this statement mean
>>> that multiple “machine” nodes in the topology object would be
>>> combined to one? I can see in function“hwloc_topology_check” that
>>> machine node is at depth 0 and there are no machine nodes at depth
>>> other than 0. Can anyone confirm this?   
>>>
>>> Thanks 
>>> Kavitha
>>>
>>>
>>> ___
>>> hwloc-users mailing list
>>> hwloc-users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>>
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org <mailto:hwloc-users@lists.open-mpi.org>
>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Machine nodes in hwloc topology

2018-02-05 Thread Brice Goglin

Hello,

Oops, sorry, this sentence is obsolete, I am removing it from the doc
right now.

We don't support the assembly of multiple machines in a single hwloc
topology anymore. For the record, this feature was a very small corner
case and it had important limitations (you couldn't bind things or use
cpusets unless you were very careful about which host you were talking
about), and it made the core hwloc code much more complex.

Thanks for the report
Brice


Le 05/02/2018 à 23:02, Madhu, Kavitha Tiptur a écrit :
> Hi
>
> I have a question on topology query. The hwloc 2.0.0 documentation
> states that "Additionally it may assemble the topologies of multiple
> machines into a single one so as to let applications consult the
> topology of an entire fabric or cluster at once.”. Since “system”
> object type has been removed from hwloc, does this statement mean that
> multiple “machine” nodes in the topology object would be combined to
> one? I can see in function“hwloc_topology_check” that machine node is
> at depth 0 and there are no machine nodes at depth other than 0. Can
> anyone confirm this?   
>
> Thanks 
> Kavitha
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

[hwloc-users] need help for testing new Mac OS support

2018-01-26 Thread Brice Goglin

Hello

I need people running Mac OS to test some patches before releasing them
in 2.0rc2 (which is likely delayed to Monday).

Just build this tarball, run lstopo, and report any difference with
older lstopo outputs:

https://ci.inria.fr/hwloc/job/zbgoglin-0-tarball/lastSuccessfulBuild/artifact/hwloc-master-20180126.1056.git55b1b3a.tar.gz

Hopefully, things won't change on most machines. On laptops without
hyperthreading, you should now see the right number of cores (instead of
N/2 dual-thread cores).

If anybody has a dual-socket machine, I am interesting in seeing whether
we properly detect the NUMA nodes. My only test case so far is a
dual-Nehalem machine which reports a single NUMA node. Not sure if the
BIOS is set to NUMA interleaving or if Mac OS reports wrong information
in sysctl.

Thanks

Brice


___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Puzzled by the number of cores on i5-7500

2018-01-25 Thread Brice Goglin

It looks like our Mac OS X backend doesn't properly handle processors
that support hyperthreading without actually having hyperthreads enabled
in hardware. Your processor has 4-core without HT but it's based on a
processor with up to 8 cores and 16 threads. Our current code uses the
latter and therefore wrongly assume you have HT cores, hence reporting 2
HT cores instead of 4 noHT cores.

I guess we would also be wrong if some cores or HT are disabled in
software in Mac OS X.


Anybody reading this from a Mac, could you send the output of these
commands on your machine?
sysctl -a | grep ^hw
sysctl -a | grep ^machdep.cpu
lstopo -

Brice



>
>
> Le 25/01/2018 à 07:14, Olivier Cessenat a écrit :
>> Hello,
>>
>> I’m puzzled by the report from lstopo about the number of physical
>> cores on an iMac with
>> I5-7500. It is specified by Intel as a quad core processor and lstopo
>> reports only 2 cores:
>> lstopo
>> <<
>> Machine (8192MB total) + NUMANode L#0 (P#0 8192MB) + L3 L#0 (6144KB)
>>   Core L#0
>>     L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + PU L#0 (P#0)
>>     L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + PU L#1 (P#1)
>>   Core L#1
>>     L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + PU L#2 (P#2)
>>     L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + PU L#3 (P#3)
>> >>
>> When running system_profiler SPHardwareDataType
>> I obtain:
>> <<
>> Hardware:
>>
>>     Hardware Overview:
>>
>>       Model Name: iMac
>>       Model Identifier: iMac18,3
>>       Processor Name: Intel Core i5
>>       Processor Speed: 3,4 GHz
>>       Number of Processors: 1
>>       Total Number of Cores: 4
>>       L2 Cache (per Core): 256 KB
>>       L3 Cache: 6 MB
>>       Memory: 8 GB
>>       Boot ROM Version: IM183.0151.B00
>>       SMC Version (system): 2.41f1
>>       Serial Number (system): DGKV7HJCJ1GN
>>       Hardware UUID: 3FDAD77B-F4E8-50AB-B0FF-AA5C41CA35FA
>> >>
>>
>> Is there a trick ?
>>
>> Thanks for you help,
>>
>> Olivier Cessenat
>>
>>
>>
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] hwloc-2.0rc1 failure on Solaris

2018-01-25 Thread Brice Goglin

It is actually easy to fix, we just need to move hwloc's #include before
what base64.c actually #include's. That'll be fixed in rc2 too.

Brice



Le 25/01/2018 à 10:56, Brice Goglin a écrit :
> Like the error below?
>
> This code hasn't changed recently. Did you ever build with these flags
> before?
>
> I am not sure I'll have time to fix yet another header crazyness before rc2.
>
> Brice
>
>
>
>   CC   base64.lo
> In file included from
> /builds/hwloc-master-20180124.2347.gitf53fe3a/include/private/private.h:29:0,
>  from base64.c:128:
> /builds/hwloc-master-20180124.2347.gitf53fe3a/include/private/misc.h: In
> function 'hwloc_strncasecmp':
> /builds/hwloc-master-20180124.2347.gitf53fe3a/include/private/misc.h:370:10:
> error: implicit declaration of function 'strncasecmp'; did you mean
> 'strncmp'? [-Werror=implicit-function-declaration]
>    return strncasecmp(s1, s2, n);
>   ^~~
>   strncmp
> cc1: some warnings being treated as errors
>
>
> Le 25/01/2018 à 10:45, Balaji, Pavan a écrit :
>> Hello,
>>
>> hwloc-2.0rc1 build seems to fail on Solaris, with the following CFLAGS:
>>
>> CFLAGS="-Werror-implicit-function-declaration -std=c99"
>>
>> I'm using gcc-4.8.2
>>
>> Thanks,
>>
>>   -- Pavan
>>
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] hwloc-2.0rc1 failure on Solaris

2018-01-25 Thread Brice Goglin

Like the error below?

This code hasn't changed recently. Did you ever build with these flags
before?

I am not sure I'll have time to fix yet another header crazyness before rc2.

Brice



  CC   base64.lo
In file included from
/builds/hwloc-master-20180124.2347.gitf53fe3a/include/private/private.h:29:0,
 from base64.c:128:
/builds/hwloc-master-20180124.2347.gitf53fe3a/include/private/misc.h: In
function 'hwloc_strncasecmp':
/builds/hwloc-master-20180124.2347.gitf53fe3a/include/private/misc.h:370:10:
error: implicit declaration of function 'strncasecmp'; did you mean
'strncmp'? [-Werror=implicit-function-declaration]
   return strncasecmp(s1, s2, n);
  ^~~
  strncmp
cc1: some warnings being treated as errors


Le 25/01/2018 à 10:45, Balaji, Pavan a écrit :
> Hello,
>
> hwloc-2.0rc1 build seems to fail on Solaris, with the following CFLAGS:
>
> CFLAGS="-Werror-implicit-function-declaration -std=c99"
>
> I'm using gcc-4.8.2
>
> Thanks,
>
>   -- Pavan
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] hwloc-2.0rc1 build warnings

2018-01-24 Thread Brice Goglin

#2 and #3 are OK

About #1, we could rename pkgconfig and pkgdata prefixes into something
like hwlocpkgconfig and hwlocpkgdata. I don't think the actual prefix
value matters. I'll try that tomorrow.

Brice


Le 24/01/2018 à 23:46, Balaji, Pavan a écrit :
> Hi Brice,
>
> Here are the other patches that we are currently maintaining for hwloc.  Can 
> you see if these can be integrated upstream too:
>
> https://github.com/pmodels/hwloc/commit/44fe0a500e7828bcb2390fbd24656a7a26b450ed
> https://github.com/pmodels/hwloc/commit/5b6d776a1226148030dcf4e26bd13fe16cc885f9
> https://github.com/pmodels/hwloc/commit/9bf3ff256511ea4092928438f5718904875e65e1
>
> The first one is definitely not usable as-is, since that breaks standalone 
> builds.  But I'm interested in hearing about any better solution that you 
> might have.
>
> Thanks,
>
>   -- Pavan
>
>> On Jan 24, 2018, at 4:43 PM, Brice Goglin <brice.gog...@inria.fr> wrote:
>>
>> Thanks, I am fixing this for rc2 tomorrow.
>>
>> Brice
>>
>>
>>
>> Le 24/01/2018 à 22:59, Balaji, Pavan a écrit :
>>> Folks,
>>>
>>> I'm seeing these warnings on the mac os when building hwloc-2.0rc1 with 
>>> clang:
>>>
>>> 8<
>>> CC   lstopo-lstopo.o
>>> lstopo.c: In function 'usage':
>>> lstopo.c:425:7: warning: "CAIRO_HAS_XLIB_SURFACE" is not defined, evaluates 
>>> to 0 [-Wundef]
>>> #elif CAIRO_HAS_XLIB_SURFACE && (defined HWLOC_HAVE_X11_KEYSYM)
>>>  ^~
>>> lstopo.c: In function 'main':
>>> lstopo.c:1041:5: warning: "CAIRO_HAS_XLIB_SURFACE" is not defined, 
>>> evaluates to 0 [-Wundef]
>>> #if CAIRO_HAS_XLIB_SURFACE && defined HWLOC_HAVE_X11_KEYSYM
>>> 8<
>>>
>>> 8<
>>> % clang --version
>>> Apple LLVM version 9.0.0 (clang-900.0.39.2)
>>> Target: x86_64-apple-darwin17.4.0
>>> Thread model: posix
>>> InstalledDir: 
>>> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
>>> 8<
>>>
>>> Thanks,
>>>
>>> -- Pavan
>>>
>>> ___
>>> hwloc-users mailing list
>>> hwloc-users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] OFED requirements for netloc

2018-01-24 Thread Brice Goglin

OK

In the meantime, maybe you can diff the ibnetdiscover outputs to see if
anything obvious appears? You might need to sort the lines first if the
outputs aren't ordered the same.

Brice




Le 24/01/2018 à 23:33, Craig West a écrit :
> Brice,
>
> The output isn't big, just a pair of IB switches and a dozen hosts,
> some with single, some dual connections.
> However, we would need to sanitise the data, or at least look at it in
> detail first to see what it contains.
>
> I can say that the ibnetdiscover and ibroute commands report version
> 1.6.5 on the system that seq faults, and 1.6.6 on the one that succeeds. 
> And that the first looks to be the standard OFED release and the 1.6.6
> version a mellanox release of OFED.
>
> Craig.
>
> On Tue, 23 Jan 2018 at 17:10 Brice Goglin <brice.gog...@inria.fr
> <mailto:brice.gog...@inria.fr>> wrote:
>
> Hello,
>
> If the output isn't too big, could you put the files gathered by
> netloc_ib_gather_raw online so that we look at them and try to
> reproduce the crash?
>
> Thanks
>
> Brice
>
>
>
> Le 23/01/2018 à 03:54, Craig West a écrit :
>> Hi,
>>
>> I can't find the version requirements for netloc. I've tried it
>> on an older version of OFED and a newer version of Mellanox OFED.
>> The newer version worked, the older segfaults when running the
>> "netloc_ib_extract_dats" process.
>>
>> Thanks,
>> Craig.
>>
>>
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> <mailto:hwloc-users@lists.open-mpi.org>
>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org <mailto:hwloc-users@lists.open-mpi.org>
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Tags for pre-releases

2018-01-23 Thread Brice Goglin

Hello

I didn't know you use submodule. I just pushed tag "hwloc-2.0.0rc1" and
I'll try to remember pushing one for each future rc. If I don't, please
remind me.

I am not going to push all the previous ones because there are just too
many of them. If you need some specific ones, please let me know.

Brice




Le 23/01/2018 à 22:39, Balaji, Pavan a écrit :
> Folks,
>
> [resending to this mailing list; my email to the devel list failed]
>
> There don't seem to be any tags associated with hwloc prereleases (such as 
> hwloc-2.0rc1).  As you know, we embed hwloc into mpich, and we tend to use 
> the git version (through git submodules) rather than release tarballs.  Not 
> having tags for prereleases is making it hard for us to pick a prerelease 
> version.
>
> Can you add tags for the previous prereleases of hwloc?  Or, can you at least 
> add them for 2.0rc1 and other prereleases going forward?
>
> Thanks,
>
>  -- Pavan
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] OFED requirements for netloc

2018-01-22 Thread Brice Goglin

Hello,

If the output isn't too big, could you put the files gathered by
netloc_ib_gather_raw online so that we look at them and try to reproduce
the crash?

Thanks

Brice



Le 23/01/2018 à 03:54, Craig West a écrit :
> Hi,
>
> I can't find the version requirements for netloc. I've tried it on an
> older version of OFED and a newer version of Mellanox OFED. The newer
> version worked, the older segfaults when running the
> "netloc_ib_extract_dats" process.
>
> Thanks,
> Craig.
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] AMD EPYC topology

2017-12-29 Thread Brice Goglin



Le 29/12/2017 à 23:15, Bill Broadley a écrit :
>
>
> Very interesting, I was running parallel finite element code and was seeing
> great performance compared to Intel in most cases, but on larger runs it was 
> 20x
> slower.  This would explain it.
>
> Do you know which commit, or anything else that might help find any related
> discussion?  I tried a few google searches without luck.
>
> Is it specific to the 24-core?  The slowdown I described happened on a 32 core
> Epyc single socket as well as a dual socket 24 core AMD Epyc system.

Hello

Yes it's 24-core specific (that's the only core-count that doesn't have
8-core per zeppelin module).

The commit in Linux git master is 2b83809a5e6d619a780876fcaf68cdc42b50d28c

Brice


commit 2b83809a5e6d619a780876fcaf68cdc42b50d28c
Author: Suravee Suthikulpanit 
Date:   Mon Jul 31 10:51:59 2017 +0200

x86/cpu/amd: Derive L3 shared_cpu_map from cpu_llc_shared_mask

For systems with X86_FEATURE_TOPOEXT, current logic uses the APIC ID
to calculate shared_cpu_map. However, APIC IDs are not guaranteed to
be contiguous for cores across different L3s (e.g. family17h system
w/ downcore configuration). This breaks the logic, and results in an
incorrect L3 shared_cpu_map.

Instead, always use the previously calculated cpu_llc_shared_mask of
each CPU to derive the L3 shared_cpu_map.

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] AMD EPYC topology

2017-12-24 Thread Brice Goglin

Hello
Make sure you use a very recent Linux kernel. There was a bug regarding L3 
caches on 24-core Epyc processors which has been fixed in 4.14 and backported 
in 4.13.x (and maybe in distro kernels too).
However, that would likely not cause huge performance difference unless your 
application heavily depends on the L3 cache.
Brice


Le 24 décembre 2017 12:46:01 GMT+01:00, Matthew Scutter 
 a écrit :
>I'm getting poor performance on OpenMPI tasks on a new AMD 7401P EPYC
>server. I suspect hwloc providing a poor topology may have something to
>do
>with it as I receive this warning below when creating a job.
>Requested data files available at http://static.skysight.io/out.tgz
>Cheers,
>Matthew
>
>
>
>
>* hwloc 1.11.8 has encountered what looks like an error from the
>operating
>system.
>
>*
>
>
>* L3 (cpuset 0x6060) intersects with NUMANode (P#0 cpuset
>0x3f3f
>nodeset 0x0001) without inclusion!
>
>
>* Error occurred in topology.c line 1088
>
>
>
>*
>
>
>
>
>* The following FAQ entry in the hwloc documentation may help:
>
>
>*   What should I do when hwloc reports "operating system" warnings?
>
>
>* Otherwise please report this error message to the hwloc user's
>mailing
>list,
>
>* along with the files generated by the hwloc-gather-topology script.
>
>
>
>
>
>
>depth 0:1 Machine (type #1)
>
>
> depth 1:   1 Package (type #3)
>
>
>  depth 2:  4 NUMANode (type #2)
>
>
>   depth 3: 10 L3Cache (type #4)
>
>
>depth 4:24 L2Cache (type #4)
>
>
> depth 5:   24 L1dCache (type #4)
>
>
>  depth 6:  24 L1iCache (type #4)
>
>
>   depth 7: 24 Core (type #5)
>
>
>
>depth 8:48 PU (type #6)
>
>
>
>Special depth -3:   12 Bridge (type #9)
>
>
>Special depth -4:   9 PCI Device (type #10)
>
>
>Special depth -5:   4 OS Device (type #11)
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] How are processor groups under Windows reported?

2017-11-29 Thread Brice Goglin

Hello

We only add hwloc Group objects when necessary. On your system, each
processor group contains a single NUMA node, so these Groups would not
really bring additional information about the hierarchy of resources.
If you had a bigger system with, let's say, 4 NUMA nodes, with 2 of them
in each processor groups, hwloc would report those as hwloc Group objects.

Does this help? I can clarify the FAQ if needed.

Brice



Le 29/11/2017 14:25, David Creasy a écrit :
> Hello,
>
> Thank you to all contributors to hwloc - very useful.
>
> In the FAQ,  under the section "What are these Group objects in my
> topology?" it says that they are used for "Windows processor groups".
> However, I'm either not seeing this, or I'm looking in the wrong
> place. On a system with two processor groups, I get:
>
> C:\temp\hwloc-win64-build-1.11.8\bin>hwloc-info.exe
> depth 0:1 Machine (type #1)
>  depth 1:   2 NUMANode (type #2)
>   depth 2:  2 Package (type #3)
>depth 3: 2 L3Cache (type #4)
> depth 4:12 L2Cache (type #4)
>  depth 5:   12 L1dCache (type #4)
>   depth 6:  12 L1iCache (type #4)
>depth 7: 12 Core (type #5)
> depth 8:24 PU (type #6)
>
> C:\temp\hwloc-win64-build-1.11.8\bin>hwloc-ls.exe
> Machine (1506MB total)
>   NUMANode L#0 (P#0 346MB) + Package L#0 + L3 L#0 (12MB)
> L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
>   PU L#0 (P#0)
>   PU L#1 (P#1)
> L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
>   PU L#2 (P#2)
>   PU L#3 (P#3)
> L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
>   PU L#4 (P#4)
>   PU L#5 (P#5)
> L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
>   PU L#6 (P#6)
>   PU L#7 (P#7)
> L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4
>   PU L#8 (P#8)
>   PU L#9 (P#9)
> L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5
>   PU L#10 (P#10)
>   PU L#11 (P#11)
>   NUMANode L#1 (P#1 1160MB) + Package L#1 + L3 L#1 (12MB)
> L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6
>   PU L#12 (P#64)
>   PU L#13 (P#65)
> L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7
>   PU L#14 (P#66)
>   PU L#15 (P#67)
> L2 L#8 (256KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8
>   PU L#16 (P#68)
>   PU L#17 (P#69)
> L2 L#9 (256KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9
>   PU L#18 (P#70)
>   PU L#19 (P#71)
> L2 L#10 (256KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10
>   PU L#20 (P#72)
>   PU L#21 (P#73)
> L2 L#11 (256KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11
>   PU L#22 (P#74)
>   PU L#23 (P#75)
>
> I definitely have 2 processor groups:
> C:\Windows\system32>bcdedit /enum | find "group"
> groupsize   6
> maxgroupYes
>
> And you can see this because the processor numbers above in the second
> numa node start at 64. Also, calling GetActiveProcessorGroupCount()
> returns 2.
>
> I was expecting to get "2" back from:
> hwloc_get_nbobjs_by_type(hwlocTopology_, HWLOC_OBJ_GROUP)
>
> but that returns 0. Am I doing something wrong?
>
> Thank you!
>
> David
>

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

[hwloc-users] RFCs about latest API changes

2017-11-19 Thread Brice Goglin

Hello

Here are 4 pull requests about the likely-last significant API changes
for hwloc 2.0. You'll get more details by clicking on the links. I'll
merge these next week unless somebody complains.

Only maintain allowed_cpuset and allowed_nodeset for the entire topology
https://github.com/open-mpi/hwloc/pull/277

Make all depths *signed* ints
https://github.com/open-mpi/hwloc/pull/276

Remove the "System" object type
https://github.com/open-mpi/hwloc/pull/275

Move local_memory to NUMA node specific attrs
https://github.com/open-mpi/hwloc/pull/274

Brice





Le 26/10/2017 17:36, Brice Goglin a écrit :
> Hello
>
> I finally merged the new memory model in master (mainly for properly
> supporting KNL-like heterogeneous memory). This was the main and last
> big change for hwloc 2.0. I still need to fix some caveats (and lstopo
> needs to better display NUMA nodes) but that part of the API should be
> ready.
>
> Now we encourage people to start porting their code to the new hwloc 2.0
> API. Here's a guide that should answer most questions about the upgrade:
> https://github.com/open-mpi/hwloc/wiki/Upgrading-to-v2.0-API
>
> The final 2.0 release isn't planned before at least the end of november,
> but we need to fix API issues before releasing it. So please start
> testing it and report issues, missing docs, etc. If there's any existing
> function/feature (either new or old) that needs to be changed, please
> report it too. We're only breaking the ABI once for 2.0, we cannot break
> it again 2 months later.
>
>
> Tarballs of git master are already available from
> https://ci.inria.fr/hwloc/job/master-0-tarball/lastBuild/
> and in nightly snapshots on the website starting tomorrow.
>
>
> There are still a couple things that may or may not change before the
> final 2.0 API. If you have an opinion, please let us know.
> * (likely) Make all depths *signed* ints: some objects have a negative
> depth (meaning "special depth", not a normal depth in the main tree).
> You'll have to cast to (int) whenever you printf a depth while
> supporting both hwloc 1.x and 2.x.
> * (likely) Drop obj->allowed_cpuset (and allowed_nodeset) and just keep
> one for the entire topology: It is very rarely used (only when you set
> HWLOC_TOPOLOGY_FLAG_WHOLESYSTEM) and can be emulated by doing a binary
> "and" or "intersects" between obj->cpuset and topology->allowed_cpuset.
> * (likely) obj->memory becomes obj->attr->numanode since it's only used
> for numa nodes. But obj->total_memory should remain in obj because it's
> available in all objects (accumulated memory in all children).
> * (likely) Remove HWLOC_OBJ_SYSTEM: not used anymore (we don't support
> multinode topologies anymore). The root is always MACHINE now. I guess
> we'd #define SYSTEM MACHINE so that you don't have to change your code.
> * (unlikely) rename some info objects for consistency (examples below).
>   + GPUVendor and PCIVendor and CPUVendor -> Vendor.
>   + GPUModel and PCIDevice and CPUModel -> Model
>   + NVIDIASerial and MICSerialNumber -> SerialNumber
> But that will make your life harder for looking up attributes while
> supporting hwloc 1.x and 2.x. And XML import from 1.x would be more
> expensive since we'd have to rename these.
> * (unlikely) Share information between osdev (e.g. eth0 or cuda0) and
> pcidev: Lots of attributes are identical (Vendor, Model, kind of device
> etc). We could merge those objects into a single generic "I/O object".
> However a single PCI device can contain multiple OS devices (for
> instance "mlx5_0"+"ib0", or "cuda0"+"opencl0d0", etc).
>
>
> --
> Brice
>

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] question about hwloc_set_area_membind_nodeset

2017-11-13 Thread Brice Goglin

The doc is wrong, flags are used, only for BY_NODESET. I actually fixed
that in git very recently.

Brice



Le 13/11/2017 07:24, Biddiscombe, John A. a écrit :
> In the documentation for get_area_memlocation it says
> "If HWLOC_MEMBIND_BYNODESET is specified, set is considered a nodeset. 
> Otherwise it's a cpuset."
>
> but it also says "Flags are currently unused."
>
> so where should the BY_NODESET policy be used? Does it have to be used with 
> the original alloc call?
>
> thanks
>
> JB
>
> 
> From: hwloc-users [hwloc-users-boun...@lists.open-mpi.org] on behalf of 
> Biddiscombe, John A. [biddi...@cscs.ch]
> Sent: 13 November 2017 14:59
> To: Hardware locality user list
> Subject: Re: [hwloc-users] question about hwloc_set_area_membind_nodeset
>
> Brice
>
> aha. thanks. I knew I'd seen a function for that, but couldn't remember what 
> it was.
>
> Cheers
>
> JB
> ____
> From: hwloc-users [hwloc-users-boun...@lists.open-mpi.org] on behalf of Brice 
> Goglin [brice.gog...@inria.fr]
> Sent: 13 November 2017 14:57
> To: Hardware locality user list
> Subject: Re: [hwloc-users] question about hwloc_set_area_membind_nodeset
>
> Use get_area_memlocation()
>
> membind() returns where the pages are *allowed* to go (anywhere)
> memlocation() returns where the pages are actually allocated.
>
> Brice
>
>
>
>
> Le 13/11/2017 06:52, Biddiscombe, John A. a écrit :
>> Thank you to you both.
>>
>> I modified the allocator to allocate one large block using hwloc_alloc and 
>> then use one thread per numa domain to  touch each page according to the 
>> tiling pattern - unfortunately, I hadn't appreciated that now
>> hwloc_get_area_membind_nodeset
>> always returns the full machine numa mask - and not the numa domain that the 
>> page was touched by (I guess it only gives the expected answer when 
>> set_area_membind is used first)
>>
>> I had hoped to use a dynamic query of the pages (using the first one of a 
>> given tile) to schedule each task that operates on a given tile to run on 
>> the numa node that touched it.
>>
>> I can work around this by using a matrix offset calculation to get the numa 
>> node, but if there's a way of querying the page directly - then please let 
>> me know.
>>
>> Thanks
>>
>> JB
>> ________
>> From: hwloc-users [hwloc-users-boun...@lists.open-mpi.org] on behalf of 
>> Samuel Thibault [samuel.thiba...@inria.fr]
>> Sent: 12 November 2017 10:48
>> To: Hardware locality user list
>> Subject: Re: [hwloc-users] question about hwloc_set_area_membind_nodeset
>>
>> Brice Goglin, on dim. 12 nov. 2017 05:19:37 +0100, wrote:
>>> That's likely what's happening. Each set_area() may be creating a new 
>>> "virtual
>>> memory area". The kernel tries to merge them with neighbors if they go to 
>>> the
>>> same NUMA node. Otherwise it creates a new VMA.
>> Mmmm, that sucks. Ideally we'd have a way to ask the kernel not to
>> strictly bind the memory, but just to allocate on a given memory
>> node, and just hope that the allocation will not go away (e.g. due to
>> swapping), which thus doesn't need a VMA to record the information. As
>> you describe below, first-touch achieves that but it's not necessarily
>> so convenient.
>>
>>> I can't find the exact limit but it's something like 64k so I guess
>>> you're exhausting that.
>> It's sysctl vm.max_map_count
>>
>>> Question 2 : Is there a better way of achieving the result I'm looking 
>>> for
>>> (such as a call to membind with a stride of some kind to say put N 
>>> pages in
>>> a row on each domain in alternation).
>>>
>>>
>>> Unfortunately, the interleave policy doesn't have a stride argument. It's 
>>> one
>>> page on node 0, one page on node 1, etc.
>>>
>>> The only idea I have is to use the first-touch policy: Make sure your buffer
>>> isn't is physical memory yet, and have a thread on node 0 read the "0" 
>>> pages,
>>> and another thread on node 1 read the "1" page.
>> Or "next-touch" if that was to ever get merged into mainline Linux :)
>>
>> Samuel
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>>

Re: [hwloc-users] question about hwloc_set_area_membind_nodeset

2017-11-13 Thread Brice Goglin

Use get_area_memlocation()

membind() returns where the pages are *allowed* to go (anywhere)
memlocation() returns where the pages are actually allocated.

Brice




Le 13/11/2017 06:52, Biddiscombe, John A. a écrit :
> Thank you to you both.
>
> I modified the allocator to allocate one large block using hwloc_alloc and 
> then use one thread per numa domain to  touch each page according to the 
> tiling pattern - unfortunately, I hadn't appreciated that now
> hwloc_get_area_membind_nodeset
> always returns the full machine numa mask - and not the numa domain that the 
> page was touched by (I guess it only gives the expected answer when 
> set_area_membind is used first)
>
> I had hoped to use a dynamic query of the pages (using the first one of a 
> given tile) to schedule each task that operates on a given tile to run on the 
> numa node that touched it.
>
> I can work around this by using a matrix offset calculation to get the numa 
> node, but if there's a way of querying the page directly - then please let me 
> know.
>
> Thanks
>
> JB 
> 
> From: hwloc-users [hwloc-users-boun...@lists.open-mpi.org] on behalf of 
> Samuel Thibault [samuel.thiba...@inria.fr]
> Sent: 12 November 2017 10:48
> To: Hardware locality user list
> Subject: Re: [hwloc-users] question about hwloc_set_area_membind_nodeset
>
> Brice Goglin, on dim. 12 nov. 2017 05:19:37 +0100, wrote:
>> That's likely what's happening. Each set_area() may be creating a new 
>> "virtual
>> memory area". The kernel tries to merge them with neighbors if they go to the
>> same NUMA node. Otherwise it creates a new VMA.
> Mmmm, that sucks. Ideally we'd have a way to ask the kernel not to
> strictly bind the memory, but just to allocate on a given memory
> node, and just hope that the allocation will not go away (e.g. due to
> swapping), which thus doesn't need a VMA to record the information. As
> you describe below, first-touch achieves that but it's not necessarily
> so convenient.
>
>> I can't find the exact limit but it's something like 64k so I guess
>> you're exhausting that.
> It's sysctl vm.max_map_count
>
>> Question 2 : Is there a better way of achieving the result I'm looking 
>> for
>> (such as a call to membind with a stride of some kind to say put N pages 
>> in
>> a row on each domain in alternation).
>>
>>
>> Unfortunately, the interleave policy doesn't have a stride argument. It's one
>> page on node 0, one page on node 1, etc.
>>
>> The only idea I have is to use the first-touch policy: Make sure your buffer
>> isn't is physical memory yet, and have a thread on node 0 read the "0" pages,
>> and another thread on node 1 read the "1" page.
> Or "next-touch" if that was to ever get merged into mainline Linux :)
>
> Samuel
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] question about hwloc_set_area_membind_nodeset

2017-11-11 Thread Brice Goglin



Le 12/11/2017 00:14, Biddiscombe, John A. a écrit :
> I'm allocating some large matrices, from 10k squared elements up to
> 40k squared per node.
> I'm also using membind to place pages of the matrix memory across numa
> nodes so that the matrix might be bound according to the kind of
> pattern at the end of this email - where each 1 or 0 corresponds to a
> 256x256 block of memory.
>
> The way I'm doing this is by calling hwloc_set_area_membind_nodeset
> many thousands of times after allocation, and I've found that as the
> matrices get bigger, then after some N calls to area_membind then I
> get a failure and it returns -1 (errno does not seem to be set to
> either ENOSYS or EXDEV) - but strerror report "Cannot allocate memory".
>
> Question 1 : by calling area_setmembind too many times, am I causing
> some resource usage in the memory tables that is being exhausted.
>

Hello

That's likely what's happening. Each set_area() may be creating a new
"virtual memory area". The kernel tries to merge them with neighbors if
they go to the same NUMA node. Otherwise it creates a new VMA. I can't
find the exact limit but it's something like 64k so I guess you're
exhausting that.

> Question 2 : Is there a better way of achieving the result I'm looking
> for (such as a call to membind with a stride of some kind to say put N
> pages in a row on each domain in alternation).

Unfortunately, the interleave policy doesn't have a stride argument.
It's one page on node 0, one page on node 1, etc.

The only idea I have is to use the first-touch policy: Make sure your
buffer isn't is physical memory yet, and have a thread on node 0 read
the "0" pages, and another thread on node 1 read the "1" page.

Brice


>
> Many thanks
>
> JB
>
>
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ... etc
>
>
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] Dual socket AMD Epyc error

2017-10-28 Thread Brice Goglin

Hello,
The Linux kernel reports incorrect L3 information.
Unfortunately, your old kernel seems to already contain patches for
supporting the L3 on this hardware. I found two candidate patches for
further fixing this, one is in 4.10 (cleanup of the above patch) and the
other will only be in 4.14.
I am going to ask AMD about this.
Brice



Le 28/10/2017 06:29, Bill Broadley a écrit :
> Dual socket Epyc 7451 server running a linux 4.4.0 kernel.
>
> When I run OpenMPI jobs I get the message to email here:
>
>  L3 (cpuset 0x0060,0x0060) intersects with NUMANode (P#0 cpuset
> 0x003f,0x003f) without inclusion!
> * Error occurred in topology.c line 1048
>
> Here's the requested files:
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

[hwloc-users] new memory model and API

2017-10-26 Thread Brice Goglin

Hello

I finally merged the new memory model in master (mainly for properly
supporting KNL-like heterogeneous memory). This was the main and last
big change for hwloc 2.0. I still need to fix some caveats (and lstopo
needs to better display NUMA nodes) but that part of the API should be
ready.

Now we encourage people to start porting their code to the new hwloc 2.0
API. Here's a guide that should answer most questions about the upgrade:
https://github.com/open-mpi/hwloc/wiki/Upgrading-to-v2.0-API

The final 2.0 release isn't planned before at least the end of november,
but we need to fix API issues before releasing it. So please start
testing it and report issues, missing docs, etc. If there's any existing
function/feature (either new or old) that needs to be changed, please
report it too. We're only breaking the ABI once for 2.0, we cannot break
it again 2 months later.


Tarballs of git master are already available from
https://ci.inria.fr/hwloc/job/master-0-tarball/lastBuild/
and in nightly snapshots on the website starting tomorrow.


There are still a couple things that may or may not change before the
final 2.0 API. If you have an opinion, please let us know.
* (likely) Make all depths *signed* ints: some objects have a negative
depth (meaning "special depth", not a normal depth in the main tree).
You'll have to cast to (int) whenever you printf a depth while
supporting both hwloc 1.x and 2.x.
* (likely) Drop obj->allowed_cpuset (and allowed_nodeset) and just keep
one for the entire topology: It is very rarely used (only when you set
HWLOC_TOPOLOGY_FLAG_WHOLESYSTEM) and can be emulated by doing a binary
"and" or "intersects" between obj->cpuset and topology->allowed_cpuset.
* (likely) obj->memory becomes obj->attr->numanode since it's only used
for numa nodes. But obj->total_memory should remain in obj because it's
available in all objects (accumulated memory in all children).
* (likely) Remove HWLOC_OBJ_SYSTEM: not used anymore (we don't support
multinode topologies anymore). The root is always MACHINE now. I guess
we'd #define SYSTEM MACHINE so that you don't have to change your code.
* (unlikely) rename some info objects for consistency (examples below).
  + GPUVendor and PCIVendor and CPUVendor -> Vendor.
  + GPUModel and PCIDevice and CPUModel -> Model
  + NVIDIASerial and MICSerialNumber -> SerialNumber
But that will make your life harder for looking up attributes while
supporting hwloc 1.x and 2.x. And XML import from 1.x would be more
expensive since we'd have to rename these.
* (unlikely) Share information between osdev (e.g. eth0 or cuda0) and
pcidev: Lots of attributes are identical (Vendor, Model, kind of device
etc). We could merge those objects into a single generic "I/O object".
However a single PCI device can contain multiple OS devices (for
instance "mlx5_0"+"ib0", or "cuda0"+"opencl0d0", etc).


--
Brice

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] linkspeed in hwloc_obj_attr_u::hwloc_pcidev_attr_s struct while traversing topology

2017-10-13 Thread Brice Goglin

Hello

On Linux, the PCI linkspeed requires root privileges unfortunately
(except for the uplink above NVIDIA GPUs where we have another way to
find it).
The only way to workaround this is to dump the topology as XML as root
and then reload it at runtime (e.g. with HWLOC_XMLFILE) :/

Brice



Le 13/10/2017 10:53, TEJASWI k a écrit :
> I am trying to traverse the topology using hwloc APIs starting from a
> PCI device till Host Bridge
>
> My code snippet:
>
> unsigned long flags = HWLOC_TOPOLOGY_FLAG_IO_DEVICES |
> HWLOC_TOPOLOGY_FLAG_IO_BRIDGES;
>
> retval = hwloc_topology_init();
> retval = hwloc_topology_set_flags(topology, flags);
> retval = hwloc_topology_load(topology);
>
> pciObj = hwloc_get_pcidev_by_busidstring(topology, "");
> while(pciObj) {
> pciObj = pciObj->parent;
>
> if (pciObj->attr->bridge.upstream_type !=
> HWLOC_OBJ_BRIDGE_HOST)
> {
>   //Get all the required information about
> intermediate bridges like
>  
> // pciObj->attr->bridge.downstream.pci.secondary_bus, 
> pciObj->attr->bridge.upstream.pci.domain
>   // pciObj->attr->bridge.upstream.pci.bus,
> *_pciObj->attr->bridge.upstream.pci.linkspeed_*
> }
> }
>
> All the other details I am able to query but linkspeed
> (*_pciObj->attr->bridge.upstream.pci.linkspeed_*) is always 0.
> Do I need to enable any other flag to get linkspeed or am I going
> wrong somewhere?
>
> I want to get the PCI Bridges' generation or linkspeed for my usecase.
> Is there any other way to get this information?
>
>
> Thanks & Regards,
> Tejaswi K
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Why do I get such little information back about GPU's on my system

2017-07-07 Thread Brice Goglin

Le 07/07/2017 20:38, David Solt a écrit :
> We are using the hwloc api to identify GPUs on our cluster. While we
> are able to "discover" the GPUs, other information about them does not
> appear to be getting filled in. See below for example: 
> (gdb) p *obj->attr
> $20 = {
>   cache = {
> size = 1,
> depth = 0,
> linesize = 0,
> associativity = 0,
> type = HWLOC_OBJ_CACHE_UNIFIED
>   },
>   group = {
> depth = 1
>   },
>   pcidev = {
> domain = 1,
> bus = 0 '\000',
> dev = 0 '\000',
> func = 0 '\000',
> class_id = 0,
> *vendor_id = 0,*
>*device_id = 0, *
> subvendor_id = 0,
> subdevice_id = 0,
> revision = 0 '\000',
> linkspeed = 0
>   },
>   bridge = {
> upstream = {
>   pci = {
> domain = 1,
> bus = 0 '\000',
> dev = 0 '\000',
> func = 0 '\000',
> class_id = 0,
> vendor_id = 0,
> device_id = 0,
> subvendor_id = 0,
> subdevice_id = 0,
> revision = 0 '\000',
> linkspeed = 0
>   }
> },
> upstream_type = HWLOC_OBJ_BRIDGE_HOST,
> downstream = {
>   pci = {
> domain = 0,
> secondary_bus = 0 '\000',
> subordinate_bus = 0 '\000'
>   }
> },
> downstream_type = HWLOC_OBJ_BRIDGE_HOST,
> depth = 0
>   },
>   osdev = {
> type = *HWLOC_OBJ_OSDEV_GPU*
>   }
> } 
> The name is generally just "cardX".  

Hello

attr is an union so only the "osdev" portion above matters. "osdev" can
be a lot of different things. So instead of having all possible
attributes in a struct, we use info key/value pairs (hwloc_obj->infos).
But those "cardX" devices are the GPU reported by the Linux kernel DRM
subsystem, we don't have much information about them anyway.

If you're looking at Power machine, I am going to assume you care about
CUDA devices. Those are "osdev" objects of type "COPROC" instead of
"GPU". They have many more attributes. Here's what I see on one of our
machines:

  PCI 10de:1094 (P#540672 busid=:84:00.0 class=0302(3D) PCIVendor="NVIDIA 
Corporation" PCIDevice="Tesla M2075 Dual-Slot Computing Processor Module") 
"NVIDIA Corporation Tesla M2075 Dual-Slot Computing Processor Module"
Co-Processor L#5 (CoProcType=CUDA Backend=CUDA GPUVendor="NVIDIA 
Corporation" GPUModel="Tesla M2075" CUDAGlobalMemorySize=5428224 
CUDAL2CacheSize=768 CUDAMultiProcessors=14 CUDACoresPerMP=32 
CUDASharedMemorySizePerMP=48) "cuda2"


On recent kernels, you would see both a "cardX" GPU osdev and a "cudaX"
COPROC osdev in the PCI device. There can even be "nvmlX" and ":0.0" if
you have the nvml and nvctrl libraries. Those are basically different
ways to talk the GPU (Linux kernel DRM, CUDA, etc).

Given that I have never seen anybody use "cardX" for placing task/data
near a GPU, I am wondering if we should disable those by default. Or
maybe rename "GPU" into something that wouldn't attract people as much,
maybe "DRM".

> Does this mean that the cards are not configured correctly? Or is
> there an additional flag that needs to be set to get this information? 

Make sure "cuda" appears in the summary at the end of the configure.

> Currently the code does: 
>   hwloc_topology_init(_topology);
>   hwloc_topology_set_flags(machine_topology,
> HWLOC_TOPOLOGY_FLAG_IO_DEVICES);
>   hwloc_topology_load(machine_topology); 
> And this is enough to identify the CPUs and GPUs, but any additional
> information - particularly the device and vendor id's - seem to not be
> there.  
> I tried this with the most recent release (1.11.7) and saw the same
> results.    
> We tried this on a variety of PowerPC machines and I think even some
> x86_64 machines with similar results.    
> Thoughts?
> Dave

BTW, it looks like you're not going to the OMPI dev meeting next week.
I'll be there if one of your colleague wants to discuss this face to face.

Brice

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] hwloc error in SuperMicro AMD Opteron 6238

2017-06-30 Thread Brice Goglin

Hello

We have seen _many_ reports like these. But there are different kinds of
errors. As far as I understand:

* Julio's error is caused by the Linux kernel improperly reporting L3
cache affinities. It's specific to multi-socket 12-core processors
because the kernel makes invalid assumptions about core APIC IDs in
these processors (because only 12 out of 16 cores are enabled).
HWLOC_COMPONENTS=x86 was designed to solve this issue until AMD fixed
the kernel, but it looks like they didn't.

* Your error looks like another issue where the BIOS reports invalid
NUMA affinity (likely in the SRAT table). A BIOS upgrade may help.
Fortunately, the x86 backend can also read NUMA affinity from CPUID
instructions on AMD. I didn't know/remember HWLOC_COMPONENTS=x86 could
help for this bug too.


I am going to add this workaround to the FAQ about these errors (this
FAQ is listed in the error below since 1.11).


By the way, you should upgrade. 1.10 is vey old :)

Brice




Le 30/06/2017 21:59, Belgin, Mehmet a écrit :
> We (Georgia Tech) too have been observing this on 16-core AMD AbuDhabi
> machines (6378). We weren’t aware of HWLOC_COMPONENTS workaround,
> which seems to mitigate the issue. 
>
> *Before:*
>
> # ./lstopo
> 
> * hwloc has encountered what looks like an error from the operating
> system.
> *
> * Socket (P#2 cpuset 0x,0x0) intersects with NUMANode (P#3
> cpuset 0xff00,0xff00) without inclusion!
> * Error occurred in topology.c line 940
> *
> * Please report this error message to the hwloc user's mailing list,
> * along with the output+tarball generated by the hwloc-gather-topology
> script.
> 
> Machine (128GB total)
>   Group0 L#0
> NUMANode L#0 (P#1 32GB)
> ...
>
> *After:*
>
> # export HWLOC_COMPONENTS=x86
> # ./lstopo
> Machine
>   Socket L#0
> NUMANode L#0 (P#0) + L3 L#0 (6144KB)
>   L2 L#0 (2048KB) + L1i L#0 (64KB)
> ...
>
> These nodes are the only one in our entire cluster to cause zombie
> processes using torque/moab. I have a feeling that they are related.
> We use hwloc/1.10.0.
>
> Not sure if this helps at all, but you are definitely not alone :)
>
> Thanks,
> -Mehmet
>
>
>
>> On Jun 29, 2017, at 1:24 AM, Brice Goglin <brice.gog...@inria.fr
>> <mailto:brice.gog...@inria.fr>> wrote:
>>
>> Hello
>>
>> We've seen this issue many times (it's specific to 12-core opterons),
>> but I am surprised it still occurs with such a recent kernel. AMD was
>> supposed to fix the kernel in early 2016 but I forgot checking
>> whether something was actually pushed.
>>
>> Anyway, you can likely ignore the issue as documented in the FAQ
>> https://www.open-mpi.org/projects/hwloc/doc/v1.11.7/a00305.php unless
>> you care about L3 affinity for binding. Otherwise, you can workaround
>> the issue by passing HWLOC_COMPONENTS=x86 in the environment so that
>> hwloc uses cpuid before of Linux sysfs files for discovery the topology.
>>
>> Brice
>>
>>
>>
>>
>> Le 29/06/2017 02:17, Julio Figueroa a écrit :
>>> Hi
>>>
>>> I am experincing the following issues when using pnetcdf version 1.8.1
>>> The machine is a Supermicro (H8DGi) dual socket AMD Opteron 6238
>>> (patch_level=0x0600063d)
>>> The BIOS is the lates from Supermicro (v3.5c 03/18/2016)
>>> OS: Debian 9.0 Kernel: 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u1
>>> (2017-06-18) x86_64 GNU/Linux
>>> 
>>> * hwloc 1.11.5 has encountered what looks like an error from the
>>> operating system.
>>> *
>>> * L3 (cpuset 0x03f0) intersects with NUMANode (P#0 cpuset
>>> 0x003f) without inclusion!
>>> * Error occurred in topology.c line 1074
>>> *
>>> * The following FAQ entry in the hwloc documentation may help:
>>> *   What should I do when hwloc reports "operating system" warnings?
>>> * Otherwise please report this error message to the hwloc user's
>>> mailing list,
>>> * along with the output+tarball generated by the
>>> hwloc-gather-topology script.
>>> 
>>>
>>> As suggested by the error message, here is the hwloc-gather-topology
>>> attached.
>>>
>>> Please let me know if you need more information.
>>>
>>> Julio Figueroa
>>> Oceanographer
>>>
>>>
>>&

Re: [hwloc-users] hwloc error in SuperMicro AMD Opteron 6238

2017-06-30 Thread Brice Goglin

Le 30/06/2017 22:08, fabricio a écrit :
> Em 30-06-2017 16:21, Brice Goglin escreveu:
>> Yes, it's possible but very easy. Before we go that way:
>> Can you also pass HWLOC_COMPONENTS_VERBOSE=1 in the environment and send
>> the verbose output?
>
> ///
> Registered cpu discovery component `no_os' with priority 40
> (statically build)
> Registered global discovery component `xml' with priority 30
> (statically build)
> Registered global discovery component `synthetic' with priority 30
> (statically build)
> Registered global discovery component `custom' with priority 30
> (statically build)
> Registered cpu discovery component `linux' with priority 50
> (statically build)
> Registered misc discovery component `linuxpci' with priority 19
> (statically build)
> Registered misc discovery component `pci' with priority 20 (statically
> build)
> Registered cpu discovery component `x86' with priority 45 (statically
> build)
> Enabling cpu discovery component `linux'
> Enabling cpu discovery component `x86'
> Enabling cpu discovery component `no_os'
> Excluding global discovery component `xml', conflicts with excludes 0x2
> Excluding global discovery component `synthetic', conflicts with
> excludes 0x2
> Excluding global discovery component `custom', conflicts with excludes
> 0x2
> Enabling misc discovery component `pci'
> Enabling misc discovery component `linuxpci'
> Final list of enabled discovery components: linux,x86,no_os,pci,linuxpci
> 
>
> * hwloc has encountered what looks like an error from the operating
> system.
> *
> * L3 (cpuset 0x03f0) intersects with NUMANode (P#0 cpuset
> 0x003f) without inclusion!
> * Error occurred in topology.c line 942
> *
> * The following FAQ entry in a recent hwloc documentation may help:
> *   What should I do when hwloc reports "operating system" warnings?
> * Otherwise please report this error message to the hwloc user's
> mailing list,
> * along with the output+tarball generated by the hwloc-gather-topology
> script.
> 
>
> Enabling global discovery component `xml'
> Excluding cpu discovery component `linux', conflicts with excludes
> 0x
> Excluding cpu discovery component `x86', conflicts with excludes
> 0x
> Excluding cpu discovery component `no_os', conflicts with excludes
> 0x
> Excluding global discovery component `xml', conflicts with excludes
> 0x
> Excluding global discovery component `synthetic', conflicts with
> excludes 0x
> Excluding global discovery component `custom', conflicts with excludes
> 0x
> Excluding misc discovery component `pci', conflicts with excludes
> 0x
> Excluding misc discovery component `linuxpci', conflicts with excludes
> 0x
> Final list of enabled discovery components: xml
> ///
>
>> I am wondering if the x86 backend was disabled somehow.
>> Please also send your config.log
>
> I'm using the embebbed hwloc in openmpi 1.10.7, whose version seems to
> be 1.9.1. I could not find a config.log file.

I thought you were using hwloc 1.11.5? HWLOC_COMPONENTS=x86 can help
there, but not in 1.9.1 from OMPI. Which one did you try?

>
>> Setting HWLOC_COMPONENTS=-linux could also work: It totally disables the
>> Linux backend. If the x86 is disabled as well, you would get an almost
>> empty topology.
>
> Will this leave the process allocation to the kernel, potentially
> diminishing performance?

This would basically ignore all topology information.
But it's not needed anymore here since the x86 backend is enabled above.

What you can do is one of these:
* tell OMPI to use an external hwloc >= 1.11.2
* use a more recent OMPI :)
* use a XML generated with hwloc >= 1.11.2 with HWLOC_COMPONENTS=x86,
and pass it to OMPI and/or hwloc with HWLOC_XMLFILE=/path/to/foo.xml and
HWLOC_THISSYSTEM=1 in the environment. If it doesn't work, I'll generate
the XML

Brice

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] hwloc error in SuperMicro AMD Opteron 6238

2017-06-28 Thread Brice Goglin

Hello

We've seen this issue many times (it's specific to 12-core opterons),
but I am surprised it still occurs with such a recent kernel. AMD was
supposed to fix the kernel in early 2016 but I forgot checking whether
something was actually pushed.

Anyway, you can likely ignore the issue as documented in the FAQ
https://www.open-mpi.org/projects/hwloc/doc/v1.11.7/a00305.php unless
you care about L3 affinity for binding. Otherwise, you can workaround
the issue by passing HWLOC_COMPONENTS=x86 in the environment so that
hwloc uses cpuid before of Linux sysfs files for discovery the topology.

Brice




Le 29/06/2017 02:17, Julio Figueroa a écrit :
> Hi
>
> I am experincing the following issues when using pnetcdf version 1.8.1
> The machine is a Supermicro (H8DGi) dual socket AMD Opteron 6238
> (patch_level=0x0600063d)
> The BIOS is the lates from Supermicro (v3.5c 03/18/2016)
> OS: Debian 9.0 Kernel: 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u1
> (2017-06-18) x86_64 GNU/Linux
> 
> * hwloc 1.11.5 has encountered what looks like an error from the
> operating system.
> *
> * L3 (cpuset 0x03f0) intersects with NUMANode (P#0 cpuset
> 0x003f) without inclusion!
> * Error occurred in topology.c line 1074
> *
> * The following FAQ entry in the hwloc documentation may help:
> *   What should I do when hwloc reports "operating system" warnings?
> * Otherwise please report this error message to the hwloc user's
> mailing list,
> * along with the output+tarball generated by the hwloc-gather-topology
> script.
> 
>
> As suggested by the error message, here is the hwloc-gather-topology
> attached.
>
> Please let me know if you need more information.
>
> Julio Figueroa
> Oceanographer
>
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] ? Finding cache & pci info on SPARC/Solaris 11.3

2017-06-09 Thread Brice Goglin

Thanks a lot for the input.
I opened https://github.com/open-mpi/hwloc/issues/243
I have access to a T5 but this will need investigation to actually find
where to get the info from.
Feel free to comment the issue if you find more. I am going to modify
Pg.pm to better understand where Caches come from.

Brice




Le 09/06/2017 09:11, Maureen Chew a écrit :
> Re: cache relationship… ah… so you’d need to parse both
> prtpicl(8) (to get sizes)  and something like pginfo(8) (perl script)
> to get
> relationship…..
>
> bash-4.3$ pginfo -v | more
> 0 (System [system]) CPUs: 0-511
> |-- 5 (Data_Pipe_to_memory [socket 0]) CPUs: 0-255
> |   |-- 4 (L3_Cache) CPUs: 0-31
> |   |   `-- 6 (CPU_PM_Active_Power_Domain) CPUs: 0-31
> |   |   |-- 3 (L2_Cache) CPUs: 0-15
> |   |   |   |-- 2 (Floating_Point_Unit [core 0]) CPUs: 0-7
> |   |   |   |   `-- 1 (Integer_Pipeline [core 0]) CPUs: 0-7
> |   |   |   `-- 8 (Floating_Point_Unit [core 1]) CPUs: 8-15
> |   |   |   `-- 7 (Integer_Pipeline [core 1]) CPUs: 8-15
> |   |   `-- 11 (L2_Cache) CPUs: 16-31
> |   |   |-- 10 (Floating_Point_Unit [core 2]) CPUs: 16-23
> |   |   |   `-- 9 (Integer_Pipeline [core 2]) CPUs: 16-23
> |   |   `-- 13 (Floating_Point_Unit [core 3]) CPUs: 24-31
> |   |   `-- 12 (Integer_Pipeline [core 3]) CPUs: 24-31
> |   |-- 17 (L3_Cache) CPUs: 32-63
> |   |   `-- 18 (CPU_PM_Active_Power_Domain) CPUs: 32-63
> |   |   |-- 16 (L2_Cache) CPUs: 32-47
> |   |   |   |-- 15 (Floating_Point_Unit [core 4]) CPUs: 32-39
> |   |   |   |   `-- 14 (Integer_Pipeline [core 4]) CPUs: 32-39
> |   |   |   `-- 20 (Floating_Point_Unit [core 5]) CPUs: 40-47
> |   |   |   `-- 19 (Integer_Pipeline [core 5]) CPUs: 40-47
> |   |   `-- 23 (L2_Cache) CPUs: 48-63
> |   |   |-- 22 (Floating_Point_Unit [core 6]) CPUs: 48-55
> |   |   |   `-- 21 (Integer_Pipeline [core 6]) CPUs: 48-55
> |   |   `-- 25 (Floating_Point_Unit [core 7]) CPUs: 56-63
> |   |   `-- 24 (Integer_Pipeline [core 7]) CPUs: 56-63
> |   |-- 29 (L3_Cache) CPUs: 64-95
> |   |   `-- 30 (CPU_PM_Active_Power_Domain) CPUs: 64-95
> |   |   |-- 28 (L2_Cache) CPUs: 64-79
> |   |   |   |-- 27 (Floating_Point_Unit [core 8]) CPUs: 64-71
> |   |   |   |   `-- 26 (Integer_Pipeline [core 8]) CPUs: 64-71
> |   |   |   `-- 32 (Floating_Point_Unit [core 9]) CPUs: 72-79
> |   |   |   `-- 31 (Integer_Pipeline [core 9]) CPUs: 72-79
> |   |   `-- 35 (L2_Cache) CPUs: 80-95
> |   |   |-- 34 (Floating_Point_Unit [core 10]) CPUs: 80-87
> |   |   |   `-- 33 (Integer_Pipeline [core 10]) CPUs: 80-87
> |   |   `-- 37 (Floating_Point_Unit [core 11]) CPUs: 88-95
> |   |   `-- 36 (Integer_Pipeline [core 11]) CPUs: 88-95
> |   |-- 41 (L3_Cache) CPUs: 96-127
> |   |   `-- 42 (CPU_PM_Active_Power_Domain) CPUs: 96-127
> |   |   |-- 40 (L2_Cache) CPUs: 96-111
> |   |   |   |-- 39 (Floating_Point_Unit [core 12]) CPUs: 96-103
> |   |   |   |   `-- 38 (Integer_Pipeline [core 12]) CPUs: 96-103
> |   |   |   `-- 44 (Floating_Point_Unit [core 13]) CPUs: 104-111
> |   |   |   `-- 43 (Integer_Pipeline [core 13]) CPUs: 104-111
> |   |   `-- 47 (L2_Cache) CPUs: 112-127
> |   |   |-- 46 (Floating_Point_Unit [core 14]) CPUs: 112-119
> |   |   |   `-- 45 (Integer_Pipeline [core 14]) CPUs: 112-119
> |   |   `-- 49 (Floating_Point_Unit [core 15]) CPUs: 120-127
> |   |   `-- 48 (Integer_Pipeline [core 15]) CPUs: 120-127
> |   |-- 53 (L3_Cache) CPUs: 128-159
> |   |   `-- 54 (CPU_PM_Active_Power_Domain) CPUs: 128-159
> |   |   |-- 52 (L2_Cache) CPUs: 128-143
> |   |   |   |-- 51 (Floating_Point_Unit [core 16]) CPUs: 128-135
> |   |   |   |   `-- 50 (Integer_Pipeline [core 16]) CPUs: 128-135
> |   |   |   `-- 56 (Floating_Point_Unit [core 17]) CPUs: 136-143
> |   |   |   `-- 55 (Integer_Pipeline [core 17]) CPUs: 136-143
> |   |   `-- 59 (L2_Cache) CPUs: 144-159
> |   |   |-- 58 (Floating_Point_Unit [core 18]) CPUs: 144-151
> |   |   |   `-- 57 (Integer_Pipeline [core 18]) CPUs: 144-151
> |   |   `-- 61 (Floating_Point_Unit [core 19]) CPUs: 152-159
> |   |   `-- 60 (Integer_Pipeline [core 19]) CPUs: 152-159
> |   |-- 65 (L3_Cache) CPUs: 160-191
> |   |   `-- 66 (CPU_PM_Active_Power_Domain) CPUs: 160-191
> |   |   |-- 64 (L2_Cache) CPUs: 160-175
> |   |   |   |-- 63 (Floating_Point_Unit [core 20]) CPUs: 160-167
> |   |   |   |   `-- 62 (Integer_Pipeline [core 20]) CPUs: 160-167
> |   |   |   `-- 68 (Floating_Point_Unit [core 21]) CPUs: 168-175
> |   |   |   `-- 67 (Integer_Pipeline [core 21]) CPUs: 168-175
> |   |   `-- 71 (L2_Cache) CPUs: 176-191
> |   |   |-- 70 (Floating_Point_Unit [core 22]) CPUs: 176-183
> |   |   |   `-- 69 (Integer_Pipeline [core 22]) CPUs: 176-183
> |   |   `-- 73 (Floating_Point_Unit

Re: [hwloc-users] ? Finding cache & pci info on SPARC/Solaris 11.3

2017-06-08 Thread Brice Goglin

Le 08/06/2017 16:58, Samuel Thibault a écrit :
> Hello,
>
> Maureen Chew, on jeu. 08 juin 2017 10:51:56 -0400, wrote:
>> Should finding cache & pci info work?
> AFAWK, there is no user-available way to get cache information on
> Solaris, so it's not implemented in hwloc.

And even if prtpicl reports some information using the PICL API, I don't
think it says how caches are shared between cores.

> Concerning pci, you need libpciaccess to get PCI information.
>

And usually you need root access (I think it looks inside /devices/pci*
where files are root-only).

Brice

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] NetLoc subnets Problem

2017-02-22 Thread Brice Goglin

Cyril would know better but he's currently on vacation. I don't think
the required scotch release is already officially publicly available online.

I am not sure what problem you're trying to solve here. This feature
isn't really useful if you only have two nodes. It was rather designed
for process placement on large clusters (by the way, you'll need
something to build a communication pattern of your MPI application as a
communication matrix).

Regarding configure, once the right scotch release is installed, there's
no configure options if I remember correctly, you just need to point
your compiler to scotch include and lib directories. Something like this
may help:
export C_INCLUDE_PATH=/opt/scotch-6/include
export LD_LIBRARY_PATH=/opt/scotch-6/lib
export LIBRARY_PATH=/opt/scotch-6/lib

Brice



Le 22/02/2017 13:38, Михаил Халилов a écrit :
> I tried to configure hwloc with scotch, but I still haven't success
> with that case. I read Chapter 18 in doxygen and chapters about Netloc
> and installation, but not found about anything about Scotch configure.
> So, Scotch installed in /opt/scotch-6/ folder, and I want to install
> hwloc with netloc in /opt/hwloc/ . Which options should I use to give
> ./configure script information about Scotch?
>
> Best regards,
> Mikhail
>
> 2017-02-20 11:50 GMT+03:00 Brice Goglin <brice.gog...@inria.fr
> <mailto:brice.gog...@inria.fr>>:
>
> Inside the tarball that you downloaded, there's a
> doc/doxygen-doc/hwloc-a4.pdf with chapter 18 about Netloc with Scotch.
> Beware that this code is still under development.
>
> Brice
>
>
>
>
> Le 19/02/2017 20:20, Михаил Халилов a écrit :
>> Okay, but what configure options for Scotch should I use? I
>> didn't found any information about it in docs and readme
>>
>>
>>
>> 2017-02-19 20:52 GMT+03:00 Brice Goglin <brice.gog...@inria.fr
>> <mailto:brice.gog...@inria.fr>>:
>>
>> The only publicly-installed netloc API is currently specific
>> to the scotch partitioner for process placement. It takes a
>> network topology and a communication pattern between a set of
>> process and it generates a topology-aware placement for these
>> processes.
>> This API only gets installed if you have scotch installed
>> (and tell configure where it is). That's why you don't get
>> any netloc API installed for now.
>>
>> We initially exposed the entire graph that netloc uses
>> internally (it's still true in v0.5 but not anymore in hwloc
>> 2.0) but there wasn't a clear list of what users want to do
>> with it. We didn't want to expose a random API without much
>> user feedback first. There are many ways to expose a graph
>> API, it was too risky. So it's not publicly installed anymore.
>>
>> You can use internal headers such as private/netloc.h for now
>> (you'll see edges, nodes, etc) and we'll make things public
>> once we know what you and others would like to do.
>>
>> Brice
>>
>>
>>
>>
>> Le 19/02/2017 17:29, Михаил Халилов a écrit :
>>> Hi again!
>>>
>>> Can I ask you, how can I use netloc API for my C programs?
>>> I configured hwloc only with --prefix=/opt/hwloc option. So,
>>> there are no netloc header files in /opt/hwloc/include
>>> directory. Also, I didn't understand how to use
>>> netloc_draw.html, because I found it only in extracted
>>> tarball. May be i should configure netloc with some other
>>> options?
>>>
>>> Best regards,
>>> Mikhail Khalilov
>>>
>>>
>>>
>>> ___
>>> hwloc-users mailing list
>>> hwloc-users@lists.open-mpi.org
>>> <mailto:hwloc-users@lists.open-mpi.org>
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
>>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users>
>> ___ hwloc-users
>> mailing list hwloc-users@lists.open-mpi.org
>> <mailto:hwloc-users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users>
>>
>>
>> ___
>>

Re: [hwloc-users] NetLoc subnets Problem

2017-02-20 Thread Brice Goglin

Inside the tarball that you downloaded, there's a
doc/doxygen-doc/hwloc-a4.pdf with chapter 18 about Netloc with Scotch.
Beware that this code is still under development.

Brice



Le 19/02/2017 20:20, Михаил Халилов a écrit :
> Okay, but what configure options for Scotch should I use? I didn't
> found any information about it in docs and readme
>
>
>
> 2017-02-19 20:52 GMT+03:00 Brice Goglin <brice.gog...@inria.fr
> <mailto:brice.gog...@inria.fr>>:
>
> The only publicly-installed netloc API is currently specific to
> the scotch partitioner for process placement. It takes a network
> topology and a communication pattern between a set of process and
> it generates a topology-aware placement for these processes.
> This API only gets installed if you have scotch installed (and
> tell configure where it is). That's why you don't get any netloc
> API installed for now.
>
> We initially exposed the entire graph that netloc uses internally
> (it's still true in v0.5 but not anymore in hwloc 2.0) but there
> wasn't a clear list of what users want to do with it. We didn't
> want to expose a random API without much user feedback first.
> There are many ways to expose a graph API, it was too risky. So
> it's not publicly installed anymore.
>
> You can use internal headers such as private/netloc.h for now
> (you'll see edges, nodes, etc) and we'll make things public once
> we know what you and others would like to do.
>
> Brice
>
>
>
>
> Le 19/02/2017 17:29, Михаил Халилов a écrit :
>> Hi again!
>>
>> Can I ask you, how can I use netloc API for my C programs?
>> I configured hwloc only with --prefix=/opt/hwloc option. So,
>> there are no netloc header files in /opt/hwloc/include directory.
>> Also, I didn't understand how to use netloc_draw.html, because I
>> found it only in extracted tarball. May be i should configure
>> netloc with some other options?
>>
>> Best regards,
>> Mikhail Khalilov
>>
>>
>>
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> <mailto:hwloc-users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users>
> ___ hwloc-users
> mailing list hwloc-users@lists.open-mpi.org
> <mailto:hwloc-users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users> 
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] NetLoc subnets Problem

2017-02-19 Thread Brice Goglin

The only publicly-installed netloc API is currently specific to the
scotch partitioner for process placement. It takes a network topology
and a communication pattern between a set of process and it generates a
topology-aware placement for these processes.
This API only gets installed if you have scotch installed (and tell
configure where it is). That's why you don't get any netloc API
installed for now.

We initially exposed the entire graph that netloc uses internally (it's
still true in v0.5 but not anymore in hwloc 2.0) but there wasn't a
clear list of what users want to do with it. We didn't want to expose a
random API without much user feedback first. There are many ways to
expose a graph API, it was too risky. So it's not publicly installed
anymore.

You can use internal headers such as private/netloc.h for now (you'll
see edges, nodes, etc) and we'll make things public once we know what
you and others would like to do.

Brice



Le 19/02/2017 17:29, Михаил Халилов a écrit :
> Hi again!
>
> Can I ask you, how can I use netloc API for my C programs?
> I configured hwloc only with --prefix=/opt/hwloc option. So, there are
> no netloc header files in /opt/hwloc/include directory. Also, I didn't
> understand how to use netloc_draw.html, because I found it only in
> extracted tarball. May be i should configure netloc with some other
> options?
>
> Best regards,
> Mikhail Khalilov
>
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] NetLoc subnets Problem

2017-02-17 Thread Brice Goglin

Please run "hwloc-gather-topology --io foo" on the head node 'Use the
gather-topology script that comes with the hwloc nightly snapshot).

The output tarball will likely be large, so feel free to send it to me
directly.

My guess is that your very old kernel may miss some files that we need
in the hwloc development snapshot (the I/O discovery changed
significantly in hwloc 2.0).

Brice




Le 17/02/2017 10:26, Михаил Халилов a écrit :
> I ran ibstat on head node it gives information in attach.
>
> 2017-02-17 12:16 GMT+03:00 Brice Goglin <brice.gog...@inria.fr
> <mailto:brice.gog...@inria.fr>>:
>
> For some reason, lstopo didn't find any InfiniBand information on
> the head node. I guess running lstopo won't show any "mlx4_0" or
> "ib0" object. Is the InfiniBand service really running on that
> machine?
>
> Brice
>
>
>
>
>
> Le 17/02/2017 10:04, Михаил Халилов a écrit :
>> All files in attach. I run netloc_ib_gather_raw with this
>> parameters netloc_ib_gather_raw /home/halilov/mycluster-data/
>>     --hwloc-dir=/home/halilov/mycluster-data/hwloc/ --verbose --sudo
>>
>> 2017-02-17 11:55 GMT+03:00 Brice Goglin <brice.gog...@inria.fr
>> <mailto:brice.gog...@inria.fr>>:
>>
>> Please copy-paste the exact command line of your
>> "netloc_ib_gather_raw" and all the messages it printed. And
>> also send the output of the hwloc directory it created (it
>> will contain the lstopo XML output of the node where you ran
>> the command).
>>
>> Brice
>>
>>
>>
>> Le 17/02/2017 09:51, Михаил Халилов a écrit :
>>> I installed nightly tarball, but it still isn't working. In
>>> attach info of ibnetdiscover and ibroute. May be it wlii help...
>>> What could be the problem?
>>>
>>> Best regards,
>>> Mikhail Khalilov
>>>
>>> 2017-02-17 9:53 GMT+03:00 Brice Goglin
>>> <brice.gog...@inria.fr <mailto:brice.gog...@inria.fr>>:
>>>
>>> Hello
>>>
>>> As identicated on the netloc webpages, the netloc
>>> development now occurs
>>> inside the hwloc git tree. netloc v0.5 is obsolete even
>>> if hwloc 2.0
>>> isn't released yet.
>>>
>>> If you want to use a development snapshot, take hwloc
>>> nightly tarballs
>>> from https://ci.inria.fr/hwloc/job/master-0-tarball/
>>> <https://ci.inria.fr/hwloc/job/master-0-tarball/> or
>>> https://www.open-mpi.org/software/hwloc/nightly/master/
>>> <https://www.open-mpi.org/software/hwloc/nightly/master/>
>>>
>>> Regards
>>> Brice
>>>
>>>
>>>
>>>
>>>
>>> Le 16/02/2017 19:15, miharuli...@gmail.com
>>> <mailto:miharuli...@gmail.com> a écrit :
>>> > I downloaded gunzip from openmpi site here:
>>> https://www.open-mpi.org/software/netloc/v0.5/
>>> >
>>> > There are three identical machines in my cluster, but
>>> now third node is broken, and i tried on two machines.
>>> They all connected by InfiniBand switch, and when I try
>>> to use ibnetdiscovery or ibroute, it works perfectly...
>>> >
>>> >
>>> >
>>> > Отправлено с iPad
>>> >> 16 февр. 2017 г., в 18:40, Cyril Bordage
>>> <cyril.bord...@inria.fr <mailto:cyril.bord...@inria.fr>>
>>> написал(а):
>>> >>
>>> >> Hi,
>>> >>
>>> >> What version did you use?
>>> >>
>>> >> I pushed some commits on master on ompi repository.
>>> With this version it
>>> >> seems to work.
>>> >> You have two machines because you tried netloc on
>>> these two?
>>> >>
>>> >>
>>> >> Cyril.
>>> >>
>>> >>> Le 15/02/2017 à 22:44, miharulidze a écrit :
>>&g

Re: [hwloc-users] NetLoc subnets Problem

2017-02-17 Thread Brice Goglin

For some reason, lstopo didn't find any InfiniBand information on the
head node. I guess running lstopo won't show any "mlx4_0" or "ib0"
object. Is the InfiniBand service really running on that machine?

Brice




Le 17/02/2017 10:04, Михаил Халилов a écrit :
> All files in attach. I run netloc_ib_gather_raw with this parameters
> netloc_ib_gather_raw /home/halilov/mycluster-data/
> --hwloc-dir=/home/halilov/mycluster-data/hwloc/ --verbose --sudo
>
> 2017-02-17 11:55 GMT+03:00 Brice Goglin <brice.gog...@inria.fr
> <mailto:brice.gog...@inria.fr>>:
>
> Please copy-paste the exact command line of your
> "netloc_ib_gather_raw" and all the messages it printed. And also
> send the output of the hwloc directory it created (it will contain
> the lstopo XML output of the node where you ran the command).
>
> Brice
>
>
>
> Le 17/02/2017 09:51, Михаил Халилов a écrit :
>> I installed nightly tarball, but it still isn't working. In
>> attach info of ibnetdiscover and ibroute. May be it wlii help...
>> What could be the problem?
>>
>> Best regards,
>> Mikhail Khalilov
>>
>> 2017-02-17 9:53 GMT+03:00 Brice Goglin <brice.gog...@inria.fr
>> <mailto:brice.gog...@inria.fr>>:
>>
>> Hello
>>
>> As identicated on the netloc webpages, the netloc development
>> now occurs
>> inside the hwloc git tree. netloc v0.5 is obsolete even if
>> hwloc 2.0
>> isn't released yet.
>>
>> If you want to use a development snapshot, take hwloc nightly
>> tarballs
>> from https://ci.inria.fr/hwloc/job/master-0-tarball/
>> <https://ci.inria.fr/hwloc/job/master-0-tarball/> or
>> https://www.open-mpi.org/software/hwloc/nightly/master/
>> <https://www.open-mpi.org/software/hwloc/nightly/master/>
>>
>> Regards
>> Brice
>>
>>
>>
>>
>>
>> Le 16/02/2017 19:15, miharuli...@gmail.com
>> <mailto:miharuli...@gmail.com> a écrit :
>> > I downloaded gunzip from openmpi site here:
>> https://www.open-mpi.org/software/netloc/v0.5/
>> >
>> > There are three identical machines in my cluster, but now
>> third node is broken, and i tried on two machines. They all
>> connected by InfiniBand switch, and when I try to use
>> ibnetdiscovery or ibroute, it works perfectly...
>> >
>> >
>> >
>> > Отправлено с iPad
>> >> 16 февр. 2017 г., в 18:40, Cyril Bordage
>> <cyril.bord...@inria.fr <mailto:cyril.bord...@inria.fr>>
>> написал(а):
>> >>
>> >> Hi,
>> >>
>> >> What version did you use?
>> >>
>> >> I pushed some commits on master on ompi repository. With
>> this version it
>> >> seems to work.
>> >> You have two machines because you tried netloc on these two?
>> >>
>> >>
>> >> Cyril.
>> >>
>> >>> Le 15/02/2017 à 22:44, miharulidze a écrit :
>> >>> Hi!
>> >>>
>> >>> I'm trying to use NetLoc tool for detecting my cluster
>> topology.
>> >>>
>> >>> I have 2 node cluster with AMD Processors, connected by
>> InfiniBand. Also
>> >>> I installed latest versions of hwloc and netloc tools.
>> >>>
>> >>> I followed the instruction of netloc and when I tried to use
>> >>> netloc_ib_gather_raw as root, i recieved this message
>> >>> root:$ netloc_ib_gather_raw
>> >>> --out-dir=/home/halilov/mycluster-data/result/
>> >>> --hwloc-dir=/home/halilov/mycluster-data/hwloc/ --sudo
>> >>>
>> >>> Found 0 subnets in hwloc directory:
>> >>>
>> >>>
>> >>> There are two files in
>> /home/halilov/mycluster-data/hwloc/ generated by
>> >>> hwloc: head.xml and node01.xml
>> >>>
>> >>> P.S. in attach archieve with .xml files
>> >>>
>> >>>
>> &g

Re: [hwloc-users] NetLoc subnets Problem

2017-02-17 Thread Brice Goglin

Please copy-paste the exact command line of your "netloc_ib_gather_raw"
and all the messages it printed. And also send the output of the hwloc
directory it created (it will contain the lstopo XML output of the node
where you ran the command).

Brice


Le 17/02/2017 09:51, Михаил Халилов a écrit :
> I installed nightly tarball, but it still isn't working. In attach
> info of ibnetdiscover and ibroute. May be it wlii help...
> What could be the problem?
>
> Best regards,
> Mikhail Khalilov
>
> 2017-02-17 9:53 GMT+03:00 Brice Goglin <brice.gog...@inria.fr
> <mailto:brice.gog...@inria.fr>>:
>
> Hello
>
> As identicated on the netloc webpages, the netloc development now
> occurs
> inside the hwloc git tree. netloc v0.5 is obsolete even if hwloc 2.0
> isn't released yet.
>
> If you want to use a development snapshot, take hwloc nightly tarballs
> from https://ci.inria.fr/hwloc/job/master-0-tarball/
> <https://ci.inria.fr/hwloc/job/master-0-tarball/> or
> https://www.open-mpi.org/software/hwloc/nightly/master/
> <https://www.open-mpi.org/software/hwloc/nightly/master/>
>
> Regards
> Brice
>
>
>
>
>
> Le 16/02/2017 19:15, miharuli...@gmail.com
> <mailto:miharuli...@gmail.com> a écrit :
> > I downloaded gunzip from openmpi site here:
> https://www.open-mpi.org/software/netloc/v0.5/
> <https://www.open-mpi.org/software/netloc/v0.5/>
> >
> > There are three identical machines in my cluster, but now third
> node is broken, and i tried on two machines. They all connected by
> InfiniBand switch, and when I try to use ibnetdiscovery or
> ibroute, it works perfectly...
> >
> >
> >
> > Отправлено с iPad
> >> 16 февр. 2017 г., в 18:40, Cyril Bordage
> <cyril.bord...@inria.fr <mailto:cyril.bord...@inria.fr>> написал(а):
> >>
> >> Hi,
> >>
> >> What version did you use?
> >>
> >> I pushed some commits on master on ompi repository. With this
> version it
> >> seems to work.
> >> You have two machines because you tried netloc on these two?
> >>
> >>
> >> Cyril.
> >>
> >>> Le 15/02/2017 à 22:44, miharulidze a écrit :
> >>> Hi!
> >>>
> >>> I'm trying to use NetLoc tool for detecting my cluster topology.
> >>>
> >>> I have 2 node cluster with AMD Processors, connected by
> InfiniBand. Also
> >>> I installed latest versions of hwloc and netloc tools.
> >>>
> >>> I followed the instruction of netloc and when I tried to use
> >>> netloc_ib_gather_raw as root, i recieved this message
> >>> root:$ netloc_ib_gather_raw
> >>> --out-dir=/home/halilov/mycluster-data/result/
> >>> --hwloc-dir=/home/halilov/mycluster-data/hwloc/ --sudo
> >>>
> >>> Found 0 subnets in hwloc directory:
> >>>
> >>>
> >>> There are two files in /home/halilov/mycluster-data/hwloc/
> generated by
> >>> hwloc: head.xml and node01.xml
> >>>
> >>> P.S. in attach archieve with .xml files
> >>>
> >>>
> >>> Best regards,
> >>> Mikhail Khalilov
> >>>
> >>>
> >>>
> >>> ___
> >>> hwloc-users mailing list
> >>> hwloc-users@lists.open-mpi.org
> <mailto:hwloc-users@lists.open-mpi.org>
> >>>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users>
> >> ___
> >> hwloc-users mailing list
> >> hwloc-users@lists.open-mpi.org
> <mailto:hwloc-users@lists.open-mpi.org>
> >>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users>
> > ___
> > hwloc-users mailing list
> > hwloc-users@lists.open-mpi.org
> <mailto:hwloc-users@lists.open-mpi.org>
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org <mailto:hwloc-users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users>
>
>
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] NetLoc subnets Problem

2017-02-16 Thread Brice Goglin

Hello

As identicated on the netloc webpages, the netloc development now occurs
inside the hwloc git tree. netloc v0.5 is obsolete even if hwloc 2.0
isn't released yet.

If you want to use a development snapshot, take hwloc nightly tarballs
from https://ci.inria.fr/hwloc/job/master-0-tarball/ or
https://www.open-mpi.org/software/hwloc/nightly/master/

Regards
Brice





Le 16/02/2017 19:15, miharuli...@gmail.com a écrit :
> I downloaded gunzip from openmpi site here: 
> https://www.open-mpi.org/software/netloc/v0.5/
>
> There are three identical machines in my cluster, but now third node is 
> broken, and i tried on two machines. They all connected by InfiniBand switch, 
> and when I try to use ibnetdiscovery or ibroute, it works perfectly...
>
>
>
> Отправлено с iPad
>> 16 февр. 2017 г., в 18:40, Cyril Bordage  написал(а):
>>
>> Hi,
>>
>> What version did you use?
>>
>> I pushed some commits on master on ompi repository. With this version it
>> seems to work.
>> You have two machines because you tried netloc on these two?
>>
>>
>> Cyril.
>>
>>> Le 15/02/2017 à 22:44, miharulidze a écrit :
>>> Hi!
>>>
>>> I'm trying to use NetLoc tool for detecting my cluster topology.
>>>
>>> I have 2 node cluster with AMD Processors, connected by InfiniBand. Also
>>> I installed latest versions of hwloc and netloc tools.
>>>
>>> I followed the instruction of netloc and when I tried to use
>>> netloc_ib_gather_raw as root, i recieved this message
>>> root:$ netloc_ib_gather_raw
>>> --out-dir=/home/halilov/mycluster-data/result/
>>> --hwloc-dir=/home/halilov/mycluster-data/hwloc/ --sudo
>>>
>>> Found 0 subnets in hwloc directory:
>>>
>>>
>>> There are two files in /home/halilov/mycluster-data/hwloc/ generated by
>>> hwloc: head.xml and node01.xml
>>>
>>> P.S. in attach archieve with .xml files
>>>
>>>
>>> Best regards,
>>> Mikhail Khalilov
>>>
>>>
>>>
>>> ___
>>> hwloc-users mailing list
>>> hwloc-users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] CPUSET shading using xml output of lstopo

2017-02-03 Thread Brice Goglin

Le 03/02/2017 23:01, James Elliott a écrit :
> On 2/3/17, Brice Goglin <brice.gog...@inria.fr> wrote:
>> What do you mean with shaded? Red or green? Red means unavailable.
>> Requires --whole-system everywhere. Green means that's where the
>> process is bound. But XML doesn't store the information about where
>> the process is bound, so you may only get Green in 2). 
> This is exactly what I am attempting to do (and finding it does not work).
> I would like to have a figure with green shadings so that I have a
> visual representation of where my MPI process lived on the machine.

Try this:

lstopo --whole-system --no-io -f hwloc-${rank}.xml
for pu in $(hwloc-calc --whole-system -H PU --sep " " $(hwloc-bind
--get)); do hwloc-annotate hwloc-${rank}.xml hwloc-${rank}.xml $pu info
lstopoStyle Background=#00ff00 ; done
...


How it works:
* hwloc-bind --get retrieves the current binding as a bitmask
* hwloc-calc converts this bitmask into a space-separated list of PU
indexes (there are other possible outputs if needed, such as cores, or
the largest object included in the binding, etc)
* the for loop iterates on these objects and hwloc-annotate adds an
attribute lstopoStyle Background=#00ff00 to each of them
* lstopo will use this attribute to change the background color of these
PU boxes in the graphical output

Make sure you have hwloc >= 1.11.1 for this to work.

Brice



>
> I currently have a function (in C) that I use in my codes that
> inspects affinities, but when I discuss app performance with others, I
> would like to be able to show (graphically) exactly how their app uses
> the resources.  I work mostly with hybrid MPI/OpenMP codes, developed
> by smart scientists who are not familiar with things like affinity.
>
>>> To test without MPI, you would just need to set a processes affinity
>>> and then use its PID instead.
>>>
>>> What I see, is that the XML generated in (1) is identical for all MPI
>>> processes, even though they have different PIDs and different CPUSETS.
>> Are you talking about different MPI runs, or different MPI ranks within
>> the same run?
>>
>> My feeling is that you think you should be seeing different cpusets for
>> each process, but they actually have the same cpuset but different
>> bindings. Cores outside the cpuset are red when --whole-system, or
>> totally ignored otherwise.
>>
>> In (2), you don't have --whole-system, no red cores. But you have --pid,
>> so you get one green core per process, it's its binding. That's why you
>> get different images for each process.
>> in (3), you inherit the missing --whole-system from (1) through XML, no
>> red cores either. But XML doesn't save the process binding, no green
>> cores either. Same image for each process.
>>
>>
>> Do you care about process binding (what mpirun applies to each rank?) or
>> about cpusets (what the batch scheduler applies to the entire job before
>> mpirun?)
>>
>> If cpuset, just add --whole-system everywhere, it should be enough.
>> If binding, there's no direct way with lstopo (but we have a way to save
>> custom colors for individual objects in the XML).
>>
>> Brice
>>
>>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] CPUSET shading using xml output of lstopo

2017-02-03 Thread Brice Goglin

Le 03/02/2017 21:57, James Elliott a écrit :
> Brice,
>  
> Thanks for you comments.  I have worked with this some, but this is
> not working.
>
> My goal is to generate images of the cpusets inuse when I run a
> parallel code using mpirun, aprun, srun, etc...  The compute nodes
> lack the mojo necessary to generate graphical formats, so I can only
> extract XML on the nodes.
>
> I am testing this locally on a 2 NUMA, dual socket workstation with 14
> cores per socket (so, 28 total cores).  I can use OpenMPI to easily
> spawn/bind processes.
>
> E.g.,
> mpirun --map-by ppr:2:NUMA:pe=7 ./hwloc_plot_mpi.sh
>
>
> hwloc_plot_mpi.sh is very simple:
>
> #!/bin/bash
>
> pid="$$"
>
> rank=${OMPI_COMM_WORLD_RANK}
>
> lstopo --pid ${pid} --no-io -f hwloc-${rank}.xml
>
> lstopo --pid ${pid} --no-io --append-legend "Rank: ${rank}" -f
> hwloc-${rank}-orig.png
>
> lstopo --append-legend "Rank: ${rank}" --whole-system --input
> hwloc-${rank}.xml -f hwloc-${rank}.png
>
>
>
> To test things,
> 1) write the XML
> 2) use the same command to write a PNG
> 3) use the generated XML to generate the PNG

Hello

You're missing --whole-system in 1) and 2)

Also --pid isn't very useful because you're basically looking at the
current process, and that's the default. The only difference is that the
process binding is reported in green when using --pid. Does it matter?
See below.

>
> (2) and (3) should produce the same image if I am doing things correctly.
>
> The image for (2) is unique for each process, showing 7 *different*
> cores shaded in each figure (4 images are generated since I spawn 4
> processes)
> The images from (3) are all identical (no shading)

What do you mean with shaded? Red or green?

Red means unavailable. Requires --whole-system everywhere.

Green means that's where the process is bound. But XML doesn't store the
information about where the process is bound, so you may only get Green
in 2).

>
> To test without MPI, you would just need to set a processes affinity
> and then use its PID instead.
>
> What I see, is that the XML generated in (1) is identical for all MPI
> processes, even though they have different PIDs and different CPUSETS.

Are you talking about different MPI runs, or different MPI ranks within
the same run?

My feeling is that you think you should be seeing different cpusets for
each process, but they actually have the same cpuset but different
bindings. Cores outside the cpuset are red when --whole-system, or
totally ignored otherwise.

In (2), you don't have --whole-system, no red cores. But you have --pid,
so you get one green core per process, it's its binding. That's why you
get different images for each process.
in (3), you inherit the missing --whole-system from (1) through XML, no
red cores either. But XML doesn't save the process binding, no green
cores either. Same image for each process.


Do you care about process binding (what mpirun applies to each rank?) or
about cpusets (what the batch scheduler applies to the entire job before
mpirun?)

If cpuset, just add --whole-system everywhere, it should be enough.
If binding, there's no direct way with lstopo (but we have a way to save
custom colors for individual objects in the XML).

Brice

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] CPUSET shading using xml output of lstopo

2017-01-31 Thread Brice Goglin

shade/highlight is included in the "cpuset" and "allowed_cpuset" fields
inside the XML (even when not using --pid).

By default, only what's "available" is displayed. If you want
"disallowed" things to appear (in different colors), add --whole-system
when drawing (in the second command-line).

Brice



Le 01/02/2017 06:56, James a écrit :
> Thanks Brice,
>
> I believe I am rebuilding it as you say, but I can retry tomorrow at
> my desk.
> I looked in the XML and can see the taskset data, but since I cannot
> do --pid ###, it seems to not shade/highlight the tasksets.
>
> I'll drop the args that are redundant and try the exact form you list.
>
> James
>
> On 1/31/2017 10:52 PM, Brice Goglin wrote:
>> Le 01/02/2017 00:19, James Elliott a écrit :
>>> Hi,
>>>
>>> I seem to be stuck. What I would like to do, is us lstopo to generate
>>> files that I can plot on another system (the nodes lack the necessary
>>> libraries for graphical output).
>>>
>>> That is, I would like to see something like
>>> lstopo --only core --pid ${pid} --taskset --no-io --no-bridges
>>> --append-legend "PID: ${pid}" -f hwloc-${pid}.png
>>>
>>> But I need to output to XML instead, and plot on another machine, e.g.
>>>
>>> lstopo --only core --pid ${pid} --taskset --no-io --no-bridges
>>> --append-legend "PID: ${pid}" -f hwloc-${pid}.png
>>> ...
>>> Then on another machine,
>>> lstopo --input hwloc-.xml output.png
>>>
>>> Where, the --pid shading of cpusets is produced in the output.png.
>>> This does not seem to work. I am fairly new to lstopo, is it possible
>>> to achieve this functionality? (I would also like to preserve the
>>> append-legend  stuff, but I could work out a way to do that on the
>>> other host.)
>> Hello
>>
>> My guess is that you would need to export to XML like this:
>> lstopo --pid ${pid} --no-io -f foo.xml
>>
>> and reload/draw on the other host like this:
>> lstopo --input foo.xml --only-core --taskset --append-legend "PID:
>> ${pid}" -f output.png
>>
>> Random comments:
>> * --no-bridges in implied by --no-io
>> * --only and --taskset only apply to the textual output, while you seem
>> to want graphical output as png
>> * --append-legend only applies to the graphical output
>>
>> Brice
>>
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Building hwloc on Cray with /opt/cray/craype/2.5.4/bin/cc

2017-01-05 Thread Brice Goglin

Does this even build with the cray compiler?

int main(int argc, char *argv[])
{
  return __builtin_ffsl((unsigned long) argc);
}




Le 05/01/2017 14:56, Xavier LACOSTE a écrit :
>
> Indeed,
>
>  
>
> If I had a #undef __GNUC__ in misc.h the compilation finished (I still
> have the link complaining about recompiling with –fPIE and linking
> with –pie, but I should be able to handle that)
>
> I tried all available cray cc (2.2.1 and 2.5.6) and they behave the same.
>
> I’ll see how to report bug to Cray and may ask for a new compiler
> installation.
>
>  
>
> XL.
>
>  
>
> *De :*Brice Goglin [mailto:brice.gog...@inria.fr]
> *Envoyé :* jeudi 5 janvier 2017 14:39
> *À :* Xavier LACOSTE
> *Cc :* Hardware locality user list
> *Objet :* Re: [hwloc-users] Building hwloc on Cray with
> /opt/cray/craype/2.5.4/bin/cc
>
>  
>
> Ah ok now I remember we've seen the same issue 4 years ago.
> https://mail-archive.com/hwloc-users@lists.open-mpi.org/msg00816.html
> Basically the Cray compiler claimed to be GCC compatible it was not.
> But that users explicitly requested the cray compiler to behave as GNU
> with "-h gnu". Did you pass any option to the compiler?
>
> We could workaround the issue by not using __builtin_ffsl() when we
> detect the cray compiler, but that would be overkill since earlier
> versions worked fine. Do you have ways to report bugs to Cray? Or
> install a different cray compiler version?
>
> Brice
>
>
> Le 05/01/2017 14:25, Xavier LACOSTE a écrit :
>
> It seems that the __GNU__ is defined so I don’t get into the
> HWLOC_HAVE_FFSL section.
>
>  
>
> Maybe because I already configured once with gcc before ? Do I
> have to do anything more than make clean an reconfigure to change
> compiler ?
>
>  
>
> *De :*Brice Goglin [mailto:brice.gog...@inria.fr]
> *Envoyé :* jeudi 5 janvier 2017 14:18
> *À :* Xavier LACOSTE
> *Cc :* Hardware locality user list
> *Objet :* Re: [hwloc-users] Building hwloc on Cray with
> /opt/cray/craype/2.5.4/bin/cc
>
>  
>
> configure seems to have detected ffsl() properly, but it looks
> like our ffsl() redefinition gets enabled anyway, and conflicts
> with the system-wide one.
>
> Does it help if you comment out line #66 of include/private/misc.h ?
> extern int ffsl(long) __hwloc_attribute_const;
>
> Brice
>
>
>
> Le 05/01/2017 13:52, Xavier LACOSTE a écrit :
>
> Hello Brice,
>
>  
>
> I attached the files.
>
>  
>
> I could build hwloc and link with it using the default (gcc)
> compiler.
>
> But If I use gcc to build the library I can link with
> gcc/icc/pgcc but not with cc from cray.
>
>  
>
> Thanks,
>
>  
>
> XL.
>
>  
>
>  
>
> *De :*Brice Goglin [mailto:brice.gog...@inria.fr]
> *Envoyé :* jeudi 5 janvier 2017 12:50
> *À :* Xavier LACOSTE
> *Cc :* Hardware locality user list
> *Objet :* Re: [hwloc-users] Building hwloc on Cray with
> /opt/cray/craype/2.5.4/bin/cc
>
>  
>
> Hello Xavier
> Can you send the /usr/include/string.h from that cray machine,
> your config.log and include/private/autogen/config.h from the
> hwloc build directory?
> Do you get the same error if building from the default
> compiler instead of /opt/cray/craype/2.5.4/bin/cc?
> thanks
> Brice
>
>
>
> Le 05/01/2017 12:31, Xavier LACOSTE a écrit :
>
>  
>
> Hello,
>
>  
>
> I’m trying to build hwloc on a cray machine with cray
> compiler :/opt/cray/craype/2.5.4/bin/cc
>
>  
>
> I get the following error :
>
> $> CC=cc ./configure --prefix=$PWD-cc-install
>
> $> make
>
> Making all in src
>
> make[1]: Entering directory
> `/home/j0306818/xavier/hwloc-1.11.5/src'
>
>   CC   bitmap.lo
>
> CC-147 craycc: ERROR
>
>   Declaration is incompatible with "int ffsl(long)"
> (declared at line 526 of
>
>   "/usr/include/string.h").
>
>  
>
>  
>
> Total errors detected in bitmap.c: 1
>
> make[1]: *** [bitmap.lo] Error 1
>
> make[1]: Leaving directory
>

Re: [hwloc-users] Reporting an operating system warning

2017-01-03 Thread Brice Goglin

Thanks

Surprisingly, I don't see any L1i in the XML output either. Did you get
warnings during this run "HWLOC_COMPONENTS=x86 lstopo foo.xml" ?

Indeed, you (very likely) don't care about that warning in the AMD SDK.
Pass HWLOC_HIDE_ERRORS=1 in the environment to silence it.

Brice


Le 03/01/2017 07:59, Johannes Goller a écrit :
> Hi Brice,
>
> thank you very much for looking into this!
>
> I am attaching the generated foo.xml.
>
> I actually came across this error message when trying to play with
> OpenCL, using the AMD SDK API. My main interest is in getting that to
> work on the GPU, and that might still work on my current kernel (4.8),
> even if I get warnings like the one reported.
>
>
> johannes.
>
> 2017-01-03 15:15 GMT+09:00 Brice Goglin <brice.gog...@inria.fr
> <mailto:brice.gog...@inria.fr>>:
>
> Hello Johannes
>
> I think there are two bugs here.
>
> First one is that each "dual-core compute unit" is reported as a
> single core with two hardware threads. That's a kernel bug that
> appeared in 4.6. There's a fix at
> https://lkml.org/lkml/2016/11/29/852
> <https://lkml.org/lkml/2016/11/29/852> but I don't think it has
> been applied yet.
>
> The second bug is a conflict between dual-core compute unit
> sharing and L1i. I am not sure which one is actually buggy. Can
> you run "HWLOC_COMPONENTS=x86 lstopo foo.xml" and send the
> generated foo.xml? (this is our raw detection that works around
> the kernel detection).
>
> Trying a Linux kernel <= 4.5 may help in the meantime.
>
> thanks
> Brice
>
>
>
>
> Le 03/01/2017 05:29, Johannes Goller a écrit :
>> As requested on
>> https://www.open-mpi.org/projects/hwloc/doc/v1.10.1/a00028.php
>> <https://www.open-mpi.org/projects/hwloc/doc/v1.10.1/a00028.php>
>> ("What should I do when hwloc reports 'operating system'
>> warnings?"), I am reporting the warning/error I received as follows
>>
>> 
>> 
>> * hwloc 1.11.0 has encountered what looks like an error from the
>> operating system.
>> *
>> * L1i (cpuset 0x0003) intersects with Core (P#0 cpuset
>> 0x0081) without inclusion!
>> * Error occurred in topology.c line 983
>> *
>> * The following FAQ entry in the hwloc documentation may help:
>> *   What should I do when hwloc reports "operating system" warnings?
>> * Otherwise please report this error message to the hwloc user's
>> mailing list,
>> * along with the output+tarball generated by the
>> hwloc-gather-topology script.
>> 
>> 
>>
>> Please find the tarball attached.
>>
>>
>>
>> regards,
>>
>> Johannes Goller.
>>
>>
>>
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> <mailto:hwloc-users@lists.open-mpi.org>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
>> <https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users>
> ___ hwloc-users
> mailing list hwloc-users@lists.open-mpi.org
> <mailto:hwloc-users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users> 
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Reporting an operating system warning

2017-01-02 Thread Brice Goglin

Hello Johannes

I think there are two bugs here.

First one is that each "dual-core compute unit" is reported as a single
core with two hardware threads. That's a kernel bug that appeared in
4.6. There's a fix at https://lkml.org/lkml/2016/11/29/852 but I don't
think it has been applied yet.

The second bug is a conflict between dual-core compute unit sharing and
L1i. I am not sure which one is actually buggy. Can you run
"HWLOC_COMPONENTS=x86 lstopo foo.xml" and send the generated foo.xml?
(this is our raw detection that works around the kernel detection).

Trying a Linux kernel <= 4.5 may help in the meantime.

thanks
Brice



Le 03/01/2017 05:29, Johannes Goller a écrit :
> As requested on
> https://www.open-mpi.org/projects/hwloc/doc/v1.10.1/a00028.php ("What
> should I do when hwloc reports 'operating system' warnings?"), I am
> reporting the warning/error I received as follows
>
> 
> * hwloc 1.11.0 has encountered what looks like an error from the
> operating system.
> *
> * L1i (cpuset 0x0003) intersects with Core (P#0 cpuset 0x0081)
> without inclusion!
> * Error occurred in topology.c line 983
> *
> * The following FAQ entry in the hwloc documentation may help:
> *   What should I do when hwloc reports "operating system" warnings?
> * Otherwise please report this error message to the hwloc user's
> mailing list,
> * along with the output+tarball generated by the hwloc-gather-topology
> script.
> 
>
> Please find the tarball attached.
>
>
>
> regards,
>
> Johannes Goller.
>
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] memory binding on Knights Landing

2016-09-08 Thread Brice Goglin

Hello
It's not a feature. This should work fine.
Random guess: do you have NUMA headers on your build machine ? (package
libnuma-dev or numactl-devel)
(hwloc-info --support also report whether membinding is supported or not)
Brice



Le 08/09/2016 16:34, Dave Love a écrit :
> I'm somewhat confused by binding on Knights Landing -- which is probably
> a feature.
>
> I'm looking at a KNL box configured as "Cluster Mode: SNC4 Memory Mode:
> Cache" with hwloc 1.11.4; I've read the KNL hwloc FAQ entries.  I ran
> openmpi and it reported failure to bind memory (but binding to cores was
> OK).  So I tried hwloc-bind --membind and that seems to fail with no
> matter what I do, reporting
>
>   hwloc_set_membind 0x0002 (policy 2 flags 0) failed (errno 38 Function 
> not implemented)
>
> Is that expected, and is there a recommendation on how to do binding in
> that configuration with things that use hwloc?  I'm particularly
> interested in OMPI, but I guess this is a better place to ask.  Thanks.
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] Topology Error

2016-05-09 Thread Brice Goglin

Le 09/05/2016 23:58, Mehmet Belgin a écrit :
> Greetings!
>
> We've been receiving this error for a while on our 64-core Interlagos
> AMD machines:
>
> 
>
> * hwloc has encountered what looks like an error from the operating
> system.
> *
> * Socket (P#2 cpuset 0x,0x0) intersects with NUMANode (P#3
> cpuset 0xff00,0xff00) without inclusion!
> * Error occurred in topology.c line 940
> *
> * Please report this error message to the hwloc user's mailing list,
> * along with the output+tarball generated by the hwloc-gather-topology
> script.
> 
>
>
> I've found some information in the hwloc list archives mentioning this
> is due to buggy AMD platform and the impact should be limited to hwloc
> missing L3 cache info (thanks Brice). If that's the case and processor
> representation is correct then I am sure we can live with this, but I
> still wanted to check with the list to confirm that (1) this is really
> harmless and (2) are there any known solutions other than upgrading
> BIOS/kernel?

Hello

The L3 bug only applies to 12-core Opteron 62xx/63xx, while you have
16-core Opterons. Your L3 locality is correct, but your NUMA locality is
wrong:
$ cat sys/devices/system/node/node*/cpumap 
,00ff
ff00,ff00
00ff,
,
You should have something like this instead:
,
,
,
,

This bug is not harmless since memory buffers have a good chance of
being physically allocated far away from your cores.

This is more likely a BIOS bug. Try upgrading.

Regards
Brice

Re: [hwloc-users] hwloc_alloc_membind with HWLOC_MEMBIND_BYNODESET

2016-05-09 Thread Brice Goglin

Hello Hugo,

Can you send your code and a description of the machine so that I try to
reproduce ?

By the way, BYNODESET is also available in 1.11.3.

Brice



Le 09/05/2016 16:18, Hugo Brunie a écrit :
> Hello,
>
> When I try to use hwloc_alloc_membind with HWLOC_MEMBIND_BYNODESET
> I obtain NULL as a pointer and the error message is : Invalid Argument.
>
> I try without, it works. It works also with HWLOC_MEMBIND_STRICMT
> and/or HWLOC_MEMBIND_THREAD.
> My hwloc version is :
> ~/usr/bin/hwloc-bind --version
> hwloc-bind 2.0.0a1-git
>
> Best regards,
>
> Hugo BRUNIE
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
> Link to this post: 
> http://www.open-mpi.org/community/lists/hwloc-users/2016/05/1269.php

Re: [hwloc-users] HWLOC_get_membind: problem in getting right(specific) NODESET where data is allocated

2016-04-24 Thread Brice Goglin

Please find out which line is actually causing the segfault.
Run your program under gdb. Once it crashes, type "bt full" and report
the output here.

By the way, what kind of machine are you using? (lstopo + uname -a)

Brice



Le 24/04/2016 23:46, Rezaul Karim Raju a écrit :
> Hi Brice,
>
> Thank you very much for your prompt care. 
>
> I am retrieving as below:
>
> nodeset_c = hwloc_bitmap_alloc();
>
> */* Find Location of a: 3rd QUARTER */*
> err = *hwloc_get_area_membind_nodeset(*topology, *array+ size/2,
> size/4,* nodeset_c, , HWLOC_MEMBIND_THREAD ); 
>
> /* print the corresponding NUMA nodes */
> hwloc_bitmap_asprintf(, nodeset_c);
> printf("Address:= %p  Variable:=  bound
> to*nodeset %s with contains:*\n", (array+size/2), s);
> free(s);
> hwloc_bitmap_foreach_begin(hw_i, nodeset_c) {
> *obj_c = hwloc_get_numanode_obj_by_os_index(topology, hw_i);*
> *printf("[3rd Q]  node #%u (OS index %u) with %lld bytes of memory\n",
> obj_c->logical_index, hw_i, (unsigned long long)
> obj_c->memory.local_memory)*;
> } hwloc_bitmap_foreach_end();
> hwloc_bitmap_free(nodeset_c);
>
> *It prints as below:*
> *
> *
> *error no:= -1 and segmentation fault 
> *
> *my array size is =  262144 {data type long} and each Quarter = size/4
> =65536*
> Address of array:= 0x7f350e515000, tmp:= 0x7f34fe515000, tst_array:=
> 0x7f34ee515000
> Address of array:= 0x7f350e515000, array+size/4:= 0x7f352e515000,
> array+size/2:= 0x7f354e515000, array+3*size/4:= 0x7f356e515000
>
> Address:= 0x7f350e515000  Variable:=  bound
> to nodeset 0x0001 with contains:
>  [1st Q]  node #0 (OS index 0) with 8387047424 bytes of memory
> Address:= 0x7f352e515000  Variable:=  bound to
> nodeset 0x0004 with contains:
> [2nd Q]  node #2 (OS index 2) with 8471621632 bytes of memory
>
> in case of [3rd Q]
> Error Occured, and error no:= -1 and segmentation fault happened.
>
> Thanks.!
>
>
> On Sun, Apr 24, 2016 at 4:08 PM, Brice Goglin <brice.gog...@inria.fr
> <mailto:brice.gog...@inria.fr>> wrote:
>
> Hello,
> What do you mean with " it can not bind the specified memory
> section (addr, len) to the desired NUMA node"?
> Did it fail? If so, what does errno contain?
> If it didn't fail, what did it do instead?
> thanks
> Brice
>
>
>
>
> Le 24/04/2016 23:02, Rezaul Karim Raju a écrit :
>> Hi ...
>>
>> I was trying to bind each quarter of an array to 4 different NUMA
>> nodes, and doing as below: 
>>
>> *//ALLOCATION *
>> *obj_a = hwloc_get_obj_by_type(topology, HWLOC_OBJ_NODE, 0);*
>>
>> *array =* hwloc_alloc_membind_nodeset( topology, size,
>> obj_a->nodeset, HWLOC_MEMBIND_BIND, 1);
>> *tmp *= hwloc_alloc_membind_nodeset( topology, size,
>> obj_a->nodeset, HWLOC_MEMBIND_BIND, 1); 
>> *
>> *
>> *// DISTRIBUTED BINDING  [my system has 8 NUMA nodes (0-7)]*
>> printf("Address of array:= %p, array+size/4:= %p, array+size/2:=
>> %p, array+3*size/4:= %p \n", array, array+size/4, array+size/2,
>> array+3*size/4);
>> // bind 1st quarter to node (n-1)
>> hwloc_set_area_membind_nodeset(topology, (array), size/4,
>> obj_a->nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_MIGRATE);
>> hwloc_set_area_membind_nodeset(topology, (tmp), size/4,
>> obj_a->nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_MIGRATE);
>> // bind 2nd quarter to node (2)
>> *obj_b = hwloc_get_obj_by_type(topology, HWLOC_OBJ_NODE,  2);*
>> hwloc_set_area_membind_nodeset(topology, (array+size/4), size/4,
>> obj_b->nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_MIGRATE);
>> hwloc_set_area_membind_nodeset(topology, (tmp +size/4), size/4,
>> obj_b->nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_MIGRATE);
>>
>> // bind 3rd quarter to node (4)
>> *obj_c = hwloc_get_obj_by_type(topology, HWLOC_OBJ_NODE, 4);*
>> hwloc_set_area_membind_nodeset(topology, array+size/2, size/4,
>> obj_c->nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_MIGRATE);
>> hwloc_set_area_membind_nodeset(topology, tmp+size/2, size/4,
>> obj_c->nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_MIGRATE);
>> // bind 4th quarter to node (6)
>> *obj_d = hwloc_get_obj_by_type(topology, HWLOC_OBJ_NODE, 6);*
>> hwloc_set_area_membind_nodeset(topology, array+3*size/4, size/4,
>> obj_d->nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_MIGRATE);
>> hwloc_set_area_membind_nodeset(topology, tmp+3*size/4, size/4,
>> obj_d->nodeset, HWLOC_MEMBIND_BIND, HWLOC_ME

1 2 3 >

1 - 100 of 291 matches

Mail list logo