Re: [hwloc-users] hwloc: Topology became empty, aborting!

2023-08-02 Thread Max R. Dechantsreiter
Correction: I have E5-2620v4, which is 8-core Broadwell.
Please excuse my error earlier.


On Wed, Aug 02, 2023 at 01:23:18PM +, Max R. Dechantsreiter wrote:
> Hi Brice,
> 
> Well, the VPS gives me a 4-core slice of an Intel(R) Xeon(R)
> CPU E5-2620 node, which is Sandy Bridge EP, with 6 physical
> cores, so probably 12 cores on the node.  The numbering does
> seem wacky: it seems to describe a node with 2 8-core CPUs.
> 
> This is the VPS on which I host my Web site; I use its shell
> account for sundry testing, mostly of build procedures.
> 
> Is there anything I could do to get hwloc to work?
> 
> Regards,
> 
> Max
> ---
> 
> 
> On Wed, Aug 02, 2023 at 03:12:27PM +0200, Brice Goglin wrote:
> > Hello
> > 
> > There's something wrong in this machine. It exposes 4 cores (number 0 to 3)
> > and no NUMA node, but says the only allowed resources are cores 8-15,24-31
> > and NUMA node 1. That's why hwloc says the topology is empty (running lstopo
> > --disallowed shows NUMA 0 and cores 0-3 in red, which means they aren't
> > allowed). How did this get configured so badly?
> > 
> > Brice
> > 
> > 
> > 
> > Le 02/08/2023 à 14:54, Max R. Dechantsreiter a écrit :
> > > Hello,
> > > 
> > > On my VPS I tested my build of hwloc-2.9.2 by running lstopo:
> > > 
> > > ./lstopo
> > > hwloc: Topology became empty, aborting!
> > > Segmentation fault
> > > 
> > > On a GCP n1-standard-2 a similar build (GCC 12.2 vs. 13.2) seemed to work:
> > > 
> > > ./lstopo
> > > hwloc/nvml: Failed to initialize with nvmlInit(): Driver Not Loaded
> > > Machine (7430MB total)
> > > Package L#0
> > >   NUMANode L#0 (P#0 7430MB)
> > >   L3 L#0 (45MB) + L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + 
> > > Core L#0
> > > PU L#0 (P#0)
> > > PU L#1 (P#1)
> > > HostBridge
> > >   PCI 00:03.0 (Other)
> > > Block(Disk) "sda"
> > >   PCI 00:04.0 (Ethernet)
> > > Net "ens4"
> > >   PCI 00:05.0 (Other)
> > > 
> > > (from which I conclude my build procedure is correct).
> > > 
> > > At the suggestion of Brice Goglin (in response to my post of the same
> > > issue to Open MPI Users), I rebuilt with '--enable-debug' and ran lstopo;
> > > then I also ran
> > > 
> > > hwloc-gather-topology hwloc-gather-topology
> > > 
> > > The resulting lstopo.tar.gz and hwloc-gather-topology.tar.gz are attached,
> > > as I was unable to recognize the underlying problem, although I believe it
> > > could be a system issue, for my builds of OpenMPI on the VPS used to work
> > > before a new OS image was installed.
> > > 
> > > Max
> 
> 
> 
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users


Re: [hwloc-users] hwloc: Topology became empty, aborting!

2023-08-02 Thread Max R. Dechantsreiter
Hi Brice,

Setting HWLOC_ALLOW=all made hwloc usable on my oddly-configured VPS:

./lstopo
Machine (4096MB total) + Package L#0
  NUMANode L#0 (P#0 4096MB)
  L3 L#0 (20MB)
L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)

As I mentioned, I use this system only for testing builds, so this
"workaround" should enable me to build a working OpenMPI that I can
use as a component in applications.

Thank you!

Max
---


On Wed, Aug 02, 2023 at 03:38:46PM +0200, Brice Goglin wrote:
> The cgroup information under /sys/fs/cgroup/ should be fixed. cpuset.cpus
> should contain 0-3 and cpuset.mems should contain 0. In the meantime, hwloc
> may ignore this cgroup info if you set HWLOC_ALLOW=all in the environment.
> 
> The x86 CPUID information is also wrong on this machine. All 4 cores report
> the same "APIC id" (sort of hardware core ID), I guess all your 4 cores are
> virtualized over a single hardware core and the hypervisor doesn't care
> about emulating topology information correctly.
> 
> Brice
> 
> 
> 
> Le 02/08/2023 à 15:23, Max R. Dechantsreiter a écrit :
> > Hi Brice,
> > 
> > Well, the VPS gives me a 4-core slice of an Intel(R) Xeon(R)
> > CPU E5-2620 node, which is Sandy Bridge EP, with 6 physical
> > cores, so probably 12 cores on the node.  The numbering does
> > seem wacky: it seems to describe a node with 2 8-core CPUs.
> > 
> > This is the VPS on which I host my Web site; I use its shell
> > account for sundry testing, mostly of build procedures.
> > 
> > Is there anything I could do to get hwloc to work?
> > 
> > Regards,
> > 
> > Max
> > ---
> > 
> > 
> > On Wed, Aug 02, 2023 at 03:12:27PM +0200, Brice Goglin wrote:
> > > Hello
> > > 
> > > There's something wrong in this machine. It exposes 4 cores (number 0 to 
> > > 3)
> > > and no NUMA node, but says the only allowed resources are cores 8-15,24-31
> > > and NUMA node 1. That's why hwloc says the topology is empty (running 
> > > lstopo
> > > --disallowed shows NUMA 0 and cores 0-3 in red, which means they aren't
> > > allowed). How did this get configured so badly?
> > > 
> > > Brice
> > > 
> > > 
> > > 
> > > Le 02/08/2023 à 14:54, Max R. Dechantsreiter a écrit :
> > > > Hello,
> > > > 
> > > > On my VPS I tested my build of hwloc-2.9.2 by running lstopo:
> > > > 
> > > > ./lstopo
> > > > hwloc: Topology became empty, aborting!
> > > > Segmentation fault
> > > > 
> > > > On a GCP n1-standard-2 a similar build (GCC 12.2 vs. 13.2) seemed to 
> > > > work:
> > > > 
> > > > ./lstopo
> > > > hwloc/nvml: Failed to initialize with nvmlInit(): Driver Not Loaded
> > > > Machine (7430MB total)
> > > >  Package L#0
> > > >NUMANode L#0 (P#0 7430MB)
> > > >L3 L#0 (45MB) + L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) 
> > > > + Core L#0
> > > >  PU L#0 (P#0)
> > > >  PU L#1 (P#1)
> > > >  HostBridge
> > > >PCI 00:03.0 (Other)
> > > >  Block(Disk) "sda"
> > > >PCI 00:04.0 (Ethernet)
> > > >  Net "ens4"
> > > >PCI 00:05.0 (Other)
> > > > 
> > > > (from which I conclude my build procedure is correct).
> > > > 
> > > > At the suggestion of Brice Goglin (in response to my post of the same
> > > > issue to Open MPI Users), I rebuilt with '--enable-debug' and ran 
> > > > lstopo;
> > > > then I also ran
> > > > 
> > > > hwloc-gather-topology hwloc-gather-topology
> > > > 
> > > > The resulting lstopo.tar.gz and hwloc-gather-topology.tar.gz are 
> > > > attached,
> > > > as I was unable to recognize the underlying problem, although I believe 
> > > > it
> > > > could be a system issue, for my builds of OpenMPI on the VPS used to 
> > > > work
> > > > before a new OS image was installed.
> > > > 
> > > > Max
> > 
> > 



___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users


[hwloc-users] hwloc: Topology became empty, aborting!

2023-08-02 Thread Max R. Dechantsreiter
Hello,

On my VPS I tested my build of hwloc-2.9.2 by running lstopo:

./lstopo
hwloc: Topology became empty, aborting!
Segmentation fault

On a GCP n1-standard-2 a similar build (GCC 12.2 vs. 13.2) seemed to work:

./lstopo
hwloc/nvml: Failed to initialize with nvmlInit(): Driver Not Loaded
Machine (7430MB total)
   Package L#0
 NUMANode L#0 (P#0 7430MB)
 L3 L#0 (45MB) + L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
   PU L#0 (P#0)
   PU L#1 (P#1)
   HostBridge
 PCI 00:03.0 (Other)
   Block(Disk) "sda"
 PCI 00:04.0 (Ethernet)
   Net "ens4"
 PCI 00:05.0 (Other)

(from which I conclude my build procedure is correct).

At the suggestion of Brice Goglin (in response to my post of the same
issue to Open MPI Users), I rebuilt with '--enable-debug' and ran lstopo;
then I also ran

hwloc-gather-topology hwloc-gather-topology

The resulting lstopo.tar.gz and hwloc-gather-topology.tar.gz are attached,
as I was unable to recognize the underlying problem, although I believe it
could be a system issue, for my builds of OpenMPI on the VPS used to work
before a new OS image was installed.

Max


lstopo.tar.gz
Description: application/gzip


hwloc-gather-topology.tar.bz2
Description: Binary data
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] hwloc: Topology became empty, aborting!

2023-08-02 Thread Max R. Dechantsreiter
Hi Brice,

Well, the VPS gives me a 4-core slice of an Intel(R) Xeon(R)
CPU E5-2620 node, which is Sandy Bridge EP, with 6 physical
cores, so probably 12 cores on the node.  The numbering does
seem wacky: it seems to describe a node with 2 8-core CPUs.

This is the VPS on which I host my Web site; I use its shell
account for sundry testing, mostly of build procedures.

Is there anything I could do to get hwloc to work?

Regards,

Max
---


On Wed, Aug 02, 2023 at 03:12:27PM +0200, Brice Goglin wrote:
> Hello
> 
> There's something wrong in this machine. It exposes 4 cores (number 0 to 3)
> and no NUMA node, but says the only allowed resources are cores 8-15,24-31
> and NUMA node 1. That's why hwloc says the topology is empty (running lstopo
> --disallowed shows NUMA 0 and cores 0-3 in red, which means they aren't
> allowed). How did this get configured so badly?
> 
> Brice
> 
> 
> 
> Le 02/08/2023 à 14:54, Max R. Dechantsreiter a écrit :
> > Hello,
> > 
> > On my VPS I tested my build of hwloc-2.9.2 by running lstopo:
> > 
> > ./lstopo
> > hwloc: Topology became empty, aborting!
> > Segmentation fault
> > 
> > On a GCP n1-standard-2 a similar build (GCC 12.2 vs. 13.2) seemed to work:
> > 
> > ./lstopo
> > hwloc/nvml: Failed to initialize with nvmlInit(): Driver Not Loaded
> > Machine (7430MB total)
> > Package L#0
> >   NUMANode L#0 (P#0 7430MB)
> >   L3 L#0 (45MB) + L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + 
> > Core L#0
> > PU L#0 (P#0)
> > PU L#1 (P#1)
> > HostBridge
> >   PCI 00:03.0 (Other)
> > Block(Disk) "sda"
> >   PCI 00:04.0 (Ethernet)
> > Net "ens4"
> >   PCI 00:05.0 (Other)
> > 
> > (from which I conclude my build procedure is correct).
> > 
> > At the suggestion of Brice Goglin (in response to my post of the same
> > issue to Open MPI Users), I rebuilt with '--enable-debug' and ran lstopo;
> > then I also ran
> > 
> > hwloc-gather-topology hwloc-gather-topology
> > 
> > The resulting lstopo.tar.gz and hwloc-gather-topology.tar.gz are attached,
> > as I was unable to recognize the underlying problem, although I believe it
> > could be a system issue, for my builds of OpenMPI on the VPS used to work
> > before a new OS image was installed.
> > 
> > Max



___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users