It seems to me the FAQ item
http://www.open-mpi.org/faq/?category=large-clusters#fd-limits needs
updating. I'm willing to give this a try, but need some help first.
(I'm even more willing to let someone else do all this, but I'm not
holding my breath.)
For example, the text sounds dated --
I am playing with those aspects right now (it's planned for hwloc v1.4).
hwloc (even the 1.2 currently in OMPI) can already support topology
containing different machines, but there's no easy/automatic way to
agregate multiple machine topologies into a single global one. The
important thing to unde
This might get interesting. In "portable hardware locality" (hwloc) as
originating at the native cpuset, and I see "locality" working at the
machine level (machines in my world can have up to 8 CPUs, for example).
But from an ompi world view, the execution graph across myriad machines
might dicta
On Aug 29, 2011, at 10:08 AM, nadia.der...@bull.net wrote:
> devel-boun...@open-mpi.org wrote on 08/29/2011 05:57:59 PM:
>
> > De : Ralph Castain
> > A : Open MPI Developers
> > Date : 08/29/2011 05:58 PM
> > Objet : Re: [OMPI devel] known limitation or bug in hwloc?
> > Envoyé par : devel
devel-boun...@open-mpi.org wrote on 08/29/2011 05:57:59 PM:
> De : Ralph Castain
> A : Open MPI Developers
> Date : 08/29/2011 05:58 PM
> Objet : Re: [OMPI devel] known limitation or bug in hwloc?
> Envoyé par : devel-boun...@open-mpi.org
>
> On Aug 29, 2011, at 8:35 AM, nadia.der...@bull.net w
I guess the question might be over the integration point. If we are setting
CPU_MAX at the opal/paffinity level, how does hwloc pickup that value? Or does
hwloc just set its own bitmask sizes and loop limits, and your integration
handles any disagreement?
On Aug 29, 2011, at 10:04 AM, Jeff Squ
Or, if there's a specific problem in hwloc (i.e., hwloc proper -- not the
component in OMPI), post to hwloc-de...@open-mpi.org.
I *think* that hwloc handles CPU sets of any size. I bumped the version of
hwloc to 1.2.1 (the latest stable release) in both the trunk and v1.5. v1.4
doesn't have h
On Aug 29, 2011, at 8:35 AM, nadia.der...@bull.net wrote:
>
> devel-boun...@open-mpi.org wrote on 08/29/2011 04:20:30 PM:
>
> > De : Ralph Castain
> > A : Open MPI Developers
> > Date : 08/29/2011 04:26 PM
> > Objet : Re: [OMPI devel] known limitation or bug in hwloc?
> > Envoyé par : dev
devel-boun...@open-mpi.org wrote on 08/29/2011 04:20:30 PM:
> De : Ralph Castain
> A : Open MPI Developers
> Date : 08/29/2011 04:26 PM
> Objet : Re: [OMPI devel] known limitation or bug in hwloc?
> Envoyé par : devel-boun...@open-mpi.org
>
> Actually, I'll eat those words. I was looking at the
Actually, I'll eat those words. I was looking at the wrong place.
Yes, that is a bug in hwloc. It needs to loop over CPU_MAX for those cases
where the bit mask extends over multiple words.
On Aug 29, 2011, at 7:16 AM, Ralph Castain wrote:
> Actually, if you look closely at the definition of th
Is your interconnect Gigabytes Ethernet? It's very surprised to see TCP BTL
just got 33MBytes peak BW on your cluster. I did a similar test on an amd
cluster with gigabytes Ethernet. As following shows, the TCP BTL's BW is
similar with your tipc(112MBytes/s). Could you redo the test with 2
proce
Actually, if you look closely at the definition of those two values, you'll see
that it really doesn't matter which one we loop over. The NUM_BITS value
defines the actual total number of bits in the mask. The CPU_MAX is the total
number of cpus we can support, which was set to a value such that
Nadia,
Interesting. I haven't tried pushing this to levels above 8 on a particular
machine. Do you think that the cpuset / paffinity / hwloc only applies at
the machine level, at which time you need to employ a graph with carto?
Regards,
Ken
-Original Message-
From: devel-boun...@open-m
Hi list,
I'm hitting a limitation with paffinity/hwloc with cpu numbers >= 64.
In opal/mca/paffinity/hwloc/paffinity_hwloc_module.c, module_set() is
the routine that sets the calling process affinity to the mask given as
parameter. Note that "mask" is a opal_paffinity_base_cpu_set_t (so we
allow
On 08/25/2011 03:14 PM, Jeff Squyres wrote:
On Aug 25, 2011, at 8:25 AM, Xin He wrote:
Can you edit your configure.m4 directly and test it and whatnot? I provided
the configure.m4 as a starting point for you. :-) It shouldn't be hard to
make it check linux/tipc.h instead of tipc.h. I'm
15 matches
Mail list logo