Creating nightly hwloc snapshot SVN tarball was a success.
Snapshot: hwloc 1.3.2rc2r4285
Start time: Thu Feb 9 21:04:15 EST 2012
End time: Thu Feb 9 21:07:14 EST 2012
Your friendly daemon,
Cyrador
Creating nightly hwloc snapshot SVN tarball was a success.
Snapshot: hwloc 1.5a1r4286
Start time: Thu Feb 9 21:01:01 EST 2012
End time: Thu Feb 9 21:04:14 EST 2012
Your friendly daemon,
Cyrador
That's pretty much what I had in mind too - will have to play with it a bit
until we find the best solution, but it shouldn't be all that hard.
On Feb 9, 2012, at 2:23 PM, Brice Goglin wrote:
> Here's what I would do:
> During init, walk the list of hwloc PCI devices (hwloc_get_next_pcidev())
On 2/9/2012 2:26 PM, Paul H. Hargrove wrote:
We then test if *either* set the variable.
Sort of a double-negative.
One of De Morgan's Laws:
NOT (A AND B) = (NOT A) OR (NOT B)
Applied to give:
NOT (TEST1_FAIL AND TEST2_FAIL)
= (NOT TEST1_FAIL) OR (NOT TEST2_FAIL)
=
On Feb 9, 2012, at 2:27 PM, Paul H. Hargrove wrote:
> What you have for the "Make sure..." is wrong in the same way as the one that
> was in rc1.
> The problem is that the AC_COMPILE_IFELSE code tests too-few and too-many
> args together.
> Since xlc makes too many an error by default, we don't
On 2/9/2012 1:19 PM, Brice Goglin wrote:
So you can find out that you are "bound" by a Linux cgroup (I am not
saying Linux "cpuset" to avoid confusion) by comparing root->cpuset and
root->online_cpuset.
If I understood the problem as stated earlier in this thread the current
code was
Here's what I would do:
During init, walk the list of hwloc PCI devices
(hwloc_get_next_pcidev()) and keep an array of pointers to the
interesting onces + their locality (the hwloc cpuset of the parent
non-IO object).
When you want the I/O device near a core, walk the array and find one
whose
Le 09/02/2012 14:00, Ralph Castain a écrit :
> There is another aspect, though - I had missed it in the thread, but the
> question Nadia was addressing is: how to tell I am bound? The way we
> currently do it is to compare our cpuset against the local cpuset - if we are
> on a subset, then we
Hmmm….guess we'll have to play with it. Our need is to start with a core or
some similar object, and quickly determine the closest IO device of a certain
type. We wound up having to write "summarizer" code to parse the hwloc tree
into a more OMPI-usable form, so we can always do that with the
That doesn't really work with the hwloc model unfortunately. Also, when
you get to smaller objects (cores, threads, ...) there are multiple
"closest" objects at each depth.
We have one "closest" object at some depth (usually Machine or NUMA
node). If you need something higher, you just walk the
Jeff,
What you have for the "Make sure..." is wrong in the same way as the one
that was in rc1.
The problem is that the AC_COMPILE_IFELSE code tests too-few and
too-many args together.
Since xlc makes too many an error by default, we don't notice its
MISbehavior when given too few.
So, one
On 2/9/2012 4:48 AM, Jeff Squyres wrote:
On Feb 8, 2012, at 6:02 PM, Paul H. Hargrove wrote:
The file config/hwloc_check_vendor.m4 that is present in trunk, is ABSENT in
the 1.3.2rc1 tarball.
There is, correspondingly, no call to _HWLOC_C_COMPILER_VENDOR in hwloc.m4.
Correct -- we hadn't
Nadia --
I committed the fix in the trunk to use HWLOC_WHOLE_SYSTEM and IO_DEVICES.
Do you want to revise your patch to use hwloc APIs with opal_hwloc_topology
(instead of paffinity)? We could use that as a basis for the other places you
identified that are doing similar things.
On Feb 9,
Thanks! I added the patch to the trunk and submitted it for the 1.6 update.
On Feb 8, 2012, at 10:20 PM, Y.MATSUMOTO wrote:
> Dear All,
>
> Next feedback is "MPI_Comm_spawn_multiple".
>
> When the function called from MPI_Comm_spawn_multiple failed,
> Segmentation fault occurs.
> In that
Ah, okay - in that case, having the I/O device attached to the "closest" object
at each depth would be ideal from an OMPI perspective.
On Feb 9, 2012, at 6:30 AM, Brice Goglin wrote:
> The bios usually tells you which numa location is close to each host-to-pci
> bridge. So the answer is yes.
>
The bios usually tells you which numa location is close to each host-to-pci
bridge. So the answer is yes.
Brice
Ralph Castain a écrit :
I'm not sure I understand this comment. A PCI device is attached to the node,
not to any specific location within the node, isn't it? Can
I'm not sure I understand this comment. A PCI device is attached to the node,
not to any specific location within the node, isn't it? Can you really say that
a PCI device is "attached" to a specific NUMA location, for example?
On Feb 9, 2012, at 6:15 AM, Jeff Squyres wrote:
> That doesn't
Yeah, I think that's the right solution. We'll have to check the impact on the
rest of the code, but I -think- it will be okay - else we'll have to make some
tweaks here and there. Either way, it's still the right answer, I think.
On Feb 9, 2012, at 6:14 AM, Jeff Squyres wrote:
> Should we
On Feb 9, 2012, at 8:06 AM, Brice Goglin wrote:
>> What if my cpuset is only on Socket P#0? What exactly will be reported
>> via (WHOLE_SUBSYSTEM | HWLOC_TOPOLOGY_FLAG_WHOLE_IO)?
>
> I actually fixed something related to this case in 1.3.2. The device will be
> attached to the root object in
Should we just do this, then:
Index: mca/hwloc/base/hwloc_base_util.c
===
--- mca/hwloc/base/hwloc_base_util.c(revision 25885)
+++ mca/hwloc/base/hwloc_base_util.c(working copy)
@@ -173,6 +173,9 @@
Jeff Squyres a écrit :
>On Feb 9, 2012, at 7:50 AM, Chris Samuel wrote:
>
>>> Just so that I understand this better -- if a process is bound in a
>>> cpuset, will tools like hwloc's lstopo only show the Linux
>>> processors *in that cpuset*? I.e., does it not have any
>>>
How's this patch (against v1.3, assuming
https://svn.open-mpi.org/trac/hwloc/changeset/4285)?
Is the test that checks to see if compilers error when the wrong number of
params are passed now mooot?
Index: config/hwloc.m4
===
---
Yes, I missed that point before - too early in the morning :-/
As I said in my last note, it would be nice to either have a flag indicating we
are bound, or see all the cpu info so we can compute that we are bound. Either
way, we still need to have a complete picture of all I/O devices so you
There is another aspect, though - I had missed it in the thread, but the
question Nadia was addressing is: how to tell I am bound? The way we currently
do it is to compare our cpuset against the local cpuset - if we are on a
subset, then we know we are bound.
So if all hwloc returns to us is
devel-boun...@open-mpi.org wrote on 02/09/2012 01:32:31 PM:
> De : Ralph Castain
> A : Open MPI Developers
> Date : 02/09/2012 01:32 PM
> Objet : Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see
> processes as bound if the job has been
On Feb 9, 2012, at 7:50 AM, Chris Samuel wrote:
>> Just so that I understand this better -- if a process is bound in a
>> cpuset, will tools like hwloc's lstopo only show the Linux
>> processors *in that cpuset*? I.e., does it not have any
>> visibility of the processors outside of its cpuset?
>
On Thursday 09 February 2012 22:18:20 Jeff Squyres wrote:
> Just so that I understand this better -- if a process is bound in a
> cpuset, will tools like hwloc's lstopo only show the Linux
> processors *in that cpuset*? I.e., does it not have any
> visibility of the processors outside of its
On Feb 8, 2012, at 4:44 PM, Brice Goglin wrote:
> Jeff, can you fix everything reported in this thread locally and prepare
> a new tarball for Paul?
Yes, just catching up on this thread now...
--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
On Feb 9, 2012, at 7:15 AM, nadia.der...@bull.net wrote:
> > By default, hwloc only shows what's inside the current cpuset. There's
> > an option to show everything instead (topology flag).
>
> So may be using that flag inside opal_paffinity_base_get_processor_info()
> would be a better fix
Hi Nadia
I'm wondering what value there is in showing the full topology, or using it in
any of our components, if the process is restricted to a specific set of cpus?
Does it really help to know that there are other cpus out there that are
unreachable?
On Feb 9, 2012, at 5:15 AM,
devel-boun...@open-mpi.org wrote on 02/09/2012 12:20:41 PM:
> De : Brice Goglin
> A : Open MPI Developers
> Date : 02/09/2012 12:20 PM
> Objet : Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see
> processes as bound if the job has been
devel-boun...@open-mpi.org wrote on 02/09/2012 12:18:20 PM:
> De : Jeff Squyres
> A : Open MPI Developers
> Date : 02/09/2012 12:18 PM
> Objet : Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see
> processes as bound if the job has been
By default, hwloc only shows what's inside the current cpuset. There's
an option to show everything instead (topology flag).
Brice
Le 09/02/2012 12:18, Jeff Squyres a écrit :
> Just so that I understand this better -- if a process is bound in a cpuset,
> will tools like hwloc's lstopo only
Just so that I understand this better -- if a process is bound in a cpuset,
will tools like hwloc's lstopo only show the Linux processors *in that cpuset*?
I.e., does it not have any visibility of the processors outside of its cpuset?
On Jan 27, 2012, at 11:38 AM, nadia.derbey wrote:
> Hi,
>
Dear All,
Next feedback is "MPI_Comm_spawn_multiple".
When the function called from MPI_Comm_spawn_multiple failed,
Segmentation fault occurs.
In that condition, "newcomp" sets NULL.
But member of "newcomp" is referred at following part.
(ompi/mpi/c/comm_spawn_multiple.c)
176 /* set array of
35 matches
Mail list logo