[OMPI devel] descriptor limits -- FAQ item

2011-08-29 Thread Eugene Loh
It seems to me the FAQ item http://www.open-mpi.org/faq/?category=large-clusters#fd-limits needs updating. I'm willing to give this a try, but need some help first. (I'm even more willing to let someone else do all this, but I'm not holding my breath.) For example, the text sounds dated --

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread Brice Goglin
I am playing with those aspects right now (it's planned for hwloc v1.4). hwloc (even the 1.2 currently in OMPI) can already support topology containing different machines, but there's no easy/automatic way to agregate multiple machine topologies into a single global one. The important thing to unde

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread Kenneth Lloyd
This might get interesting. In "portable hardware locality" (hwloc) as originating at the native cpuset, and I see "locality" working at the machine level (machines in my world can have up to 8 CPUs, for example). But from an ompi world view, the execution graph across myriad machines might dicta

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread Ralph Castain
On Aug 29, 2011, at 10:08 AM, nadia.der...@bull.net wrote: > devel-boun...@open-mpi.org wrote on 08/29/2011 05:57:59 PM: > > > De : Ralph Castain > > A : Open MPI Developers > > Date : 08/29/2011 05:58 PM > > Objet : Re: [OMPI devel] known limitation or bug in hwloc? > > Envoyé par : devel

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread nadia . derbey
devel-boun...@open-mpi.org wrote on 08/29/2011 05:57:59 PM: > De : Ralph Castain > A : Open MPI Developers > Date : 08/29/2011 05:58 PM > Objet : Re: [OMPI devel] known limitation or bug in hwloc? > Envoyé par : devel-boun...@open-mpi.org > > On Aug 29, 2011, at 8:35 AM, nadia.der...@bull.net w

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread Ralph Castain
I guess the question might be over the integration point. If we are setting CPU_MAX at the opal/paffinity level, how does hwloc pickup that value? Or does hwloc just set its own bitmask sizes and loop limits, and your integration handles any disagreement? On Aug 29, 2011, at 10:04 AM, Jeff Squ

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread Jeff Squyres
Or, if there's a specific problem in hwloc (i.e., hwloc proper -- not the component in OMPI), post to hwloc-de...@open-mpi.org. I *think* that hwloc handles CPU sets of any size. I bumped the version of hwloc to 1.2.1 (the latest stable release) in both the trunk and v1.5. v1.4 doesn't have h

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread Ralph Castain
On Aug 29, 2011, at 8:35 AM, nadia.der...@bull.net wrote: > > devel-boun...@open-mpi.org wrote on 08/29/2011 04:20:30 PM: > > > De : Ralph Castain > > A : Open MPI Developers > > Date : 08/29/2011 04:26 PM > > Objet : Re: [OMPI devel] known limitation or bug in hwloc? > > Envoyé par : dev

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread nadia . derbey
devel-boun...@open-mpi.org wrote on 08/29/2011 04:20:30 PM: > De : Ralph Castain > A : Open MPI Developers > Date : 08/29/2011 04:26 PM > Objet : Re: [OMPI devel] known limitation or bug in hwloc? > Envoyé par : devel-boun...@open-mpi.org > > Actually, I'll eat those words. I was looking at the

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread Ralph Castain
Actually, I'll eat those words. I was looking at the wrong place. Yes, that is a bug in hwloc. It needs to loop over CPU_MAX for those cases where the bit mask extends over multiple words. On Aug 29, 2011, at 7:16 AM, Ralph Castain wrote: > Actually, if you look closely at the definition of th

Re: [OMPI devel] TIPC BTL code ready for review

2011-08-29 Thread teng ma
Is your interconnect Gigabytes Ethernet? It's very surprised to see TCP BTL just got 33MBytes peak BW on your cluster. I did a similar test on an amd cluster with gigabytes Ethernet. As following shows, the TCP BTL's BW is similar with your tipc(112MBytes/s). Could you redo the test with 2 proce

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread Ralph Castain
Actually, if you look closely at the definition of those two values, you'll see that it really doesn't matter which one we loop over. The NUM_BITS value defines the actual total number of bits in the mask. The CPU_MAX is the total number of cpus we can support, which was set to a value such that

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread Kenneth Lloyd
Nadia, Interesting. I haven't tried pushing this to levels above 8 on a particular machine. Do you think that the cpuset / paffinity / hwloc only applies at the machine level, at which time you need to employ a graph with carto? Regards, Ken -Original Message- From: devel-boun...@open-m

[OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread nadia.derbey
Hi list, I'm hitting a limitation with paffinity/hwloc with cpu numbers >= 64. In opal/mca/paffinity/hwloc/paffinity_hwloc_module.c, module_set() is the routine that sets the calling process affinity to the mask given as parameter. Note that "mask" is a opal_paffinity_base_cpu_set_t (so we allow

Re: [OMPI devel] TIPC BTL code ready for review

2011-08-29 Thread Xin He
On 08/25/2011 03:14 PM, Jeff Squyres wrote: On Aug 25, 2011, at 8:25 AM, Xin He wrote: Can you edit your configure.m4 directly and test it and whatnot? I provided the configure.m4 as a starting point for you. :-) It shouldn't be hard to make it check linux/tipc.h instead of tipc.h. I'm