Re: [OMPI users] scaling issue beyond 1024 processes

2011-08-10 Thread Ralph Castain
When you say "stuck", what actually happens? On Aug 10, 2011, at 2:09 PM, CB wrote: > Now I was able to run MPI hello world example up to 3096 processes across 129 > nodes (24 cores per node). > However, it seems to get stuck with 3097 processes. > > Any suggestions for troubleshooting? > >

Re: [OMPI users] scaling issue beyond 1024 processes

2011-08-10 Thread CB
Now I was able to run MPI hello world example up to 3096 processes across 129 nodes (24 cores per node). However, it seems to get stuck with 3097 processes. Any suggestions for troubleshooting? Thanks, - Chansup On Tue, Aug 9, 2011 at 2:02 PM, CB wrote: > Hi Ralph, > >

Re: [OMPI users] CMAQ crashes with OpenMPI

2011-08-10 Thread Matthew Russell
Hmm, I didn't know that. Is OS X's small stack something that can be alleviated with "ulimit" in bash? Right now, I have my ulimit set to unlimited. Does this still work with OpenMPI? (I might be wrong, but doesn't MPI work over TCP, such that new spawned processes on my host wouldn't be

Re: [OMPI users] CMAQ crashes with OpenMPI

2011-08-10 Thread Matthew Russell
I figured. I've been using Ubuntu so long that I never expect any issues with upgrades (until Unity came out..) From what I've read however, ar on Lion seems to be a bit buggy (that seems to be the consistent complaint on the MacPort support forums), but overall I have faith that it can work.

[OMPI users] MPI with dynamic arrays

2011-08-10 Thread Sylvestre Ledru
Hello, I would like to know what could be the best way to send three variables with the following types: double * data; (which can be also int *, float * but that is a different issue) int row; // number of row int col; // number of cols On the nodes, I have no way to know a priori what is

Re: [OMPI users] CMAQ crashes with OpenMPI

2011-08-10 Thread Matthew Russell
Ack, that's a very good point. I made sure to compile all my other dependencies (NetCDF, IOAPI) with PGI, but I overlooked that one. I'll admit that even after years of working with these models, I'm still never sure when I can and can't mix binaries compiled with different compilers. I used

Re: [hwloc-users] hwloc get cpubind function

2011-08-10 Thread Gabriele Fatigati
Ok, thanks! 2011/8/10 Samuel Thibault > Samuel Thibault, le Wed 10 Aug 2011 16:24:39 +0200, a écrit : > > Gabriele Fatigati, le Wed 10 Aug 2011 16:13:27 +0200, a écrit : > > > there is something wrong. I'm using two thread, the first one is bound > on > > >

Re: [hwloc-users] hwloc get cpubind function

2011-08-10 Thread Samuel Thibault
Samuel Thibault, le Wed 10 Aug 2011 16:24:39 +0200, a écrit : > Gabriele Fatigati, le Wed 10 Aug 2011 16:13:27 +0200, a écrit : > > there is something wrong. I'm using two thread, the first one is bound on > > HWLOC_OBJ_PU number 2, the second one on  HWLOC_OBJ_PU number 10, > > It seems that

Re: [hwloc-users] hwloc get cpubind function

2011-08-10 Thread Samuel Thibault
Gabriele Fatigati, le Wed 10 Aug 2011 16:13:27 +0200, a écrit : > there is something wrong. I'm using two thread, the first one is bound on > HWLOC_OBJ_PU number 2, the second one on  HWLOC_OBJ_PU number 10, It seems that hwloc_linux_get_tid_last_cpu_location erroneously assume that

Re: [hwloc-users] hwloc get cpubind function

2011-08-10 Thread Gabriele Fatigati
Mm, there is something wrong. I'm using two thread, the first one is bound on HWLOC_OBJ_PU number 2, the second one on HWLOC_OBJ_PU number 10, and hwloc_get_last_cpu_location() give me the same CPU index for each thread.. ( machine is not SMT). But from linux "top" command I see CPU 2 and 10

Re: [hwloc-users] hwloc get cpubind function

2011-08-10 Thread Samuel Thibault
Gabriele Fatigati, le Wed 10 Aug 2011 15:41:19 +0200, a écrit : > hwloc_cpuset_t set = hwloc_bitmap_alloc(); > > int return_value = hwloc_get_last_cpu_location(topology, set, >  HWLOC_CPUBIND_THREAD); > > printf( " bitmap_string: %s \n", bitmap_string[0]); > > give me: > > 0x0800 > >

Re: [hwloc-users] hwloc get cpubind function

2011-08-10 Thread Gabriele Fatigati
Yes of course: char*bitmap_string[256]; hwloc_cpuset_t set = hwloc_bitmap_alloc(); int return_value = hwloc_get_last_cpu_location(topology, set, HWLOC_CPUBIND_THREAD); printf( " bitmap_string: %s \n", bitmap_string[0]); give me: 0x0800 converted in binary: 1000 So, CPU 0 I

Re: [hwloc-users] hwloc get cpubind function

2011-08-10 Thread Samuel Thibault
Gabriele Fatigati, le Wed 10 Aug 2011 15:29:43 +0200, a écrit : > hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_MACHINE, 0); > > int return_value = hwloc_get_last_cpu_location(topology, core->cpuset, > HWLOC_CPUBIND_THREAD); > > and now in "core->cpuset" I get the new cpuset

Re: [hwloc-users] hwloc get cpubind function

2011-08-10 Thread Gabriele Fatigati
Hi Samuel, please show this little example: hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_MACHINE, 0); int return_value = hwloc_get_last_cpu_location(topology, core->cpuset, HWLOC_CPUBIND_THREAD); and now in "core->cpuset" I get the new cpuset bitmap, where process/threads runs.

Re: [OMPI users] How to setup and use nodes for OpenMPI on Windows

2011-08-10 Thread Shiqing Fan
Hi Clinton, I suggest that you build Open MPI directly on the Windows Server, so that the system dependencies wouldn't get wrong. If you just copy around the binaries, there will be problems: your local PC (I guess it's Windows Vista or 7) has inet_pton, but the Windows Server 2003 doesn't

Re: [OMPI users] Open MPI via SSH noob issue

2011-08-10 Thread Jeff Squyres
Have you setup your shell startup files such that they point to the new OMPI installation (/opt/local/openmpi/) even for non-interactive logins? On Aug 10, 2011, at 6:14 AM, Christopher Jones wrote: > Hi, > > Thanks for the quick response.I managed to compile 1.5.3 on both > computers

Re: [hwloc-users] hwloc get cpubind function

2011-08-10 Thread Samuel Thibault
Gabriele Fatigati, le Wed 10 Aug 2011 09:35:19 +0200, a écrit : > these lines, doesn't works: > > set = hwloc_bitmap_alloc(); > hwloc_get_cpubind(topology, , 0); > > hwloc_get_cpubind() crash, because I have to pass set, not i suppose. Right, of course. > I think hwloc_get_last_cpu_location()

Re: [hwloc-users] hwloc get cpubind function

2011-08-10 Thread Gabriele Fatigati
Hi Samuel, these lines, doesn't works: set = hwloc_bitmap_alloc(); hwloc_get_cpubind(topology, , 0); hwloc_get_cpubind() crash, because I have to pass set, not i suppose. I think hwloc_get_last_cpu_location() is used coupled with hwloc_get_cpubind()? hwloc_get_cpubind() give me the cpuset,