Re: [OMPI users] trouble_MPI

2012-09-19 Thread David Warren
Segfaults in FORTRAN generally mean either an array is out of bounds, or you can't get the memory you are requesting. Check your array sizes (particularly the ones in subroutines). You can compile with -C, but that only tells you if you exceed an array declaration, not the actual size. It is po

Re: [OMPI users] some mpi processes "disappear" on a cluster of servers

2012-09-04 Thread David Warren
Which FORTRAN compiler are you using? I believe that most of them allow you to compile with -g and optimization and then force a stack dump on crash. I have found this to work on code that seems to vanish on random processors. Also, you might look at the FORTRAN options and see if it lets you a

Re: [OMPI users] MPI/FORTRAN on a cluster system

2012-08-20 Thread David Warren
en-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- David Warren University of Washington 206 543-0954

Re: [OMPI users] compiling openMPI 1.6 with Intel compilers on Ubuntu, getting error

2012-07-24 Thread David Warren
g so that I can get libtool to see where icpc is? > > Thanks and best regards, > Stephen > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- David Warren University of Washington 206 543-0954

Re: [OMPI users] Bad parallel scaling using Code Saturne with openmpi

2012-07-10 Thread David Warren
Your problem may not be related to bandwidth. It may be latency or division of the problem. We found significant improvements running wrf and other atmospheric code (CFD) over IB. The problem was not so much the amount of data communicated, but how long it takes to send it. Also, is your model

Re: [OMPI users] fortran program with integer kind=8 using openmpi

2012-06-28 Thread David Warren
You should not have to recompile openmpi, but you do have to use the correct type. You can check the size of integers in your fortrana nd use MPI_INTEGER4 or MPI_INTEGER8 depending on what you get. in gfortran use integer i if(sizeof(i) .eq. 8) then mpi_int_type=MPI_INTEGER8 else mpi_int

Re: [OMPI users] MPI_BCAST and fortran subarrays

2011-12-14 Thread David Warren
Actually, sub array passing is part of the F90 standard (at least according to every document I can find), and not an Intel extension. So if it doesn't work you should complain to the compiler company. One of the reasons for using it is that the compiler should be optimized for whatever method

Re: [OMPI users] MPI_BCAST and fortran subarrays

2011-12-12 Thread David Warren
What FORTRAN compiler are you using? This should not really be an issue with the MPI implementation, but with the FORTRAN. This is legitimate usage in FORTRAN 90 and the compiler should deal with it. I do similar things using ifort and it creates temporary arrays when necessary and it all works

Re: [OMPI users] Open MPI via SSH noob issue

2011-08-09 Thread David Warren
I don't know if this is it, but if you use the name localhost, won't processes on both machines try to talk to 127.0.0.1? I believe you need to use the real hostname in you host file. I think that your two tests work because there is no interprocess communication, just stdout. On 08/08/11 23:4

Re: [OMPI users] OpenMPI causing WRF to crash

2011-08-05 Thread David Warren
That error is from one of the processes that was working when another one died. It is not an indication that MPI had problems, but that you had one of the wrf processes (#45) crash. You need to look at what happened to process 45. What do the rsl.out and rsl.error files for #45 say? On 08/04/

Re: [OMPI users] Mixed Mellanox and Qlogic problems

2011-07-27 Thread David Warren
/native performance of the network between the devices reflects the same dichotomy. (e.g., ibv_rc_pingpong) On Jul 15, 2011, at 7:58 PM, David Warren wrote: All OFED 1.4 and 2.6.32 (that's what I can get to today) qib to qib: # OSU MPI Latency Test v3.3 # SizeLatency (

Re: [OMPI users] Mixed Mellanox and Qlogic problems

2011-07-15 Thread David Warren
have done combined QLogic + Mellanox runs, so this probably isn't a well-explored space. Can you run some microbenchmarks to see what kind of latency / bandwidth you're getting between nodes of the same type and nodes of different types? On Jul 14, 2011, at 8:21 PM, David Warren wro

Re: [OMPI users] Mixed Mellanox and Qlogic problems

2011-07-14 Thread David Warren
some longer tests as well before I went to ofed 1.6. On 07/14/11 05:55, Jeff Squyres wrote: On Jul 13, 2011, at 7:46 PM, David Warren wrote: I finally got access to the systems again (the original ones are part of our real time system). I thought I would try one other test I had set up

Re: [OMPI users] Mixed Mellanox and Qlogic problems

2011-07-13 Thread David Warren
n attach a debugger to one of the still-live processes after the error message is printed. Can you send the stack trace? It would be interesting to know what is going on here -- I can't think of a reason that would happen offhand. On Jun 30, 2011, at 5:03 PM, David Warren wrote: I

[OMPI users] Mixed Mellanox and Qlogic problems

2011-06-30 Thread David Warren
I have a cluster with mostly Mellanox ConnectX hardware and a few with Qlogic QLE7340's. After looking through the web, FAQs etc. I built openmpi-1.5.3 with psm and openib. If I run within the same hardware it is fast and works fine. If I run between without specifying an MTL (e.g. mpirun -np 2