subject:"Re\: \[OMPI users\] mpirun hangs"

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users

t here on the mailing list. -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Jeff Squyres (jsquyres) via users Sent: Thursday, May 5, 2022 3:31 PM To: George Bosilca; Open MPI Users Cc: Jeff Squyres (jsquyres) Subject: Re: [OMPI users] mpirun hangs on m

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users

2022 3:19 PM To: Open MPI Users Cc: Jeff Squyres (jsquyres); Scott Sayres Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 That is weird, but maybe it is not a deadlock, but a very slow progress. In the child can you print the fdmax and i in the frame do_child. George. On Thu,

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread George Bosilca via users

That is weird, but maybe it is not a deadlock, but a very slow progress. In the child can you print the fdmax and i in the frame do_child. George. On Thu, May 5, 2022 at 11:50 AM Scott Sayres via users < users@lists.open-mpi.org> wrote: > Jeff, thanks. > from 1: > > (lldb) process attach --pid 9

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Scott Sayres via users

Jeff, thanks. from 1: (lldb) process attach --pid 95083 Process 95083 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x0001bde25628 libsystem_kernel.dylib`close + 8 libsystem_kernel.dylib`close: -> 0x1bde25628 <+8>: b.lo 0x1bde25648

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users

You can use "lldb -p PID" to attach to a running process. -- Jeff Squyres jsquy...@cisco.com From: Scott Sayres Sent: Thursday, May 5, 2022 11:22 AM To: Jeff Squyres (jsquyres) Cc: Open MPI Users Subject: Re: [OMPI users] mpirun hangs on m1 mac

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Scott Sayres via users

Jeff, It does launch two mpirun processes (when hung from another terminal window) scottsayres 95083 99.0 0.0 408918416 1472 s002 R 8:20AM 0:04.48 mpirun -np 4 foo.sh scottsayres 95085 0.0 0.0 408628368 1632 s006 S+8:20AM 0:00.00 egrep mpirun|foo.sh scottsayres

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Bennet Fauber via users

happens immediately after forking the > child process... which is weird). > > -- > Jeff Squyres > jsquy...@cisco.com > > ________ > From: Scott Sayres > Sent: Wednesday, May 4, 2022 4:02 PM > To: Jeff Squyres (jsquyres) > Cc: Open MPI Users > Subject: Re: [OMPI users] mp

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users

ild process... which is weird). -- Jeff Squyres jsquy...@cisco.com From: Scott Sayres Sent: Wednesday, May 4, 2022 4:02 PM To: Jeff Squyres (jsquyres) Cc: Open MPI Users Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 foo.sh is executabl

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Gilles Gouaillardet via users

it via: >> >> mpirun -np 1 foo.sh >> >> If you start seeing output, good!If it completes, better! >> >> If it hangs, and/or if you don't see any output at all, do this: >> >> ps auxwww | egrep 'mpirun|foo.sh' >> >> It should show mp

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Scott Sayres via users

ut at all, do this: > > ps auxwww | egrep 'mpirun|foo.sh' > > It should show mpirun and 2 copies of foo.sh (and probably a grep). Does > it? > > -- > Jeff Squyres > jsquy...@cisco.com > > ________________ > From: Scott Sayres >

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Jeff Squyres (jsquyres) via users

uy...@cisco.com From: Scott Sayres Sent: Wednesday, May 4, 2022 2:47 PM To: Open MPI Users Cc: Jeff Squyres (jsquyres) Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 Following Jeff's advice, I have rebuilt open-mpi by hand using the -g option. This shows more

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Scott Sayres via users

Following Jeff's advice, I have rebuilt open-mpi by hand using the -g option. This shows more information as below. I am attempting George's advice of how to track the child but notice that gdb does not support arm64. attempting to update lldb. scottsayres@scotts-mbp openmpi-4.1.3 % lldb mpir

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Jeff Squyres (jsquyres) via users

Sent: Wednesday, May 4, 2022 12:35 PM To: Open MPI Users Cc: George Bosilca Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 I compiled a fresh copy of the 4.1.3 branch on my M1 laptop, and I can run both MPI and non-MPI apps without any issues. Try running `lldb mpirun -- -np 1 hos

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread George Bosilca via users

Scott, This shows the deadlock arrives during the local spawn. Here is how things are supposed to work: the mpirun process (parent) will fork (the child), and these 2 processes are connected through a pipe. The child will then execve the desired command (hostname in your case), and this will close

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Scott Sayres via users

Hi George, Thanks! You have just taught me a new trick. Although I do not yet understand the output, it is below: scottsayres@scotts-mbp ~ % lldb mpirun -- -np 1 hostname (lldb) target create "mpirun" Current executable set to 'mpirun' (arm64). (lldb) settings set -- target.run-args "-np" "1

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread George Bosilca via users

I compiled a fresh copy of the 4.1.3 branch on my M1 laptop, and I can run both MPI and non-MPI apps without any issues. Try running `lldb mpirun -- -np 1 hostname` and once it deadlocks, do a CTRL+C to get back on the debugger and then `backtrace` to see where it is waiting. George. On Wed, Ma

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Scott Sayres via users

Thanks for looking at this Jeff. No, I cannot use mpirun to launch a non-MPI application.The command "mpirun -np 2 hostname" also hangs. I get the following output if I add the -d command before (I've replaced the server with the hashtags) : [scotts-mbp.3500.dhcp.###:05469] procdir: /var/fol

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Jeff Squyres (jsquyres) via users

Are you able to use mpirun to launch a non-MPI application? E.g.: mpirun -np 2 hostname And if that works, can you run the simple example MPI apps in the "examples" directory of the MPI source tarball (the "hello world" and "ring" programs)? E.g.: cd examples make mpirun -np 4 hello_c mpirun

Re: [OMPI users] mpirun hangs

2018-08-15 Thread Jeff Squyres (jsquyres) via users

There can be lots of reasons that this happens. Can you send all the information listed here? https://www.open-mpi.org/community/help/ > On Aug 15, 2018, at 10:55 AM, Mota, Thyago wrote: > > Hello. > > I have openmpi 2.0.4 installed on a Cent OS 7. When I try to run "mpirun" it > hang

Re: [OMPI users] mpirun hangs without internet connection

2015-01-21 Thread Klara Hornisova

Thank you for help. Now it works. Klara Hornisova On Thu, Jan 15, 2015 at 5:54 PM, Marco Atzeri wrote: > > > On 1/15/2015 5:39 PM, Klara Hornisova wrote: > >> I have installed OpenMPI 1.6.5 under cygwin. When trying test example >> >> $mpirun hello >> > > current cygwin package is 1.8.4-1, coul

Re: [OMPI users] mpirun hangs without internet connection

2015-01-15 Thread Marco Atzeri

On 1/15/2015 5:39 PM, Klara Hornisova wrote: I have installed OpenMPI 1.6.5 under cygwin. When trying test example $mpirun hello current cygwin package is 1.8.4-1, could you test it ? or, e.g., more complex examples from scalapack, such as $mpirun -np 4 xslu everything works fine when t

Re: [OMPI users] mpirun hangs: "hello" test in single machine

2013-04-12 Thread Rodrigo Gómez Vázquez

I solved the issue by accepting the input traffic of data packages through the TCP Ports as long as they are sent "from" and "to" the local machine. Here is the line I added to the iptables: /sbin/iptables -A INPUT --source --destination --protocol tcp -j ACCEPT Just an observation, I

Re: [OMPI users] mpirun hangs: "hello" test in single machine

2013-04-11 Thread Ralph Castain

FWIW: I'm working on a rewrite of our out-of-band comm system (it does the wireup that is hanging on your system) that will include a shared memory module. Once that is in place, this problem will go away when running on a single node (still need sockets for multi-node, of course). On Apr 11,

Re: [OMPI users] mpirun hangs: "hello" test in single machine

2013-04-11 Thread Rodrigo Gómez Vázquez

You were right, Ralph. I made a short test turning off the firewall and MPI ran as predicted. I am taking a look to the firewall rules, to figure out how to set it up properly, so that it does not interfere with OpenMPI's functionalities. I will post the required changes in those settings as so

Re: [OMPI users] mpirun hangs: "hello" test in single machine

2013-04-10 Thread Rodrigo Gómez Vázquez

In fact we should have restrictive firewall settings, as long as I remember. I will check the rules again tomorrow morning. That's very interesting, I would expect such kind of problem if I were working with a cluster, but I haven't thought that it might lead also to problems for the internal c

Re: [OMPI users] mpirun hangs: "hello" test in single machine

2013-04-10 Thread Ralph Castain

Best guess is that there is some issue with getting TCP sockets on the system - once the procs are launched, they need to open a TCP socket and communicate back to mpirun. If the socket is "stuck" waiting to complete the open, things will hang. You might check to ensure there isn't some securit

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs ( mpirun compiled without thread support )

2012-01-19 Thread Jeff Squyres

On Jan 18, 2012, at 4:15 AM, Theiner, Andre wrote: > I also have requested the user to run the following adaption to his original > command "mpriun -np 9 interFoam -parallel". I hoped to get a kind of debug > output > which points me into the right way. The new command did not work and I am a >

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs ( mpirun compiled without thread support )

2012-01-18 Thread Theiner, Andre

o:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: Dienstag, 17. Januar 2012 22:53 To: Open MPI Users Subject: Re: [OMPI users] mpirun hangs when used on more than 2 CPUs You should probably also run the ompi_info command; it tells you details about your installation, and how it was

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs

2012-01-17 Thread Jeff Squyres

multiple processors? >>> Is there a special flag which tells the compiler to care for multiple CPUs? >>> >>> Andre >>> >>> >>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On >>> Behalf Of devendra rai >

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs

2012-01-17 Thread Ralph Castain

a special flag which tells the compiler to care for multiple CPUs? >> >> Andre >> >> >> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On >> Behalf Of devendra rai >> Sent: Montag, 16. Januar 2012 13:25 >> To: Open MPI Users >>

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs

2012-01-17 Thread John Hearns

cial flag which tells the compiler to care for multiple CPUs? > > Andre > > > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of devendra rai > Sent: Montag, 16. Januar 2012 13:25 > To: Open MPI Users > Subject: Re: [OMPI users] mpirun ha

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs

2012-01-17 Thread Theiner, Andre

rai Sent: Montag, 16. Januar 2012 13:25 To: Open MPI Users Subject: Re: [OMPI users] mpirun hangs when used on more than 2 CPUs Hello Andre, It may be possible that your openmpi does not support threaded MPI-calls (if these are happening). I had a similar problem, and it was traced to this cause

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs

2012-01-16 Thread devendra rai

Hello Andre, It may be possible that your openmpi does not support threaded MPI-calls (if these are happening). I had a similar problem, and it was traced to this cause. If you installed your openmpi from available repositories, chances are that you do not have thread-support. Here's a small s

Re: [OMPI users] mpirun hangs during runtime on Intel quad-core

2010-08-15 Thread Ralph Castain

Cryptic enough :-) Best I can tell, your TCP comm isn't working. All your procs are failing because they can't talk to each other. I'm also seeing something I don't understand: *** The MPI_Init() function was called before MPI_INIT was invoked. *** This is disallowed by the MPI standard. You

Re: [OMPI users] mpirun hangs with multiple nodes

2010-01-06 Thread Ralph Castain

There is a bug in that tarball which was fixed as of yesterday. However, the patch that you need was the cause of the bug, so the fix for your problem is no longer in the 1.4 branch. As you probably recall, I had cautioned that the fix might not make it to the 1.4 series. At the time, I was con

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-18 Thread Raymond Wan

Hi Bogdan, Thanks for the information and looking forward to the new OpenMPI feature of port restriction... About Debian, I was wondering about that...I've had no problems with it and I was thinking everything was just done for me; of course, another possibility is that there was no firewall

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-18 Thread Bogdan Costescu

On Wed, 18 Mar 2009, Raymond Wan wrote: Perhaps it has something to do with RH's defaults for the firewall settings? If your sysadmin uses kickstart to configure the systems, (s)he has to add 'firewall --disabled'; similar for SELinux which seems to have caused problems to another person on

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-18 Thread Raymond Wan

Hi Ron, Ron Babich wrote: Thanks for your response. I had noticed your thread, which is why I'm embarrassed (but happy) to say that it looks like my problem was the same as yours. I mentioned in my original email that there was no firewall running, which it turns out was a lie. I think th

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-17 Thread Ron Babich

Hi Ray, Thanks for your response. I had noticed your thread, which is why I'm embarrassed (but happy) to say that it looks like my problem was the same as yours. I mentioned in my original email that there was no firewall running, which it turns out was a lie. I think that when I checked b

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-17 Thread Raymond Wan

Hi Ron, Ron Babich wrote: Hi Everyone, I'm having a very basic problem getting an MPI job to run on multiple nodes. My setup consists of two identically configured nodes, called node01 and node02, connected via ethernet and infiniband. They are running CentOS 5.2 and the bundled OMPI, ver

Re: [OMPI users] mpirun hangs

2009-01-06 Thread Maciej Kazulak

2009/1/6 Ralph Castain > > On Jan 5, 2009, at 5:19 PM, Jeff Squyres wrote: > > On Jan 5, 2009, at 5:01 PM, Maciej Kazulak wrote: >> >> Interesting though. I thought in such a simple scenario shared memory >>> would be used for IPC (or whatever's fastest) . But nope. Even with one >>> process st

Re: [OMPI users] mpirun hangs

2009-01-06 Thread Ralph Castain

On Jan 5, 2009, at 5:19 PM, Jeff Squyres wrote: On Jan 5, 2009, at 5:01 PM, Maciej Kazulak wrote: Interesting though. I thought in such a simple scenario shared memory would be used for IPC (or whatever's fastest) . But nope. Even with one process still it wants to use TCP/IP to communicat

Re: [OMPI users] mpirun hangs

2009-01-05 Thread Jeff Squyres

On Jan 5, 2009, at 5:01 PM, Maciej Kazulak wrote: Interesting though. I thought in such a simple scenario shared memory would be used for IPC (or whatever's fastest) . But nope. Even with one process still it wants to use TCP/IP to communicate between mpirun and orted. Correct -- we only

Re: [OMPI users] mpirun hangs

2009-01-05 Thread Maciej Kazulak

2009/1/3 Maciej Kazulak > Hi, > > I have a weird problem. After a fresh install mpirun refuses to work: > > box% ./hello > Process 0 on box out of 1 > box% mpirun -np 1 ./hello > # hangs here, no output, nothing at all; on another terminal: > box% ps axl | egrep 'mpirun|orted' > 0 1000 24162 76

Re: [OMPI users] mpirun hangs

2007-08-16 Thread Jeff Squyres

On Aug 16, 2007, at 5:34 AM, jody wrote: Just a quick update about my ssh/LD_LIBRARY_PATH problem. Apparently on my System the sshd was configured not to permit user defined environment variables (security reasons?). To fix that i had to change the file /etc/ssh/sshd_config By changing the en

Re: [OMPI users] mpirun hangs

2007-08-16 Thread jody

Hi Tim Just a quick update about my ssh/LD_LIBRARY_PATH problem. Apparently on my System the sshd was configured not to permit user defined environment variables (security reasons?). To fix that i had to change the file /etc/ssh/sshd_config By changing the entry #PermitUserEnvironment no to

Re: [OMPI users] mpirun hangs

2007-08-14 Thread Tim Prins

Jody, jody wrote: Hi TIm thanks for the suggestions. I now set both paths in .zshenv but it seems that LD_LIBRARY_PATH still does not get set. The ldd experment shows that all openmpi libraries are not found, and indeed the printenv shows that PATH is there but LD_LIBRARY_PATH is not. Are you

Re: [OMPI users] mpirun hangs

2007-08-14 Thread jody

Hi TIm thanks for the suggestions. I now set both paths in .zshenv but it seems that LD_LIBRARY_PATH still does not get set. The ldd experment shows that all openmpi libraries are not found, and indeed the printenv shows that PATH is there but LD_LIBRARY_PATH is not. It is rather unclear why thi

Re: [OMPI users] mpirun hangs

2007-08-14 Thread Tim Prins

Hi Jody, jody wrote: Hi I installed openmpi 1.2.2 on a quad core intel machine running fedora 6 (hostname plankton) I set PATH and LD_LIBRARY in the .zshrc file: Note that .zshrc is only used for interactive logins. You need to setup your system so the LD_LIBRARY_PATH and PATH is also set for

Re: [OMPI users] mpirun hangs

2006-02-24 Thread Brian Barrett

On Feb 24, 2006, at 8:23 AM, Emanuel Ziegler wrote: So, the question from the mpirun_debug.out-file is, what IP- addresses do node01 and node02 have, is the local 10.0.0.1 node01, while 10.1.0.1 is node02? Maybe the route on node01 is not correct to node02? Ok, I figured out the problem, bu

Re: [OMPI users] mpirun hangs

2006-02-24 Thread Emanuel Ziegler

> So, the question from the mpirun_debug.out-file is, what IP-addresses do > node01 and node02 have, is the local 10.0.0.1 node01, while 10.1.0.1 is > node02? > Maybe the route on node01 is not correct to node02? Ok, I figured out the problem, but didn't solve it completely. node01 and node02 b

Re: [OMPI users] mpirun hangs

2006-02-24 Thread Bogdan Costescu

On Fri, 24 Feb 2006, Emanuel Ziegler wrote: So "No rout to host" means that the TCP package could not be sent (usually host down, broken routing table, network interface down, ...). But it's 'ping'able and even rsh works fine. ... or some packet filtering is enabled. Check with 'iptables -L -

Re: [OMPI users] mpirun hangs

2006-02-24 Thread Rainer Keller

Hello Emanual, can you actually log in using rsh without submitting a password? I would rather use the ssh-based login using public-keys to login. This is definitely more secure but in Your first mail, You said, ssh wouldn't work either? So, the question from the mpirun_debug.out-file is, what I

Re: [OMPI users] mpirun hangs

2006-02-24 Thread Emanuel Ziegler

> >From /usr/include/asm/errno.h: > > #define EHOSTUNREACH113 /* No route to host */ Ah, I thought it was an internal openMPI error number and 'grep'ed the source code without success. So "No rout to host" means that the TCP package could not be sent (usually host down, broken routi

Re: [OMPI users] mpirun hangs

2006-02-24 Thread Bogdan Costescu

On Thu, 23 Feb 2006, Emanuel Ziegler wrote: Unfortunately, I don't know what errno=113 means, but obviously it's a TCP problem. From /usr/include/asm/errno.h: #define EHOSTUNREACH113 /* No route to host */ -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliche

55 matches

Mail list logo