Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Jeff Squyres
On Jul 8, 2011, at 7:34 PM, Steve Kargl wrote: >> We unfortunately don't have access to any BSD machines to test this >> on, ourselves. It works on other OS's, so I'm curious as to why it >> doesn't seem to work for you. :-( > > I can arrange access on the cluster in question. ;-) Actually,

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Steve Kargl
On Fri, Jul 08, 2011 at 07:03:13PM -0400, Jeff Squyres wrote: > Sorry -- I got distracted all afternoon... No problem. We all have obligations that we prioritize. > In addition to what Ralph said (i.e., I'm not sure if the > CIDR notation stuff made it over to the v1.5 branch or not, > but it

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Jeff Squyres
Sorry -- I got distracted all afternoon... In addition to what Ralph said (i.e., I'm not sure if the CIDR notation stuff made it over to the v1.5 branch or not, but it is available from the nightly SVN trunk tarballs: http://www.open-mpi.org/nightly/trunk/), here's a few points from other

Re: [OMPI users] Error using hostfile

2011-07-08 Thread Mohan, Ashwin
Thanks Ralph. I have emailed the network admin on the firewall issue. About the PATH and LIBRARY PATH issue, is it sufficient evidence that the path are set alright if I am able to compile and run successfully on individual nodes mentioned in the machine file. Thanks, Ashwin.

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Ralph Castain
We've been moving to provide support for including values as CIDR notation instead of names - e.g., 192.168.0/16 instead of bge0 or bge1 - but I don't think that has been put into the 1.4 release series. If you need it now, you might try using the developer's trunk - I know it works there. On

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Steve Kargl
On Fri, Jul 08, 2011 at 04:26:35PM -0400, Gus Correa wrote: > Steve Kargl wrote: > >On Fri, Jul 08, 2011 at 02:19:27PM -0400, Jeff Squyres wrote: > >>The easiest way to fix this is likely to use the btl_tcp_if_include > >>or btl_tcp_if_exclude MCA parameters -- i.e., tell OMPI exactly > >>which

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Gus Correa
Steve Kargl wrote: On Fri, Jul 08, 2011 at 02:19:27PM -0400, Jeff Squyres wrote: The easiest way to fix this is likely to use the btl_tcp_if_include or btl_tcp_if_exclude MCA parameters -- i.e., tell OMPI exactly which interfaces to use:

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Steve Kargl
On Fri, Jul 08, 2011 at 12:09:09PM -0700, Steve Kargl wrote: > On Fri, Jul 08, 2011 at 02:19:27PM -0400, Jeff Squyres wrote: > > > > The easiest way to fix this is likely to use the btl_tcp_if_include > > or btl_tcp_if_exclude MCA parameters -- i.e., tell OMPI exactly > > which interfaces to use:

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Steve Kargl
On Fri, Jul 08, 2011 at 02:19:27PM -0400, Jeff Squyres wrote: > > The easiest way to fix this is likely to use the btl_tcp_if_include > or btl_tcp_if_exclude MCA parameters -- i.e., tell OMPI exactly > which interfaces to use: > > http://www.open-mpi.org/faq/?category=tcp#tcp-selection >

Re: [OMPI users] Error using hostfile

2011-07-08 Thread Ralph Castain
Is there a firewall in the way? The error indicates that daemons were launched on the remote machines, but failed to communicate back. Also, check that your remote PATH and LD_LIBRARY_PATH are being set to the right place to pickup this version of OMPI. Lots of systems deploy with default

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Steve Kargl
On Fri, Jul 08, 2011 at 02:19:27PM -0400, Jeff Squyres wrote: > On Jul 8, 2011, at 1:31 PM, Steve Kargl wrote: > > > It seems that openmpi-1.4.4 compiled code is trying to use the > > wrong nic. My /etc/hosts file has > > > > 10.208.78.111 hpc.apl.washington.edu hpc > > 192.168.0.10

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Jeff Squyres
On Jul 8, 2011, at 1:31 PM, Steve Kargl wrote: > It seems that openmpi-1.4.4 compiled code is trying to use the > wrong nic. My /etc/hosts file has > > 10.208.78.111 hpc.apl.washington.edu hpc > 192.168.0.10node10.cimu.org node10 n10 master > 192.168.0.11

Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD

2011-07-08 Thread Steve Kargl
On Thu, Jul 07, 2011 at 08:38:56PM -0400, Jeff Squyres wrote: > On Jul 5, 2011, at 4:24 PM, Steve Kargl wrote: > > On Tue, Jul 05, 2011 at 01:14:06PM -0700, Steve Kargl wrote: > >> I have an application that appears to function as I expect > >> when compiled with openmpi-1.4.2 on FreeBSD 9.0.

Re: [OMPI users] Pinning of openmpi to certain defined cores possible

2011-07-08 Thread Ralph Castain
Look at "mpirun -h" or "man mpirun" - you'll see options for binding processes to cores etc. On Jul 8, 2011, at 10:13 AM, Vlad Popa wrote: > Hello! > > We habe a shared memory system based on 4CPUs of 12-core Opteron with a > total of 256Gb RAM . > > Are there any switches, which we could

Re: [OMPI users] InfiniBand, different OpenFabrics transport types

2011-07-08 Thread Bill Johnstone
Hello, and thanks for the reply. - Original Message - > From: Jeff Squyres > Sent: Thursday, July 7, 2011 5:14 PM > Subject: Re: [OMPI users] InfiniBand, different OpenFabrics transport types > > On Jun 28, 2011, at 1:46 PM, Bill Johnstone wrote: > >> I have a

Re: [OMPI users] Error using hostfile

2011-07-08 Thread Mohan, Ashwin
Hi, I am following up on a previous error posted. Based on the previous recommendation, I did set up a password less SSH login. I created a hostfile comprising of 4 nodes (w/ each node having 4 slots). I tried to run my job on 4 slots but get no output. Hence, I end up killing the job. I am

[OMPI users] pinning processes

2011-07-08 Thread Vlad Popa
Hi ! I've forgotten to mention that we're using openmpi 1.5.3 on Ubuntu Server 11.04 64 bit.. Greetings from Salzburg/Austria Vlad Popa

[OMPI users] Pinning of openmpi to certain defined cores possible

2011-07-08 Thread Vlad Popa
Hello! We habe a shared memory system based on 4CPUs of 12-core Opteron with a total of 256Gb RAM . Are there any switches, which we could set , so that when running mpirun-jobs we'd be able to pin them down to a defined certain memory area and defined processor cores so that they stay

[OMPI users] Error-Open MPI over Infiniband: polling LP CQ with status LOCAL LENGTH ERROR

2011-07-08 Thread yanyg
Hi all, The message says : [[17549,1],0][btl_openib_component.c:3224:handle_wc] from gulftown to: gulftown error polling LP CQ with status LOCAL LENGTH ERROR status number 1 for wr_id 492359816 opcode 32767 vendor error 105 qp_idx 3 This is very arcane to me, the same test ran when only one

Re: [OMPI users] [WARNING: SPOOFED E-MAIL--Non-Aerospace Sender] Re: Problem with prebuilt ver 1.5.3 for windows

2011-07-08 Thread Shiqing Fan
Hi Jeff, Sorry for answering late. These emails were hidden in another thread in my email client, so I didn't catch it until now. The prebuild version of Open MPI was based on Windows 2008, which has InterlockedCompareExchange64 natively. Windows XP doesn't have support of this function,

Re: [OMPI users] [WARNING: SPOOFED E-MAIL--Non-Aerospace Sender] Re: Problem with prebuilt ver 1.5.3 for windows

2011-07-08 Thread Jeffrey A Cummings
I've been following this list for several months now and have been quite impressed by the helpfulness of the list experts in response to most questions. So how come the pregnant silence in response to my question? I could really use some help here. - Jeff From: Jeffrey A Cummings

[OMPI users] a question about network connection of open-mpi

2011-07-08 Thread zhuangchao
hello all : I run the following command : /data1/cluster/openmpi/bin/mpirun -d -machinefile /tmp/nodes.10515.txt -np 3 /data1/cluster/mpiblast-pio-1.6/bin/mpiblast -p blastn -i /data1/cluster/sequences/seq_4.txt -d Baculo_Nucleotide -o