Re: [OMPI devel] Build failure on FreeBSD 7

2008-05-01 Thread Jeff Squyres

George -- did you get to make this fix?

What header file is openpty declared in on FreeBSD 7?  It should be  
easy enough to add the right #include to that file.


On Apr 29, 2008, at 7:45 PM, Brad Penoff wrote:


hey all,

I was just configuring MTT to run some multihost tests on FreeBSD 7
and I came across this same error you guys were, using the
openmpi-1.3a1r18325.tar.gz trunk nightly tarball :

kqueue.c:165: error: implicit declaration of function 'openpty'

However, this error seems to only come up if I use --enable-picky to
configure.  Getting rid of --enable-picky results in a successful
compilation.  Any idea why that is?  Should this be fixed in the long
term?

For now, I'm just adjusting my MTT runs to not have --enable-picky in
the ompi_configure_arguments...

brad


2008/4/11 George Bosilca :
That's good that you guys revive this thread, I almost forget about  
it.


 The code you're referring, is not part of the libevent. It was one  
of my
"fixes" around for problem on OS X (where kevent is not able to  
work nicely
with pty). It works on MAC as the code trigger an error so there is  
no need

for the timeout ... I'll make the corrections over the weekend.

 Thanks,
   george.



On Apr 11, 2008, at 7:39 PM, Karol Mroz wrote:


Hi, Jeff...

This test was performed locally, yes. I'm short on machines at the  
moment

to perform any proper distributed tests.


--
Karol

-Original Message-
From: Jeff Squyres 

Date: Fri, 11 Apr 2008 16:36:33
To:Open MPI Developers 
Subject: Re: [OMPI devel] Build failure on FreeBSD 7


This may depend on how you ran the app on FreeBSD -- did you run on
the localhost only?

We have/had a problem when running locally with regards to kevent --
I'm not 100% sure if we've fixed it yet.  Let me check...


On Apr 5, 2008, at 1:53 AM, Karol Mroz wrote:

After digging a little deeper, it turns out that the kevent()  
call in

opal/event/kquene.c:
  if (kevent(kq,
kqueueop->changes, 1, kqueueop->events, NEVENT, NULL) !=
1 ||
 (int)kqueueop->events[0].ident != master ||
 kqueueop->events[0].flags != EV_ERROR) {

seems to hang in freebsd 7. Changing the NULL parameter to, lets  
say
1000, causes the function to return and print out the error  
message:


  event_warn("%s: detected broken kqueue (failed delete); not

using

error %d (%s)", __func__, errno, strerror(errno));

The simple non-blocking send/recv app used to test this then runs  
to

completion. Compiling OpenMPI on linux and running this same app
produces no errors.

Any ideas?

Thanks.
--
Karol



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] Build failure on FreeBSD 7

2008-05-01 Thread Jeff Squyres

We don't really test/play with FreeBSD at all.  :-\

George -- were you able to look into this?


On Apr 29, 2008, at 10:14 PM, Brad Penoff wrote:


hey again,

One quick follow-up, as things are still misbehaving...

While removing --enable-picky in the ompi_configure_arguments got MTT
to compile the nightly tarball, no MPI program succeeds (both in MTT
and outside) for any BTL included in the nightly tarball when ran on
FreeBSD 7.

I did a quick investigation and it appears as if I'm arriving at the
same thing that Karol did in the Apr 5th email in this thread...

Things hang on the call to kevent on line 177 of opal/event/kqueue.c .

Jeff had replied asking if Karol had only ran locally, citing past
problems he'd seen with kevent running locally.  I tried also running
on a remote machine, and it hung in the same way.  George mentioned he
had done a fix for an OS X issue recently just curious but did you
guys (or anyone else) ever get a chance to cycle back to this?

Thanks!
brad


On Tue, Apr 29, 2008 at 4:45 PM, Brad Penoff  wrote:

hey all,

I was just configuring MTT to run some multihost tests on FreeBSD 7
and I came across this same error you guys were, using the
openmpi-1.3a1r18325.tar.gz trunk nightly tarball :

kqueue.c:165: error: implicit declaration of function 'openpty'

However, this error seems to only come up if I use --enable-picky to
configure.  Getting rid of --enable-picky results in a successful
compilation.  Any idea why that is?  Should this be fixed in the long
term?

For now, I'm just adjusting my MTT runs to not have --enable-picky in
the ompi_configure_arguments...

brad


2008/4/11 George Bosilca :


That's good that you guys revive this thread, I almost forget  
about it.


 The code you're referring, is not part of the libevent. It was  
one of my
"fixes" around for problem on OS X (where kevent is not able to  
work nicely
with pty). It works on MAC as the code trigger an error so there  
is no need

for the timeout ... I'll make the corrections over the weekend.

 Thanks,
   george.



On Apr 11, 2008, at 7:39 PM, Karol Mroz wrote:


Hi, Jeff...

This test was performed locally, yes. I'm short on machines at  
the moment

to perform any proper distributed tests.


--
Karol

-Original Message-
From: Jeff Squyres 

Date: Fri, 11 Apr 2008 16:36:33
To:Open MPI Developers 
Subject: Re: [OMPI devel] Build failure on FreeBSD 7


This may depend on how you ran the app on FreeBSD -- did you run on
the localhost only?

We have/had a problem when running locally with regards to kevent  
--

I'm not 100% sure if we've fixed it yet.  Let me check...


On Apr 5, 2008, at 1:53 AM, Karol Mroz wrote:

After digging a little deeper, it turns out that the kevent()  
call in

opal/event/kquene.c:
  if (kevent(kq,
kqueueop->changes, 1, kqueueop->events, NEVENT, NULL) !=
1 ||
 (int)kqueueop->events[0].ident != master ||
 kqueueop->events[0].flags != EV_ERROR) {

seems to hang in freebsd 7. Changing the NULL parameter to, lets  
say
1000, causes the function to return and print out the error  
message:


  event_warn("%s: detected broken kqueue (failed delete); not

using

error %d (%s)", __func__, errno, strerror(errno));

The simple non-blocking send/recv app used to test this then  
runs to

completion. Compiling OpenMPI on linux and running this same app
produces no errors.

Any ideas?

Thanks.
--
Karol



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



[OMPI devel] merging cpc3 -> trunk

2008-05-01 Thread Jeff Squyres
Ok, we've fixed the problems that Pasha was seeing, and seem to be  
clear to bring the cpc3 work back to the trunk tonight.


We will still have oob/xoob be the default -- ibcm and rdmacm will  
[currently] fire only if specifically requested because there's still  
a little work to do in these two cpc's before they're done.


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] Intel MPI Benchmark(IMB) using OpenMPI - Segmentation-fault error message.

2008-05-01 Thread Lenny Verkhovsky
On 5/1/08, Mukesh K Srivastava  wrote:
>
> Hi Lenny.
>
> Thanks for responding. To correct more - would like to know few things.
>
> (a) I did modify make_mpich makefile present in IMB-3.1/src folder giving
> the path for openmpi. Here I am using same mpirun as built from
> openmpi(v-1.2.5) also did mention in PATH & LD_LIBRARY_PATH.
>
> (b) What is the command on console to run any new additional file with MPI
> API contents call. Do I need to add in Makefile.base of IMB-3.1/src folder
> or mentioning in console as a command it takes care alongwith "$mpirun
> IMB-MPI1"
>
> (c) Does IMB-3.1 need INB(Infiniband) or TCP support to complete it's
> Benchmark routine call, means do I need to configure and build OpnMPI with
> Infiniband stack too?
>

IMB is a set of benchmarks that can be run between 1 and more machines
it calls for MPI API that does all the communication
MPI decides how to run ( IB or TCP or shared memory ) according to
priorities and all possible ways to be connected to another host.

you can make your own benchmark or test program, compile it with mpicc and
run
ex:
#mpicc -o hello_world hello_world.c
#mpirun -np 2 -H host1,host2 ./hello_world


#cat hello_world.c
/*
* Hewlett-Packard Co., High Performance Systems Division
*
* Function: - example: simple "hello world"
*
* $Revision: 1.1.2.1 $
*/

#include 
#include 

main(argc, argv)

int argc;
char *argv[];

{
int rank, size, len;
char name[MPI_MAX_PROCESSOR_NAME];
int to_wait = 0, sleep_diff = 0, max_limit = 0;
double sleep_start = 0.0, sleep_now = 0.0;

MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);

MPI_Get_processor_name(name, &len);

if (argc > 1)
{
to_wait = atoi(argv[1]);
}

//busy loop for debuging needs
if (to_wait)
{
sleep_start=MPI_Wtime();
while(1)
{
max_limit++;
if(max_limit > 1)
{
fprintf(stdout," exit loop, to_wait: %d, \n", to_wait);
break;
}

sleep_now = MPI_Wtime();
sleep_diff = (int)(sleep_now - sleep_start);
if(sleep_diff >= to_wait)
{
break;
}
}
}

if (rank == 0) //only the first will print this message
{
printf ("Hello world! I'm %d of %d on %s\n", rank, size, name);
}

MPI_Finalize();
exit(0);
}






(d) I don't see any README in IMB-3.1 or anu user-guide which tells how to
> execute rather it simply tells about each 17 benchmark and flags to be used.
>
> BR
>
>
> On 4/30/08, Lenny Verkhovsky  wrote:
> >
> >
> >
> >
> > On 4/30/08, Mukesh K Srivastava  wrote:
> > >
> > > Hi.
> > >
> > > I am using IMB-3.1, an Intel MPI Benchmark tool with OpenMPI(v-1.2.5).
> > > In /IMB-3.1/src/make_mpich file, I had only given the decalartion for
> > > MPI_HOME, which takes care for CC, OPTFLAGS & CLINKER. Building IMB_MPI1,
> > > IMP-EXT & IMB-IO happens succesfully.
> > >
> > > I get proper results of IMB Benchmark with command "-np 1" as mpirun
> > > IMB-MPI1, but for "-np 2", I get below errors -
> > >
> > > -
> > > [mukesh@n161 src]$ mpirun -np 2 IMB-MPI1
> > > [n161:13390] *** Process received signal ***
> > > [n161:13390] Signal: Segmentation fault (11)
> > > [n161:13390] Signal code: Address not mapped (1)
> > > [n161:13390] Failing at address: (nil)
> > > [n161:13390] [ 0] /lib64/tls/libpthread.so.0 [0x399e80c4f0]
> > > [n161:13390] [ 1]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so [0x2a9830f8b4]
> > > [n161:13390] [ 2]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so [0x2a983109e3]
> > > [n161:13390] [ 3]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0xbc)
> > > [0x2a9830fc50]
> > > [n161:13390] [ 4]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x4b)
> > > [0x2a97fce447]
> > > [n161:13390] [ 5]
> > > /home/mukesh/openmpi/prefix/lib/libopen-pal.so.0(opal_progress+0xbc)
> > > [0x2a958fc343]
> > > [n161:13390] [ 6]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x22)
> > > [0x2a962e9e22]
> > > [n161:13390] [ 7]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x677)
> > > [0x2a962f1aab]
> > > [n161:13390] [ 8]
> > > /home/mukesh/openmpi/prefix/lib/libopen-rte.so.0(mca_oob_recv_packed+0x46)
> > > [0x2a9579d243]
> > > [n161:13390] [ 9]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_put+0x2f3)
> > > [0x2a96508c8f]
> > > [n161:13390] [10]
> > > /home/mukesh/openmpi/prefix/lib/libopen-rte.so.0(orte_smr_base_set_proc_state+0x425)
> > > [0x2a957c391d]
> > > [n161:13390] [11]
> > > /home/mukesh/openmpi/prefix/lib/libmpi.so.0(ompi_mpi_init+0xa1e)
> > > [0x2a9559f042]
> > > [n161:13390] [12]
> > > /home/mukesh/openmpi/prefix/lib/libmpi.so.0(PMPI_Init_thread+0xcb)
> > > [0x2a955e1c5b]
> > > [n161:13390] [13] IMB-MPI1(main+0x33) [0x403543]
> > > [n161:13390] [14] /lib64/tls/libc.so.6(__libc_start_main+0xdb)
> > > [0x399e11c3fb]
> > > [n161:13390] [15] IMB-MPI1 [0x40347a]
> > > [n161:13390] *** End of error message ***
> > > [n161:13391] *** Process received signal ***
> >

Re: [OMPI devel] forgetting to run ./autogen.sh should not be fatal

2008-05-01 Thread Jeff Squyres

Done -- thanks!

(config commit coming tonight to avoid US workday hours)


On Apr 29, 2008, at 2:45 PM, Ralf Wildenhues wrote:


Hello,

I just forgot to run ./autogen.sh after svn update.  It caused aclocal
to warn about missing libtool macros, and automake to fail later.  The
following change to Makefile.am fixes this by allowing aclocal to find
config/libtool.m4 and the other libtool macro files.

The ompi_functions.m4 change fixes a trivial unnecessary escaping.

Cheers,
Ralf

Index: Makefile.am
===
--- Makefile.am (revision 18324)
+++ Makefile.am (working copy)
@@ -24,3 +24,5 @@

dist-hook:
	csh "$(top_srcdir)/config/distscript.csh" "$(top_srcdir)" "$ 
(distdir)" "$(OMPI_VERSION)" "$(OMPI_SVN_R)"

+
+ACLOCAL_AMFLAGS = -I config
Index: config/ompi_functions.m4
===
--- config/ompi_functions.m4(revision 18324)
+++ config/ompi_functions.m4(working copy)
@@ -132,7 +132,7 @@
echo installing to directory \"$prefix\"
;;
  *)
-AC_MSG_ERROR(prefix \"$prefix\" must be an absolute directory  
path)

+AC_MSG_ERROR(prefix "$prefix" must be an absolute directory path)
;;
esac

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] Build failure on FreeBSD 7

2008-05-01 Thread Brad Penoff
I believe Karol's patch in the original mail in this thread adds the
appropriate headers for openpty to be resolved when --enable-picky is
supplied.  Without --enable-picky, it's able to resolve it too, as the
code is.  However, even if it compiles, the call to kevent (line 177
of opal/event/kqueue.c) still hangs, so this is more of the mystery...

Would giving you access to a FreeBSD 7 machine be useful?  Contact me
off the list, if so and we'll try to sort something out.  Or if you
have any patches/suggestions you'd like to try to fix this, I could
run them myself and let you know.

Thanks,
brad

On Thu, May 1, 2008 at 5:51 AM, Jeff Squyres  wrote:
> George -- did you get to make this fix?
>
>  What header file is openpty declared in on FreeBSD 7?  It should be
>  easy enough to add the right #include to that file.
>
>
>
>  On Apr 29, 2008, at 7:45 PM, Brad Penoff wrote:
>
>  > hey all,
>  >
>  > I was just configuring MTT to run some multihost tests on FreeBSD 7
>  > and I came across this same error you guys were, using the
>  > openmpi-1.3a1r18325.tar.gz trunk nightly tarball :
>  >
>  > kqueue.c:165: error: implicit declaration of function 'openpty'
>  >
>  > However, this error seems to only come up if I use --enable-picky to
>  > configure.  Getting rid of --enable-picky results in a successful
>  > compilation.  Any idea why that is?  Should this be fixed in the long
>  > term?
>  >
>  > For now, I'm just adjusting my MTT runs to not have --enable-picky in
>  > the ompi_configure_arguments...
>  >
>  > brad
>  >
>  >
>  > 2008/4/11 George Bosilca :
>  >> That's good that you guys revive this thread, I almost forget about
>  >> it.
>  >>
>  >>  The code you're referring, is not part of the libevent. It was one
>  >> of my
>  >> "fixes" around for problem on OS X (where kevent is not able to
>  >> work nicely
>  >> with pty). It works on MAC as the code trigger an error so there is
>  >> no need
>  >> for the timeout ... I'll make the corrections over the weekend.
>  >>
>  >>  Thanks,
>  >>george.
>  >>
>  >>
>  >>
>  >> On Apr 11, 2008, at 7:39 PM, Karol Mroz wrote:
>  >>
>  >>> Hi, Jeff...
>  >>>
>  >>> This test was performed locally, yes. I'm short on machines at the
>  >>> moment
>  >> to perform any proper distributed tests.
>  >>>
>  >>> --
>  >>> Karol
>  >>>
>  >>> -Original Message-
>  >>> From: Jeff Squyres 
>  >>>
>  >>> Date: Fri, 11 Apr 2008 16:36:33
>  >>> To:Open MPI Developers 
>  >>> Subject: Re: [OMPI devel] Build failure on FreeBSD 7
>  >>>
>  >>>
>  >>> This may depend on how you ran the app on FreeBSD -- did you run on
>  >>> the localhost only?
>  >>>
>  >>> We have/had a problem when running locally with regards to kevent --
>  >>> I'm not 100% sure if we've fixed it yet.  Let me check...
>  >>>
>  >>>
>  >>> On Apr 5, 2008, at 1:53 AM, Karol Mroz wrote:
>  >>>
>   After digging a little deeper, it turns out that the kevent()
>   call in
>   opal/event/kquene.c:
>     if (kevent(kq,
>   kqueueop->changes, 1, kqueueop->events, NEVENT, NULL) !=
>   1 ||
>    (int)kqueueop->events[0].ident != master ||
>    kqueueop->events[0].flags != EV_ERROR) {
>  
>   seems to hang in freebsd 7. Changing the NULL parameter to, lets
>   say
>   1000, causes the function to return and print out the error
>   message:
>  
>     event_warn("%s: detected broken kqueue (failed delete); not
>  >> using
>   error %d (%s)", __func__, errno, strerror(errno));
>  
>   The simple non-blocking send/recv app used to test this then runs
>   to
>   completion. Compiling OpenMPI on linux and running this same app
>   produces no errors.
>  
>   Any ideas?
>  
>   Thanks.
>   --
>   Karol
>  
>  
>  
>   ___
>   devel mailing list
>   de...@open-mpi.org
>   http://www.open-mpi.org/mailman/listinfo.cgi/devel
>  
>  >>>
>  >>>
>  >>> --
>  >>> Jeff Squyres
>  >>> Cisco Systems
>  >>>
>  >>> ___
>  >>> devel mailing list
>  >>> de...@open-mpi.org
>  >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>  >>>
>  >>> ___
>  >>> devel mailing list
>  >>> de...@open-mpi.org
>  >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>  >>>
>  >>
>  >>
>  >> ___
>  >> devel mailing list
>  >> de...@open-mpi.org
>  >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>  >>
>  > ___
>  > devel mailing list
>  > de...@open-mpi.org
>  > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>  --
>  Jeff Squyres
>  Cisco Systems
>
>  ___
>  devel mailing list
>  de...@open-mpi.org
>  http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>