Re: [OMPI devel] Mercurial demo OMPI repository

2008-04-29 Thread Ralph Castain
And if you're looking for my stuff, it's at:

http://www.open-mpi.org/hg/hgwebdir.cgi/rhc/hg/

Ralph


On 4/28/08 8:58 PM, "Jeff Squyres"  wrote:

> I believe that the correct URL should not have an extra "/hg/" in
> there after /jsquyres/
> 
> 
> 
> On Apr 28, 2008, at 9:37 PM, Josh Hursey wrote:
> 
>> Hum. So I just tried this and I got:
>> 
>> shell$  hg clone
>> http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/hg/ompi-svn-conversion-r1792
>> 1/
>> destination directory: ompi-svn-conversion-r17921
>> abort: 
>> 'http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/hg/ompi-svn-conversion-r179
>> 21/'
>>  does not appear to be an hg repository!
>> 
>> Any thoughts on why?
>> 
>> Cheers,
>> Josh
>> 
>> On Apr 2, 2008, at 7:26 PM, Jeff Squyres wrote:
>> 
>>> Thanks to the sysadmins at IU, I put up a sample Mercurial OMPI
>>> repository here:
>>> 
>>>http://www.open-mpi.org/hg/hgwebdir.cgi/
>>> 
>>> I converted the entire SVN ompi repository history (/trunk, /tags,
>>> and /branches only) as of r17921.  Note that it shows some commits on
>>> the 0.9 branch as the most recent activity only because it converts
>>> the branches in reverse order -- the entire trunk is there as of
>>> r17921.
>>> 
>>> You can clone this repository with the following:
>>> 
>>>hg clone 
>>> http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/hg/ompi-svn-conversion-r179
>>> 21/
>>> 
>>> Enjoy.
>>> 
>>> -- 
>>> Jeff Squyres
>>> Cisco Systems
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 




Re: [OMPI devel] Mercurial demo OMPI repository

2008-04-29 Thread Josh Hursey

Yay. Looks like that is working.

Thanks,
Josh

On Apr 29, 2008, at 6:21 AM, Ralph Castain wrote:


And if you're looking for my stuff, it's at:

http://www.open-mpi.org/hg/hgwebdir.cgi/rhc/hg/

Ralph


On 4/28/08 8:58 PM, "Jeff Squyres"  wrote:


I believe that the correct URL should not have an extra "/hg/" in
there after /jsquyres/



On Apr 28, 2008, at 9:37 PM, Josh Hursey wrote:


Hum. So I just tried this and I got:

shell$  hg clone
http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/hg/ompi-svn-conversion-r1792
1/
destination directory: ompi-svn-conversion-r17921
abort:
'http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/hg/ompi-svn-conversion-r179
21/'
does not appear to be an hg repository!

Any thoughts on why?

Cheers,
Josh

On Apr 2, 2008, at 7:26 PM, Jeff Squyres wrote:


Thanks to the sysadmins at IU, I put up a sample Mercurial OMPI
repository here:

  http://www.open-mpi.org/hg/hgwebdir.cgi/

I converted the entire SVN ompi repository history (/trunk, /tags,
and /branches only) as of r17921.  Note that it shows some  
commits on

the 0.9 branch as the most recent activity only because it converts
the branches in reverse order -- the entire trunk is there as of
r17921.

You can clone this repository with the following:

  hg clone
http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/hg/ompi-svn-conversion-r179
21/

Enjoy.

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Mercurial demo OMPI repository

2008-04-29 Thread Pak Lui
I just ran into the same problem as Josh did. I had to take out 'hg' 
from Ralph's hg trunk. Otherwise hg clone wouldn't take it.


http://www.open-mpi.org/hg/hgwebdir.cgi/rhc/trunk

Josh Hursey wrote:

Yay. Looks like that is working.

Thanks,
Josh

On Apr 29, 2008, at 6:21 AM, Ralph Castain wrote:


And if you're looking for my stuff, it's at:

http://www.open-mpi.org/hg/hgwebdir.cgi/rhc/hg/

Ralph


On 4/28/08 8:58 PM, "Jeff Squyres"  wrote:


I believe that the correct URL should not have an extra "/hg/" in
there after /jsquyres/



On Apr 28, 2008, at 9:37 PM, Josh Hursey wrote:


Hum. So I just tried this and I got:

shell$  hg clone
http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/hg/ompi-svn-conversion-r1792
1/
destination directory: ompi-svn-conversion-r17921
abort:
'http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/hg/ompi-svn-conversion-r179
21/'
does not appear to be an hg repository!

Any thoughts on why?

Cheers,
Josh

On Apr 2, 2008, at 7:26 PM, Jeff Squyres wrote:


Thanks to the sysadmins at IU, I put up a sample Mercurial OMPI
repository here:

  http://www.open-mpi.org/hg/hgwebdir.cgi/

I converted the entire SVN ompi repository history (/trunk, /tags,
and /branches only) as of r17921.  Note that it shows some  
commits on

the 0.9 branch as the most recent activity only because it converts
the branches in reverse order -- the entire trunk is there as of
r17921.

You can clone this repository with the following:

  hg clone
http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/hg/ompi-svn-conversion-r179
21/

Enjoy.

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--

- Pak Lui
pak@sun.com


[OMPI devel] forgetting to run ./autogen.sh should not be fatal

2008-04-29 Thread Ralf Wildenhues
Hello,

I just forgot to run ./autogen.sh after svn update.  It caused aclocal
to warn about missing libtool macros, and automake to fail later.  The
following change to Makefile.am fixes this by allowing aclocal to find
config/libtool.m4 and the other libtool macro files.

The ompi_functions.m4 change fixes a trivial unnecessary escaping.

Cheers,
Ralf

Index: Makefile.am
===
--- Makefile.am (revision 18324)
+++ Makefile.am (working copy)
@@ -24,3 +24,5 @@

 dist-hook:
csh "$(top_srcdir)/config/distscript.csh" "$(top_srcdir)" "$(distdir)" 
"$(OMPI_VERSION)" "$(OMPI_SVN_R)"
+
+ACLOCAL_AMFLAGS = -I config
Index: config/ompi_functions.m4
===
--- config/ompi_functions.m4(revision 18324)
+++ config/ompi_functions.m4(working copy)
@@ -132,7 +132,7 @@
 echo installing to directory \"$prefix\" 
 ;;
   *) 
-AC_MSG_ERROR(prefix \"$prefix\" must be an absolute directory path) 
+AC_MSG_ERROR(prefix "$prefix" must be an absolute directory path) 
 ;;
 esac



[OMPI devel] [RFC] mca_base_select()

2008-04-29 Thread Josh Hursey
What:  Add mca_base_select() and adjust frameworks & components to use  
it.

Why:   Consolidation of code for general goodness.
Where: https://svn.open-mpi.org/svn/ompi/tmp-public/jjh-mca-play
When:  Code ready now. Documentation ready soon.
Timeout: May 6, 2008 (After teleconf) [1 week]

Discussion:
---
For a number of years a few developers have been talking about  
creating a MCA base component selection function. For various reasons  
this was never implemented. Recently I decided to give it a try.


A base select function will allow Open MPI to provide completely  
consistent selection behavior for many of its frameworks (18 of 31 to  
be exact at the moment). The primary goal of this work is to improving  
code maintainability through code reuse. Other benefits also result  
such as a slightly smaller memory footprint.


The mca_base_select() function represented the most commonly used  
logic for component selection: Select the one component with the  
highest priority and close all of the not selected components. This  
function can be found at the path below in the branch:

 opal/mca/base/mca_base_components_select.c

To support this I had to formalize a query() function in the  
mca_base_component_t of the form:
  int mca_base_query_component_fn(mca_base_module_t **module, int  
*priority);


This function is specified after the open and close component  
functions in this structure as to allow compatibility with frameworks  
that do not use the base selection logic. Frameworks that do *not* use  
this function are *not* effected by this commit. However, every  
component in the frameworks that use the mca_base_select function must  
adjust their component query function to fit that specified above.


18 frameworks in Open MPI have been changed. I have updated all of the  
components in the 18 frameworks available in the trunk on my branch.  
The effected frameworks are:

 - OPAL Carto
 - OPAL crs
 - OPAL maffinity
 - OPAL memchecker
 - OPAL paffinity
 - ORTE errmgr
 - ORTE ess
 - ORTE Filem
 - ORTE grpcomm
 - ORTE odls
 - ORTE pml
 - ORTE ras
 - ORTE rmaps
 - ORTE routed
 - ORTE snapc
 - OMPI crcp
 - OMPI dpm
 - OMPI pubsub

There was a question of the memory footprint change as a result of  
this commit. I used 'pmap' to determine process memory footprint of a  
hello world MPI program. Static and Shared build numbers are below  
along with variations on launching locally and to a single node  
allocated by SLURM. All of this was on Indiana University's Odin  
machine. We compare against the trunk (r18276) representing the last  
SVN sync point of the branch.


   Process(shared)| Trunk| Branch  | Diff (Improvement)
   ---+--+-+---
   mpirun (orted) |   39976K |  36828K | 3148K
   hello (0)  |  229288K | 229268K |   20K
   hello (1)  |  229288K | 229268K |   20K
   ---+--+-+---
   mpirun |   40032K |  37924K | 2108K
   orted  |   34720K |  34660K |   60K
   hello (0)  |  228404K | 228384K |   20K
   hello (1)  |  228404K | 228384K |   20K

   Process(static)| Trunk| Branch  | Diff (Improvement)
   ---+--+-+---
   mpirun (orted) |   21384K |  21372K |  12K
   hello (0)  |  194000K | 193980K |  20K
   hello (1)  |  194000K | 193980K |  20K
   ---+--+-+---
   mpirun |   21384K |  21372K |  12K
   orted  |   21208K |  21196K |  12K
   hello (0)  |  193116K | 193096K |  20K
   hello (1)  |  193116K | 193096K |  20K

As you can see there are some small memory footprint improvements on  
my branch that result from this work. The size of the Open MPI project  
shrinks a bit as well. This commit cuts between 3,500 and 2,000 lines  
of code (depending on how you count) so about a ~1% code shrink.


The branch is stable in all of the testing I have done, but there are  
some platforms on which I cannot test. So please give this branch a  
try and let me know if you find any problems.


Cheers,
Josh



Re: [OMPI devel] Build failure on FreeBSD 7

2008-04-29 Thread Brad Penoff
hey all,

I was just configuring MTT to run some multihost tests on FreeBSD 7
and I came across this same error you guys were, using the
openmpi-1.3a1r18325.tar.gz trunk nightly tarball :

kqueue.c:165: error: implicit declaration of function 'openpty'

However, this error seems to only come up if I use --enable-picky to
configure.  Getting rid of --enable-picky results in a successful
compilation.  Any idea why that is?  Should this be fixed in the long
term?

For now, I'm just adjusting my MTT runs to not have --enable-picky in
the ompi_configure_arguments...

brad


2008/4/11 George Bosilca :
> That's good that you guys revive this thread, I almost forget about it.
>
>   The code you're referring, is not part of the libevent. It was one of my
> "fixes" around for problem on OS X (where kevent is not able to work nicely
> with pty). It works on MAC as the code trigger an error so there is no need
> for the timeout ... I'll make the corrections over the weekend.
>
>   Thanks,
> george.
>
>
>
>  On Apr 11, 2008, at 7:39 PM, Karol Mroz wrote:
>
> > Hi, Jeff...
> >
> > This test was performed locally, yes. I'm short on machines at the moment
> to perform any proper distributed tests.
> >
> > --
> > Karol
> >
> > -Original Message-
> > From: Jeff Squyres 
> >
> > Date: Fri, 11 Apr 2008 16:36:33
> > To:Open MPI Developers 
> > Subject: Re: [OMPI devel] Build failure on FreeBSD 7
> >
> >
> > This may depend on how you ran the app on FreeBSD -- did you run on
> > the localhost only?
> >
> > We have/had a problem when running locally with regards to kevent --
> > I'm not 100% sure if we've fixed it yet.  Let me check...
> >
> >
> > On Apr 5, 2008, at 1:53 AM, Karol Mroz wrote:
> >
> > > After digging a little deeper, it turns out that the kevent() call in
> > > opal/event/kquene.c:
> > >if (kevent(kq,
> > >  kqueueop->changes, 1, kqueueop->events, NEVENT, NULL) !=
> > > 1 ||
> > >   (int)kqueueop->events[0].ident != master ||
> > >   kqueueop->events[0].flags != EV_ERROR) {
> > >
> > > seems to hang in freebsd 7. Changing the NULL parameter to, lets say
> > > 1000, causes the function to return and print out the error message:
> > >
> > >event_warn("%s: detected broken kqueue (failed delete); not
> using
> > > error %d (%s)", __func__, errno, strerror(errno));
> > >
> > > The simple non-blocking send/recv app used to test this then runs to
> > > completion. Compiling OpenMPI on linux and running this same app
> > > produces no errors.
> > >
> > > Any ideas?
> > >
> > > Thanks.
> > > --
> > > Karol
> > >
> > >
> > >
> > > ___
> > > devel mailing list
> > > de...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > >
> >
> >
> > --
> > Jeff Squyres
> > Cisco Systems
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
>
>
> ___
>  devel mailing list
>  de...@open-mpi.org
>  http://www.open-mpi.org/mailman/listinfo.cgi/devel
>


Re: [OMPI devel] Build failure on FreeBSD 7

2008-04-29 Thread Brad Penoff
hey all,

I was just configuring MTT to run some multihost tests on FreeBSD 7
and I came across this same error you guys were, using the
openmpi-1.3a1r18325.tar.gz trunk nightly tarball :

kqueue.c:165: error: implicit declaration of function 'openpty'

However, this error seems to only come up if I use --enable-picky to
configure.  Getting rid of --enable-picky results in a successful
compilation.  Any idea why that is?  Should this be fixed in the long
term?

For now, I'm just adjusting my MTT runs to not have --enable-picky in
the ompi_configure_arguments...

brad


2008/4/11 George Bosilca :
> That's good that you guys revive this thread, I almost forget about it.
>
>   The code you're referring, is not part of the libevent. It was one of my
> "fixes" around for problem on OS X (where kevent is not able to work nicely
> with pty). It works on MAC as the code trigger an error so there is no need
> for the timeout ... I'll make the corrections over the weekend.
>
>   Thanks,
> george.
>
>
>
>  On Apr 11, 2008, at 7:39 PM, Karol Mroz wrote:
>
> > Hi, Jeff...
> >
> > This test was performed locally, yes. I'm short on machines at the moment
> to perform any proper distributed tests.
> >
> > --
> > Karol
> >
> > -Original Message-
> > From: Jeff Squyres 
> >
> > Date: Fri, 11 Apr 2008 16:36:33
> > To:Open MPI Developers 
> > Subject: Re: [OMPI devel] Build failure on FreeBSD 7
> >
> >
> > This may depend on how you ran the app on FreeBSD -- did you run on
> > the localhost only?
> >
> > We have/had a problem when running locally with regards to kevent --
> > I'm not 100% sure if we've fixed it yet.  Let me check...
> >
> >
> > On Apr 5, 2008, at 1:53 AM, Karol Mroz wrote:
> >
> > > After digging a little deeper, it turns out that the kevent() call in
> > > opal/event/kquene.c:
> > >if (kevent(kq,
> > >  kqueueop->changes, 1, kqueueop->events, NEVENT, NULL) !=
> > > 1 ||
> > >   (int)kqueueop->events[0].ident != master ||
> > >   kqueueop->events[0].flags != EV_ERROR) {
> > >
> > > seems to hang in freebsd 7. Changing the NULL parameter to, lets say
> > > 1000, causes the function to return and print out the error message:
> > >
> > >event_warn("%s: detected broken kqueue (failed delete); not
> using
> > > error %d (%s)", __func__, errno, strerror(errno));
> > >
> > > The simple non-blocking send/recv app used to test this then runs to
> > > completion. Compiling OpenMPI on linux and running this same app
> > > produces no errors.
> > >
> > > Any ideas?
> > >
> > > Thanks.
> > > --
> > > Karol
> > >
> > >
> > >
> > > ___
> > > devel mailing list
> > > de...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > >
> >
> >
> > --
> > Jeff Squyres
> > Cisco Systems
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
>
>
> ___
>  devel mailing list
>  de...@open-mpi.org
>  http://www.open-mpi.org/mailman/listinfo.cgi/devel
>


Re: [OMPI devel] Build failure on FreeBSD 7

2008-04-29 Thread Brad Penoff
hey again,

One quick follow-up, as things are still misbehaving...

While removing --enable-picky in the ompi_configure_arguments got MTT
to compile the nightly tarball, no MPI program succeeds (both in MTT
and outside) for any BTL included in the nightly tarball when ran on
FreeBSD 7.

I did a quick investigation and it appears as if I'm arriving at the
same thing that Karol did in the Apr 5th email in this thread...

Things hang on the call to kevent on line 177 of opal/event/kqueue.c .

Jeff had replied asking if Karol had only ran locally, citing past
problems he'd seen with kevent running locally.  I tried also running
on a remote machine, and it hung in the same way.  George mentioned he
had done a fix for an OS X issue recently just curious but did you
guys (or anyone else) ever get a chance to cycle back to this?

Thanks!
brad


On Tue, Apr 29, 2008 at 4:45 PM, Brad Penoff  wrote:
> hey all,
>
>  I was just configuring MTT to run some multihost tests on FreeBSD 7
>  and I came across this same error you guys were, using the
>  openmpi-1.3a1r18325.tar.gz trunk nightly tarball :
>
>  kqueue.c:165: error: implicit declaration of function 'openpty'
>
>  However, this error seems to only come up if I use --enable-picky to
>  configure.  Getting rid of --enable-picky results in a successful
>  compilation.  Any idea why that is?  Should this be fixed in the long
>  term?
>
>  For now, I'm just adjusting my MTT runs to not have --enable-picky in
>  the ompi_configure_arguments...
>
>  brad
>
>
>  2008/4/11 George Bosilca :
>
>
> > That's good that you guys revive this thread, I almost forget about it.
>  >
>  >   The code you're referring, is not part of the libevent. It was one of my
>  > "fixes" around for problem on OS X (where kevent is not able to work nicely
>  > with pty). It works on MAC as the code trigger an error so there is no need
>  > for the timeout ... I'll make the corrections over the weekend.
>  >
>  >   Thanks,
>  > george.
>  >
>  >
>  >
>  >  On Apr 11, 2008, at 7:39 PM, Karol Mroz wrote:
>  >
>  > > Hi, Jeff...
>  > >
>  > > This test was performed locally, yes. I'm short on machines at the moment
>  > to perform any proper distributed tests.
>  > >
>  > > --
>  > > Karol
>  > >
>  > > -Original Message-
>  > > From: Jeff Squyres 
>  > >
>  > > Date: Fri, 11 Apr 2008 16:36:33
>  > > To:Open MPI Developers 
>  > > Subject: Re: [OMPI devel] Build failure on FreeBSD 7
>  > >
>  > >
>  > > This may depend on how you ran the app on FreeBSD -- did you run on
>  > > the localhost only?
>  > >
>  > > We have/had a problem when running locally with regards to kevent --
>  > > I'm not 100% sure if we've fixed it yet.  Let me check...
>  > >
>  > >
>  > > On Apr 5, 2008, at 1:53 AM, Karol Mroz wrote:
>  > >
>  > > > After digging a little deeper, it turns out that the kevent() call in
>  > > > opal/event/kquene.c:
>  > > >if (kevent(kq,
>  > > >  kqueueop->changes, 1, kqueueop->events, NEVENT, NULL) !=
>  > > > 1 ||
>  > > >   (int)kqueueop->events[0].ident != master ||
>  > > >   kqueueop->events[0].flags != EV_ERROR) {
>  > > >
>  > > > seems to hang in freebsd 7. Changing the NULL parameter to, lets say
>  > > > 1000, causes the function to return and print out the error message:
>  > > >
>  > > >event_warn("%s: detected broken kqueue (failed delete); not
>  > using
>  > > > error %d (%s)", __func__, errno, strerror(errno));
>  > > >
>  > > > The simple non-blocking send/recv app used to test this then runs to
>  > > > completion. Compiling OpenMPI on linux and running this same app
>  > > > produces no errors.
>  > > >
>  > > > Any ideas?
>  > > >
>  > > > Thanks.
>  > > > --
>  > > > Karol
>  > > >
>  > > >
>  > > >
>  > > > ___
>  > > > devel mailing list
>  > > > de...@open-mpi.org
>  > > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>  > > >
>  > >
>  > >
>  > > --
>  > > Jeff Squyres
>  > > Cisco Systems
>  > >
>  > > ___
>  > > devel mailing list
>  > > de...@open-mpi.org
>  > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>  > >
>  > > ___
>  > > devel mailing list
>  > > de...@open-mpi.org
>  > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>  > >
>  >
>  >
>  > ___
>  >  devel mailing list
>  >  de...@open-mpi.org
>  >  http://www.open-mpi.org/mailman/listinfo.cgi/devel
>  >
>