[OMPI devel] rankfile questions

2008-03-18 Thread Jeff Squyres
I notice that rankfile didn't compile properly on some platforms and  
issued warnings on other platforms.  Thanks to Ralph for cleaning it  
up...


1. I see a getenv("slot_list") in the MPI side of the code; it looks  
like $slot_list is set by the odls for the MPI process.  Why isn't it  
an MCA parameter?  That's what all other values passed by the orted to  
the MPI process appear to be.


2. I see that ompi_mpi_params.c is now registering 2 rmaps-level MCA  
parameters.  Why?  Shouldn't these be in ORTE somewhere?


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] rankfile questions

2008-03-18 Thread Jeff Squyres

On Mar 18, 2008, at 9:32 AM, Jeff Squyres wrote:


I notice that rankfile didn't compile properly on some platforms and
issued warnings on other platforms.  Thanks to Ralph for cleaning it
up...

1. I see a getenv("slot_list") in the MPI side of the code; it looks
like $slot_list is set by the odls for the MPI process.  Why isn't it
an MCA parameter?  That's what all other values passed by the orted to
the MPI process appear to be.

2. I see that ompi_mpi_params.c is now registering 2 rmaps-level MCA
parameters.  Why?  Shouldn't these be in ORTE somewhere?



A few more notes:

3. Most of the files in orte/mca/rmaps/rankfile do not obey the prefix  
rule.  I think that they should be renamed.


4. A quick look through rankfile_lex.l seems to show that there are  
global variables that are not protected by the prefix rule (or  
static).  Ditto in rmaps_rf.c.  These should be fixed.


5. rank_file_done was instantiated in both rankfile_lex.l and  
ramps_rf.c (causing a duplicate symbol linker error on OS X).  I  
removed it from rmaps_rf.c (it was declared "extern" in  
rankfile_lex.h, assumedly to indicate that it is "owned" by the lex.l  
file...?).


6. svn:ignore was not set in the new rankfile directory.

--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] xensocket btl and migration

2008-03-18 Thread Josh Hursey

Muhammad,

With regard to your question on migration you will likely have to  
reload the BTL components when a migration occurs. Open MPI currently  
assumes that once the set of BTLs are decided upon in a process they  
are to be used until the application completes. There is some limited  
support for failover in which if one BTL 'fails' then it is  
disregarded and a previously defined alternative path is used. For  
example if between two peers Open MPI has the choice of using tcp or  
openib then it will use openib. If openib were to fail during the  
running of the job then it may be possible for Open MPI to fail over  
and use just tcp. I'm not sure how well tested this ability is, others  
can comment if you are interested in this.


However failover is not really want you are looking for. What it seem  
you are looking for is the ability to tell two processes that they  
should no longer communicate over tcp, but continue communication over  
xensockets or visa versa. One technique would be upon migration, if  
unload the BTLs (component_close) then reopen (component_open) and  
reselect (component_select) then reexchange the modex the processes  
should settle into the new configuration. You will have to make sure  
that any state Open MPI has cached such as network addresses and node  
name data is refreshed upon restart. Take a look at the checkpoint/ 
restart logic for how I do this in the code base ([opal|orte|ompi]/ 
runtime/*_cr.c).


It is likely that there is another, more efficient method but I don't  
have anything to point you to at the moment. One idea would be to add  
a refresh function to the modex which would force the reexchange of a  
single processes address set. There are a slew of problems with this  
that you will have to overcome including race conditions, but I think  
they can be surmounted.


I'd be interested in hearing your experiences implementing this in  
Open MPI. Let me know if I can be of any more help.


Cheers,
Josh

On Mar 9, 2008, at 6:13 AM, Muhammad Atif wrote:

Okay guys.. with all your support and help in understanding ompi  
architecture, I was able to get Xensocket to work.  Only minor  
changes to the xensocket kernel module made it compatible with  
libevent. I am getting results which are bad but I am sure, I have  
to cleanup the code. At least my results have improved over native  
netfront-netback of xen for messages of size larger than 1 MB.


I started with making minor changes in the TCP btl, but it seems it  
is not the best way, as changes are quite huge and it is better to  
have separate dedicated btl for xensockets. As you guys might be  
aware Xen supports live migration, now I have one stupid question.  
My knowledge so far suggests that btl component is initialized only  
once. The scerario here is if my guest os is migrated from one  
physical node to another, and realizes that the communicating  
processes are now on one physical host and they should abandon use  
of TCP btl and make use of Xensocket btl. I am sure it would not  
happen out of the box, but is it possible without making heavy  
changes in the openmpi architecture?
With the current design, i am running a mix of tcp and xensocket  
btls, and endpoints check periodically if they are on same physical  
host or not. This has quite a big penalty in terms of time.


Another question is (good thing i am using email otherwise you guys  
would beat the hell outta me, its such a basic question). I am not  
able to track MPI_Recv(...) api call and its alike calls. Once in  
the code of MPI_Recv(..) we give a call to rc =  
MCA_PML_CALL(recv(buf, count ... ). This call goes to the macro, and  
pml.recv(..) gets invoked (mca_pml_base_module_recv_fn_t  
pml_recv;) . Where can I find the actual function? I get totally  
lost when trying to pinpoint what exactly is happening. Basically, I  
am looking for a place where tcp btl recv is getting called with all  
the goodies and  parameters which were passed by the MPI programmer.  
I hope I have made my question understandable.


Best Regards,
Muhammad Atif


- Original Message 
From: Brian W. Barrett 
To: Open MPI Developers 
Sent: Wednesday, February 6, 2008 2:57:31 AM
Subject: Re: [OMPI devel] xensocket - callbacks through OPAL/libevent

On Mon, 4 Feb 2008, Muhammad Atif wrote:

> I am trying to port xensockets to openmpi. In principle, I have the
> framework and everything, but there seems to be a small issue, I  
cannot

> get libevent (or OPAL) to give callbacks for receive (or send) for
> xensockets. I have tried to implement native code for xensockets  
with
> libevent library, again the same issue.  No call backs! . With  
normal

> sockets, callbacks do come easily.
>
> So question is, do the socket/file descriptors have to have some  
special
> mechanism attached to them to support callbacks for libevent/opal?  
i.e
> some structure/magic?. i.e. maybe the developers of xensockets did  
not
> add that callback

[OMPI devel] 1.2.6 man page fixes: done

2008-03-18 Thread Jeff Squyres

Terry --

Per the teleconf today (I wanted to ensure that some man page fixes  
were included in 1.2.6): I checked SVN; the man pages fixes submitted  
by the Debian OMPI package maintainers were committed to the 1.2  
branch almost a month ago.


So I think we're clear for 1.2.6rc3.

--
Jeff Squyres
Cisco Systems



[OMPI devel] libevent-merge tarball

2008-03-18 Thread Jeff Squyres
Per the RFC posted yesterday, we plan to merge in the new libevent  
over this upcoming weekend.  Please test the /tmp-public/libevent- 
merge SVN branch!


For convenience, I have posted a tarball from this branch if it would  
make it easier for you to test:


http://www.open-mpi.org/~jsquyres/unofficial/

The SVN branch is:

http://svn.open-mpi.org/svn/ompi/tmp-public/libevent-merge

--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] RFC: libevent update

2008-03-18 Thread Josh Hursey
I'm testing with checkpoint/restart and the new libevent seems to be  
messing up the checkpoints generated by BLCR. I'll be taking a look  
at it over the next couple of days, but just thought I'd let people  
know. Unfortunately I don't have any more details at the moment.


-- Josh

On Mar 17, 2008, at 2:50 PM, Jeff Squyres wrote:


WHAT: Bring new version of libevent to the trunk.

WHY: Newer version, slightly better performance (lower overheads /
lighter weight), properly integrate the use of epoll and other
scalable fd monitoring mechanisms.

WHERE: 98% of the changes are in opal/event; there's a few changes to
configury and one change to the orted.

TIMEOUT: COB, Friday, 21 March 2008

DESCRIPTION:

George/UTK has done the bulk of the work to integrate a new version of
libevent on the following tmp branch:

 https://svn.open-mpi.org/svn/ompi/tmp-public/libevent-merge

** WE WOULD VERY MUCH APPRECIATE IF PEOPLE COULD MTT TEST THIS BRANCH!
**

Cisco ran MTT on this branch on Friday and everything checked out
(i.e., no more failures than on the trunk).  We just made a few more
minor changes today and I'm running MTT again now, but I'm not
expecting any new failures (MTT will take several hours).  We would
like to bring the new libevent in over this upcoming weekend, but
would very much appreciate if others could test on their platforms
(Cisco tests mainly 64 bit RHEL4U4).  This new libevent *should* be a
fairly side-effect free change, but it is possible that since we're
now using epoll and other scalable fd monitoring tools, we'll run into
some unanticipated issues on some platforms.

Here's a consolidated diff if you want to see the changes:

https://svn.open-mpi.org/trac/ompi/changeset?old_path=tmp-public% 
2Flibevent-merge&old=17846&new_path=trunk&new=17842


Thanks.

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




[OMPI devel] Switching away from SVN?

2008-03-18 Thread Jeff Squyres
It's been loosely proposed that we switch away from SVN into a  
different system.  This probably warrants some discussion to a) figure  
out if we want to move, and b) *if* we want to move, which system  
should we move to?  One has system been proposed: Mercurial -- several  
OMPI developers are using it with good success.  I know that some OMPI  
developers use Git, too.  Are there other systems that we should  
consider?


Primary reasons for doing the switch are:

- distributed repositories are attractive/useful
- git/Mercurial branching and merging are *way* better than SVN
  --> note that SVN v1.5 is supposed to be *much* better than v1.4

Primary reasons for staying with SVN are:

- aside from branching/merging, SVN works pretty well
- branching/merging is not "bad" in SVN (but if you used git/hg, you  
know it can be much, much better)


This is likely not a near-term issue, but we might as well start some  
low-frequency discussions about it.  Several issues would need to be  
figured out if we decide to switch away from SVN:


- integration with trac
- integration with user/account management
- how to import all the SVN history to the new system
- ...and probably others

This might make a good topic for the next post-MPI-Forum meeting in  
Chicago: have someone stand up and give a 30 min overview of each  
system (Mercurial, Git, ...?) and we can have developer-level  
discussions (and hands-on testing) of the various systems to see what  
we like / don't like.


If this sounds like a reasonable idea, let's figure out who wants to  
speak about the systems, etc.


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] RFC: libevent update

2008-03-18 Thread Jeff Squyres

Crud, ok.  Keep us posted.

On Mar 18, 2008, at 4:16 PM, Josh Hursey wrote:


I'm testing with checkpoint/restart and the new libevent seems to be
messing up the checkpoints generated by BLCR. I'll be taking a look
at it over the next couple of days, but just thought I'd let people
know. Unfortunately I don't have any more details at the moment.

-- Josh

On Mar 17, 2008, at 2:50 PM, Jeff Squyres wrote:


WHAT: Bring new version of libevent to the trunk.

WHY: Newer version, slightly better performance (lower overheads /
lighter weight), properly integrate the use of epoll and other
scalable fd monitoring mechanisms.

WHERE: 98% of the changes are in opal/event; there's a few changes to
configury and one change to the orted.

TIMEOUT: COB, Friday, 21 March 2008

DESCRIPTION:

George/UTK has done the bulk of the work to integrate a new version  
of

libevent on the following tmp branch:

https://svn.open-mpi.org/svn/ompi/tmp-public/libevent-merge

** WE WOULD VERY MUCH APPRECIATE IF PEOPLE COULD MTT TEST THIS  
BRANCH!

**

Cisco ran MTT on this branch on Friday and everything checked out
(i.e., no more failures than on the trunk).  We just made a few more
minor changes today and I'm running MTT again now, but I'm not
expecting any new failures (MTT will take several hours).  We would
like to bring the new libevent in over this upcoming weekend, but
would very much appreciate if others could test on their platforms
(Cisco tests mainly 64 bit RHEL4U4).  This new libevent *should* be a
fairly side-effect free change, but it is possible that since we're
now using epoll and other scalable fd monitoring tools, we'll run  
into

some unanticipated issues on some platforms.

Here's a consolidated diff if you want to see the changes:

https://svn.open-mpi.org/trac/ompi/changeset?old_path=tmp-public%
2Flibevent-merge&old=17846&new_path=trunk&new=17842

Thanks.

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] RFC: libevent update

2008-03-18 Thread Brian W. Barrett

Jeff / George -

Did you add a way to specify which event modules are used?  Because epoll 
pushs the socket list into the kernel, I can see how it would screw up 
BLCR.  I bet everything would work if we forced the use of poll / select.


Brian

On Tue, 18 Mar 2008, Jeff Squyres wrote:


Crud, ok.  Keep us posted.

On Mar 18, 2008, at 4:16 PM, Josh Hursey wrote:


I'm testing with checkpoint/restart and the new libevent seems to be
messing up the checkpoints generated by BLCR. I'll be taking a look
at it over the next couple of days, but just thought I'd let people
know. Unfortunately I don't have any more details at the moment.

-- Josh

On Mar 17, 2008, at 2:50 PM, Jeff Squyres wrote:


WHAT: Bring new version of libevent to the trunk.

WHY: Newer version, slightly better performance (lower overheads /
lighter weight), properly integrate the use of epoll and other
scalable fd monitoring mechanisms.

WHERE: 98% of the changes are in opal/event; there's a few changes to
configury and one change to the orted.

TIMEOUT: COB, Friday, 21 March 2008

DESCRIPTION:

George/UTK has done the bulk of the work to integrate a new version
of
libevent on the following tmp branch:

https://svn.open-mpi.org/svn/ompi/tmp-public/libevent-merge

** WE WOULD VERY MUCH APPRECIATE IF PEOPLE COULD MTT TEST THIS
BRANCH!
**

Cisco ran MTT on this branch on Friday and everything checked out
(i.e., no more failures than on the trunk).  We just made a few more
minor changes today and I'm running MTT again now, but I'm not
expecting any new failures (MTT will take several hours).  We would
like to bring the new libevent in over this upcoming weekend, but
would very much appreciate if others could test on their platforms
(Cisco tests mainly 64 bit RHEL4U4).  This new libevent *should* be a
fairly side-effect free change, but it is possible that since we're
now using epoll and other scalable fd monitoring tools, we'll run
into
some unanticipated issues on some platforms.

Here's a consolidated diff if you want to see the changes:

https://svn.open-mpi.org/trac/ompi/changeset?old_path=tmp-public%
2Flibevent-merge&old=17846&new_path=trunk&new=17842

Thanks.

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel






Re: [OMPI devel] RFC: libevent update

2008-03-18 Thread Paul H. Hargrove
If avoiding epoll() makes Josh's problems go away, PLEASE let me know
because that might indicate a deficiency in BLCR that I would want to
address.

-Paul

Brian W. Barrett wrote:
> Jeff / George -
> 
> Did you add a way to specify which event modules are used?  Because epoll 
> pushs the socket list into the kernel, I can see how it would screw up 
> BLCR.  I bet everything would work if we forced the use of poll / select.
> 
> Brian
> 
> On Tue, 18 Mar 2008, Jeff Squyres wrote:
> 
>> Crud, ok.  Keep us posted.
>>
>> On Mar 18, 2008, at 4:16 PM, Josh Hursey wrote:
>>
>>> I'm testing with checkpoint/restart and the new libevent seems to be
>>> messing up the checkpoints generated by BLCR. I'll be taking a look
>>> at it over the next couple of days, but just thought I'd let people
>>> know. Unfortunately I don't have any more details at the moment.
>>>
>>> -- Josh
>>>
>>> On Mar 17, 2008, at 2:50 PM, Jeff Squyres wrote:
>>>
 WHAT: Bring new version of libevent to the trunk.

 WHY: Newer version, slightly better performance (lower overheads /
 lighter weight), properly integrate the use of epoll and other
 scalable fd monitoring mechanisms.

 WHERE: 98% of the changes are in opal/event; there's a few changes to
 configury and one change to the orted.

 TIMEOUT: COB, Friday, 21 March 2008

 DESCRIPTION:

 George/UTK has done the bulk of the work to integrate a new version
 of
 libevent on the following tmp branch:

 https://svn.open-mpi.org/svn/ompi/tmp-public/libevent-merge

 ** WE WOULD VERY MUCH APPRECIATE IF PEOPLE COULD MTT TEST THIS
 BRANCH!
 **

 Cisco ran MTT on this branch on Friday and everything checked out
 (i.e., no more failures than on the trunk).  We just made a few more
 minor changes today and I'm running MTT again now, but I'm not
 expecting any new failures (MTT will take several hours).  We would
 like to bring the new libevent in over this upcoming weekend, but
 would very much appreciate if others could test on their platforms
 (Cisco tests mainly 64 bit RHEL4U4).  This new libevent *should* be a
 fairly side-effect free change, but it is possible that since we're
 now using epoll and other scalable fd monitoring tools, we'll run
 into
 some unanticipated issues on some platforms.

 Here's a consolidated diff if you want to see the changes:

 https://svn.open-mpi.org/trac/ompi/changeset?old_path=tmp-public%
 2Flibevent-merge&old=17846&new_path=trunk&new=17842

 Thanks.

 --
 Jeff Squyres
 Cisco Systems

 ___
 devel mailing list
 de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] RFC: libevent update

2008-03-18 Thread Jeff Squyres
George added an MCA parameter for it (opal_event_include is a string  
that can be set to "select" or "poll"), but it has to be set before  
opal_init().


Josh: could you try running with the MCA parameter opal_event_include  
set to "select"?  This would confirm Brian's hypothesis...


Given that opal_init() is the first thing that happens in  
ompi_mpi_init(), this may not be enough -- you could *detect* that we  
can't do BLCR, but this mechanism doesn't allow libmpi to set  
something saying "reset libevent to be able to only use select()."


George -- is that hard to add?  I would imagine that it could be kinda  
difficult to reset libevent after there are already users of it, fd's  
and other events that may have been added, etc...?



On Mar 18, 2008, at 4:29 PM, Brian W. Barrett wrote:


Jeff / George -

Did you add a way to specify which event modules are used?  Because  
epoll

pushs the socket list into the kernel, I can see how it would screw up
BLCR.  I bet everything would work if we forced the use of poll /  
select.


Brian

On Tue, 18 Mar 2008, Jeff Squyres wrote:


Crud, ok.  Keep us posted.

On Mar 18, 2008, at 4:16 PM, Josh Hursey wrote:


I'm testing with checkpoint/restart and the new libevent seems to be
messing up the checkpoints generated by BLCR. I'll be taking a look
at it over the next couple of days, but just thought I'd let people
know. Unfortunately I don't have any more details at the moment.

-- Josh

On Mar 17, 2008, at 2:50 PM, Jeff Squyres wrote:


WHAT: Bring new version of libevent to the trunk.

WHY: Newer version, slightly better performance (lower overheads /
lighter weight), properly integrate the use of epoll and other
scalable fd monitoring mechanisms.

WHERE: 98% of the changes are in opal/event; there's a few  
changes to

configury and one change to the orted.

TIMEOUT: COB, Friday, 21 March 2008

DESCRIPTION:

George/UTK has done the bulk of the work to integrate a new version
of
libevent on the following tmp branch:

   https://svn.open-mpi.org/svn/ompi/tmp-public/libevent-merge

** WE WOULD VERY MUCH APPRECIATE IF PEOPLE COULD MTT TEST THIS
BRANCH!
**

Cisco ran MTT on this branch on Friday and everything checked out
(i.e., no more failures than on the trunk).  We just made a few  
more

minor changes today and I'm running MTT again now, but I'm not
expecting any new failures (MTT will take several hours).  We would
like to bring the new libevent in over this upcoming weekend, but
would very much appreciate if others could test on their platforms
(Cisco tests mainly 64 bit RHEL4U4).  This new libevent *should*  
be a

fairly side-effect free change, but it is possible that since we're
now using epoll and other scalable fd monitoring tools, we'll run
into
some unanticipated issues on some platforms.

Here's a consolidated diff if you want to see the changes:

https://svn.open-mpi.org/trac/ompi/changeset?old_path=tmp-public%
2Flibevent-merge&old=17846&new_path=trunk&new=17842

Thanks.

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] RFC: libevent update

2008-03-18 Thread George Bosilca
Its like rewriting libevent from scratch. I guess it can be done, but  
it will be a long and painful process. How about the following solution:


- the daemons are aware that the checkpointing is enabled. They can  
set the environment variable which will force the opal_event_include  
to be set to select.


- as the environment variables have a higher priority over the  
configuration file, this will work on most cases (except when the user  
add the mca parameter by hand).


- in the checkpoint/restart code, we can add a test that check the  
value of opal_event_include, print a message if the value is not  
select, and disable the checkpoint/restart functionality.


  george.

On Mar 18, 2008, at 4:59 PM, Jeff Squyres wrote:


George added an MCA parameter for it (opal_event_include is a string
that can be set to "select" or "poll"), but it has to be set before
opal_init().

Josh: could you try running with the MCA parameter opal_event_include
set to "select"?  This would confirm Brian's hypothesis...

Given that opal_init() is the first thing that happens in
ompi_mpi_init(), this may not be enough -- you could *detect* that we
can't do BLCR, but this mechanism doesn't allow libmpi to set
something saying "reset libevent to be able to only use select()."

George -- is that hard to add?  I would imagine that it could be kinda
difficult to reset libevent after there are already users of it, fd's
and other events that may have been added, etc...?


On Mar 18, 2008, at 4:29 PM, Brian W. Barrett wrote:


Jeff / George -

Did you add a way to specify which event modules are used?  Because
epoll
pushs the socket list into the kernel, I can see how it would screw  
up

BLCR.  I bet everything would work if we forced the use of poll /
select.

Brian

On Tue, 18 Mar 2008, Jeff Squyres wrote:


Crud, ok.  Keep us posted.

On Mar 18, 2008, at 4:16 PM, Josh Hursey wrote:

I'm testing with checkpoint/restart and the new libevent seems to  
be

messing up the checkpoints generated by BLCR. I'll be taking a look
at it over the next couple of days, but just thought I'd let people
know. Unfortunately I don't have any more details at the moment.

-- Josh

On Mar 17, 2008, at 2:50 PM, Jeff Squyres wrote:


WHAT: Bring new version of libevent to the trunk.

WHY: Newer version, slightly better performance (lower overheads /
lighter weight), properly integrate the use of epoll and other
scalable fd monitoring mechanisms.

WHERE: 98% of the changes are in opal/event; there's a few
changes to
configury and one change to the orted.

TIMEOUT: COB, Friday, 21 March 2008

DESCRIPTION:

George/UTK has done the bulk of the work to integrate a new  
version

of
libevent on the following tmp branch:

  https://svn.open-mpi.org/svn/ompi/tmp-public/libevent-merge

** WE WOULD VERY MUCH APPRECIATE IF PEOPLE COULD MTT TEST THIS
BRANCH!
**

Cisco ran MTT on this branch on Friday and everything checked out
(i.e., no more failures than on the trunk).  We just made a few
more
minor changes today and I'm running MTT again now, but I'm not
expecting any new failures (MTT will take several hours).  We  
would

like to bring the new libevent in over this upcoming weekend, but
would very much appreciate if others could test on their platforms
(Cisco tests mainly 64 bit RHEL4U4).  This new libevent *should*
be a
fairly side-effect free change, but it is possible that since  
we're

now using epoll and other scalable fd monitoring tools, we'll run
into
some unanticipated issues on some platforms.

Here's a consolidated diff if you want to see the changes:

https://svn.open-mpi.org/trac/ompi/changeset?old_path=tmp-public%
2Flibevent-merge&old=17846&new_path=trunk&new=17842

Thanks.

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] RFC: libevent update

2008-03-18 Thread Josh Hursey

I have some more data from the field.

Leaving "opal_event_include" unset (Default) BLCR would give me the  
following error when trying to restart a 2 process 'noop' MPI  
application:


shell$ ompi-restart ompi_global_snapshot_8587.ckpt
Restart failed: Bad file descriptor
Restart failed: Bad file descriptor
shell$


If I set "opal_event_include" to "select" then I get a different  
message, this one from Open MPI:


shell$  ompi-restart ompi_global_snapshot_8543.ckpt
[warn] select: Bad file descriptor
[odin001.cs.indiana.edu:18027] opal_event_base_loop: ompi_evesel- 
>dispatch() failed.

[warn] select: Bad file descriptor
[odin001.cs.indiana.edu:18027] opal_event_base_loop: ompi_evesel- 
>dispatch() failed.

[warn] select: Bad file descriptor
...

This repeats until I kill the restarted job. I've figured out what is  
outputing the error message, but I can't say exactly why at the  
moment. Still digging.


If I set "opal_event_include" to "poll" then everything is fine. The  
restart works as expected in all scenarios. :)


I'm currently using BLCR 0.6.0 Beta 6 on this machine. I've requested  
that BLCR be upgraded on this machine so I can test the latest  
version to see if the poll/epoll problem persists. I'll work with  
Paul if this turns up anything.


As far as what Open MPI needs to do, I don't think we need to do  
anything at the moment. I can add the MCA parameter to the 'ft-enable- 
cr' AMCA file which will work as a temporary fix.


Thanks for all your help in tracking this problem.

Cheers,
Josh

On Mar 18, 2008, at 5:19 PM, George Bosilca wrote:

Its like rewriting libevent from scratch. I guess it can be done,  
but it will be a long and painful process. How about the following  
solution:


- the daemons are aware that the checkpointing is enabled. They can  
set the environment variable which will force the  
opal_event_include to be set to select.


- as the environment variables have a higher priority over the  
configuration file, this will work on most cases (except when the  
user add the mca parameter by hand).


- in the checkpoint/restart code, we can add a test that check the  
value of opal_event_include, print a message if the value is not  
select, and disable the checkpoint/restart functionality.


  george.

On Mar 18, 2008, at 4:59 PM, Jeff Squyres wrote:


George added an MCA parameter for it (opal_event_include is a string
that can be set to "select" or "poll"), but it has to be set before
opal_init().

Josh: could you try running with the MCA parameter opal_event_include
set to "select"?  This would confirm Brian's hypothesis...

Given that opal_init() is the first thing that happens in
ompi_mpi_init(), this may not be enough -- you could *detect* that we
can't do BLCR, but this mechanism doesn't allow libmpi to set
something saying "reset libevent to be able to only use select()."

George -- is that hard to add?  I would imagine that it could be  
kinda

difficult to reset libevent after there are already users of it, fd's
and other events that may have been added, etc...?


On Mar 18, 2008, at 4:29 PM, Brian W. Barrett wrote:


Jeff / George -

Did you add a way to specify which event modules are used?  Because
epoll
pushs the socket list into the kernel, I can see how it would  
screw up

BLCR.  I bet everything would work if we forced the use of poll /
select.

Brian

On Tue, 18 Mar 2008, Jeff Squyres wrote:


Crud, ok.  Keep us posted.

On Mar 18, 2008, at 4:16 PM, Josh Hursey wrote:

I'm testing with checkpoint/restart and the new libevent seems  
to be
messing up the checkpoints generated by BLCR. I'll be taking a  
look
at it over the next couple of days, but just thought I'd let  
people

know. Unfortunately I don't have any more details at the moment.

-- Josh

On Mar 17, 2008, at 2:50 PM, Jeff Squyres wrote:


WHAT: Bring new version of libevent to the trunk.

WHY: Newer version, slightly better performance (lower  
overheads /

lighter weight), properly integrate the use of epoll and other
scalable fd monitoring mechanisms.

WHERE: 98% of the changes are in opal/event; there's a few
changes to
configury and one change to the orted.

TIMEOUT: COB, Friday, 21 March 2008

DESCRIPTION:

George/UTK has done the bulk of the work to integrate a new  
version

of
libevent on the following tmp branch:

  https://svn.open-mpi.org/svn/ompi/tmp-public/libevent-merge

** WE WOULD VERY MUCH APPRECIATE IF PEOPLE COULD MTT TEST THIS
BRANCH!
**

Cisco ran MTT on this branch on Friday and everything checked out
(i.e., no more failures than on the trunk).  We just made a few
more
minor changes today and I'm running MTT again now, but I'm not
expecting any new failures (MTT will take several hours).  We  
would

like to bring the new libevent in over this upcoming weekend, but
would very much appreciate if others could test on their  
platforms

(Cisco tests mainly

Re: [OMPI devel] Switching away from SVN?

2008-03-18 Thread Roland Dreier
 > It's been loosely proposed that we switch away from SVN into a  
 > different system.  This probably warrants some discussion to a) figure  
 > out if we want to move, and b) *if* we want to move, which system  
 > should we move to?  One has system been proposed: Mercurial -- several  
 > OMPI developers are using it with good success.  I know that some OMPI  
 > developers use Git, too.  Are there other systems that we should  
 > consider?

As an ompi bystander, I would strongly endorse a switch away from svn.
I think that git, hg and bzr are all roughly equivalent -- they each
have their enthusiastic partisans, but in reality they're all probably
fine.  And the difference between svn and any of the newer distributed
systems, especially for a big codebase like ompi, is pretty huge.

 > Primary reasons for doing the switch are:
 > 
 > - distributed repositories are attractive/useful
 > - git/Mercurial branching and merging are *way* better than SVN
 >--> note that SVN v1.5 is supposed to be *much* better than v1.4

Also, svn is much slower for lots of things, to the point where it
becomes a usability issue.  And supporting disconnected operation (aka
"working on a plane") is another really nice bonus.

 > - how to import all the SVN history to the new system

Should not be a big problem -- since svn at least has atomic
changesets, you avoid all the pain of parsing cvs repositories, and
there fairly mature svn importers for distributed systems.

 - R.


Re: [OMPI devel] RFC: libevent update

2008-03-18 Thread Josh Hursey

I found another problem with the libevent branch.

If I set "-mca btl tcp,self" on the command line then I get a segfult  
when sending messages > 16 KB. I can try to make a smaller repeater,  
but if you use the "progress" or "simple" tests in ompi-tests below:

  https://svn.open-mpi.org/svn/ompi-tests/trunk/iu/ft/correctness

To build:
  shell$ make
To run with failure:
  shell$ mpirun  -np 2 -mca btl tcp,self progress  -s 16 -v 1
To run without failure:
  shell$ mpirun  -np 2 -mca btl tcp,self progress  -s 15 -v 1

This program will display the message "Checkpoint at any time...". If  
you send mpirun SIGUSR2 it will progress to the next stage of the  
test. The failure occurs when the first message before this becomes  
an issue though.


I was using Odin, and if I do not specify the btls then the test will  
pass as normal.


The backtrace is below:
--
...
Core was generated by `progress -s 16 -v 1'.
Program terminated with signal 11, Segmentation fault.
#0  0x002a9793318b in mca_bml_base_free  
(bml_btl=0x736275705f61636d, des=0x559700) at ../../../../ompi/mca/ 
bml/bml.h:267

267 bml_btl->btl_free( bml_btl->btl, des );
(gdb) bt
#0  0x002a9793318b in mca_bml_base_free  
(bml_btl=0x736275705f61636d, des=0x559700) at ../../../../ompi/mca/ 
bml/bml.h:267
#1  0x002a9793304d in mca_pml_ob1_put_completion (btl=0x5598c0,  
ep=0x0, des=0x559700, status=0) at pml_ob1_recvreq.c:190
#2  0x002a97930069 in mca_pml_ob1_recv_frag_callback  
(btl=0x5598c0, tag=64 '@', des=0x2a989d2b00, cbdata=0x0) at  
pml_ob1_recvfrag.c:149
#3  0x002a97d5f3e0 in mca_btl_tcp_endpoint_recv_handler (sd=10,  
flags=2, user=0x5a5df0) at btl_tcp_endpoint.c:696
#4  0x002a95a0ab93 in event_process_active (base=0x508c80) at  
event.c:591
#5  0x002a95a0af59 in opal_event_base_loop (base=0x508c80,  
flags=2) at event.c:763

#6  0x002a95a0ad2b in opal_event_loop (flags=2) at event.c:670
#7  0x002a959fadf8 in opal_progress () at runtime/opal_progress.c: 
169
#8  0x002a9792caae in opal_condition_wait (c=0x2a9587d940,  
m=0x2a9587d9c0) at ../../../../opal/threads/condition.h:93
#9  0x002a9792c9dd in ompi_request_wait_completion (req=0x5a5380)  
at ../../../../ompi/request/request.h:381
#10 0x002a9792c920 in mca_pml_ob1_recv (addr=0x5baf70,  
count=16384, datatype=0x503770, src=1, tag=1001, comm=0x5039a0,  
status=0x0)

at pml_ob1_irecv.c:104
#11 0x002a956f1f00 in PMPI_Recv (buf=0x5baf70, count=16384,  
type=0x503770, source=1, tag=1001, comm=0x5039a0, status=0x0) at  
precv.c:75

#12 0x0040211f in exchange_stage1 (ckpt_num=1) at progress.c:414
#13 0x00401295 in main (argc=5, argv=0x7fbfffe668) at  
progress.c:131

(gdb) p bml_btl
$1 = (mca_bml_base_btl_t *) 0x736275705f61636d
(gdb) p *bml_btl
Cannot access memory at address 0x736275705f61636d
--

-- Josh

On Mar 17, 2008, at 2:50 PM, Jeff Squyres wrote:


WHAT: Bring new version of libevent to the trunk.

WHY: Newer version, slightly better performance (lower overheads /
lighter weight), properly integrate the use of epoll and other
scalable fd monitoring mechanisms.

WHERE: 98% of the changes are in opal/event; there's a few changes to
configury and one change to the orted.

TIMEOUT: COB, Friday, 21 March 2008

DESCRIPTION:

George/UTK has done the bulk of the work to integrate a new version of
libevent on the following tmp branch:

 https://svn.open-mpi.org/svn/ompi/tmp-public/libevent-merge

** WE WOULD VERY MUCH APPRECIATE IF PEOPLE COULD MTT TEST THIS BRANCH!
**

Cisco ran MTT on this branch on Friday and everything checked out
(i.e., no more failures than on the trunk).  We just made a few more
minor changes today and I'm running MTT again now, but I'm not
expecting any new failures (MTT will take several hours).  We would
like to bring the new libevent in over this upcoming weekend, but
would very much appreciate if others could test on their platforms
(Cisco tests mainly 64 bit RHEL4U4).  This new libevent *should* be a
fairly side-effect free change, but it is possible that since we're
now using epoll and other scalable fd monitoring tools, we'll run into
some unanticipated issues on some platforms.

Here's a consolidated diff if you want to see the changes:

https://svn.open-mpi.org/trac/ompi/changeset?old_path=tmp-public% 
2Flibevent-merge&old=17846&new_path=trunk&new=17842


Thanks.

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] RFC: libevent update

2008-03-18 Thread George Bosilca

This has been fixed in the trunk, but not yet merged in the branch.

  george.

On Mar 18, 2008, at 7:17 PM, Josh Hursey wrote:


I found another problem with the libevent branch.

If I set "-mca btl tcp,self" on the command line then I get a segfult
when sending messages > 16 KB. I can try to make a smaller repeater,
but if you use the "progress" or "simple" tests in ompi-tests below:
  https://svn.open-mpi.org/svn/ompi-tests/trunk/iu/ft/correctness

To build:
  shell$ make
To run with failure:
  shell$ mpirun  -np 2 -mca btl tcp,self progress  -s 16 -v 1
To run without failure:
  shell$ mpirun  -np 2 -mca btl tcp,self progress  -s 15 -v 1

This program will display the message "Checkpoint at any time...". If
you send mpirun SIGUSR2 it will progress to the next stage of the
test. The failure occurs when the first message before this becomes
an issue though.

I was using Odin, and if I do not specify the btls then the test will
pass as normal.

The backtrace is below:
--
...
Core was generated by `progress -s 16 -v 1'.
Program terminated with signal 11, Segmentation fault.
#0  0x002a9793318b in mca_bml_base_free
(bml_btl=0x736275705f61636d, des=0x559700) at ../../../../ompi/mca/
bml/bml.h:267
267 bml_btl->btl_free( bml_btl->btl, des );
(gdb) bt
#0  0x002a9793318b in mca_bml_base_free
(bml_btl=0x736275705f61636d, des=0x559700) at ../../../../ompi/mca/
bml/bml.h:267
#1  0x002a9793304d in mca_pml_ob1_put_completion (btl=0x5598c0,
ep=0x0, des=0x559700, status=0) at pml_ob1_recvreq.c:190
#2  0x002a97930069 in mca_pml_ob1_recv_frag_callback
(btl=0x5598c0, tag=64 '@', des=0x2a989d2b00, cbdata=0x0) at
pml_ob1_recvfrag.c:149
#3  0x002a97d5f3e0 in mca_btl_tcp_endpoint_recv_handler (sd=10,
flags=2, user=0x5a5df0) at btl_tcp_endpoint.c:696
#4  0x002a95a0ab93 in event_process_active (base=0x508c80) at
event.c:591
#5  0x002a95a0af59 in opal_event_base_loop (base=0x508c80,
flags=2) at event.c:763
#6  0x002a95a0ad2b in opal_event_loop (flags=2) at event.c:670
#7  0x002a959fadf8 in opal_progress () at runtime/opal_progress.c:
169
#8  0x002a9792caae in opal_condition_wait (c=0x2a9587d940,
m=0x2a9587d9c0) at ../../../../opal/threads/condition.h:93
#9  0x002a9792c9dd in ompi_request_wait_completion (req=0x5a5380)
at ../../../../ompi/request/request.h:381
#10 0x002a9792c920 in mca_pml_ob1_recv (addr=0x5baf70,
count=16384, datatype=0x503770, src=1, tag=1001, comm=0x5039a0,
status=0x0)
at pml_ob1_irecv.c:104
#11 0x002a956f1f00 in PMPI_Recv (buf=0x5baf70, count=16384,
type=0x503770, source=1, tag=1001, comm=0x5039a0, status=0x0) at
precv.c:75
#12 0x0040211f in exchange_stage1 (ckpt_num=1) at progress.c: 
414

#13 0x00401295 in main (argc=5, argv=0x7fbfffe668) at
progress.c:131
(gdb) p bml_btl
$1 = (mca_bml_base_btl_t *) 0x736275705f61636d
(gdb) p *bml_btl
Cannot access memory at address 0x736275705f61636d
--

-- Josh

On Mar 17, 2008, at 2:50 PM, Jeff Squyres wrote:


WHAT: Bring new version of libevent to the trunk.

WHY: Newer version, slightly better performance (lower overheads /
lighter weight), properly integrate the use of epoll and other
scalable fd monitoring mechanisms.

WHERE: 98% of the changes are in opal/event; there's a few changes to
configury and one change to the orted.

TIMEOUT: COB, Friday, 21 March 2008

DESCRIPTION:

George/UTK has done the bulk of the work to integrate a new version  
of

libevent on the following tmp branch:

https://svn.open-mpi.org/svn/ompi/tmp-public/libevent-merge

** WE WOULD VERY MUCH APPRECIATE IF PEOPLE COULD MTT TEST THIS  
BRANCH!

**

Cisco ran MTT on this branch on Friday and everything checked out
(i.e., no more failures than on the trunk).  We just made a few more
minor changes today and I'm running MTT again now, but I'm not
expecting any new failures (MTT will take several hours).  We would
like to bring the new libevent in over this upcoming weekend, but
would very much appreciate if others could test on their platforms
(Cisco tests mainly 64 bit RHEL4U4).  This new libevent *should* be a
fairly side-effect free change, but it is possible that since we're
now using epoll and other scalable fd monitoring tools, we'll run  
into

some unanticipated issues on some platforms.

Here's a consolidated diff if you want to see the changes:

https://svn.open-mpi.org/trac/ompi/changeset?old_path=tmp-public%
2Flibevent-merge&old=17846&new_path=trunk&new=17842

Thanks.

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] Switching away from SVN?

2008-03-18 Thread Jeff Squyres

On Mar 18, 2008, at 7:02 PM, Roland Dreier wrote:


Primary reasons for doing the switch are:

- distributed repositories are attractive/useful
- git/Mercurial branching and merging are *way* better than SVN
  --> note that SVN v1.5 is supposed to be *much* better than v1.4


Also, svn is much slower for lots of things, to the point where it
becomes a usability issue.  And supporting disconnected operation (aka
"working on a plane") is another really nice bonus.


This is a good point - I've [briefly] used both git and Mercurial; as  
part of their "*way* better support for branching and merging" is  
speed.  A goodly-sized merge in SVN can take an hour or more.  I've  
done goodly-sized merges in git and hg in seconds (or minutes).



- how to import all the SVN history to the new system


Should not be a big problem -- since svn at least has atomic
changesets, you avoid all the pain of parsing cvs repositories, and
there fairly mature svn importers for distributed systems.



Agreed -- I'm sure it *can* be done; we just have to spend a few  
cycles to figure out how to do it properly.


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] RFC: libevent update

2008-03-18 Thread Paul H. Hargrove
After taking a look at how epoll is implemented in the Linyux kernel, I
can say with 100% certainty that BLCR will not restore the epoll fd
correctly.  I hope to fix that eventually, but have too many other
things on my plate to address is now.

Since I cannot promise how soon BLCR may be able to resolve this
problem, I suggest that Josh continue exploring the alternatives.  At
least "opal_event_include" set to "poll" appears to work.  It is not
clear to me if the "select" problem is related to BLCR or not.

I am guessing that I don't get a say as to weather the BLCR/epoll
problems should delay the libevent merge, but I trust the rest of you to
determine what is in the best interest of OMPI.

-Paul

Josh Hursey wrote:
> I have some more data from the field.
> 
> Leaving "opal_event_include" unset (Default) BLCR would give me the  
> following error when trying to restart a 2 process 'noop' MPI  
> application:
> 
> shell$ ompi-restart ompi_global_snapshot_8587.ckpt
> Restart failed: Bad file descriptor
> Restart failed: Bad file descriptor
> shell$
> 
[snip]

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] RFC: libevent update

2008-03-18 Thread Jeff Squyres
When did you fix it?  I merged the trunk down to the libevent-merge  
branch late this afternoon (r17869).



On Mar 18, 2008, at 7:29 PM, George Bosilca wrote:


This has been fixed in the trunk, but not yet merged in the branch.

 george.

On Mar 18, 2008, at 7:17 PM, Josh Hursey wrote:


I found another problem with the libevent branch.

If I set "-mca btl tcp,self" on the command line then I get a segfult
when sending messages > 16 KB. I can try to make a smaller repeater,
but if you use the "progress" or "simple" tests in ompi-tests below:
 https://svn.open-mpi.org/svn/ompi-tests/trunk/iu/ft/correctness

To build:
 shell$ make
To run with failure:
 shell$ mpirun  -np 2 -mca btl tcp,self progress  -s 16 -v 1
To run without failure:
 shell$ mpirun  -np 2 -mca btl tcp,self progress  -s 15 -v 1

This program will display the message "Checkpoint at any time...". If
you send mpirun SIGUSR2 it will progress to the next stage of the
test. The failure occurs when the first message before this becomes
an issue though.

I was using Odin, and if I do not specify the btls then the test will
pass as normal.

The backtrace is below:
--
...
Core was generated by `progress -s 16 -v 1'.
Program terminated with signal 11, Segmentation fault.
#0  0x002a9793318b in mca_bml_base_free
(bml_btl=0x736275705f61636d, des=0x559700) at ../../../../ompi/mca/
bml/bml.h:267
267 bml_btl->btl_free( bml_btl->btl, des );
(gdb) bt
#0  0x002a9793318b in mca_bml_base_free
(bml_btl=0x736275705f61636d, des=0x559700) at ../../../../ompi/mca/
bml/bml.h:267
#1  0x002a9793304d in mca_pml_ob1_put_completion (btl=0x5598c0,
ep=0x0, des=0x559700, status=0) at pml_ob1_recvreq.c:190
#2  0x002a97930069 in mca_pml_ob1_recv_frag_callback
(btl=0x5598c0, tag=64 '@', des=0x2a989d2b00, cbdata=0x0) at
pml_ob1_recvfrag.c:149
#3  0x002a97d5f3e0 in mca_btl_tcp_endpoint_recv_handler (sd=10,
flags=2, user=0x5a5df0) at btl_tcp_endpoint.c:696
#4  0x002a95a0ab93 in event_process_active (base=0x508c80) at
event.c:591
#5  0x002a95a0af59 in opal_event_base_loop (base=0x508c80,
flags=2) at event.c:763
#6  0x002a95a0ad2b in opal_event_loop (flags=2) at event.c:670
#7  0x002a959fadf8 in opal_progress () at runtime/ 
opal_progress.c:

169
#8  0x002a9792caae in opal_condition_wait (c=0x2a9587d940,
m=0x2a9587d9c0) at ../../../../opal/threads/condition.h:93
#9  0x002a9792c9dd in ompi_request_wait_completion (req=0x5a5380)
at ../../../../ompi/request/request.h:381
#10 0x002a9792c920 in mca_pml_ob1_recv (addr=0x5baf70,
count=16384, datatype=0x503770, src=1, tag=1001, comm=0x5039a0,
status=0x0)
   at pml_ob1_irecv.c:104
#11 0x002a956f1f00 in PMPI_Recv (buf=0x5baf70, count=16384,
type=0x503770, source=1, tag=1001, comm=0x5039a0, status=0x0) at
precv.c:75
#12 0x0040211f in exchange_stage1 (ckpt_num=1) at  
progress.c:414

#13 0x00401295 in main (argc=5, argv=0x7fbfffe668) at
progress.c:131
(gdb) p bml_btl
$1 = (mca_bml_base_btl_t *) 0x736275705f61636d
(gdb) p *bml_btl
Cannot access memory at address 0x736275705f61636d
--

-- Josh

On Mar 17, 2008, at 2:50 PM, Jeff Squyres wrote:


WHAT: Bring new version of libevent to the trunk.

WHY: Newer version, slightly better performance (lower overheads /
lighter weight), properly integrate the use of epoll and other
scalable fd monitoring mechanisms.

WHERE: 98% of the changes are in opal/event; there's a few changes  
to

configury and one change to the orted.

TIMEOUT: COB, Friday, 21 March 2008

DESCRIPTION:

George/UTK has done the bulk of the work to integrate a new  
version of

libevent on the following tmp branch:

   https://svn.open-mpi.org/svn/ompi/tmp-public/libevent-merge

** WE WOULD VERY MUCH APPRECIATE IF PEOPLE COULD MTT TEST THIS  
BRANCH!

**

Cisco ran MTT on this branch on Friday and everything checked out
(i.e., no more failures than on the trunk).  We just made a few more
minor changes today and I'm running MTT again now, but I'm not
expecting any new failures (MTT will take several hours).  We would
like to bring the new libevent in over this upcoming weekend, but
would very much appreciate if others could test on their platforms
(Cisco tests mainly 64 bit RHEL4U4).  This new libevent *should*  
be a

fairly side-effect free change, but it is possible that since we're
now using epoll and other scalable fd monitoring tools, we'll run  
into

some unanticipated issues on some platforms.

Here's a consolidated diff if you want to see the changes:

https://svn.open-mpi.org/trac/ompi/changeset?old_path=tmp-public%
2Flibevent-merge&old=17846&new_path=trunk&new=17842

Thanks.

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.c

Re: [OMPI devel] rankfile questions

2008-03-18 Thread Ralph Castain
Not trying to pile on here...but I do have a question.

This commit inserted a bunch of affinity-specific code in ompi_mpi_init.c.
Was this truly necessary?

It seems to me this violates our code architecture. Affinity-specific code
belongs in the opal_p[m]affinity functions. Why aren't we just calling a
"opal_paffinity_set_my_processor" function (or whatever name you like) in
mpi_init, and doing all this paffinity stuff there?

It would make mpi_init a lot cleaner, and preserve the code standards we
have had since the beginning.

In addition, the code that has been added returns ORTE error and success
codes. Given the location, it should be OMPI error and success codes - if we
move it to where I think it belongs (in OPAL), then those codes should
obviously be OPAL codes.

If I'm missing some reason why these things can't be done, please enlighten
me. Otherwise, it would be nice if this could be cleaned up.

Thanks
Ralph

On 3/18/08 8:39 AM, "Jeff Squyres"  wrote:

> On Mar 18, 2008, at 9:32 AM, Jeff Squyres wrote:
> 
>> I notice that rankfile didn't compile properly on some platforms and
>> issued warnings on other platforms.  Thanks to Ralph for cleaning it
>> up...
>> 
>> 1. I see a getenv("slot_list") in the MPI side of the code; it looks
>> like $slot_list is set by the odls for the MPI process.  Why isn't it
>> an MCA parameter?  That's what all other values passed by the orted to
>> the MPI process appear to be.
>> 
>> 2. I see that ompi_mpi_params.c is now registering 2 rmaps-level MCA
>> parameters.  Why?  Shouldn't these be in ORTE somewhere?
> 
> 
> A few more notes:
> 
> 3. Most of the files in orte/mca/rmaps/rankfile do not obey the prefix
> rule.  I think that they should be renamed.
> 
> 4. A quick look through rankfile_lex.l seems to show that there are
> global variables that are not protected by the prefix rule (or
> static).  Ditto in rmaps_rf.c.  These should be fixed.
> 
> 5. rank_file_done was instantiated in both rankfile_lex.l and
> ramps_rf.c (causing a duplicate symbol linker error on OS X).  I
> removed it from rmaps_rf.c (it was declared "extern" in
> rankfile_lex.h, assumedly to indicate that it is "owned" by the lex.l
> file...?).
> 
> 6. svn:ignore was not set in the new rankfile directory.




Re: [OMPI devel] RFC: libevent update

2008-03-18 Thread George Bosilca

Commit 17872 is the one you're looking for.

https://svn.open-mpi.org/trac/ompi/changeset/17872

george.

On Mar 18, 2008, at 9:12 PM, Jeff Squyres wrote:


When did you fix it?  I merged the trunk down to the libevent-merge
branch late this afternoon (r17869).


On Mar 18, 2008, at 7:29 PM, George Bosilca wrote:


This has been fixed in the trunk, but not yet merged in the branch.

george.

On Mar 18, 2008, at 7:17 PM, Josh Hursey wrote:


I found another problem with the libevent branch.

If I set "-mca btl tcp,self" on the command line then I get a  
segfult

when sending messages > 16 KB. I can try to make a smaller repeater,
but if you use the "progress" or "simple" tests in ompi-tests below:
https://svn.open-mpi.org/svn/ompi-tests/trunk/iu/ft/correctness

To build:
shell$ make
To run with failure:
shell$ mpirun  -np 2 -mca btl tcp,self progress  -s 16 -v 1
To run without failure:
shell$ mpirun  -np 2 -mca btl tcp,self progress  -s 15 -v 1

This program will display the message "Checkpoint at any time...".  
If

you send mpirun SIGUSR2 it will progress to the next stage of the
test. The failure occurs when the first message before this becomes
an issue though.

I was using Odin, and if I do not specify the btls then the test  
will

pass as normal.

The backtrace is below:
--
...
Core was generated by `progress -s 16 -v 1'.
Program terminated with signal 11, Segmentation fault.
#0  0x002a9793318b in mca_bml_base_free
(bml_btl=0x736275705f61636d, des=0x559700) at ../../../../ompi/mca/
bml/bml.h:267
267 bml_btl->btl_free( bml_btl->btl, des );
(gdb) bt
#0  0x002a9793318b in mca_bml_base_free
(bml_btl=0x736275705f61636d, des=0x559700) at ../../../../ompi/mca/
bml/bml.h:267
#1  0x002a9793304d in mca_pml_ob1_put_completion (btl=0x5598c0,
ep=0x0, des=0x559700, status=0) at pml_ob1_recvreq.c:190
#2  0x002a97930069 in mca_pml_ob1_recv_frag_callback
(btl=0x5598c0, tag=64 '@', des=0x2a989d2b00, cbdata=0x0) at
pml_ob1_recvfrag.c:149
#3  0x002a97d5f3e0 in mca_btl_tcp_endpoint_recv_handler (sd=10,
flags=2, user=0x5a5df0) at btl_tcp_endpoint.c:696
#4  0x002a95a0ab93 in event_process_active (base=0x508c80) at
event.c:591
#5  0x002a95a0af59 in opal_event_base_loop (base=0x508c80,
flags=2) at event.c:763
#6  0x002a95a0ad2b in opal_event_loop (flags=2) at event.c:670
#7  0x002a959fadf8 in opal_progress () at runtime/
opal_progress.c:
169
#8  0x002a9792caae in opal_condition_wait (c=0x2a9587d940,
m=0x2a9587d9c0) at ../../../../opal/threads/condition.h:93
#9  0x002a9792c9dd in ompi_request_wait_completion  
(req=0x5a5380)

at ../../../../ompi/request/request.h:381
#10 0x002a9792c920 in mca_pml_ob1_recv (addr=0x5baf70,
count=16384, datatype=0x503770, src=1, tag=1001, comm=0x5039a0,
status=0x0)
  at pml_ob1_irecv.c:104
#11 0x002a956f1f00 in PMPI_Recv (buf=0x5baf70, count=16384,
type=0x503770, source=1, tag=1001, comm=0x5039a0, status=0x0) at
precv.c:75
#12 0x0040211f in exchange_stage1 (ckpt_num=1) at
progress.c:414
#13 0x00401295 in main (argc=5, argv=0x7fbfffe668) at
progress.c:131
(gdb) p bml_btl
$1 = (mca_bml_base_btl_t *) 0x736275705f61636d
(gdb) p *bml_btl
Cannot access memory at address 0x736275705f61636d
--

-- Josh

On Mar 17, 2008, at 2:50 PM, Jeff Squyres wrote:


WHAT: Bring new version of libevent to the trunk.

WHY: Newer version, slightly better performance (lower overheads /
lighter weight), properly integrate the use of epoll and other
scalable fd monitoring mechanisms.

WHERE: 98% of the changes are in opal/event; there's a few changes
to
configury and one change to the orted.

TIMEOUT: COB, Friday, 21 March 2008

DESCRIPTION:

George/UTK has done the bulk of the work to integrate a new
version of
libevent on the following tmp branch:

  https://svn.open-mpi.org/svn/ompi/tmp-public/libevent-merge

** WE WOULD VERY MUCH APPRECIATE IF PEOPLE COULD MTT TEST THIS
BRANCH!
**

Cisco ran MTT on this branch on Friday and everything checked out
(i.e., no more failures than on the trunk).  We just made a few  
more

minor changes today and I'm running MTT again now, but I'm not
expecting any new failures (MTT will take several hours).  We would
like to bring the new libevent in over this upcoming weekend, but
would very much appreciate if others could test on their platforms
(Cisco tests mainly 64 bit RHEL4U4).  This new libevent *should*
be a
fairly side-effect free change, but it is possible that since we're
now using epoll and other scalable fd monitoring tools, we'll run
into
some unanticipated issues on some platforms.

Here's a consolidated diff if you want to see the changes:

https://svn.open-mpi.org/trac/ompi/changeset?old_path=tmp-public%
2Flibevent-merge&old=17846&new_path=trunk&new=17842

Thanks.

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listin