[OMPI devel] Return code and error message problems

2008-03-25 Thread Tim Prins
Hi, Something went wrong last night and all our MTT tests had the following output: [odin005.cs.indiana.edu:28167] [[46567,0],0] ORTE_ERROR_LOG: Error in file base/plm_base_launch_support.c at line 161 -- mpirun was unable

Re: [OMPI devel] Return code and error message problems

2008-03-25 Thread Ralph H Castain
Interesting! I was running it on odin last night until around 11pm your time without problems. I'll take a look On 3/25/08 6:35 AM, "Tim Prins" wrote: > Hi, > > Something went wrong last night and all our MTT tests had the following > output: > [odin005.cs.indiana.edu:28167] [[46567,0],0]

[OMPI devel] iof/libevent failures?

2008-03-25 Thread Tim Prins
Hi everyone, For the last couple nights ALL of our mtt runs have been failing (although the failure is masked because mpirun is returning the wrong error code) with: [odin005.cs.indiana.edu:28167] [[46567,0],0] ORTE_ERROR_LOG: Error in file base/plm_base_launch_support.c at line 161 -

Re: [OMPI devel] iof/libevent failures?

2008-03-25 Thread Jeff Squyres
We're chasing down a problem that we're having on OSX w.r.t. libevent, too -- can you try running with: --mca opal_event_include select and see if that fixes the problem for you? On Mar 25, 2008, at 8:49 AM, Tim Prins wrote: Hi everyone, For the last couple nights ALL of our mtt runs ha

Re: [OMPI devel] iof/libevent failures?

2008-03-25 Thread Tim Prins
I was able to replicate the failure with a debug build by running mpirun through a batch job. I then added the parameter you gave me, and it worked fine with the parameter. Thanks, Tim Jeff Squyres wrote: We're chasing down a problem that we're having on OSX w.r.t. libevent, too -- can you

Re: [OMPI devel] iof/libevent failures?

2008-03-25 Thread Jeff Squyres
Crud, ok. I added this info to https://svn.open-mpi.org/trac/ompi/ticket/1253 ; hopefully we'll resolve it today. I guess people didn't test the libevent-merge branch before we brought it to the trunk. :-( On Mar 25, 2008, at 9:22 AM, Tim Prins wrote: I was able to replicate the failure w

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r17951

2008-03-25 Thread Jeff Squyres
Note: you do *not* need to re-autogen for this commit. On Mar 25, 2008, at 9:41 AM, gship...@osl.iu.edu wrote: Author: gshipman Date: 2008-03-25 09:41:09 EDT (Tue, 25 Mar 2008) New Revision: 17951 URL: https://svn.open-mpi.org/trac/ompi/changeset/17951 Log: need orted_LDFLAGS as a placeholder

[OMPI devel] Coverity results

2008-03-25 Thread Jeff Squyres
Heads up to those examining Coverity results (may not be many at the moment, but we'll likely use them quite a bit during the 1.3 release process)... David Maxwell was finally able to track down a long-standing issue that we've had with their scanner: sometimes, the number of OMPI source

Re: [OMPI devel] iof/libevent failures?

2008-03-25 Thread Ralph H Castain
> I cannot replicate this with a debug build I was doing all my work in a debug build, Tim - may be why I didn't see the problem. There is an issue with libevent right now and pty's/select/event registration. George, Jeff, and I have been chatting about it as it breaks the Mac completely. Will l

[OMPI devel] Open MPI v1.2.6rc3 has been posted

2008-03-25 Thread Tim Mattox
Hi All, The next release candidate (rc3) of Open MPI v1.2.6 is now up: http://www.open-mpi.org/software/ompi/v1.2/ Please run it through it's paces as best you can. -- Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/ tmat...@gmail.com || timat...@open-mpi.org I'm a bright... http://www.th

Re: [OMPI devel] iof/libevent failures?

2008-03-25 Thread Jeff Squyres
Further discussion on: https://svn.open-mpi.org/trac/ompi/ticket/1253 On Mar 25, 2008, at 8:54 AM, Ralph H Castain wrote: I cannot replicate this with a debug build I was doing all my work in a debug build, Tim - may be why I didn't see the problem. There is an issue with libevent right n

[OMPI devel] Using coverity results

2008-03-25 Thread Jeff Squyres
I have started checking the Coverity results again. There's a lot of good stuff in there! But there's some false positives, as well. I *strongly* encourage developers to start checking the coverity results; it takes a little time to dig through the issues that it finds and decide whether

[OMPI devel] 1.2.6 testing

2008-03-25 Thread Jeff Squyres
I mentioned on the call today that there were some bsend errors showing up in MTT on the v1.2 branch. I am now pretty sure that these are some weird artifact of my test environment. I cannot reproduce these errors manually. I did some sanity testing on 1.2.6rc3 today and it looks good to m

Re: [OMPI devel] Using coverity results

2008-03-25 Thread Jeff Squyres
I put up some notes about using the Coverity web tool on the wiki (there's a link from the front page, too): https://svn.open-mpi.org/trac/ompi/wiki/Coverity I also asked Coverity if they could install libibverbs so that we can get scanning of the openib BTL. Let's see what they say; if

Re: [OMPI devel] Proc modex change

2008-03-25 Thread Jeff Squyres
Sounds reasonable to me. Anyone else care? On Mar 20, 2008, at 2:03 PM, Brian W. Barrett wrote: Hi all - Does anyone know why we go through the modex receive and for the local process in ompi_proc_get_info()? It doesn't seem like it's necessary, and it causes some problems on platforms th

Re: [OMPI devel] FreeBSD timer_base_open error?

2008-03-25 Thread Jeff Squyres
"linux" is the name of the component. It looks like opal/mca/timer/ linux/timer_linux_component.c is doing some checks during component open() and returning an error if it can't be used (e.g,. if it's not on linux). The timer components are a little different than normal MCA frameworks; t

Re: [OMPI devel] FreeBSD timer_base_open error?

2008-03-25 Thread Brian Barrett
On Mar 25, 2008, at 6:16 PM, Jeff Squyres wrote: "linux" is the name of the component. It looks like opal/mca/timer/ linux/timer_linux_component.c is doing some checks during component open() and returning an error if it can't be used (e.g,. if it's not on linux). The timer components are a lit