Re: [OMPI devel] Is trunk broken ?

2008-06-19 Thread Ralph Castain
On 6/19/08 3:31 PM, "Jeff Squyres" wrote: > Yo Ralph -- > > Is the "bad" grpcomm component both new and the default? Further, is > the old "basic" grpcomm component now the non-default / testing > component? Yes to both > > If so, I wonder if what happened was that

Re: [OMPI devel] Is trunk broken ?

2008-06-19 Thread Jeff Squyres
Yo Ralph -- Is the "bad" grpcomm component both new and the default? Further, is the old "basic" grpcomm component now the non-default / testing component? If so, I wonder if what happened was that Pasha did an "svn up", but without re-running autogen/configure, he wouldn't have seen

Re: [OMPI devel] Is trunk broken ?

2008-06-19 Thread Pavel Shamis (Pasha)
I did fresh check out and everything works well. So looks like some svn up screw my svn. Ralph, thanks for help ! Ralph H Castain wrote: Hmmm...something isn't right, Pasha. There is simply no way you should be encountering this error. You are picking up the wrong grpcomm module. I went ahead

Re: [OMPI devel] Is trunk broken ?

2008-06-19 Thread Ralph H Castain
Hmmm...something isn't right, Pasha. There is simply no way you should be encountering this error. You are picking up the wrong grpcomm module. I went ahead and fixed the grpcomm/basic module, but as I note in the commit message, that is now an experimental area. The grpcomm/bad module is the

Re: [OMPI devel] Is trunk broken ?

2008-06-19 Thread Pavel Shamis (Pasha)
Ralph H Castain wrote: Ha! I found it - you left out one very important detail. You are specifying the use of the grpcomm basic module instead of the default "bad" one. Hmm , I did not specified any "grpcomm" module. I just checked and that module is indeed showing a problem. I'll see what

Re: [OMPI devel] Is trunk broken ?

2008-06-19 Thread Pavel Shamis (Pasha)
Ralph H Castain wrote: I can't find anything wrong so far. I'm waiting in a queue on Odin to try there since Jeff indicated you are using rsh as a launcher, and that's the only access I have to such an environment. Guess Odin is being pounded because the queue isn't going anywhere. I use

Re: [OMPI devel] Is trunk broken ?

2008-06-19 Thread Ralph H Castain
Ha! I found it - you left out one very important detail. You are specifying the use of the grpcomm basic module instead of the default "bad" one. I just checked and that module is indeed showing a problem. I'll see what I can do. For now, though, just use the default grpcomm and it will work

Re: [OMPI devel] Is trunk broken ?

2008-06-19 Thread Ralph H Castain
I can't find anything wrong so far. I'm waiting in a queue on Odin to try there since Jeff indicated you are using rsh as a launcher, and that's the only access I have to such an environment. Guess Odin is being pounded because the queue isn't going anywhere. Meantime, I'm building on RoadRunner

Re: [OMPI devel] Is trunk broken ?

2008-06-19 Thread Pavel Shamis (Pasha)
You'll have to tell us something more than that, Pasha. What kind of environment, what rev level were you at, etc. Ahh, sorry :) I run on Linux x86_64 Sles10 sp1. (Open MPI) 1.3a1r18682M , OFED 1.3.1 Pasha. So far as I know, the trunk is fine. On 6/19/08 12:01 PM, "Pavel Shamis (Pasha)"

Re: [OMPI devel] RML Send

2008-06-19 Thread Ralph H Castain
Okay, I've traced this down. The problem is that a DSS-internal function has been exposed via the API, so now people can mistakenly call the wrong one. You should -never- be using opal_dss.pack_buffer or opal_dss.unpack_buffer. Those were supposed to be internal to the DSS only, and will

Re: [OMPI devel] Is trunk broken ?

2008-06-19 Thread Ralph H Castain
You'll have to tell us something more than that, Pasha. What kind of environment, what rev level were you at, etc. So far as I know, the trunk is fine. On 6/19/08 12:01 PM, "Pavel Shamis (Pasha)" wrote: > I tried to run trunk on my machines and I got follow error: >

Re: [OMPI devel] [OMPI svn] svn:open-mpi r18677

2008-06-19 Thread Ralph H Castain
I would argue that this behavior is in fact consistent - the returned state is that all required connections have been opened and is independent of the selected routed module. How that is done is irrelevant to the caller. Each routed module knows precisely what connections are used for its

[OMPI devel] Is trunk broken ?

2008-06-19 Thread Pavel Shamis (Pasha)
I tried to run trunk on my machines and I got follow error: [sw214:04367] [[16563,1],1] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file base/grpcomm_base_modex.c at line 451 [sw214:04367] [[16563,1],1] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file

Re: [OMPI devel] autogen error

2008-06-19 Thread Jeff Squyres
Will do. And with some off-list mails to Leonardo, it seems that the env variable GREP_COLORS was the culprit. On Jun 19, 2008, at 12:01 PM, Ralf Wildenhues wrote: * Jeff Squyres wrote on Thu, Jun 19, 2008 at 05:50:43PM CEST: Ralf: if it's more correct to also quote the m4_define first

Re: [OMPI devel] autogen error

2008-06-19 Thread Jeff Squyres
Ah! Looks like your "ls" must be aliased to include colors or somesuch. So I think the real culprit here is that we need to ensure to use an unaliased "ls" when getting the list of components. I can fix up autogen to do this. Ralf: if it's more correct to also quote the m4_define first

Re: [OMPI devel] autogen error

2008-06-19 Thread Leonardo Fialho
Hi Ralf, $ aclocal -I config /usr/local/bin/m4:config/mca_no_configure_components.m4:9: ERROR: end of file in string autom4te: /usr/local/bin/m4 failed with exit status: 1 aclocal: autom4te failed with exit status: 1 $ My line 9 have some characters more (I'm not m4, expert...):

Re: [OMPI devel] autogen error

2008-06-19 Thread Jeff Squyres
Interesting! I'm happy to make the change, but can you guess as to why this is only biting Leonardo, and only now (after literally years of being underquoted)? On Jun 19, 2008, at 11:29 AM, Ralf Wildenhues wrote: Hello Leonardo, * Leonardo Fialho wrote on Thu, Jun 19, 2008 at

Re: [OMPI devel] autogen error

2008-06-19 Thread Ralf Wildenhues
Hello Leonardo, * Leonardo Fialho wrote on Thu, Jun 19, 2008 at 04:29:30PM CEST: > [Running] aclocal -I config > /usr/local/bin/m4:config/mca_no_configure_components.m4:9: ERROR: end of > file in string > autom4te: /usr/local/bin/m4 failed with exit status: 1 > aclocal: autom4te failed with

Re: [OMPI devel] autogen error

2008-06-19 Thread Leonardo Fialho
That is the versions that I'm using: $ aclocal --version aclocal (GNU automake) 1.10.1 ... $ autoheader --version autoheader (GNU Autoconf) 2.62 ... $ autoconf --version autoconf (GNU Autoconf) 2.62 ... $ autom4te --version autom4te (GNU Autoconf) 2.62 ... $ libtoolize --version libtoolize (GNU

Re: [OMPI devel] autogen error

2008-06-19 Thread Leonardo Fialho
Hi Jeff, Yes, with a fresh checkout... well, it can be some error in my aclocal files, I just updated it today, but I think I did it correctly. Leonardo Jeff Squyres escribió: That's a weird one -- that file (mca_no_configure_components.m4) is automatically generated by autogen.sh. I can't

Re: [OMPI devel] autogen error

2008-06-19 Thread Jeff Squyres
That's a weird one -- that file (mca_no_configure_components.m4) is automatically generated by autogen.sh. I can't think offhand of how it could be bogus. If you have a fresh tree checkout and run autogen, is the problem repeatable? On Jun 19, 2008, at 10:29 AM, Leonardo Fialho wrote:

Re: [OMPI devel] RML Send

2008-06-19 Thread Leonardo Fialho
Hi Ralph, Mi mistake, I'm really using ORTE_PROC_MY_DAEMON->jobid. I have success using pack_buffer()/unpack_buffer() and OPAL_BYTE type, something strange occur when I was using pack()/unpack(). The value of num_bytes increase, example: I tried to read num_bytes=5, and after a unpack this

[OMPI devel] autogen error

2008-06-19 Thread Leonardo Fialho
Hi All, Anybody knows what is this error? Yes, I think that I'm using last version of M4, autoconf, automake and libtool, I think... *** Running GNU tools [Running] autom4te --language=m4sh ompi_get_version.m4sh -o ompi_get_version.sh [Running] libtoolize --automake --copy --ltdl **

Re: [OMPI devel] MPI_Iprobe and mca_btl_sm_component_progress

2008-06-19 Thread Brian W. Barrett
On Thu, 19 Jun 2008, Terry Dontje wrote: But my concern is not the raw performance of MPI_Iprobe in this case but more of an interaction between MPI and an application. The concern is if it takes 2 MPI_Iprobes to get to the real message (instead of one) then could this induce a

Re: [OMPI devel] MPI_Iprobe and mca_btl_sm_component_progress

2008-06-19 Thread Terry Dontje
George Bosilca wrote: Terry, We had a discussion about this few weeks ago. I have a version that modify this behavior (SM progress will not return as long as there are pending acks). There was no benefit from doing so (even if one might think that less calls to opal_progress might improve