Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25015

2011-08-08 Thread Ralph Castain
Probably should have injected my earlier thoughts here. IMO, Brian is correct - we do have a basic design problem, and it does need to be corrected. It's a bigger discussion than just this commit, though, and we probably should add it to the Tues telecon so we at least get some initial ideas on

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25015

2011-08-08 Thread Ralph Castain
On Aug 8, 2011, at 1:23 PM, Thomas Herault wrote: > proc_get_epoch( proc ); > calls ORTE_NAME_PRINT(proc) > when in debug mode. > > When this is the case, ORTE_NAME_PRINT prints all fields of proc, including > epoch. So, proc->epoch = proc_get_epoch(proc) triggers the valgrind warning, > becau

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25015

2011-08-08 Thread Ralph Castain
I will interject at this point in the thread. IMO, Brian has raised a valid point. We are contorting around with the epoch values, as I commented in a separate thread. The problem really stems from something Wes mentioned a while back in a different set of communications. When the epoch code wa

Re: [OMPI devel] bug report-

2011-08-08 Thread Shiqing Fan
Hi, I've never tried this Boost.MPI with Open MPI on Windows. Does it work without the Boost.MPI library? Did you run your test under MinGW? Regards, Shiqing On 2011-08-08 5:31 PM, renyong.yang wrote: Run time environment of mine is Windows 7, with the unstable OpenMPI_v1.5.3-2_win32.exe

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25015

2011-08-08 Thread Aurélien Bouteiller
I see no problem with the latest code from Wesley. The warning is silenced, and was not indicative of any buggy behavior. So for me the problem is closed. Aurelien Le 8 août 2011 à 15:30, Barrett, Brian W a écrit : > 4) Don't design an interface where you're going to have the circular > depe

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25015

2011-08-08 Thread Barrett, Brian W
4) Don't design an interface where you're going to have the circular dependency. Perhaps it's a sign that the epoch is in the wrong place, or handled in the wrong way, or the whole naming scheme is wrong. I'm not sure. I've relented that it doesn't need to be fixed now (since it is clearly bigge

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25015

2011-08-08 Thread Thomas Herault
proc_get_epoch( proc ); calls ORTE_NAME_PRINT(proc) when in debug mode. When this is the case, ORTE_NAME_PRINT prints all fields of proc, including epoch. So, proc->epoch = proc_get_epoch(proc) triggers the valgrind warning, because of the print, in debug mode. This is the only reason why proc->

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25015

2011-08-08 Thread Barrett, Brian W
Ok, that makes some sense. It's a really weird way of doing initialization, though. Some of it is probably how Ralph chose to share nidpid map code, so perhaps it's unavoidable. But since the name isn't use in proc_get_epoch() (or, better not be used in there), I'm not entirely sure why that mad

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25015

2011-08-08 Thread Wesley Bland
It's not just copying the value from one to the other. The value is initialized on the first line. The proc_name is passed into the function where the jobid and vpid are used (not the epoch). The function returns a new (the correct) value for the epoch based on the passed in jobid and vpid which i

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25015

2011-08-08 Thread Barrett, Brian W
On 8/8/11 9:34 AM, "Jeff Squyres" wrote: >On Aug 8, 2011, at 11:30 AM, Wesley Bland wrote: > >> The reason is because valgrind was complaining about uninitialized >>values that were passed into proc_get_epoch. I saw the same warnings >>from valgrind when I ran it. I added the code to initialize t

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25015

2011-08-08 Thread Jeff Squyres
On Aug 8, 2011, at 11:30 AM, Wesley Bland wrote: > The reason is because valgrind was complaining about uninitialized values > that were passed into proc_get_epoch. I saw the same warnings from valgrind > when I ran it. I added the code to initialize the values to what really > should be the de

[OMPI devel] bug report-

2011-08-08 Thread renyong.yang
Run time environment of mine is Windows 7, with the unstable OpenMPI_v1.5.3-2_win32.exe release for Windows, together with Microsoft Compute Cluster Pack. Additionally I'm using Boost.MPI library v1.47 compiled by min

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25015

2011-08-08 Thread Wesley Bland
The reason is because valgrind was complaining about uninitialized values that were passed into proc_get_epoch. I saw the same warnings from valgrind when I ran it. I added the code to initialize the values to what really should be the default value and the warnings went away. Since the process_nam

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25005

2011-08-08 Thread Jeff Squyres
Mike -- Does mxm_init() do Reasonable Things to check to see if the local OpenFabrics-capable devices are unsuitable for MXM? E.g., does it check to see if the local OpenFabrics devices are MXM-capable, and if not, fail gracefully? Also, I would suggest NOT showing a show_help message if there

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25015

2011-08-08 Thread Jeff Squyres
FYI: Ralph's out today. He'll be back tomorrow. I'm not really part of this ORTE discussion, but I am curious about a code style that I see in this commit: assigning ORTE_EPOCH_INVALID to a field, and then immediately overwriting that field with another value. E.g.: > peer_name.jobid = O

Re: [OMPI devel] Uninitialized ORTE epoch values

2011-08-08 Thread Wesley Bland
Fixed in r25015. On Fri, Aug 5, 2011 at 4:52 PM, Ralph Castain wrote: > Thanks Wes - it isn't the print that's the issue, it's the fact that we > have epochs that aren't being initialized, and what else that may be causing > to have problems. > > > On Aug 5, 2011, at 2:45 PM, Wesley Bland wrote: