Re: [OMPI devel] Adding error/verbose messages to the TCP BTL

2010-03-07 Thread George Bosilca
Then let's just be patient until OPAL_SOS make it in the trunk, and save us the 
burden of a large effort made twice.

  george.

On Mar 5, 2010, at 22:35 , Ralph Castain wrote:

> 
> On Mar 5, 2010, at 7:22 PM, Jeff Squyres wrote:
> 
>> On Mar 5, 2010, at 6:10 PM, Ralph Castain wrote:
>> 
 I agree with Jeff's comments about the BTL_ERROR. How about a middle 
 ground here? We let the BTLs use BTL_ERROR, eventually with some 
 modifications, and we redirect the BTL_ERROR to a more advanced macro 
 including support for orte_show_help? This will require going over all the 
 BTLs, but on the bright side it will give us a 100% consistency on 
 retorting errors.
>>> 
>>> Sounds reasonable to me - I'm happy to help do it, assuming Jeff also 
>>> concurs. I assume we would then replace all the show_help calls as well? 
>>> Otherwise, I'm not sure what we gain as the direct orte_show_help 
>>> dependency will remain. Or are those calls too specialized to be replaced 
>>> with BTL_ERROR?
>> 
>> Should this kind of thing wait for OPAL_SOS?
>> 
>> (I mention this because the OPAL_SOS RFC will be sent to devel Real Soon 
>> Now...)
> 
> Sure - OPAL_SOS will supersede all this anyway.
> 
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] RFC: Rename --enable-*-threads and ENABLE*THREAD* (take 2)

2010-03-07 Thread George Bosilca
Quick question about this. We now have an OPAL level progress thread, which 
enables the machinery at the OPAL level. Unfortunately, this doesn't say 
anything about what the MPI level will do? Moreover, this is quite confusing as 
there are no communications layers in OPAL so one can ask what an OPAL level 
--enable-progress-thread means.

This raise several related questions. Do you expect to have a ORTE level 
progress thread even if the MPI level do not have one? I didn't look at the 
code, but I have a strong doubt about such mix-up between threads requirements.

How do we know when MPI needs a progress thread? There is no option for this. 
Or should we define that if MPI_THREAD_MULTIPLE is supported and 
OPAL_PROGRESS_THREAD is enabled this means the BTLs can register their own 
progress thread?

george.


On Mar 4, 2010, at 16:17 , Jeff Squyres wrote:

> WHAT: Rename the --enable-*-threads configure switches and ENABLE*THREAD* 
> macros.
>  (see previous RFC: 
> http://www.open-mpi.org/community/lists/devel/2010/01/7366.php)
> 
> WHY: The fact that thread safety in OPAL and ORTE requires a configure switch 
> with "mpi" in the name is very non-intuitive.  Additionally, 
> MPI_THREAD_MULTIPLE support is not necessarily the same thing as OPAL thread 
> support (MTM needs OPAL thread support, but not the other way around), and we 
> are seeing a growing advantage/need for ORTE to utilize threads in mpirun and 
> orted irrespective of the MPI layer's threading abilities.
> 
> WHERE: Mostly opal/config/opal_config_threads.m4, something new in 
> ompi/config/*.m4, and wherever the current ENABLE*THREAD* macros are 
> currently used in the current code base.
> 
> WHEN: Next Friday COB
> 
> TIMEOUT: COB, Friday, Feb 5, 2010
> 
> 
> 
> More details:
> 
> Cisco is starting to investigate using ORTE and OPAL in various threading 
> scenarios.  The fact that you need to enable thread safety in ORTE/OPAL with 
> a configure switch that has the word "mpi" in it is extremely 
> counter-intuitive (it bit some of our engineers very badly, and they were 
> mighty annoyed!!).  In addition, we ran into problems where it was 
> advantageous to have threads in ORTE, but we couldn't do it without forcing 
> thread support into the MPI layer because the switch is universal.
> 
> Since this functionality actually has nothing to do with MPI (it's actually 
> the other way around -- MPI_THREAD_MULTIPLE needs this functionality), we 
> really should divorce MPI threading functionality from whether threading 
> machinery is enabled in OPAL or not. 
> 
> These names were proposed at the end of the previous RFC and no one objected, 
> so I'm sending this around as a new RFC to ensure we're all on the same sheet 
> of music:
> 
> --enable-opal-progress-threads: enables progress thread machinery in opal
> --> this is just a renaming from --enable-progress-threads
> --> the corresponding #define stays the same: OPAL_ENABLE_PROGRES_THREADS
> 
> --enable-opal-multi-threads: enables multi threaded machinery in opal
> --> this is just a renaming from --enable-mpi-threads
> --> the corresponding #define also renames; from OPAL_ENABLE_MPI_THREADS to 
> OPAL_ENABLE_MULTI_THREADS
> 
> --enable-mpi-thread-multiple: enables the use of MPI_THREAD_MULTIPLE; *ONLY* 
> affects the MPI layer
> --> use of this switch explicitly implies --enable-opal-multi-threads
> --> new #define: OMPI_ENABLE_THREAD_MULTIPLE
> 
> We can keep and deprecate the old configure options if desired:
> 
> --enable-mpi-threads: deprecated synonym for --enable-mpi-thread-multiple
> --enable-progress-threads: deprecated synonym for 
> --enable-opal-progress-threads
> 
> ..although I'm somewhat inclined to ditch them unless someone has strong 
> feelings about keeping them.
> 
> Doing the name change in OPAL and ORTE is fairly straightforward -- it's 
> essentially an s/foo/bar/g kind of operation.  It'll likely take a little 
> more effort in the MPI layer because the places where the current #defines 
> are used may need to switch to the new name or to the new 
> OMPI_ENABLE_THREAD_MULTIPLE name (and maybe some new logic?  I am not sure 
> without looking into it closer).
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] RFC: Rename --enable-*-threads and ENABLE*THREAD* (take 2)

2010-03-07 Thread Ralph Castain
Those are excellent questions that I have asked as well at various times :-)

Some thoughts below

On Mar 7, 2010, at 1:20 PM, George Bosilca wrote:

> Quick question about this. We now have an OPAL level progress thread, which 
> enables the machinery at the OPAL level. Unfortunately, this doesn't say 
> anything about what the MPI level will do?

That is correct and has always been the case. The OPAL progress thread only 
indicates that opal_progress is being called via a separate thread. Currently, 
turning "on" the opal progress thread automatically turns "on" opal thread 
support and enables MPI thread multiple. However, the BTLs may or may not be 
involved (see below).

With this change, you can turn "on" the opal progress thread and/or the opal 
thread support but not enable MPI thread multiple if you choose not to do so. 
Similarly, if you enable MPI thread multiple you will automatically turn "on" 
the opal thread support, but you will -not- turn "on" the opal progress thread. 
This is required behavior as some (most?) of the BTL's are not safe when opal 
progress thread is active.


> Moreover, this is quite confusing as there are no communications layers in 
> OPAL so one can ask what an OPAL level --enable-progress-thread means.

It means that anything involving OPAL events will be progressed. So async 
messages coming into ORTE, for example, can be supported without waiting for 
someone to call into the OMPI library.

My understanding is that the design decision to have a "central" progress 
thread at the OPAL layer was intended to help avoid thread-lock and unnecessary 
overhead caused by having multiple progress threads throughout the code. I'm 
content to let the original design stand for now and address that question 
(i.e., OPAL vs ORTE progress thread) as a separate issue for the future.

This RFC is solely to change the configure option names to remove the badly 
overloaded and confusing --enable-mpi-threads


> 
> This raise several related questions. Do you expect to have a ORTE level 
> progress thread even if the MPI level do not have one? I didn't look at the 
> code, but I have a strong doubt about such mix-up between threads 
> requirements.

Yep - but as the original RFC discussion explained, not inside MPI apps. The 
desire is to allow mpirun and orted to utilize threads without stipulating that 
they can -only- do so if MPI apps are thread-enabled. The two situations are 
completely orthogonal and should not be connected via the configure options.

> 
> How do we know when MPI needs a progress thread? There is no option for this. 
> Or should we define that if MPI_THREAD_MULTIPLE is supported and 
> OPAL_PROGRESS_THREAD is enabled this means the BTLs can register their own 
> progress thread?

At the moment, the BTLs already use their own progress threads and do -not- 
utilize the OPAL progress thread. Why the various BTL developers chose to do 
this is unknown to me and essentially irrelevant to this RFC. What the BTL 
developers may want to do is review the reasons behind this design decision. As 
I understand it, there was consideration of this question, and it was a made 
decision (as opposed to a simple oversight) to have BTL-specific progress 
threads instead of relying on the OPAL progress thread.


> 
> george.
> 
> 
> On Mar 4, 2010, at 16:17 , Jeff Squyres wrote:
> 
>> WHAT: Rename the --enable-*-threads configure switches and ENABLE*THREAD* 
>> macros.
>> (see previous RFC: 
>> http://www.open-mpi.org/community/lists/devel/2010/01/7366.php)
>> 
>> WHY: The fact that thread safety in OPAL and ORTE requires a configure 
>> switch with "mpi" in the name is very non-intuitive.  Additionally, 
>> MPI_THREAD_MULTIPLE support is not necessarily the same thing as OPAL thread 
>> support (MTM needs OPAL thread support, but not the other way around), and 
>> we are seeing a growing advantage/need for ORTE to utilize threads in mpirun 
>> and orted irrespective of the MPI layer's threading abilities.
>> 
>> WHERE: Mostly opal/config/opal_config_threads.m4, something new in 
>> ompi/config/*.m4, and wherever the current ENABLE*THREAD* macros are 
>> currently used in the current code base.
>> 
>> WHEN: Next Friday COB
>> 
>> TIMEOUT: COB, Friday, Feb 5, 2010
>> 
>> 
>> 
>> More details:
>> 
>> Cisco is starting to investigate using ORTE and OPAL in various threading 
>> scenarios.  The fact that you need to enable thread safety in ORTE/OPAL with 
>> a configure switch that has the word "mpi" in it is extremely 
>> counter-intuitive (it bit some of our engineers very badly, and they were 
>> mighty annoyed!!).  In addition, we ran into problems where it was 
>> advantageous to have threads in ORTE, but we couldn't do it without forcing 
>> thread support into the MPI layer because the switch is universal.
>> 
>> Since this functionality actually has nothing to do with MPI (it's actually 
>> the other way around -- MPI_THREAD_MULTIPLE needs this functi

Re: [OMPI devel] Adding error/verbose messages to the TCP BTL

2010-03-07 Thread Jeff Squyres
I'm not sure about that -- OPAL_SOS will take some time to propagate  
throughout the code base, even after the infrastructure is added to  
the trunk.


My point was that it might not be worth it to revamp BTL_ERROR if  
OPAL_SOS is coming.  But I'd still like to get the new TCP BTL  
messages in.  :-)




On Mar 7, 2010, at 11:13 AM, George Bosilca wrote:

Then let's just be patient until OPAL_SOS make it in the trunk, and  
save us the burden of a large effort made twice.


  george.

On Mar 5, 2010, at 22:35 , Ralph Castain wrote:

>
> On Mar 5, 2010, at 7:22 PM, Jeff Squyres wrote:
>
>> On Mar 5, 2010, at 6:10 PM, Ralph Castain wrote:
>>
 I agree with Jeff's comments about the BTL_ERROR. How about a  
middle ground here? We let the BTLs use BTL_ERROR, eventually with  
some modifications, and we redirect the BTL_ERROR to a more advanced  
macro including support for orte_show_help? This will require going  
over all the BTLs, but on the bright side it will give us a 100%  
consistency on retorting errors.

>>>
>>> Sounds reasonable to me - I'm happy to help do it, assuming Jeff  
also concurs. I assume we would then replace all the show_help calls  
as well? Otherwise, I'm not sure what we gain as the direct  
orte_show_help dependency will remain. Or are those calls too  
specialized to be replaced with BTL_ERROR?

>>
>> Should this kind of thing wait for OPAL_SOS?
>>
>> (I mention this because the OPAL_SOS RFC will be sent to devel  
Real Soon Now...)

>
> Sure - OPAL_SOS will supersede all this anyway.
>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





Re: [OMPI devel] RFC: Rename --enable-*-threads and ENABLE*THREAD*(take 2)

2010-03-07 Thread Jeff Squyres

On Mar 7, 2010, at 12:59 PM, Ralph Castain wrote:

> Quick question about this. We now have an OPAL level progress  
thread, which enables the machinery at the OPAL level.  
Unfortunately, this doesn't say anything about what the MPI level  
will do?


That is correct and has always been the case. The OPAL progress  
thread only indicates that opal_progress is being called via a  
separate thread. Currently, turning "on" the opal progress thread  
automatically turns "on" opal thread support and enables MPI thread  
multiple. However, the BTLs may or may not be involved (see below).




How about calling it --enable-opal-event-progress-thread, or even -- 
enable-open-libevent-progress-thread?


This RFC is solely to change the configure option names to remove  
the badly overloaded and confusing --enable-mpi-threads




+1

At the moment, the BTLs already use their own progress threads and  
do -not- utilize the OPAL progress thread. Why the various BTL  
developers chose to do this is unknown to me and essentially  
irrelevant to this RFC. What the BTL developers may want to do is  
review the reasons behind this design decision. As I understand it,  
there was consideration of this question, and it was a made decision  
(as opposed to a simple oversight) to have BTL-specific progress  
threads instead of relying on the OPAL progress thread.





The openib BTL can have up to 2 progress threads (!) -- the async  
verbs event notifier and the RDMA CM agent.  They really should be  
consolidated.  If there's infrastructure to consolidate them via opal  
or something else, then so much the better...


--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/



[OMPI devel] valgrind problem with 1.4.1 and MPI_Allgather()

2010-03-07 Thread Barry Smith



Begin forwarded message:


From: Barry Smith 
Date: March 7, 2010 9:17:10 PM CST
To: de...@open-mpi.org
Cc: Satish Balay 
Subject: valgrind problem with 1.4.1 and MPI_Allgather()


> ==9066== Source and destination overlap in memcpy(0xa571694,  
0xa571698, 8)

> ==9066==at 0xC5B224: memcpy (mc_replace_strmem.c:482)
> ==9066==by 0x91FC39: ompi_ddt_copy_content_same_ddt (in ./ex3)
> ==9066==by 0x949DFA: ompi_coll_tuned_allgather_intra_bruck  
(in ./ex3)

> ==9066==by 0x9287AF: MPI_Allgather (in ./ex3)






Re: [OMPI devel] RFC: Rename --enable-*-threads and ENABLE*THREAD*(take 2)

2010-03-07 Thread Ralph Castain

On Mar 7, 2010, at 5:13 PM, Jeff Squyres wrote:

> On Mar 7, 2010, at 12:59 PM, Ralph Castain wrote:
> 
>> > Quick question about this. We now have an OPAL level progress thread, 
>> > which enables the machinery at the OPAL level. Unfortunately, this doesn't 
>> > say anything about what the MPI level will do?
>> 
>> That is correct and has always been the case. The OPAL progress thread only 
>> indicates that opal_progress is being called via a separate thread. 
>> Currently, turning "on" the opal progress thread automatically turns "on" 
>> opal thread support and enables MPI thread multiple. However, the BTLs may 
>> or may not be involved (see below).
>> 
> 
> How about calling it --enable-opal-event-progress-thread, or even 
> --enable-open-libevent-progress-thread?

Why not add another 100+ characters to the name while we are at it? :-/

enable-opal-progress-thread accurately reflects what it does, IMHO

> 
>> This RFC is solely to change the configure option names to remove the badly 
>> overloaded and confusing --enable-mpi-threads
>> 
> 
> +1
> 
>> At the moment, the BTLs already use their own progress threads and do -not- 
>> utilize the OPAL progress thread. Why the various BTL developers chose to do 
>> this is unknown to me and essentially irrelevant to this RFC. What the BTL 
>> developers may want to do is review the reasons behind this design decision. 
>> As I understand it, there was consideration of this question, and it was a 
>> made decision (as opposed to a simple oversight) to have BTL-specific 
>> progress threads instead of relying on the OPAL progress thread.
>> 
> 
> 
> The openib BTL can have up to 2 progress threads (!) -- the async verbs event 
> notifier and the RDMA CM agent.  They really should be consolidated.  If 
> there's infrastructure to consolidate them via opal or something else, then 
> so much the better...

Agreed, though I think that is best done as a separate effort from this RFC. I 
believe there was a concern over latency if all the BTLs are driven by one 
progress thread that sequentially runs across their respective file 
descriptors, but I may be remembering it incorrectly...

> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel