Am 20.09.2011 um 16:50 schrieb Blosch, Edwin L:

> Thank you all for the replies.
> 
> Certainly optimization flags can be useful to address differences between 
> compilers, etc. And differences in MPI_ALLREDUCE are appreciated as possible. 
>  But I don't think either is quite relevant because:
> 
> - It was exact same compiler, with identical compilation flags.  So whatever 
> optimizations are applied, we should have the same instructions; 

I'm not sure about this. When you compile a program with mpicc, mpif77, ... you 
include automatically the header files of the MPI version in question. Hence 
you get a different set of variables to be stored (although you are not 
accessing them directly) as the internal representation is unique to each MPI 
implementation. If you compare the mpi.h between them they are far from looking 
similar. As a result different operations might be used to transfer your 
application data into the internal representation inside the MPI implementation.


> - I'm looking at inputs and outputs to a compute-only routine - there are no 
> MPI calls within the routine

So this is a serial part in your application?

You can compile with the option -S to get the assembler output.

-- Reuti


> Again, most numbers going into the routine were checked, and there were no 
> differences in the numbers out to 18 digits (i.e. beyond the precision of the 
> FP representation).  Yet, coming out of the routine, results differ.  I am 
> quite sure that no MPI routines were actually involved in calculations, and 
> that the compiler options given, were also the same.
> 
> It appears to be a side effect of linkage that is able to change a 
> compute-only routine's answers.
> 
> I have assumed that max/sqrt/tiny/abs might be replaced, but some other kind 
> of corruption may be going on.
> 
> I also could be mistaken about the inputs to the routine, i.e. they are not 
> truly identical as I have presumed and (partially) checked.
> 
> It is interesting that the whole of the calculation runs fine with MVAPICH 
> and blows up with OpenMPI.
> 
> Another diagnostic step I am taking: see if observation can be repeated with 
> a newer version of OpenMPI (currently using 1.4.3)
> 
> Ed
> 
>       
> -----Original Message-----
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of Reuti
> Sent: Tuesday, September 20, 2011 7:25 AM
> To: tpri...@computer.org; Open MPI Users
> Subject: EXTERNAL: Re: [OMPI users] How could OpenMPI (or MVAPICH) affect 
> floating-point results?
> 
> Am 20.09.2011 um 13:52 schrieb Tim Prince:
> 
>> On 9/20/2011 7:25 AM, Reuti wrote:
>>> Hi,
>>> 
>>> Am 20.09.2011 um 00:41 schrieb Blosch, Edwin L:
>>> 
>>>> I am observing differences in floating-point results from an application 
>>>> program that appear to be related to whether I link with OpenMPI 1.4.3 or 
>>>> MVAPICH 1.2.0.  Both packages were built with the same installation of 
>>>> Intel 11.1, as well as the application program; identical flags passed to 
>>>> the compiler in each case.
>>>> 
>>>> I've tracked down some differences in a compute-only routine where I've 
>>>> printed out the inputs to the routine (to 18 digits) ; the inputs are 
>>>> identical.  The output numbers are different in the 16th place (perhaps a 
>>>> few in the 15th place).  These differences only show up for optimized 
>>>> code, not for -O0.
>>>> 
>>>> My assumption is that some optimized math intrinsic is being replaced 
>>>> dynamically, but I do not know how to confirm this.  Anyone have guidance 
>>>> to offer? Or similar experience?
>>> 
>>> yes, I face it often but always at a magnitude where it's not of any 
>>> concern (and not related to any MPI). Due to the limited precision in 
>>> computers, a simple reordering of operation (although being equivalent in a 
>>> mathematical sense) can lead to different results. Removing the anomalies 
>>> with -O0 could proof that.
>>> 
>>> The other point I heard especially for the x86 instruction set is, that the 
>>> internal FPU has still 80 bits, while the presentation in memory is only 64 
>>> bit. Hence when all can be done in the registers, the result can be 
>>> different compared to the case when some interim results need to be stored 
>>> to RAM. For the Portland compiler there is a switch -Kieee -pc64 to force 
>>> it to stay always in 64 bit, and a similar one for Intel is -mp (now 
>>> -fltconsistency) and -mp1.
>>> 
>> Diagnostics below indicate that ifort 11.1 64-bit is in use.  The options 
>> aren't the same as Reuti's "now" version (a 32-bit compiler which hasn't 
>> been supported for 3 years or more?).
> 
> In the 11.1 documentation they are also still listed:
> 
> http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/fortran/lin/compiler_f/index.htm
> 
> I read it in the way, that -mp is deprecated syntax (therefore listed under 
> "Alternate Options"), but -fltconsistency is still a valid and supported 
> option.
> 
> -- Reuti
> 
> 
>> With ifort 10.1 and more recent, you would set at least
>> -assume protect_parens -prec-div -prec-sqrt
>> if you are interested in numerical consistency.  If you don't want 
>> auto-vectorization of sum reductions, you would use instead
>> -fp-model source -ftz
>> (ftz sets underflow mode back to abrupt, while "source" sets gradual).
>> It may be possible to expose 80-bit x87 by setting the ancient -mp option, 
>> but such a course can't be recommended without additional cautions.
>> 
>> Quoted comment from OP seem to show a somewhat different question: Does 
>> OpenMPI implement any operations in a different way from MVAPICH?  I would 
>> think it probable that the answer could be affirmative for operations such 
>> as allreduce, but this leads well outside my expertise with respect to 
>> specific MPI implementations.  It isn't out of the question to suspect that 
>> such differences might be aggravated when using excessively aggressive ifort 
>> options such as -fast.
>> 
>> 
>>>>        libifport.so.5 =>  
>>>> /opt/intel/Compiler/11.1/072/lib/intel64/libifport.so.5 
>>>> (0x00002b6e7e081000)
>>>>        libifcoremt.so.5 =>  
>>>> /opt/intel/Compiler/11.1/072/lib/intel64/libifcoremt.so.5 
>>>> (0x00002b6e7e1ba000)
>>>>        libimf.so =>  /opt/intel/Compiler/11.1/072/lib/intel64/libimf.so 
>>>> (0x00002b6e7e45f000)
>>>>        libsvml.so =>  /opt/intel/Compiler/11.1/072/lib/intel64/libsvml.so 
>>>> (0x00002b6e7e7f4000)
>>>>        libintlc.so.5 =>  
>>>> /opt/intel/Compiler/11.1/072/lib/intel64/libintlc.so.5 (0x00002b6e7ea0a000)
>>>> 
>> 
>> -- 
>> Tim Prince
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to