Re: [OMPI devel] MPI_Reduce() is losing precision

N.M. Maclaren Mon, 15 Oct 2012 05:45:07 -0400

On Oct 15 2012, Iliev, Hristo wrote:

Numeric differences are to be expected with parallel applications. Thebasic reason for that is that on many architectures floating-pointoperations are performed using higher internal precision than that of thearguments and only the final result is rounded back to the lower outputprecision. When performing the same operation in parallel, intermediateresults are communicated using the lower precision and thus the finalresult could differ. ...


Not quite.  That's ONE reason.

You could try to "cure" this (non-problem) by telling your compiler to not
use higher precision for intermediate results.


But it wouldn't help if the problem is the other reason, which is that
floating-point arithmetic is not associative.  That means that the actual
order of the operations makes a difference to the final result, and that
is (correctly) unspecified for MPI_Reduce.

I have had long arguments with people who believe in deterministic
floating-point (i.e. that consistency implies correctness), but the
actual fact is that it is an unavoidable problem with parallel use of
floating-point or indeed any serious numeric optimisation.

So the summary is that anyone doing floating-point work has to learn
to live with it.  Any traditional book on numerical programming (i.e.
before 1980) will take that for granted.


Regards,
Nick Maclaren.

Re: [OMPI devel] MPI_Reduce() is losing precision

Reply via email to