On 01/14/2013 04:06 PM, Paul Sandoz wrote:
On Jan 14, 2013, at 3:38 PM, Peter Levart <peter.lev...@gmail.com> wrote:

I think these classes are targeted at use cases such as gathering real-time 
statistics of profiling or business data, where data comes in from various 
sources in real-time and statistics are sampled in real-time too...

For bulk processing, the new streams API seems more appropriate. I think the 
user might be able to control the order of operations applied 
(j.u.stream.Spliterator API indicates that the spliting of work among FJP 
threads could be controled and we can hope that the order of reduction of 
intermediary results would also be controllable by the user or at least 
defined).

Can streams API developers shed some light on that?

DoubleStream (when added) will have a sum method that will defer to a reduce, 
so elements will be processed in order, but the grouping of elements depends on 
how the input is split and to what depth, and the user will have no control 
over that.
Unless user implements his own Spliterator, right?


It is similar in concept to the IntStream.sum method, but i expect for 
DoubleStream the collectors API will be used with a double sum collector impl 
that compensates for errors and supports merging (in order) of intermediate sum 
values.

Paul.

Regards, Peter

On 01/14/2013 07:18 AM, Howard Lovatt wrote:
If you make a binary tree and sum it, the rounding errors aren't that bad and 
this algorithm is easy to parallelise.

Higham, Nicholas J 1993 the accuracy of floating point summation SIAM Sci Comp 
14 (4) 783-799

Also see Wikipedia for a description of Kahan summation and a general 
discussion of this topic.

Why not commit to binary tree reductions and that will allow everyone to 
understand what is going on and design lambdas accordingly.

  -- Howard.

Sent from my iPad

On 13/01/2013, at 2:04 AM, Doug Lea <d...@cs.oswego.edu> wrote:

On 01/11/13 21:37, Joe Darcy wrote:

I would prefer to cautionary note along the lines of "if you want numerical
accuracy, take care to use a summation algorithm with defined worst-case 
behavior."

(Varying magnitude is not so much of a problem if you add things up in the right
order.)
Thanks. I do not believe such an algorithm exists, because
no ordering control is possible, and all other known accuracy
improvements (like Kahn) require multiword atomicity, which we
explicitly do not provide.

Which leaves me thinking that the current disclaimer (below)
is the best we can do.

-Doug

"The order of accumulation within or across threads is not guaranteed.
Thus, this class may not be applicable if numerical stability is
required, especially when combining values of substantially different
orders of magnitude."


Reply via email to