dsimcha wrote:
== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
Robert Jacques wrote:
On Sat, 07 Nov 2009 12:56:35 -0500, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:

Robert Jacques wrote:
I'd recommend rolling that into a basic statistics struct containing
common single pass metrics: i.e. sum, mean, variance, min, max, etc.
Well the problem is that if you want to compute several one-pass
statistics in one pass, you'd have to invent means to combine these
functions. That ability is already present in reduce, e.g. reduce(min,
max)(range) yields a pair containing the min and the max element after
exactly one pass through range.

Andrei
Yes, but reduce(mean, std)(range) doesn't work.
 From std.algorithm's doc:
// Compute sum and sum of squares in one pass
r = reduce!("a + b", "a + b * b")(tuple(0.0, 0.0), a);
// Compute average and standard deviation from the above
auto avg = r.field[0] / a.length;
auto stdev = sqrt(r.field[1] / a.length - avg * avg);
I'm not saying there's no need for a more specialized library, just that
I purposely designed reduce to be no slouch either.

Don't get me wrong, I love reduce and it's definitely the right tool for some
jobs.  It's just that computing standard deviations isn't one of them.  Finding
the sum of the squares explicitly is an absolutely **horrible** way to find the
standard deviation because it's numerically unstable.  What if you have a few
billion numbers being read in lazily from a file and you want to find the 
standard
deviation of them?  Heck, summing explicitly isn't even a very good way to find
the mean.

I'm sure you could implement a proper algorithm for this using reduce, but it
would be really awkward.  IMHO reduce's place is as a convenience for simple
things like finding the max and min of a range.  Once you're trying to shoehorn
something into reduce that doesn't fit nicely, it's time to give up using reduce
and just write a "real" function.

I agree.

Andrei

Reply via email to