Please remember that stats can be more general.  I frequently use stats for 
complex types.  In that case, mean is also complex, but var is scalar.  The 
proposed implementation doesn't address this.

On Tuesday 25 February 2003 12:29 am, Jason D Schmidt wrote:
> I know this is well after the discussion on the stats class has ended,
> but I think I have a good idea here.
>
> Scott Kirkwood proposed a class that behaves something like this:
>
>   stats myStats;
>     for (int i = 0; i < 100; ++i) {
>         myStats.add(i);
>     }
>     cout << "Average: " << myStats.getAverage() << "\n";
>     cout << "Max: " << myStats.getMax() << "\n";
>     cout << "Standard deviation: " << myStats.getStd() << "\n";
>
> In one of my classes in grad school, I found it quite useful and
> effecient to do statistics on the fly like this, so this stats class
> interests me.  Anyway, Scott has already alluded to the point I'm about
> to make.  I think it's important and useful for this stats class to
> integrate with the STL well.  This example code was inspired by the
> PointAverage example from "Effective STL" p. 161:
>
> // this class reports statistics
> template <typename value_type>
> class stats
> {
> public:
>     stats(const size_t n, const value_type sum, const value_type
> sum_sqr):
>     m_n(n), m_sum(sum), m_sum_sqr(sum_sqr)
>     {}
>     value_type sum() const
>     { return m_sum; }
>     value_type mean() const
>     { return m_sum/m_n; }
>     value_type var() const
>     { return m_sum_sqr - m_sum*m_sum/m_n; }
>     value_type delta() const  // aka, standard dev
>     { return sqrt(var() / (m_n-1)); }
> private:
>     value_type m_n, m_sum, m_sum_sqr;
> };
>
> // this class accumulates results that can be used to
> // compute meaningful statistics
> template <typename value_type>
> class stats_accum: public std::unary_function<const value_type, void>
> {
> public:
>     stats_accum(): n(0), sum(0), sum_sqr(0)
>     {}
>      // use this to operate on each value in a range
>     void operator()(argument_type x)
>     {
>         ++n;
>         sum += x;
>         sum_sqr += x*x;
>     }
>     stats<value_type> result() const
>     { return stats<value_type>(n, sum, sum_sqr); }
> private:
>     size_t n;
>     value_type sum, sum_sqr;
> };
>
> int main(int argc, char *argv[])
> {
>     typedef float value_type;
>     const size_t n(10);
>
>     float f[n] = {0, 2, 3, 4, 5, 6, 7, 8, 9, 8};
>
>    // accumulate stats over a range of iterators
>     my_stats = std::for_each(f, f+n,
>         stats_accum<value_type>()).result();
>
>     m = my_stats.mean();
>     m = my_stats.delta();  // aka, standard deviation
>
>     return 0;
> }
>
> This seems to be pretty similar to what Scott has proposed, and it turns
> out that this method is very fast.  In my tests it has been nearly as
> fast as if we got rid of the classes and used a hand-written loop.  It's
> certainly much faster than storing the data in a std::valarray object,
> and using functions that calculate the mean & standard deviation
> separately.  This is just a neat application of Scott's idea.
>
> I think this stats could be pretty useful for scientific computing, and
> in this example it works very well with the STL and has great
> performance.  I'd like to see more code like this in Boost, but most of
> my work is numerical.  Take my opinion or leave it.
>
> Jason Schmidt
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Reply via email to