Please remember that stats can be more general. I frequently use stats for complex types. In that case, mean is also complex, but var is scalar. The proposed implementation doesn't address this.
On Tuesday 25 February 2003 12:29 am, Jason D Schmidt wrote: > I know this is well after the discussion on the stats class has ended, > but I think I have a good idea here. > > Scott Kirkwood proposed a class that behaves something like this: > > stats myStats; > for (int i = 0; i < 100; ++i) { > myStats.add(i); > } > cout << "Average: " << myStats.getAverage() << "\n"; > cout << "Max: " << myStats.getMax() << "\n"; > cout << "Standard deviation: " << myStats.getStd() << "\n"; > > In one of my classes in grad school, I found it quite useful and > effecient to do statistics on the fly like this, so this stats class > interests me. Anyway, Scott has already alluded to the point I'm about > to make. I think it's important and useful for this stats class to > integrate with the STL well. This example code was inspired by the > PointAverage example from "Effective STL" p. 161: > > // this class reports statistics > template <typename value_type> > class stats > { > public: > stats(const size_t n, const value_type sum, const value_type > sum_sqr): > m_n(n), m_sum(sum), m_sum_sqr(sum_sqr) > {} > value_type sum() const > { return m_sum; } > value_type mean() const > { return m_sum/m_n; } > value_type var() const > { return m_sum_sqr - m_sum*m_sum/m_n; } > value_type delta() const // aka, standard dev > { return sqrt(var() / (m_n-1)); } > private: > value_type m_n, m_sum, m_sum_sqr; > }; > > // this class accumulates results that can be used to > // compute meaningful statistics > template <typename value_type> > class stats_accum: public std::unary_function<const value_type, void> > { > public: > stats_accum(): n(0), sum(0), sum_sqr(0) > {} > // use this to operate on each value in a range > void operator()(argument_type x) > { > ++n; > sum += x; > sum_sqr += x*x; > } > stats<value_type> result() const > { return stats<value_type>(n, sum, sum_sqr); } > private: > size_t n; > value_type sum, sum_sqr; > }; > > int main(int argc, char *argv[]) > { > typedef float value_type; > const size_t n(10); > > float f[n] = {0, 2, 3, 4, 5, 6, 7, 8, 9, 8}; > > // accumulate stats over a range of iterators > my_stats = std::for_each(f, f+n, > stats_accum<value_type>()).result(); > > m = my_stats.mean(); > m = my_stats.delta(); // aka, standard deviation > > return 0; > } > > This seems to be pretty similar to what Scott has proposed, and it turns > out that this method is very fast. In my tests it has been nearly as > fast as if we got rid of the classes and used a hand-written loop. It's > certainly much faster than storing the data in a std::valarray object, > and using functions that calculate the mean & standard deviation > separately. This is just a neat application of Scott's idea. > > I think this stats could be pretty useful for scientific computing, and > in this example it works very well with the STL and has great > performance. I'd like to see more code like this in Boost, but most of > my work is numerical. Take my opinion or leave it. > > Jason Schmidt _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost