On Fri, Nov 27, 2009 at 9:25 PM, Wayne Watson <sierra_mtnv...@sbcglobal.net>wrote:
> I actually wrote my own several days ago. When I began getting myself > more familiar with numpy, I was hoping there would be an easy to use > version in it for this frequency approach. If not, then I'll just stick > with what I have. It seems something like this should be common. > > A simple way to do it with the present capabilities would be to "unwind" > the frequencies, For example, given [2,1,3] for some corresponding set > of x, say, [1,2,3], produce[1, 1, 2, 3, 3, 3]. I have no idea if numpy > does anything like that, but, if so, the typical mean, std, ... could be > used. In my case, it's sort of pointless. It would produce an array of > 307,200 items for 256 x (0,1,2,...,255), and just slow down the > computations "unwinding" it in software. The sub-processor hardware > already produced the 256 frequencies. > > Basically, this amounts to having a pdf, and values of x. > Mathematically, the statistics are produced directly from it. > > josef.p...@gmail.com wrote: > > On Fri, Nov 27, 2009 at 9:47 PM, Wayne Watson > > <sierra_mtnv...@sbcglobal.net> wrote: > > > >> How do I compute avg, std dev, min, max and other simple stats if I only > >> know the frequency distribution? > >> > > > > If you are willing to assign to all observations in a bin the value at > > the bin midpoint, then you could do it with weights in the statistics > > calculations. However, numpy.average is, I think, the only statistic > > that takes weights. min max are independent of weight, but std and var > > need to be calculated indirectly. > > > > If you need more stats with weights, then the attachment in > > http://projects.scipy.org/scipy/ticket/604 is a good start. > > > > Josef > Wayne: There is no need to "unwind": If Y(X) is the (unnormalized) freq. distr. of random variable/data X, start by computing y = Y/(Y.sum()) (if Y is already normalized, skip this step). Then: av(X) = np.dot(X, y), sd(X) = np.sqrt(np.dot((X*X), y) - (av(X))^2), and higher moment statistics can be calculated utilizing similar formulae. DG
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion