On Mon, Jun 9, 2008 at 4:45 PM, Robert Kern <[EMAIL PROTECTED]> wrote: > On Mon, Jun 9, 2008 at 18:34, Keith Goodman <[EMAIL PROTECTED]> wrote: >> Does anyone have a function that converts ranks into a Gaussian? >> >> I have an array x: >> >>>> import numpy as np >>>> x = np.random.rand(5) >> >> I rank it: >> >>>> x = x.argsort().argsort() >>>> x_ranked = x.argsort().argsort() >>>> x_ranked >> array([3, 1, 4, 2, 0]) > > There are subtleties in computing ranks when ties are involved. Take a > look at the implementation of scipy.stats.rankdata().
Good point. I had to deal with ties and missing data. I bet scipy.stats.rankdata() is faster than my implementation. >> I would like to convert the ranks to a Gaussian without using scipy. > > No dice. You are going to have to use scipy.special.ndtri somewhere. A > basic transformation (off the top of my head, I have no idea if this > is statistically meaningful): > > scipy.special.ndtri((ranks + 1.0) / (len(ranks) + 1.0)) > > Barring tied first or last items, this should give equal weight to > each of the tails outside of the range of your data. Nice. Thank you. It passes the never wrong chi-by-eye test: r = np.arange(1000) g = special.ndtri((r + 1.0) / (len(r) + 1.0)) pylab.hist(g, 50) pylab.show() I wasn't able to use scipy.special.ndtri (after import scipy) like you did. I had to do (but I'm new to scipy) from scipy import special special.ndtri scipy.__version__ '0.6.0' from Debian Lenny. _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion