On 04-Mar-09 16:56:14, Wacek Kusnierczyk wrote: > (Ted Harding) wrote: > <snip> >> So, with reference to your original question >> "Excel has percentile() function. R function quantile() does the >> same thing. Is there any significant difference btw percentile >> and quantile?" >> the answer is that they in effect give the same results, though >> differ with respect to how they are to be fed (quantile eats >> probabilities, percentile eats percentages). [Though (since I am >> not familiar with Excel) I cannot rule out that Excel's percentile() >> function also eats probabilities; in which case its name would be >> an example of sloppy nomenclature on Excel's part; which I cannot >> rule out on general grounds either]. > > i am not familiar enough with excel to prove or disprove what you say > above, but in general such claims should be grounded in the respective > documentations. > > there are a number of ways to compute empirical quantiles (see, e.g., > [1]), and it's possible that the one used by r's quantile by default > (see ?quantile) is not the one used by excel (where you probably have > no choice; help in oocalc does not specify the method, and i guess > that excel's does not either). > > have you actually confirmed that excel's percentile() does the same as > r's quantile() (modulo the scaling)? > vQ
I have now googled around a bit. All references to the Excel percentile() function say that you feed it the fractional value corresponding to the percentage. So, for example, to get the 80-th percentile you would give it 0.8. Hence Excel should call it "quantile"! As to the algorithm, Wikipedia states the following (translated into R syntax): Many software packages, such as Microsoft Excel, use the following method recommended by NIST[4] to estimate the value, vp, of the pth percentile of an ascending ordered dataset containing N elements with values v[1],v[2],...,v[N]: n = (p/100)*(N-1) + 1 n is then split into its integer component, k and decimal component, d, such that n = k + d. If k = 1, then the value for that percentile, vp, is the first member of the ordered dataset, v[1]. If k = N, then the value for that percentile, vp, is the Nth member of the ordered dataset, v[N]. Otherwise, 1 < k < N and vp = v[k] + d*(v[k + 1] - v[k]). Note that the Wikipedia article uses the "%" interpretation of "p-th percentile", i.e. the point which is (p/100) of the way along the distribution. It looks as though R's quantile with type=4 might be the same, since it is explained as "linear interpolation of the empirical cdf", which is what the above description of Excel's method does. However, R's default type is 7, which is different. Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.hard...@manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 04-Mar-09 Time: 17:29:50 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.