Thanks for bringing this to attention.
In https://mathworld.wolfram.com/Quantile.html Eric Weisstein lists 9
different quantile implementations.
Back in the 90s there was a project to establish a library of stats
functions in APL. One result was a useful Maximum
Likelihood model workspace, but it also produced a suite of APL
functions for distributions, summary statistics,
etc. I think Norman Thomson was one of the participants, Adrian Smith,
not a J-user as gr as I know, another.
I've had a look at two relevant functions in the latter suite,
ASL∆PCTILE and ASL∆SEMIIQR .
ASL stood for APL Statistics Library.
These two listings, stripped of header comments, should be readable in
some email handlers.
RES←PCT ASL∆PCTILE DATA;T;B;∆elx;n
[8] T←T-B←⌊T←0.5+0.01×PCT×n←⍴DATA←DATA[(⍋DATA)]
[9] RES←(DATA[1⌈B]×1-T)+DATA[n⌊B+1]×T
RES←ASL∆SEMIIQR DATA;T;B;IND;∆elx
[8] B←⌊T←0.5+(0.75,0.25)×⍴DATA←DATA[(⍋DATA)]
[9] RES←(-/(DATA[B]×1-T-B)+DATA[B+1]×T-B)÷2
These are my approximate renderings in a mixture of tacit and explicit J.
The remaining comment line is an attempt to render the result more like
the APL original. Note that the APL functions used index origin = 1.
NB. larg is required percentile, rarg is vector of data
aslpct =: 4 : 0
pct =. x
t =. t - b =. <. t =. 0.5 + 0.01 * pct * n =. # data =. /:~ y
((,-.)t) +/ . * data{~<: (1>.b), n<.>:b
NB. (t, 1-t) +/ . * data{~ _1 + (1>.b), n <. 1 + b NB. clearer?
)
asliqr =: 3 : 0
tmb =. t - b =. <. t =. 0.5 + 0.75 0.25 * # data =. /:~ y
((,:~-.)tmb) -/@:+/@:* data{~ (,:~<:) b
NB. -/ ((1 - tmb) * data{~ b-1) + tmb * data{~ b NB. clearer?
)
With a data set, d, of 40 values using ? 10 , I get:
d =: 1 5 0 2 2 7 3 8 3 1 3 8 5 9 2 4 9 3 8 2 8 0 4 3 9 6 3 1 8 2 6 8 1 2
5 9 4 0 5 0
(q, q2, q3, iqr, asliqr) d NB. q,q2,q3,iqr from Emir U, xash, Brian
Schott
5 5 5 5.5 5.5
(q, q2, q3, iqr, asliqr) }.d NB. different approaches differ for
odd and even size.
6 6 6 6 5.75
(q, q2, q3, iqr, asliqr) }:d
6 6 6 6 5.75
These discrepancies are not necessarily indications of error, bearing in
mind
Weinsstein's observations.
I used to prefer percentiles which were stable under reversal of sign;
ie, I felt it
was useful for the negation of the 25th percentile of -d to be the same
as the 75th
percentile of +d . This is a nice property in certain circumstances,
but not essential
for descriptive stats.
I've tweaked q to qq:
qq=: 3 : ' (((1+ i. #y) % #y) I. 0.75 0.25) {/:~y'
and observe, fwiw,
(qq; 25&aslpct, 75&aslpct) d
+---+-----+
|7 2|2 7.5|
+---+-----+
-@|.each (qq; 25&aslpct, 75&aslpct) -d
+---+-----+
|8 2|2 7.5|
+---+-----+
If there's any interest, I could have a shot at translating more of the
APL stats library into J -
probably not necessary for distributions, though some might be missing
from addons/stats.
I can share lists of functions from both sub-projects, but they're too
long for this thread.
Mike
On 21/02/2021 21:19, Emir U wrote:
I couldn't find this in the archive so here is an attempt at a fairly simple
interquartile range function for those who might be looking for it.
iqr=: 3 : 0
q=: 3 : '-/(((1+ i. #y) % #y) I. 0.75 0.25) {/:~y'
q "1 |: y
)
Pretty certain it can be written much more concisely: just not by me right now 🙂
Emir
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm