Thanks for bringing this to attention.
In https://mathworld.wolfram.com/Quantile.html Eric Weisstein lists 9 different quantile implementations.

Back in the 90s there was a project to establish a library of stats functions in APL. One result was a useful Maximum Likelihood model workspace,  but it also produced a suite of APL functions for distributions,  summary statistics, etc.  I think Norman Thomson was one of the participants, Adrian Smith, not a J-user as gr as I know,  another.

I've had a look at two relevant functions in the latter suite, ASL∆PCTILE and ASL∆SEMIIQR .
ASL stood for APL Statistics Library.
These two listings, stripped of header comments,  should be readable in some email handlers.

     RES←PCT ASL∆PCTILE DATA;T;B;∆elx;n
[8]    T←T-B←⌊T←0.5+0.01×PCT×n←⍴DATA←DATA[(⍋DATA)]
[9]    RES←(DATA[1⌈B]×1-T)+DATA[n⌊B+1]×T

     RES←ASL∆SEMIIQR DATA;T;B;IND;∆elx
[8]    B←⌊T←0.5+(0.75,0.25)×⍴DATA←DATA[(⍋DATA)]
[9]    RES←(-/(DATA[B]×1-T-B)+DATA[B+1]×T-B)÷2

These are my approximate renderings in a mixture of tacit and explicit J.
The remaining comment line is an attempt to render the result more like
the APL original.  Note that the APL functions used index origin = 1.

NB. larg is required percentile,  rarg is vector of data
aslpct =: 4 : 0
pct =. x
t   =. t - b =. <. t =. 0.5 + 0.01 * pct * n =. # data =. /:~ y
((,-.)t) +/ . * data{~<: (1>.b), n<.>:b
NB. (t, 1-t) +/ . * data{~ _1 + (1>.b), n <. 1 + b  NB. clearer?
)

asliqr =: 3 : 0
tmb =. t - b =. <. t =. 0.5 + 0.75 0.25 * # data =. /:~ y
((,:~-.)tmb) -/@:+/@:* data{~ (,:~<:) b
NB. -/ ((1 - tmb) * data{~ b-1) + tmb * data{~ b  NB. clearer?
)

With a data set, d, of 40 values using ? 10  , I get:

d =: 1 5 0 2 2 7 3 8 3 1 3 8 5 9 2 4 9 3 8 2 8 0 4 3 9 6 3 1 8 2 6 8 1 2 5 9 4 0 5 0    (q, q2, q3, iqr, asliqr) d NB. q,q2,q3,iqr from Emir U, xash, Brian Schott
5 5 5 5.5 5.5
   (q, q2, q3, iqr, asliqr) }.d  NB. different approaches differ for odd and even size.
6 6 6 6 5.75
   (q, q2, q3, iqr, asliqr) }:d
6 6 6 6 5.75

These discrepancies are not necessarily indications of error, bearing in mind
Weinsstein's observations.

I used to prefer percentiles which were stable under reversal of sign;  ie,  I felt it was useful for the negation of the 25th percentile of -d to be the same as the 75th percentile of +d .   This is a nice property in certain circumstances,  but not essential
for descriptive stats.

I've tweaked q to qq:
   qq=: 3 : ' (((1+ i. #y) % #y) I. 0.75 0.25) {/:~y'
and observe, fwiw,
   (qq; 25&aslpct, 75&aslpct) d
+---+-----+
|7 2|2 7.5|
+---+-----+
   -@|.each (qq; 25&aslpct, 75&aslpct) -d
+---+-----+
|8 2|2 7.5|
+---+-----+

If there's any interest,  I could have a shot at translating more of the APL stats library into J - probably not necessary for distributions,  though some might be missing from addons/stats. I can share lists of functions from both sub-projects,  but they're too long for this thread.

Mike


On 21/02/2021 21:19, Emir U wrote:
I couldn't find this in the archive so here is an attempt at a fairly simple 
interquartile range function for those who might be looking for it.

iqr=: 3 : 0

q=: 3 : '-/(((1+ i. #y) % #y) I. 0.75 0.25) {/:~y'

q "1 |: y

)

Pretty certain it can be written much more concisely: just not by me right now 🙂

Emir
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm


--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to