-------- Original Message --------
Date: Fri, 29 Feb 2008 11:05:27 -0800 (PST)
From: Elsa et Stéphane BOUEE <[EMAIL PROTECTED]>
To: <[email protected]>
Speaking about Mahalanobis distance (D) I have a question/remark.
Due to random fluctuation in a finite number of observations, D is not
null and will increase with the number of variables.
Markus has proposed a formula that takes into account this fact (I did
not find the mathematical demonstration of this formula):
Corrected(D)=[(n1+n2-p-3)*D/(n1+n2-2)]-[(n1+n2)*p/n1*n2]
With: D=mahalanobis distance
n1 and n2: number of observations in the 2 groups
p: number of variables
I applied this formula on a dataset and found negative results (even
with a small number of variables (5)), which is embarrassing for a distance…
Therefore, I used another method to encompass this bias. I randomly
permuted the variables with the observations (I neither cannot use my
hands, but hope everyone can understand) and calculated 10000 random D
by using this method. Then, I subtracted the mean of those random D to
the true D calculated on my dataset.
Am I correct doing so ?
Has anyone an idea of a better (exact mathematic) way to correct the D
without having negative values?
Thank you for your answers
Stéphane BOUEE
--
Replies will be sent to the list.
For more information visit http://www.morphometrics.org