Yes, my suggestion was not on the problem of Elahep in itself but to which you mentioned with regard to the distance D2 and sample sizes.
On Sun, Jan 31, 2016 at 3:14 PM, F. James Rohlf < [email protected]> wrote: > I am sorry but that does not fix the problem. The problem is that > Mahalanobis distance is not defined (and thus cannot even be calculated) if > the within-group covariance matrix is singular – which it must be if its > number of degrees of freedom is less than the number of shape variables. > Even if the sample sizes were somewhat larger there would still be a > problem as the coefficient is very sensitive to chance unless the sample > sizes are much larger. > > > > Note that one uses the within-group covariance matrix not the overall > covariance matrix. This also reveals the problem that for the distance to > be very meaningful one assumes that the covariances matrices are > homogeneous across groups. Often unlikely to be true in many studies. > > > > Rather disappointing as there are many situations in which one would like > to use that coefficient. An ad hoc solution that is often used is to just > use the first few PCA axes as the shape variables. Of course one might then > miss more subtle differences among groups if they do not account for a > relatively large proportion of the total variance. > > > > ____________________________________________ > > F. James Rohlf, Distinguished Professor, Emeritus. Ecology & Evolution > > Research Professor, Anthropology > > Stony Brook University > > > > *From:* Miguel Eduardo Delgado Burbano [mailto:[email protected]] > *Sent:* Sunday, January 31, 2016 3:35 AM > *To:* [email protected] > *Cc:* Elahep <[email protected]>; MORPHMET < > [email protected]>; [email protected] > > *Subject:* Re: [MORPHMET] Mahalanobis distance in cluster analysis of > shape variables > > > > Usually researchers use small sample sizes for distinct reasons in my case > because I study archaeological and paleontological derived samples. The > practical problem mentioned by James could be partially solved correcting > the D2 distances for small sample size, that is, calculating an unbiased > Mahalanobis distance ∆2 following Marcus L. 1993. (Some aspects of > multivariate statistics for morphometrics. In: Marcus LF, Bello E, > García-Valdecasas A, editors. Contributions to morphometrics. Museo > Nacional de Ciencias Naturales, Madrid. p 99-130). > > > > On Sat, Jan 30, 2016 at 4:51 PM, F. James Rohlf < > [email protected]> wrote: > > The distinction is that Mahalanobis distance should be thought of as a > statistical distance. For a single variable it is like a z-score (a > difference divided by a standard deviation). It is not a measure of the > absolute amount of difference. In the multivariate case Mahalanobis > distance is relative to the amount of the amount of variation in the > direction of the difference (that is what taking into account within-group > covariation gives you). > > > > Both Mahalanobis and Euclidean distances are valid. It depends on what you > wish “distance” to mean. In morphometrics do you want to cluster based on > how similar shapes are (in terms of distance in Kendall shape space) or > based on the degree of statistical overlap in population samples (e.g., the > degree to which specimens from the two groups might be misidentified). > > > > A practical problem with Mahalanobis distance in many morphometric studies > is that it requires large sample sizes within groups because landmark data > is usually high dimensional and thus very large samples are needed for > reliable results. > > > > ____________________________________________ > > F. James Rohlf, Distinguished Professor, Emeritus. Ecology & Evolution > > Research Professor, Anthropology > > Stony Brook University > > > > *From:* Elahep [mailto:[email protected]] > *Sent:* Saturday, January 30, 2016 7:14 AM > *To:* MORPHMET <[email protected]> > *Cc:* [email protected]; [email protected] > *Subject:* Re: [MORPHMET] Mahalanobis distance in cluster analysis of > shape variables > > > > Dear Joseph, > > > > Thanks for your detailed explanation. As it is recommended by Claude in > "morphometrics with R" (2008) it's better to use the Mahalanobis distance > for clustering group means, because this will be scaled by the within-group > variance-covariance. In my analysis, I calculated the mean value of > relative warp scores for each population and then carried out a UPGMA > cluster analysis based on the Euclidian distance and results were > satisfying for me and they were congruent with my other results. According > to the book and other articles I ran the same analysis but based on the > Mahalanobis distance in PAST software, but unfortunately whenever I ran the > analysis the software error "Invalid floating point operation" appeared!! > so I couldn't see the Mahalanobis's cluster!! (I couldn't realize why this > error happens) > > Euclidian distance worked for me, but I was just curious to understand if > my analyses is statistically meaningful!! > > > > Thanks again for your answer, > > Elahe > > On Saturday, January 30, 2016 at 5:12:46 PM UTC+3:30, Joseph Kunkel wrote: > > I can not speak directly to why it is frequently used in GM cluster > analysis but I would like to mention how I look at Mahalanobis distance > based on its calculation. > > Mahalanobis distance is not a pure distance metric like Euclidian or > Manhattan distance, as you have stated it is ‘standardized’. What doe that > really mean? It sounds supeficially good. > > One way of computing it is to rotate the k-landmark data set to simplest > form treating the landmarks as factors. This way would consider all > landmarks to have a common covariance structure in XY or XYZ in three > dimensions. That is a already a streetch, since not all landmarks can be > assumed to have the same covariance structure. In addition the landmarks > have all been already centered about their centroid and rotated to > coincide, which has eliminated a dgeree of freedom of variability that can > have consequences. > > Furthermore not all species landmarks can be expected to have the same > covariance structure, which is an assumption made in the ordinary > Mahalanobis distance application to strut analysis between populations or > species. The assumption of similar data structure of course applies to the > null hypothesis where there is no difference. The typical statistical test > explodes when the null hypothesis is falsified so just when you want the > Mahalanobis distance metric to be accurate it starts misbehaving. > > After rotation to simplest axes one does an 1 df F-test between each of > the landmarks. These tests are all independent so they can be summed > together to produce a k df F-test which is Mahalonobis D squared. So > Mahalonobis D is the square root of the sum of independent F-tests, but > those F-tests are based on all sorts of assumptions about the variance of > the landmarks. I immagine on could modify calculation of D by limiting the > sum over the top 95 or 99% variance components of the principal components. > > Many times applications of analytical techniques are judged by whether > they ‘work’ or not. If a clustering method works for you, use it(?). I > am of the opinion that I use statistics to convince myself rather than the > audience. A confluence on many arguments is used to make a case. > > Joe > > -·. .· ·. .><((((º>·. .· ·. .><((((º>·. .· ·. .><((((º> .··.· >=- > =º}}}}}>< > Joseph G. Kunkel, Research Professor > UNE Biddeford ME 04005 > http://www.bio.umass.edu/biology/kunkel/ > > > On Jan 30, 2016, at 7:11 AM, Elahep <[email protected]> wrote: > > > > > > Hello all, > > > > > > > > I have seen in many GM articles people use Mahalanobis distance for > cluster analysis. What is the advantage of using Mahalanobis distance over > Euclidian distance as similarity measure in cluster analysis of shape > variables? > > > > As far as I know Mahalanobis distance is the standardized form of > Euclidean distance which standardized data with adjustments made for > correlation between variables and weights all variables equally. > > > > Why this distance measure is frequently used in GM cluster analysis?? > > > > > > > > Thanks in advance > > > > Elahe > > > > > > -- > > MORPHMET may be accessed via its webpage at http://www.morphometrics.org > > --- > > You received this message because you are subscribed to the Google > Groups "MORPHMET" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected]. > > -- > MORPHMET may be accessed via its webpage at http://www.morphometrics.org > --- > You received this message because you are subscribed to the Google Groups > "MORPHMET" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > > -- > MORPHMET may be accessed via its webpage at http://www.morphometrics.org > --- > You received this message because you are subscribed to the Google Groups > "MORPHMET" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > > > > > > -- > > ************************************************* > > Miguel Delgado PhD > > CONICET-División Antropología. > > Facultad de Ciencias Naturales y Museo. > > Universidad Nacional de La Plata > > Paseo del Bosque s/n. La Plata 1900. Argentina > > Cel: 5492216795916. Fax: 54 221 4257527 > > https://unlp.academia.edu/DelgadoMiguel > > http://www.cearqueologia.com.ar/ > > E-mail: [email protected] > > ************************************************* > -- ************************************************* Miguel Delgado PhD CONICET-División Antropología. Facultad de Ciencias Naturales y Museo. Universidad Nacional de La Plata Paseo del Bosque s/n. La Plata 1900. Argentina Cel: 5492216795916. Fax: 54 221 4257527 https://unlp.academia.edu/DelgadoMiguel http://www.cearqueologia.com.ar/ E-mail: [email protected] ************************************************* -- MORPHMET may be accessed via its webpage at http://www.morphometrics.org --- You received this message because you are subscribed to the Google Groups "MORPHMET" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
