I have been able to implement the Mahalanobis distance function, and I have also been able to generate code that computes the mahalanobis distance calculations. Both have resulted in the same results, though these results differ from the true results.
I believe my problem is in the formulation of 'S' (see below), but I am not sure as to how to correct it. Would anybody who has successfully implemented the mahalanobis() please provide some guidance on what I am doing wrong??????? #Both of the following two methods for computing this distance give the same result #To calculate Mahalanobis distance for populations 1 and 2 m.x1<-mean(subset(my.data, pop==1)) m.x2<-mean(subset(my.data, pop==2)) s1<-cov(subset(my.data, pop==1)) s2<-cov(subset(my.data, pop==2)) #I believe I am doing something wrong with the calculation of 'S' S<-((((n.rows[1]-1)*s1) + ((n.rows[2]-1)*s2)) / ((n.rows[1]+n.rows[2])-1)) Si<-ginv(S) d2<-t(m.x1-m.x2) %*% Si %*% (m.x1-m.x2) d2 #or using the mahalanobis() function mahalanobis(m.x1,m.x2,S) > If the goal is to *use* the Mahalanobis distance, rather than to learn > how to write your own code, there are several existing implementations. > rseek.org is a good place to find functions. > > Sarah > > On Fri, Jan 29, 2010 at 9:48 PM, Robert Lonsinger > <rob.lonsin...@gmail.com> wrote: >> Hello, >> I am a new R user and trying to learn how to implement the mahalanobis >> function to measure the distance between to 2 population centroids. I >> have used STATISTICA to calculate these differences, but was hoping to >> learn >> to do the analysis in R. I have implemented the code as below, but my >> results are very different from that of STATISTICA, and I believe I may >> not >> have interpreted the help correctly and may have implemented the >> code incorrectly. >> >> Though I am not certain, I believe that my error may be in calculating >> the >> common covariance matrix (the third argument supplied to the mahalanobis >> funtion). >> >> Any help or guidance would be greatly appreciated. >> >> Thank you! RL >> >> CODE >> >> fit<-lda(pop~v1 + v2 + v3 +...+vn, data=my.data) >> >> x1<-subset(my.data, pop==1) >> >> x2<-subset(my.data, pop==2) >> >> >> >> #Save Covariance Matices for each group >> cov1<-cov(x1) >> cov2<-cov(x2) >> >> >> >> #Determine number of rows in each matrix >> n1<-nrow(x1); n2<-nrow(x2); >> n.rows<-c(n1,n2) >> >> >> #store mean vectors from lda object >> mu1<-fit$means[1,] >> mu2<-fit$means[2,] >> >> >> >> #Calculate the common Covariance Matrix >> S<-(((n.rows[1]-1)*cov1)+((n.rows[2]-1)*cov2)/ (sum(n.rows[1:2])-1)) >> >> #Calculate the common Covariance Matrix >> mahalanobis(mu1, mu2, S) >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Sarah Goslee > http://www.functionaldiversity.org > -- [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.