On Oct 25, 2012, at 4:41 PM, Bert Gunter wrote: > 1. I don't know what StatMatch is. Try using stats::mahalanobis. > > 2. It's the covariance matrix that is **numerically** singular and > can't be inverted. Why do you claim that there's "no way" this could > be true when there are hundreds of variables (= dimensions). > > 3. Try calculating the svd of your matrix and see what you get if you > haven't already done so.
This was crossposted to StackOverflow where Josh O'Brien has responded that his code using svd() shows the matrix to be highly collinear. This is the upper left corner of the correlation matrix: V1 V2 V3 V4 V5 V1 1.00000000 0.97250825 0.93390424 0.918813118 0.89705917 V2 0.97250825 1.00000000 0.97118079 0.954020724 0.93992361 V3 0.93390424 0.97118079 1.00000000 0.991508026 0.97602188 V4 0.91881312 0.95402072 0.99150803 1.000000000 0.98837387 V5 0.89705917 0.93992361 0.97602188 0.988373865 1.00000000 > length( which(cor(mat)==1) ) [1] 374 Just looking at it should give a good idea why. I can see bands of columns that are identically zero. -- david. > Cheers, > Bert > > On Thu, Oct 25, 2012 at 4:14 PM, langvince <la...@purdue.edu> wrote: >> Hi folks, >> >> I know, this is a fairly common question and I am really disappointed that I >> could not find a solution. >> I am trying to calculate Mahanalobis distances in a data frame, where I have >> several hundreds groups and several hundreds of variables. >> >> Whatever I do, however I subset it I get the "system is computationally >> singular: reciprocal condition number" error. >> I know what it means and I know what should be the problem, but there is no >> way this is a singular matrix. >> >> I have uploaded the input file to my ftp: >> http://mkk.szie.hu/dep/talt/lv/CentInpDuplNoHeader.txt >> It is a tab delimited txt file with no headers. >> >> I tried the StatMatch Mahanalobis function and also this function: >> >> mahal_dist <-function (data, nclass, nvariable) { >> dist <- matrix(0, nclass, nclass) >> n=0 >> w <- cov(data) >> print(w) >> for(i in 1:nclass) { >> >> for(c in 1:nclass){ >> diffl <- vector(length = nvariable) >> for(l in 1:nvariable){ >> diffl[l]=abs(data[i,l]-data[c,l]) >> >> } >> ### matrixes >> print(diffl) >> dist[i,c]= (t(diffl))%*%(solve(w))%*%(diffl) >> } >> >> n=n+1 >> print(n) >> } >> return(dist) >> sqrt_dist <- sqrt(dist) >> print(sqrt_dist) } >> >> >> I have a deadline for this project (not a homework:)), and I could always >> use this codes, so I thought I will be able to quit the calculations short, >> but now I am just lost. >> >> I would really appreciate any help. >> >> Thanks for any help >> >> >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/system-is-computationally-singular-reciprocal-condition-number-tp4647472.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.