As Petr suggests, the dist() function will do much of the work for you. For example ... # example matrix of data nsamples <- 40 nreadings <- 460000 dat <- matrix(runif(nsamples*nreadings), nrow=nsamples) # Euclidean distance between the ROWS of dat distance <- dist(dat)
Jean JenniferH <jenacho...@gmail.com> wrote on 08/01/2012 04:42:48 AM: > > Hello everyone. Like others on this list, I'm new to R, and really not much > of a programmer, so please excuse any obtuse questions! I'm trying to > repeat a function across all possible combinations of vectors in a data > frame. I'd hugely appreciate any advice! > > Here's what I'm doing: > > I have some data: 40 samples, ~460 000 different readings between 1 and 0 > for each sample. I would like to make R spit out a matrix of distances > between the samples. So far, I have made a function to calculate the > distance between any two samples: > > DistanceCalc<-function(x,y){#x and y are both vectors - the entire reading > set for sample x and > #sample y, respectively > distance<-sqrt(sum((x-y).^2)) > distanceCorrected<-distance/sqrt(length(x))#to force the maximum possible > value to =1 > print(distanceCorrected) > } > > The next thing I want to do is to make this function run to compare all > possible combinations of my samples (1vs1, 1vs2, 1vs3...2vs1, 2vs2 etc). In > python, the only other programming language I have ever used, I would just > use a "for" loop. I have asked the internet how to do this, but the > overwhelming response seems to be "you don't want to do it like that - use > the 'apply' functions". I've tried to use the apply functions, but I tend > to find that I can only give my DistanceCalc function a single vector (I can > tell it where to find x, but not where to find y, or vice versa). I've also > found the 'by' and the 'outer' functions, but I'm likewise failing at making > those work, e.g. > > > distancetable<-outer(DataWithoutBlanks,DataWithoutBlanks,FUN=DistanceCalc) > Error in x - y : non-numeric argument to binary operator > > I think this may be because my data has headers and the function is trying > to calculate the difference between the names of my samples, but I don't > know how to correct this. > > Would really appreciate your help! > > Jen [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.