Here's a function Josh Wiley provided in another thread: spec.cor <- function(dat, r, ...) { x <- cor(dat, ...) x[upper.tri(x, TRUE)] <- NA i <- which(abs(x) >= r, arr.ind = TRUE) data.frame(matrix(colnames(x)[as.vector(i)], ncol = 2), value = x[i]) }
Michael On Thu, Nov 17, 2011 at 4:08 PM, Musa Hassan <musah...@gmail.com> wrote: > Hi Michael, > I was able to solve this. I just used the WGCNA library which allows for > stringsAsFactors to be defined in the work space making everything stored as > strings remain strings. My problem now is parsing through the results to > pull out only significant correlations defined by a certain Pearson > correlation value say 0.8. > > On 17 November 2011 15:32, R. Michael Weylandt <michael.weyla...@gmail.com> > wrote: >> >> I can't see how it's stored like that and the email servers garble it >> up. Use dput() to create a plain text representation and paste that >> back in. >> >> Thanks, >> Michael >> >> On Thu, Nov 17, 2011 at 9:37 AM, muzz56 <musah...@gmail.com> wrote: >> > Hi Michael, >> > Here is a sample of the data. >> > >> > Gene Array1 Array2 Array3 Array4 Array5 Array6 Array7 Array8 Array9 >> > Array10 >> > Array11 Fth1 26016.01 23134.66 17445.71 39856.04 27245.45 23622.98 >> > 37887.75 >> > 49857.46 25864.73 21852.51 29198.4 B2m 7573.64 7768.52 6608.24 8571.65 >> > 6380.78 6242.76 6903.92 7330.63 7256.18 5678.21 10937.05 Tmsb4x 6192.44 >> > 4277.22 5024.59 4851.51 3062.55 4562.43 7948.1 5018.58 3200.17 2855.77 >> > 6139.23 H2-D1 3141.41 3986.06 3328.62 4726.6 3589.89 2885.95 7509.88 >> > 5257.62 4742.26 3431.33 5300.72 Prdx5 3935.7 3938.9 3401.68 4193.14 >> > 4028.95 >> > 3438.19 6640.15 5486.61 4424.57 3368.83 5265.92 >> > I want to retain the gene names in the data. What you've proposed will >> > take >> > them out and I'll have to append them back to the results after the >> > cor() >> > >> > On 17 November 2011 09:33, Michael Weylandt [via R] < >> > ml-node+s789695n4080177...@n4.nabble.com> wrote: >> > >> >> I think something like this should do it, but I can't test without >> >> data: >> >> >> >> rownames(mydata) <- mydata[,1] # Put the elements in the first column >> >> as rownames >> >> mydata <- mydata[,-1] # drop the things that are now rownames >> >> >> >> Michael >> >> >> >> On Thu, Nov 17, 2011 at 9:23 AM, Musa Hassan <[hidden >> >> email]<http://user/SendEmail.jtp?type=node&node=4080177&i=0>> >> >> wrote: >> >> >> >> > Hi Michael, >> >> > Thanks for the response. I have noticed that the error occurred >> >> > during >> >> my >> >> > data read. It appears that the rownames (which when the data is >> >> transposed >> >> > become my colnames) were converted to numbers instead of strings as >> >> > they >> >> > should be. The original header names don't change, just the rownames. >> >> > I >> >> have >> >> > to figure out how to import the data and have the strings not >> >> > converted. >> >> > Right now am using: >> >> > mydata = read.csv(mydata.csv, headers=T,stringsAsFactors=F) >> >> > >> >> > then to convert the data frame to matrix >> >> > mydata=data.matrix(mydata) >> >> > >> >> > Then I just do the correlation as Peter suggested. >> >> > >> >> > expression=cor(t(expression)) >> >> > >> >> > Thanks. >> >> > >> >> > On 17 November 2011 08:51, R. Michael Weylandt <[hidden >> >> > email]<http://user/SendEmail.jtp?type=node&node=4080177&i=1>> >> >> >> >> > wrote: >> >> >> >> >> >> On Wed, Nov 16, 2011 at 11:22 PM, muzz56 <[hidden >> >> >> email]<http://user/SendEmail.jtp?type=node&node=4080177&i=2>> >> >> wrote: >> >> >> > Thanks to everyone who replied to my post, I finally got it to >> >> >> > work. >> >> I >> >> >> > am >> >> >> > however not sure how well it worked since it run so quickly, but >> >> seems >> >> >> > like >> >> >> > I have a 2000 x 2000 data set. >> >> >> >> >> >> Behold the great and mighty power that is R! Don't worry -- on a >> >> >> decent machine the correlation of a 2k x 2k data set should be >> >> >> pretty >> >> >> fast. (It's about 9 seconds on my old-ish laptop with a bunch of >> >> >> other >> >> >> junk running) >> >> >> >> >> >> > My followup questions would be, how do I get >> >> >> > only pairs with say a certain pearson correlation value >> >> >> > additionally >> >> it >> >> >> > seems like my output didn't retain the headers but instead >> >> >> > replaced >> >> them >> >> >> > with numbers making it hard to know which gene pairs correlate. >> >> >> >> >> >> This is a little worrisome: R carries column names through cor() so >> >> >> this would suggest you weren't using them. Were your headers listed >> >> >> as >> >> >> part of your data (instead of being names)? If so, they would have >> >> >> been taken as numbers. >> >> >> >> >> >> Take a look at dimnames(NAMEOFDATA) -- if your headers aren't there, >> >> >> then they are being treated as data instead of numbers. If they are, >> >> >> can you provide some reproducible code and we can debug more fully. >> >> >> The easiest way to send data is to use the dput() function to get a >> >> >> copy-pasteable plain text representation. It would also be great if >> >> >> you could restrict it to a subset of your data rather than the full >> >> >> 4M >> >> >> data points, but if that's hard to do, don't worry. >> >> >> >> >> >> You should have expected behavior like >> >> >> >> >> >> X <- matrix(1:9,3) >> >> >> colnames(X) <- c("A","B","C") >> >> >> cor(X) # Prints with labels >> >> >> >> >> >> Michael >> >> >> >> >> >> > >> >> >> > On 16 November 2011 17:11, Nordlund, Dan (DSHS/RDA) [via R] < >> >> >> > [hidden email] >> >> >> > <http://user/SendEmail.jtp?type=node&node=4080177&i=3>> >> >> wrote: >> >> >> > >> >> >> >> > -----Original Message----- >> >> >> >> > From: [hidden >> >> >> >> > email]<http://user/SendEmail.jtp?type=node&node=4078114&i=0 >> >> >[mailto: >> >> >> >> r-help-bounces@r- >> >> >> >> > project.org] On Behalf Of muzz56 >> >> >> >> > Sent: Wednesday, November 16, 2011 12:28 PM >> >> >> >> > To: [hidden >> >> >> >> > email]<http://user/SendEmail.jtp?type=node&node=4078114&i=1> >> >> >> >> > Subject: Re: [R] Pairwise correlation >> >> >> >> > >> >> >> >> > Thanks Peter. I tried this after reading in the csv (read.csv) >> >> >> >> > and >> >> >> >> > converted the data to matrix (as.matrix). But when I tried the >> >> >> >> > correlation, >> >> >> >> > I keeping getting the error (x must be numeric) yet when I view >> >> the >> >> >> >> > data, >> >> >> >> > its numeric. >> >> >> >> > >> >> >> >> >> >> >> >> What does R tell you if you execute the following? >> >> >> >> >> >> >> >> str(x) >> >> >> >> >> >> >> >> Just because the data looks like it is numeric when it prints >> >> doesn't >> >> >> >> mean >> >> >> >> it is. >> >> >> >> >> >> >> >> >> >> >> >> Dan >> >> >> >> >> >> >> >> Daniel J. Nordlund >> >> >> >> Washington State Department of Social and Health Services >> >> >> >> Planning, Performance, and Accountability >> >> >> >> Research and Data Analysis Division >> >> >> >> Olympia, WA 98504-5204 >> >> >> >> >> >> >> >> >> >> >> >> ______________________________________________ >> >> >> >> [hidden email] >> >> >> >> <http://user/SendEmail.jtp?type=node&node=4078114&i=2>mailing >> >> >> >> list >> >> >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> >> >> PLEASE do read the posting guide >> >> >> >> http://www.R-project.org/posting-guide.html >> >> >> >> and provide commented, minimal, self-contained, reproducible >> >> >> >> code. >> >> >> >> >> >> >> >> >> >> >> >> ------------------------------ >> >> >> >> If you reply to this email, your message will be added to the >> >> >> >> discussion >> >> >> >> below: >> >> >> >> >> >> >> >> >> >> >> >> http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4078114.html >> >> >> >> To unsubscribe from Pairwise correlation, click >> >> >> >> here< >> >> >> >> >> >> . >> >> >> >> >> >> >> >> NAML< >> >> >> >> http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.InstantMailNamespace&breadcrumbs=instant+emails%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >> >> >> >> >> >> >> >> >> > >> >> >> > >> >> >> > -- >> >> >> > View this message in context: >> >> >> > >> >> >> >> http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4078915.html >> >> >> > Sent from the R help mailing list archive at Nabble.com. >> >> >> > [[alternative HTML version deleted]] >> >> >> > >> >> >> > ______________________________________________ >> >> >> > [hidden email] >> >> >> > <http://user/SendEmail.jtp?type=node&node=4080177&i=4>mailing list >> >> >> > https://stat.ethz.ch/mailman/listinfo/r-help >> >> >> > PLEASE do read the posting guide >> >> >> > http://www.R-project.org/posting-guide.html >> >> >> > and provide commented, minimal, self-contained, reproducible code. >> >> >> > >> >> > >> >> > >> >> >> >> ______________________________________________ >> >> [hidden email] >> >> <http://user/SendEmail.jtp?type=node&node=4080177&i=5>mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> >> >> >> >> >> ------------------------------ >> >> If you reply to this email, your message will be added to the >> >> discussion >> >> below: >> >> >> >> http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4080177.html >> >> To unsubscribe from Pairwise correlation, click >> >> here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4076963&code=bXVzYWhhc3NAZ21haWwuY29tfDQwNzY5NjN8LTE5ODYxNDM0OTI=> >> >> . >> >> >> >> NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.InstantMailNamespace&breadcrumbs=instant+emails%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >> >> >> > >> > >> > -- >> > View this message in context: >> > http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4080194.html >> > Sent from the R help mailing list archive at Nabble.com. >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.