Re: [R] Removing "NA" from matrix
On Jun 14, 2013, at 7:03 AM, Katherine Gobin wrote: > Dear R forum, > > I have a data frame > > > dat = data.frame( > ABC = c(25.28000732,48.33857234,19.8013245,10.68361461), > DEF = c(14.02722251,10.57985168,11.81890316,21.40171514), > GHI = c(1,1,1,1), > JKL = c(45.96423231,44.52986236,16.56514176,32.14545122), > MNO = c(45.38438063,15.54338206,18.78444777,24.29486984)) > >> dat >ABC DEF GHI JKL MNO > 1 25.28001 14.02722 1 45.96423 45.38438 > 2 48.33857 10.57985 1 44.52986 15.54338 > 3 19.80132 11.81890 1 16.56514 18.78445 > 4 10.68361 21.40172 1 32.14545 24.29487 > > > When I try to find the correlation I get (which is obvious as my one column > shows no variation) Perhaps: dat_cor = cor(dat[ , sapply(dat, function(col) sd(col) != 0 ) ] ) > Warning message: > In cor(dat) : the standard deviation is zero >> dat_cor >ABC DEF GHI JKLMNO > ABC 1.000 -0.75600764 NA 0.55245223 -0.2735585 > DEF -0.7560076 1. NA -0.06479082 0.2020781 > GHI NA NA 1 NA NA > JKL 0.5524522 -0.06479082 NA 1. 0.4564568 > MNO -0.2735585 0.20207810 NA 0.45645683 1.000 > > > In reality I am dealing with about 300 variables and don't know which > variables don't vary. > > My query is how do I remove the columns and rows with NA's. > > So for example, I need the correlation matrix for ABC, DEF, JKL and MNO only. > > Kindly guide. > > Thanking in advance. > > Regards > > Katherine > > [[alternative HTML version deleted]] Please post in plain text. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing "NA" from matrix
Probably, this also works: dat2<-dat[,(colSums(dat)/dat[1,])!=nrow(dat)] cor(dat2) dat$NewCol<-5 dat3<-dat[,(colSums(dat)/dat[1,])!=nrow(dat)] cor(dat3) # ABC DEF JKL MNO #ABC 1.000 -0.75600764 0.55245223 -0.2735585 #DEF -0.7560076 1. -0.06479082 0.2020781 #JKL 0.5524522 -0.06479082 1. 0.4564568 #MNO -0.2735585 0.20207810 0.45645683 1.000 A.K. - Original Message - From: arun To: Katherine Gobin Cc: R help Sent: Friday, June 14, 2013 10:34 AM Subject: Re: [R] Removing "NA" from matrix HI, Try: dat1<-dat[sapply(dat,function(x) length(unique(x)))>1] cor(dat1) # ABC DEF JKL MNO #ABC 1.000 -0.75600764 0.55245223 -0.2735585 #DEF -0.7560076 1. -0.06479082 0.2020781 #JKL 0.5524522 -0.06479082 1. 0.4564568 #MNO -0.2735585 0.20207810 0.45645683 1.000 A.K. From: Katherine Gobin To: "r-help@r-project.org" Cc: Sent: Friday, June 14, 2013 10:03 AM Subject: [R] Removing "NA" from matrix Dear R forum, I have a data frame dat = data.frame( ABC = c(25.28000732,48.33857234,19.8013245,10.68361461), DEF = c(14.02722251,10.57985168,11.81890316,21.40171514), GHI = c(1,1,1,1), JKL = c(45.96423231,44.52986236,16.56514176,32.14545122), MNO = c(45.38438063,15.54338206,18.78444777,24.29486984)) > dat ABC DEF GHI JKL MNO 1 25.28001 14.02722 1 45.96423 45.38438 2 48.33857 10.57985 1 44.52986 15.54338 3 19.80132 11.81890 1 16.56514 18.78445 4 10.68361 21.40172 1 32.14545 24.29487 When I try to find the correlation I get (which is obvious as my one column shows no variation) dat_cor = cor(dat) Warning message: In cor(dat) : the standard deviation is zero > dat_cor ABC DEF GHI JKL MNO ABC 1.000 -0.75600764 NA 0.55245223 -0.2735585 DEF -0.7560076 1. NA -0.06479082 0.2020781 GHI NA NA 1 NA NA JKL 0.5524522 -0.06479082 NA 1. 0.4564568 MNO -0.2735585 0.20207810 NA 0.45645683 1.000 In reality I am dealing with about 300 variables and don't know which variables don't vary. My query is how do I remove the columns and rows with NA's. So for example, I need the correlation matrix for ABC, DEF, JKL and MNO only. Kindly guide. Thanking in advance. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing "NA" from matrix
HI, Try: dat1<-dat[sapply(dat,function(x) length(unique(x)))>1] cor(dat1) # ABC DEF JKL MNO #ABC 1.000 -0.75600764 0.55245223 -0.2735585 #DEF -0.7560076 1. -0.06479082 0.2020781 #JKL 0.5524522 -0.06479082 1. 0.4564568 #MNO -0.2735585 0.20207810 0.45645683 1.000 A.K. From: Katherine Gobin To: "r-help@r-project.org" Cc: Sent: Friday, June 14, 2013 10:03 AM Subject: [R] Removing "NA" from matrix Dear R forum, I have a data frame dat = data.frame( ABC = c(25.28000732,48.33857234,19.8013245,10.68361461), DEF = c(14.02722251,10.57985168,11.81890316,21.40171514), GHI = c(1,1,1,1), JKL = c(45.96423231,44.52986236,16.56514176,32.14545122), MNO = c(45.38438063,15.54338206,18.78444777,24.29486984)) > dat ABC DEF GHI JKL MNO 1 25.28001 14.02722 1 45.96423 45.38438 2 48.33857 10.57985 1 44.52986 15.54338 3 19.80132 11.81890 1 16.56514 18.78445 4 10.68361 21.40172 1 32.14545 24.29487 When I try to find the correlation I get (which is obvious as my one column shows no variation) dat_cor = cor(dat) Warning message: In cor(dat) : the standard deviation is zero > dat_cor ABC DEF GHI JKL MNO ABC 1.000 -0.75600764 NA 0.55245223 -0.2735585 DEF -0.7560076 1. NA -0.06479082 0.2020781 GHI NA NA 1 NA NA JKL 0.5524522 -0.06479082 NA 1. 0.4564568 MNO -0.2735585 0.20207810 NA 0.45645683 1.000 In reality I am dealing with about 300 variables and don't know which variables don't vary. My query is how do I remove the columns and rows with NA's. So for example, I need the correlation matrix for ABC, DEF, JKL and MNO only. Kindly guide. Thanking in advance. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Removing "NA" from matrix
Dear R forum, I have a data frame dat = data.frame( ABC = c(25.28000732,48.33857234,19.8013245,10.68361461), DEF = c(14.02722251,10.57985168,11.81890316,21.40171514), GHI = c(1,1,1,1), JKL = c(45.96423231,44.52986236,16.56514176,32.14545122), MNO = c(45.38438063,15.54338206,18.78444777,24.29486984)) > dat ABC DEF GHI JKL MNO 1 25.28001 14.02722 1 45.96423 45.38438 2 48.33857 10.57985 1 44.52986 15.54338 3 19.80132 11.81890 1 16.56514 18.78445 4 10.68361 21.40172 1 32.14545 24.29487 When I try to find the correlation I get (which is obvious as my one column shows no variation) dat_cor = cor(dat) Warning message: In cor(dat) : the standard deviation is zero > dat_cor ABC DEF GHI JKL MNO ABC 1.000 -0.75600764 NA 0.55245223 -0.2735585 DEF -0.7560076 1. NA -0.06479082 0.2020781 GHI NA NA 1 NA NA JKL 0.5524522 -0.06479082 NA 1. 0.4564568 MNO -0.2735585 0.20207810 NA 0.45645683 1.000 In reality I am dealing with about 300 variables and don't know which variables don't vary. My query is how do I remove the columns and rows with NA's. So for example, I need the correlation matrix for ABC, DEF, JKL and MNO only. Kindly guide. Thanking in advance. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.