Re: [R] Drop column from a data frame
On Dec 26, 2010, at 8:22 PM, John Sorkin wrote: I am trying to drop a column of a data frame. The code below attempts to drop a numeric column (which does not work but gives no error or warning) and a factor column (which does not work but gives an error). I would appreciate someone telling me why my code does not work, and suggesting code that will work. You are misusing the syntax of the "[" operation. When using negative indices you can only use numeric or logical values : ?"[" Character indices always need to be "positive". dfxyz[ , -2] # works dfxyz[ , c(T,F,T)] # works > dfxyz[ , -"y"] Error in -"y" : invalid argument to unary operator This next mechanism also works and us especially useful on dataframes with lots of columns: dfxyz[ , -grep("y", names(dfxyz))] But you need to be careful to make sure you know which columns will match and its good practice to test the grepping expression first: > grep("y", names(dfxyz)) [1] 2 If you only wanted to remove "y" and not "y2" you would need to add qualifiers to the pattern. Thanks, John rm(dfxyz,dfxz,dfxy) # create the data frame. dfxyz <- data.frame(x=1:10,y=11:20,z=factor(c(rep(0,5),rep(1,5 dfxyz names(dfxyz) # try to drop y column # does not work, does not produce error message dfxz <- dfxyz[,-(dfxyz$y)] Well, dfxyz$y does evaluate to a numeric vector with values 11:20 and there were no columns in that range. So it behaved as documented. You asked for the dataframe without some non-existent (numbered) columns and it obliged. dfxz # try to drop z column # does not work, produces error message: # In Ops.factor(df$z) : - not meaningful for factors dfxy <- dfxyz[,-dfxyz$z] Right, you cannot subtract (or negate) factors. As Phil suggests, subset()-ting is often safer. -- David. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drop column from a data frame
John - You can use a syntax similar to what you've tried with the select= argument of the subset function: subset(dfxyz,select=-y) x z 1 1 0 2 2 0 . . . subset(dfxyz,select=-z) x y 1 1 11 2 2 12 . . . - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Sun, 26 Dec 2010, John Sorkin wrote: I am trying to drop a column of a data frame. The code below attempts to drop a numeric column (which does not work but gives no error or warning) and a factor column (which does not work but gives an error). I would appreciate someone telling me why my code does not work, and suggesting code that will work. Thanks, John rm(dfxyz,dfxz,dfxy) # create the data frame. dfxyz <- data.frame(x=1:10,y=11:20,z=factor(c(rep(0,5),rep(1,5 dfxyz names(dfxyz) # try to drop y column # does not work, does not produce error message dfxz <- dfxyz[,-(dfxyz$y)] dfxz # try to drop z column # does not work, produces error message: # In Ops.factor(df$z) : - not meaningful for factors dfxy <- dfxyz[,-dfxyz$z] dfxy John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drop column from a data frame
assign NULL to the column: > dfxyz <- data.frame(x=1:10,y=11:20,z=factor(c(rep(0,5),rep(1,5 > dfxyz x y z 1 1 11 0 2 2 12 0 3 3 13 0 4 4 14 0 5 5 15 0 6 6 16 1 7 7 17 1 8 8 18 1 9 9 19 1 10 10 20 1 > dfxyz$y <- NULL > dfxyz x z 1 1 0 2 2 0 3 3 0 4 4 0 5 5 0 6 6 1 7 7 1 8 8 1 9 9 1 10 10 1 > On Sun, Dec 26, 2010 at 8:22 PM, John Sorkin wrote: > I am trying to drop a column of a data frame. The code below attempts to drop > a numeric column (which does not work but gives no error or warning) and a > factor column (which does not work but gives an error). > I would appreciate someone telling me why my code does not work, and > suggesting code that will work. > Thanks, > John > > rm(dfxyz,dfxz,dfxy) > > # create the data frame. > dfxyz <- data.frame(x=1:10,y=11:20,z=factor(c(rep(0,5),rep(1,5 > dfxyz > > names(dfxyz) > > # try to drop y column > # does not work, does not produce error message > dfxz <- dfxyz[,-(dfxyz$y)] > dfxz > > # try to drop z column > # does not work, produces error message: > # In Ops.factor(df$z) : - not meaningful for factors > dfxy <- dfxyz[,-dfxyz$z] > dfxy > > > > John David Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > Confidentiality Statement: > This email message, including any attachments, is for ...{{dropped:17}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Drop column from a data frame
I am trying to drop a column of a data frame. The code below attempts to drop a numeric column (which does not work but gives no error or warning) and a factor column (which does not work but gives an error). I would appreciate someone telling me why my code does not work, and suggesting code that will work. Thanks, John rm(dfxyz,dfxz,dfxy) # create the data frame. dfxyz <- data.frame(x=1:10,y=11:20,z=factor(c(rep(0,5),rep(1,5 dfxyz names(dfxyz) # try to drop y column # does not work, does not produce error message dfxz <- dfxyz[,-(dfxyz$y)] dfxz # try to drop z column # does not work, produces error message: # In Ops.factor(df$z) : - not meaningful for factors dfxy <- dfxyz[,-dfxyz$z] dfxy John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.