On Dec 26, 2010, at 8:22 PM, John Sorkin wrote:
I am trying to drop a column of a data frame. The code below
attempts to drop a numeric column (which does not work but gives no
error or warning) and a factor column (which does not work but gives
an error).
I would appreciate someone telling me why my code does not work, and
suggesting code that will work.
You are misusing the syntax of the "[" operation. When using negative
indices you can only use numeric or logical values :
?"["
Character indices always need to be "positive".
dfxyz[ , -2] # works
dfxyz[ , c(T,F,T)] # works
> dfxyz[ , -"y"]
Error in -"y" : invalid argument to unary operator
This next mechanism also works and us especially useful on dataframes
with lots of columns:
dfxyz[ , -grep("y", names(dfxyz))]
But you need to be careful to make sure you know which columns will
match and its good practice to test the grepping expression first:
> grep("y", names(dfxyz))
[1] 2
If you only wanted to remove "y" and not "y2" you would need to add
qualifiers to the pattern.
Thanks,
John
rm(dfxyz,dfxz,dfxy)
# create the data frame.
dfxyz <- data.frame(x=1:10,y=11:20,z=factor(c(rep(0,5),rep(1,5))))
dfxyz
names(dfxyz)
# try to drop y column
# does not work, does not produce error message
dfxz <- dfxyz[,-(dfxyz$y)]
Well, dfxyz$y does evaluate to a numeric vector with values 11:20 and
there were no columns in that range. So it behaved as documented. You
asked for the dataframe without some non-existent (numbered) columns
and it obliged.
dfxz
# try to drop z column
# does not work, produces error message:
# In Ops.factor(df$z) : - not meaningful for factors
dfxy <- dfxyz[,-dfxyz$z]
Right, you cannot subtract (or negate) factors.
As Phil suggests, subset()-ting is often safer.
--
David.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.