Re: [R] Drop column from a data frame

2010-12-27 Thread David Winsemius


On Dec 26, 2010, at 8:22 PM, John Sorkin wrote:

I am trying to drop a column of a data frame. The code below  
attempts to drop a numeric column (which does not work but gives no  
error or warning) and a factor column (which does not work but gives  
an error).
I would appreciate someone telling me why my code does not work, and  
suggesting code that will work.


You are misusing the syntax of the "[" operation. When using negative  
indices you can only use numeric or logical values :


?"["

Character indices always need to be "positive".

dfxyz[ , -2]  # works
dfxyz[ , c(T,F,T)] # works

> dfxyz[ , -"y"]
Error in -"y" : invalid argument to unary operator

This next mechanism also works and us especially useful on dataframes  
with lots of columns:


dfxyz[ , -grep("y", names(dfxyz))]

But you need to be careful to make sure you know which columns will  
match and its good practice to test the grepping expression first:

> grep("y", names(dfxyz))
[1] 2

If you only wanted to remove "y" and not "y2" you would need to add  
qualifiers to the pattern.



Thanks,
John

rm(dfxyz,dfxz,dfxy)

# create the data frame.
dfxyz <- data.frame(x=1:10,y=11:20,z=factor(c(rep(0,5),rep(1,5
dfxyz

names(dfxyz)

# try to drop y column
# does not work, does not produce error message
dfxz <- dfxyz[,-(dfxyz$y)]


Well, dfxyz$y does evaluate to a numeric vector with values 11:20 and  
there were no columns in that range. So it behaved as documented. You  
asked for the dataframe without some non-existent (numbered) columns  
and it obliged.



dfxz

# try to drop z column
# does not work, produces error message:
# In Ops.factor(df$z) : - not meaningful for factors
dfxy <- dfxyz[,-dfxyz$z]


Right, you cannot subtract (or negate) factors.

As Phil suggests, subset()-ting is often safer.

--
David.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drop column from a data frame

2010-12-26 Thread Phil Spector

John -
   You can use a syntax similar to what you've tried with
the select= argument of the subset function:


subset(dfxyz,select=-y)

x z
1   1 0
2   2 0
  . . .

subset(dfxyz,select=-z)

x  y
1   1 11
2   2 12
  . . .


- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Sun, 26 Dec 2010, John Sorkin wrote:


I am trying to drop a column of a data frame. The code below attempts to drop a 
numeric column (which does not work but gives no error or warning) and a factor 
column (which does not work but gives an error).
I would appreciate someone telling me why my code does not work, and suggesting 
code that will work.
Thanks,
John

rm(dfxyz,dfxz,dfxy)

# create the data frame.
dfxyz <- data.frame(x=1:10,y=11:20,z=factor(c(rep(0,5),rep(1,5
dfxyz

names(dfxyz)

# try to drop y column
# does not work, does not produce error message
dfxz <- dfxyz[,-(dfxyz$y)]
dfxz

# try to drop z column
# does not work, produces error message:
# In Ops.factor(df$z) : - not meaningful for factors
dfxy <- dfxyz[,-dfxyz$z]
dfxy



John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drop column from a data frame

2010-12-26 Thread jim holtman
assign NULL to the column:

> dfxyz <- data.frame(x=1:10,y=11:20,z=factor(c(rep(0,5),rep(1,5
> dfxyz
x  y z
1   1 11 0
2   2 12 0
3   3 13 0
4   4 14 0
5   5 15 0
6   6 16 1
7   7 17 1
8   8 18 1
9   9 19 1
10 10 20 1
> dfxyz$y <- NULL
> dfxyz
x z
1   1 0
2   2 0
3   3 0
4   4 0
5   5 0
6   6 1
7   7 1
8   8 1
9   9 1
10 10 1
>


On Sun, Dec 26, 2010 at 8:22 PM, John Sorkin
 wrote:
> I am trying to drop a column of a data frame. The code below attempts to drop 
> a numeric column (which does not work but gives no error or warning) and a 
> factor column (which does not work but gives an error).
> I would appreciate someone telling me why my code does not work, and 
> suggesting code that will work.
> Thanks,
> John
>
> rm(dfxyz,dfxz,dfxy)
>
> # create the data frame.
> dfxyz <- data.frame(x=1:10,y=11:20,z=factor(c(rep(0,5),rep(1,5
> dfxyz
>
> names(dfxyz)
>
> # try to drop y column
> # does not work, does not produce error message
> dfxz <- dfxyz[,-(dfxyz$y)]
> dfxz
>
> # try to drop z column
> # does not work, produces error message:
> # In Ops.factor(df$z) : - not meaningful for factors
> dfxy <- dfxyz[,-dfxyz$z]
> dfxy
>
>
>
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
> Confidentiality Statement:
> This email message, including any attachments, is for ...{{dropped:17}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Drop column from a data frame

2010-12-26 Thread John Sorkin
I am trying to drop a column of a data frame. The code below attempts to drop a 
numeric column (which does not work but gives no error or warning) and a factor 
column (which does not work but gives an error).
I would appreciate someone telling me why my code does not work, and suggesting 
code that will work.
Thanks,
John

rm(dfxyz,dfxz,dfxy)

# create the data frame.
dfxyz <- data.frame(x=1:10,y=11:20,z=factor(c(rep(0,5),rep(1,5
dfxyz

names(dfxyz)

# try to drop y column
# does not work, does not produce error message
dfxz <- dfxyz[,-(dfxyz$y)]
dfxz

# try to drop z column
# does not work, produces error message:
# In Ops.factor(df$z) : - not meaningful for factors
dfxy <- dfxyz[,-dfxyz$z]
dfxy



John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.