Hi, Katherine,

IF the naming scheme of the columns of your data frame is consistently <stringwithoutdot> and <stringwithoutdot.number> if duplicated columns appear THEN (something like)

df[ -grep( "\\.", names( df))]

could help. (But it's maybe more efficient to avoid - a priori - producing duplicated columns, if the data frame is large, as you say.)

 Regards -- Gerrit


On Thu, 28 Mar 2013, Katherine Gobin wrote:

Dear R forum

Suppose I have a data.frame

df = data.frame(id = c(1:6), x = c(15, 21, 14, 21, 14, 38), y = c(36, 38, 55, 11, 5, 18), x.1 = c(15, 21, 14, 21, 14, 38), z = 
c("D", "B", "A", "F", "H", "P"))


df
  id  x  y    x.1 z
1  1 15 36  15 D
2  2 21 38  21 B
3  3 14 55  14 A
4  4 21 11  21 F
5  5 14  5  14 H
6  6 38 18  38 P


Clearly columns x and x.1 are identical. In reality, I have a large data.frame 
and can't make out which columns are identical, but I am sure that column with 
name say x is repeated as x.1, x.2 etc.

How to automatically identify and retain only one column (in this example 
column x) among the identical columns besides other non-identical columns (viz. 
id, y and z).


Regards

Katherine
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to