On Nov 25, 2012, at 10:22 AM, lind35 wrote:
Hello,
I have a new data set and an old data set. Both have the same columns
representing the same sort of measure. Within each data set (old and
new)
are 18 groups (simplified to three groups below). Within each group
are
individuals with unique ID numbers. These ID numbers may be the same
as
other ID numbers in different groups, but a particular ID number only
appears once in each group.
The old data set does not include all of the individuals from the
new data
set - meaning IDs within groups in the new data set are not found
within the
old data set or visa versa. I am trying to extract data from a
particular
column for unique individuals within a unique group from the old
data set
and put that info into a column within the row for that particular
unique
individual/group in the new data set. However, I keep coming up with R
errors.
Basically it's set up like this (i've simplified the data to
illustrate the
important stuff)
old <- read.csv("/Users/Me/Desktop/old data.csv")
new <- read.csv("/Users/Me/Desktop/new data.csv")
new
[group] [ID] [column where I want to put the data, currently
blank]
[1,] 1 800 __
[2,] 1 23 __
[3,] 1 542 __
[4,] 2 23 __
[5,] 2 2318 __
[6,] 2 45 __
[7,] 3 1345 __
[8,] 3 800 __
[9,] 3 232 __
old
[group] [ID] [data I want for the new.object]
[1,] 1 300 12.2
[2,] 1 155 10.8
[3,] 1 23 9.2
[4,] 2 45 15.6
[5,] 2 1289 5.5
[6,] 2 23 4.2
[7,] 3 800 3.3
[8,] 3 232 18.0
[9,] 3 1345 21.1
#and this is what I want to get as an end result
new
[1,] 1 800 __
[2,] 1 23 9.2
[3,] 1 542 __
[4,] 2 23 4.2
[5,] 2 2318 __
[6,] 2 45 15.6
[7,] 3 1345 21.1
[8,] 3 800 3.3
[9,] 3 232 18.0
I've tried the following codes but keep getting error messages
for (i in 1) {
+ new[i,3] <- old[which(old[,2] == new[i,2] & old[,1] ==
new[i,1]),3]
+ }
Error in `[<-.data.frame`(`*tmp*`, i, 11, value = numeric(0)) :
replacement has length zero
#OR
for (i in 1) {
+ data[[i,11]] <- as.numeric(old[[which(old[[,22]] == data[[i,
2]] &
old[[,1]] == data[[i,1]]),46]])
+ }
Error in `[[.data.frame`(old, , 22) :
argument "..1" is missing, with no default
I just want to ignore the IDs in the old data set that aren't in the
new
data set. How do I do this?
Actually you are not 'ignoring' the ID's in new-data, but rather
using them to set the value columns to missing in certain rows.
Perhaps, (untested)
oldnew <- old[ c( "group", , "ID", column-vector for desired values ]
oldnew[ !paste(old$ID, old$group, sep="_") %in% paste(new$ID, new
$group, sep="_") ,
#### names of columns here as a character vector or
3:length(oldnew)
] <- NA
--
David Winsemius, MD
Alameda, CA, USA
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.