I'm getting this error message: nms<-names(data)[grep(vars,names(data))] Warning message: In grep(vars, names(data)) : argument 'pattern' has length > 1 and only the first element will be used
Is there a way around this? On Thu, Jul 19, 2012 at 6:17 PM, Rui Barradas <ruipbarra...@sapo.pt> wrote: > Hello, > > I guess so, and I can save you some typing. > > vars <- sort(apply(expand.grid("L", 1:8, 1:2), 1, paste, collapse="")) > > > Then use it and see the result. > > Rui Barradas > > Em 20-07-2012 00:00, Lib Gray escreveu: > >> The variables are actually L11, L12, L21, L22, ... , L81, L82. Would just >> creating a vector c(L11,... ,L82) be fine? (I'm about to try it, but I >> wanted to check to see if that was going to be a big issue). >> >> On Thu, Jul 19, 2012 at 3:27 PM, Rui Barradas <ruipbarra...@sapo.pt> >> wrote: >> >> Hello, >>> >>> Try the following. The data is your example of Patient A through E, but >>> from the output of dput(). >>> >>> dat <- structure(list(Patient = structure(c(1L, 1L, 1L, 1L, 1L, 2L, >>> 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L), .Label = c("A", >>> "B", "C", "D", "E"), class = "factor"), Cycle = c(1L, 2L, 3L, >>> 4L, 5L, 1L, 2L, 1L, 3L, 4L, 5L, 1L, 2L, 4L, 5L, 1L, 2L, 3L), >>> V1 = c(0.4, 0.3, 0.3, 0.4, 0.5, 0.4, 0.4, 0.9, 0.3, NA, 0.4, >>> 0.2, 0.5, 0.6, 0.5, 0.1, 0.5, 0.4), V2 = c(0.1, 0.2, NA, >>> NA, 0.2, NA, NA, 0.9, 0.5, NA, NA, 0.5, 0.7, 0.4, 0.5, NA, >>> 0.3, 0.3), V3 = c(0.5, 0.5, 0.6, 0.4, 0.5, NA, NA, 0.9, 0.6, >>> NA, NA, NA, NA, NA, NA, NA, NA, NA), V4 = c(1.5, 1.6, 1.7, >>> 1.8, 1.5, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, >>> NA), V5 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, >>> NA, NA, NA, NA, NA, NA)), .Names = c("Patient", "Cycle", >>> "V1", "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, >>> -18L)) >>> >>> dat >>> >>> nms <- names(dat)[grep("^V[1-9]$", names(dat))] >>> dd <- split(dat, dat$Patient) >>> fun <- function(x) any(is.na(x)) && any(!is.na(x)) >>> ix <- sapply(dd, function(x) Reduce(`|`, lapply(x[, nms], fun))) >>> >>> dd[ix] >>> do.call(rbind, dd[ix]) >>> >>> >>> I'm assuming that the variables names are as posted, V followed by one >>> single digit 1-9. To keep the Patients with complete cases just negate >>> the >>> index 'ix', it's a logical index. >>> Note also that dput() is the best way of posting a data example. >>> >>> Hope this helps, >>> >>> Rui Barradas >>> >>> Em 19-07-2012 15:15, Lib Gray escreveu: >>> >>> Hello, >>>> >>>> I didn't give enough information when I sent an query before, so I'm >>>> trying >>>> again with a more detailed explanation: >>>> >>>> In this data set, each patient has a different number of measured >>>> variables >>>> (they represent tumors, so some people had 2 tumors, some had 5, etc). >>>> The >>>> problem I have is that often in later cycles for a patient, tumors that >>>> were originally measured are now missing (or a "new" tumor showed up). >>>> We >>>> assume there are many different reasons for why a tumor would be >>>> measured >>>> in one cycle and not another, and so I want to subset OUT the "problem" >>>> patients to better study these patterns. >>>> >>>> An example: >>>> >>>> Patient Cycle V1 V2 V3 V4 V5 >>>> A 1 0.4 0.1 0.5 1.5 NA >>>> A 2 0.3 0.2 0.5 1.6 NA >>>> A 3 0.3 NA 0.6 1.7 NA >>>> A 4 0.4 NA 0.4 1.8 NA >>>> A 5 0.5 0.2 0.5 1.5 NA >>>> >>>> I want to keep patient A; they have 4 measured tumors, but tumor 2 is >>>> missing data for cycles 3 and 4 >>>> >>>> B 1 0.4 NA NA NA NA >>>> B 2 0.4 NA NA NA NA >>>> >>>> I do not want to keep patient B; they have 1 tumor that is measure >>>> consistently in both cycles >>>> >>>> C 1 0.9 0.9 0.9 NA NA >>>> C 3 0.3 0.5 0.6 NA NA >>>> C 4 NA NA NA NA NA >>>> C 5 0.4 NA NA NA NA >>>> >>>> I do want to keep patient C; all their data is missing for cycle 4 and >>>> cycle 5 only measured one tumor >>>> >>>> D 1 0.2 0.5 NA NA NA >>>> D 2 0.5 0.7 NA NA NA >>>> D 4 0.6 0.4 NA NA NA >>>> D 5 0.5 0.5 NA NA NA >>>> >>>> I do not want patient D, their two tumors were measured each cycle >>>> >>>> E 1 0.1 NA NA NA NA >>>> E 2 0.5 0.3 NA NA NA >>>> E 3 0.4 0.3 NA NA NA >>>> >>>> I DO want patient E; they only had one tumor register in Cycle 1, but >>>> cycles 2 and 3 had two tumors. >>>> >>>> >>>> Thanks for any help! >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________****________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/****listinfo/r-help<https://stat.ethz.ch/mailman/**listinfo/r-help> >>>> <https://stat.**ethz.ch/mailman/listinfo/r-**help<https://stat.ethz.ch/mailman/listinfo/r-help> >>>> > >>>> PLEASE do read the posting guide http://www.R-project.org/** >>>> posting-guide.html >>>> <http://www.R-project.org/**posting-guide.html<http://www.R-project.org/posting-guide.html> >>>> > >>>> >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> >>> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.