And here is a simplification I just noticed: date.grouping <- function(d) { # for ea date in d calculate date beginning 6 month period which contains it POSIXct.dates <- as.POSIXct(paste(as.character(d),"01",sep="-")) breaks <- c(seq(from=min(POSIXct.dates), to=max(POSIXct.dates), by="6 mo"), Inf) format( as.POSIXct( cut( POSIXct.dates, breaks, include.lowest=T )), "%Y-%m" ) }
patients <- read.table("clipboard",header=T) patients2 <- with( patients, tapply( cost, list(ID,date.grouping(date)), sum ) ) patients2 <- as.data.frame( patients2 ) summary(patients2) boxplot(patients2) --- Gabor Grothendieck <[EMAIL PROTECTED]> wrote: >Sorry but there was an error in the seq statement. Here it is again. > > >date.grouping <- function(d) { > # for ea date in d calculate date beginning 6 month period which contains it > mat <- matrix(as.numeric(unlist(strsplit(as.character(d),"-"))),nr=2) > f <- function(x) do.call( "ISOdate", as.list(x) ) > POSIXct.dates <- apply(rbind(mat,1),2,f) + ISOdate(1970,1,1) > breaks <- c(seq(from=min(POSIXct.dates), to=max(POSIXct.dates), by="6 mo"), Inf) > format( as.POSIXct( cut( POSIXct.dates, breaks, include.lowest=T )), "%Y-%m" ) >} > >patients2 <- with( patients, tapply( cost, list(ID,date.grouping(date)), sum ) ) >patients2 <- as.data.frame( patients2 ) > >summary(patients2) > >boxplot(patients2) > > > >--- Gabor Grothendieck <[EMAIL PROTECTED]> wrote: >> >>Try this. The function takes a vector of dates of the form yyyy-mm and produces a >>new character vector of dates of the same form except the >>output date is the beginning of the 6 month period in which the input date lies. >>The 6 month intervals are measured from the minimum date. >> >>date.grouping <- function(d) { >> # for ea date in d calculate date beginning 6 month period which contains it >> mat <- matrix(as.numeric(unlist(strsplit(as.character(d),"-"))),nr=2) >> f <- function(x) do.call( "ISOdate", as.list(x) ) >> POSIXct.dates <- apply(rbind(mat,1),2,f) + ISOdate(1970,1,1) >> breaks <- c(seq(from=min(POSIXct.dates), along=POSIXct.dates, by="6 mo"), Inf) >> format( as.POSIXct( cut( POSIXct.dates, breaks, include.lowest=T )), "%Y-%m" ) >>} >> >>patients2 <- with( patients, tapply( cost, list(ID,date.grouping(date)), sum ) ) >>patients2 <- as.data.frame( patients2 ) >> >>summary(patients2) >> >>boxplot(patients2) >> >> >> >>--- Ricardo Pietrobon <[EMAIL PROTECTED]> wrote: >>>Hi, >>> >>> >>>I am new to R, coming from a few years using Stata. I've been twisting my >>>brain and checking several R and S references over the last few days to >>>try to solve this data management problem: I have a data set with a unique >>>patient identifier that is repeated along multiple rows, a variable with >>>month of patient encounter, and a continous variable for cost of >>>individual encounters. The data looks like this: >>> >>>ID date cost >>>1 "2001-01" 200.00 >>>1 "2001-01" 123.94 >>>1 "2001-03" 100.23 >>>1 "2001-04" 150.34 >>>2 "2001-03" 296.34 >>>2 "2002-05" 156.36 >>> >>> >>>I would like to obtain the median costs and boxplots for the sum of >>>encounters happening in the first six months after the index encounter >>>(first patient encounter) for each patient, then the mean and median costs >>>for the costs happening from 6 to 12 months after the index encounter, and >>>so on. Notice that the first ID has two encounters during the index date, >>>making it more difficult to define a single row with the index encounter. >>> >>>Any help would be appreciated, >>> >>> >>>Ricardo >>> >>> >>>Ricardo Pietrobon, MD >>>Assistant Professor of Surgery >>>Duke University Medical Center >>>Durham, NC 27710 US >>> >>>______________________________________________ >>>[EMAIL PROTECTED] mailing list >>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help >> >>______________________________________________ >>[EMAIL PROTECTED] mailing list >>https://www.stat.math.ethz.ch/mailman/listinfo/r-help > >_____________________________________________________________ ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help