And here is a simplification I just noticed:

date.grouping <- function(d) {
  # for ea date in d calculate date beginning 6 month period which contains it
  POSIXct.dates <- as.POSIXct(paste(as.character(d),"01",sep="-"))
  breaks <- c(seq(from=min(POSIXct.dates), to=max(POSIXct.dates), by="6 mo"), Inf)
  format( as.POSIXct( cut( POSIXct.dates, breaks, include.lowest=T )), "%Y-%m" )
}

patients <- read.table("clipboard",header=T)
patients2 <- with( patients, tapply( cost, list(ID,date.grouping(date)), sum ) )
patients2 <- as.data.frame( patients2 )

summary(patients2)

boxplot(patients2)



--- Gabor Grothendieck <[EMAIL PROTECTED]> wrote:
>Sorry but there was an error in the seq statement.  Here it is again.
>
>
>date.grouping <- function(d) {
>  # for ea date in d calculate date beginning 6 month period which contains it
>  mat <- matrix(as.numeric(unlist(strsplit(as.character(d),"-"))),nr=2)
>  f <- function(x) do.call( "ISOdate", as.list(x) )
>  POSIXct.dates <- apply(rbind(mat,1),2,f) + ISOdate(1970,1,1)
>  breaks <- c(seq(from=min(POSIXct.dates), to=max(POSIXct.dates), by="6 mo"), Inf)
>  format( as.POSIXct( cut( POSIXct.dates, breaks, include.lowest=T )), "%Y-%m" )
>}
>
>patients2 <- with( patients, tapply( cost, list(ID,date.grouping(date)), sum ) )
>patients2 <- as.data.frame( patients2 )
>
>summary(patients2)
>
>boxplot(patients2)
>
>
>
>--- Gabor Grothendieck <[EMAIL PROTECTED]> wrote:
>>
>>Try this.  The function takes a vector of dates of the form yyyy-mm and produces a 
>>new character vector of dates of the same form except the 
>>output date is the beginning of the 6 month period in which the input date lies.  
>>The 6 month intervals are measured from the minimum date.
>>
>>date.grouping <- function(d) {
>>  # for ea date in d calculate date beginning 6 month period which contains it
>>  mat <- matrix(as.numeric(unlist(strsplit(as.character(d),"-"))),nr=2)
>>  f <- function(x) do.call( "ISOdate", as.list(x) )
>>  POSIXct.dates <- apply(rbind(mat,1),2,f) + ISOdate(1970,1,1)
>>  breaks <- c(seq(from=min(POSIXct.dates), along=POSIXct.dates, by="6 mo"), Inf)
>>  format( as.POSIXct( cut( POSIXct.dates, breaks, include.lowest=T )), "%Y-%m" )
>>}
>>
>>patients2 <- with( patients, tapply( cost, list(ID,date.grouping(date)), sum ) )
>>patients2 <- as.data.frame( patients2 )
>>
>>summary(patients2)
>>
>>boxplot(patients2)
>>
>>
>>
>>--- Ricardo Pietrobon <[EMAIL PROTECTED]> wrote:
>>>Hi,
>>>
>>>
>>>I am new to R, coming from a few years using Stata. I've been twisting my
>>>brain and checking several R and S references over the last few days to
>>>try to solve this data management problem: I have a data set with a unique
>>>patient identifier that is repeated along multiple rows, a variable with
>>>month of patient encounter, and a continous variable for cost of
>>>individual encounters. The data looks like this:
>>>
>>>ID   date            cost
>>>1    "2001-01"       200.00
>>>1    "2001-01"       123.94
>>>1    "2001-03"       100.23
>>>1    "2001-04"       150.34
>>>2    "2001-03"       296.34
>>>2    "2002-05"       156.36
>>>
>>>
>>>I would like to obtain the median costs and boxplots for the sum of
>>>encounters happening in the first six months after the index encounter
>>>(first patient encounter) for each patient, then the mean and median costs
>>>for the costs happening from 6 to 12 months after the index encounter, and
>>>so on. Notice that the first ID has two encounters during the index date,
>>>making it more difficult to define a single row with the index encounter.
>>>
>>>Any help would be appreciated,
>>>
>>>
>>>Ricardo
>>>
>>>
>>>Ricardo Pietrobon, MD
>>>Assistant Professor of Surgery
>>>Duke University Medical Center
>>>Durham, NC 27710 US
>>>
>>>______________________________________________
>>>[EMAIL PROTECTED] mailing list
>>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>
>>______________________________________________
>>[EMAIL PROTECTED] mailing list
>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
>_____________________________________________________________

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Reply via email to