Dear R-listers,

I am a relatively inexperienced R-user currently migrating from Stata. I
am deeply frustrated by this data manipulation question: I know how I
could do it in Stata, but I cannot make it work in R.

I have a data frame of hospitalization data where each row represents an
admission. I need to know when patients were first discharged, but the
problem is that patients were sometimes transferred between hospital
departments. In my data a transfer looks like a new admission, except
that it has a 'start' date equal to the previous admission's 'stop'
date.

Here is an example:

id <- c(rep("a",4),rep("b",2), rep("c",5), rep("d",1))
start <- c(c(0,6,17,20),c(0,1),c(0,5,10,11,50),c(0))
stop <- c(c(6,12,20,30),c(1,10),c(3,10,11,30,55),c(6))
data <- as.data.frame(cbind(id,start,stop))
data
#    id start stop
# 1   a     0    6
# 2   a     6   12
# 3   a    17   20
# 4   a    20   30
# 5   b     0    1
# 6   b     1   10
# 7   c     0    3
# 8   c     5   10
# 9   c    10   11
# 10  c    11   30
# 11  c    50   55
# 12  d     0    6

So, what I want to end up with is this:

id start stop
a  0     12   # This patient was transferred at time 6 and discharged at
time 12. The admission starting at time 17 is therefore irrelevant.
b  0     10   
c  0     3    
d  0     6

I have tried tons of variations over lapply, sapply, split, for etc.,
all to no avail. 

Thank you in advance for any assistance.

Best regards,
Peter Jepsen, MD.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to