[R] validation, calibration and Design

Williams Scott Mon, 11 Jul 2005 02:06:55 -0700

 

Hi R experts,


 

I am trying to do a prognostic model validation study, using cancer
survival data. There are 2 data sets - 1500 cases used to develop a
nomogram, and another of 800 cases used as an independent validation
cohort.  I have validated the nomogram in the original data (easy with
the Design tools), and then want to show that it also has good results
with the independent data using 60 month survival. I would also like to
show that the nomogram is significantly different to an existing model
based on 60 month survival data generated by it (eg by McNemar's test).

Hence, somewhat shortened:           

 

#using R 2.01 on Windows

library(Hmisc)

library(Design)

 

data1 #dataframe with predictor variables A and B, cens and time 

      columns (months)

ddist1 <- datadist(data1) 

options(datadist='ddist1') 

 

s1 <- Surv(data1$time, data1$cens)

 

cph.nomo <- cph(s1 ~ A+B, surv=T, x=T, y=T, time.inc=60)

 

survcph <- Survival(cph.nomo, x=T, y=T, time.inc=60, surv=T)

surv5 <- function(lp) survcph(60, lp)

nomogram(cph.nomo, lp=T, conf.int=F, fun=list(surv5, surv7), 

funlabel=c("5 yr DFS"))

 

# now have a useful nomogram model, with good discrimination and

#calibration when checked with validate and calibrate (not shown)

#....move on to validation cohort of n=800

 

Data2 #Validation data with same predictor variables A, B, cens, time

# do I need to put data2 into datadist??

 

s2 <- Surv(data2$time, data2$cens)

 

#able to derive 60 month estimates of survival using

data2.est5 <- survest(cph.nomo, expand.grid(A=data2$A, B=data2$B), 

times=c(60), conf.int=0)

 

rcorr.cens(data2.est5$surv, s2) # tests discrimination of the model 

#against the validation data observed censored data

 

# I cant find a way to use calibrate in this setting though??

# Also, if I have the 5 year estimates for 2 different models, I can 

#     use rcorr.cens to show discrimination, but which values are 

#     suitable for a test of difference (eg with McNemars)?

# I have tried predict / newdata function a number of ways but it 

#     typically returns an error relating to unequal vector lengths

 

 

What I cant work out is where to go now to derive a calibration curve of
the predicted 5 year result (val.data5) and the observed  (s2). Or can I
do it another way? For example, could I merge the 2 data frames and use
lines1:1500 to build the model and the last 800 lines to validate?

 

Obviously I am a novice, and sure to be missing something simple. I have
spent countless hours pouring over Prof Harrell's text (which is great
but doesn't have a specific example of this) and Design Help plus the R
news archive with no success, so any help is very much appreciated. 

 

Scott Williams MD

Peter MacCallum Cancer Centre

Melbourne Australia

 


        [[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] validation, calibration and Design

Reply via email to