Re: [R] rpart with interval censored data crashes R

2009-01-13 Thread Keith Jewell
Thanks for such a complete answer, that is very helpful.

Best regards,

Keith Jewell

Terry Therneau thern...@mayo.edu wrote in message 
news:200901121858.n0ciw0g06...@hsrnfs-101.mayo.edu...

 Thank you for the input on rpart -- I just saw the message today.

 1. You are right, it should not crash.  Why it crashes rpart is simply 
 that I
 (the author) never ever tried using interval censored data in the call. 
 Real
 users try the most amazing things
  I'll fix it in my local version, but putting in a no no no message.  My
 local version and the R version, maintained by Brian, have drifted quite 
 far
 apart however.

  2. Rpart deals with right censored data using the same trick as Cox 
 models, by
 thinking of it as observation of a Poisson process; number of events seen 
 over a
 given time window.  The fact that the number is always 0 or 1 doesn't 
 hinder the
 mathematical trick, which is based in counting process theory.
  BUT - the trick only works for right censored data.

  Using the mid points of your intervals is the only approach that comes 
 readily
 to mind.

  Terry Therneau

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rpart with interval censored data crashes R

2009-01-12 Thread Terry Therneau

 Thank you for the input on rpart -- I just saw the message today.
 
 1. You are right, it should not crash.  Why it crashes rpart is simply that I 
(the author) never ever tried using interval censored data in the call.  Real 
users try the most amazing things
  I'll fix it in my local version, but putting in a no no no message.  My 
local version and the R version, maintained by Brian, have drifted quite far 
apart however.
  
  2. Rpart deals with right censored data using the same trick as Cox models, 
by 
thinking of it as observation of a Poisson process; number of events seen over 
a 
given time window.  The fact that the number is always 0 or 1 doesn't hinder 
the 
mathematical trick, which is based in counting process theory.
  BUT - the trick only works for right censored data.
  
  Using the mid points of your intervals is the only approach that comes 
readily 
to mind.
  
Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rpart with interval censored data crashes R

2009-01-10 Thread David Winsemius

On a Leopard Mac with the Urbanek compiled 64 bit R, one sees this:

 library(rpart)
 library(survival)
Loading required package: splines
 fit-rpart(Surv(N,Y,type=interval2)~Salt+pH+Temp, data=myD)

 *** caught segfault ***
address 0x0, cause 'memory not mapped'

Traceback:
 1: .C(C_rpartexp2, as.integer(length(dtimes)),  
as.double(dtimes), as.double(.Machine$double.eps), keep =  
integer(length(dtimes)))

 2: (get(paste(rpart, method, sep = .)))(Y, offset, , wt)
 3: rpart(Surv(N, Y, type = interval2) ~ Salt + pH + Temp, data =  
myD)


Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

Choosing 4 does save the workspace.
--
David Winsemius

On Jan 9, 2009, at 9:04 AM, Keith Jewell wrote:


Hi Everyone,

This example code results in R 'crashing'; that is the R application  
closes

with no warnings or error messages.
#---
myD - read.table(stdin(), header=TRUE, nrows=20)
Broth Salt   pH TempN  Y Growth
13109.0 2.92  10 90.0 NA0
26156.0 7.82  30  1.0  21
32172.0 7.34  10  7.0  81
4338   10.0 4.44  10 90.0 NA0
52404.0 7.33  10 20.0 211
6336   10.0 3.90  10 90.0 NA0
72797.0 6.73  10 90.0 NA0
8   10219.0 5.03  45  8.0  91
99747.0 4.01  45 90.0 NA0
10   2657.0 2.93  10 90.0 NA0
11   9344.0 5.28  45  0.1  11
12   6699.0 5.03  30 90.0 NA0
13   875   10.0 6.24  37  1.0  21
14   3852.0 5.84  20  1.0  21
15   5622.0 5.84  30  0.1  11
16   7180.5 5.54  37  0.1  11
17   8459.0 5.03  37  3.0  61
18   9132.0 5.84  45  0.1  11
19   5774.0 4.10  30 90.0 NA0
20200.5 7.44   8 24.0 271

library(rpart)
library(survival)
fit-rpart(Surv(N,Y,type=interval2)~Salt+pH+Temp, data=myD)
#-

Professor Ripley helpfully pointed out that the documentation does  
not say
that interval censoring is supported, and indeed this seems only to  
happen

with interval censored data.

?rpart indicates that the dependent variable may be a survival object.
Neither ?rpart nor An Introduction to Recursive Partitioning Using  
the
RPART Routines (Therneau et al 1997) suggest that the dependent  
variable

may contain interval censored data, but neither do they suggest it
shouldn't; i.e. as far as I'm aware (!) this restriction is not  
documented.


This post has three purposes:

1) Bring this behaviour - especially the crash in response to 'bad'  
data -

to the attention of the authors.

2) Seek an explanation of the restriction (if intentional). In my
simplicity, it seems that interval censored data should be easier to  
handle
than left or right censored - after all the information content is  
greater.


3) Seek guidance on how to work around the problem. I'm minded to  
replace
the interval censored data by the mid points of the intervals. Does  
anyone

have any comments on such an approach?

Any comments gratefully received.

Keith Jewell
==
Version:
platform = i386-pc-mingw32
arch = i386
os = mingw32
system = i386, mingw32
status = Patched
major = 2
minor = 8.1
year = 2009
month = 01
day = 07
svn rev = 47502
language = R
version.string = R version 2.8.1 Patched (2009-01-07 r47502)

Windows Server 2003 x64 (build 3790) Service Pack 2

Locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
Kingdom.1252;LC_MONETARY=English_United
Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

Search Path:
.GlobalEnv, package:stats, package:graphics, package:grDevices,
package:utils, package:datasets, package:methods, Autoloads,  
package:base


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rpart with interval censored data crashes R

2009-01-09 Thread Keith Jewell
Hi Everyone,

This example code results in R 'crashing'; that is the R application closes 
with no warnings or error messages.
#---
myD - read.table(stdin(), header=TRUE, nrows=20)
Broth Salt   pH TempN  Y Growth
13109.0 2.92  10 90.0 NA0
26156.0 7.82  30  1.0  21
32172.0 7.34  10  7.0  81
4338   10.0 4.44  10 90.0 NA0
52404.0 7.33  10 20.0 211
6336   10.0 3.90  10 90.0 NA0
72797.0 6.73  10 90.0 NA0
8   10219.0 5.03  45  8.0  91
99747.0 4.01  45 90.0 NA0
10   2657.0 2.93  10 90.0 NA0
11   9344.0 5.28  45  0.1  11
12   6699.0 5.03  30 90.0 NA0
13   875   10.0 6.24  37  1.0  21
14   3852.0 5.84  20  1.0  21
15   5622.0 5.84  30  0.1  11
16   7180.5 5.54  37  0.1  11
17   8459.0 5.03  37  3.0  61
18   9132.0 5.84  45  0.1  11
19   5774.0 4.10  30 90.0 NA0
20200.5 7.44   8 24.0 271

library(rpart)
library(survival)
fit-rpart(Surv(N,Y,type=interval2)~Salt+pH+Temp, data=myD)
#-

Professor Ripley helpfully pointed out that the documentation does not say 
that interval censoring is supported, and indeed this seems only to happen 
with interval censored data.

?rpart indicates that the dependent variable may be a survival object. 
Neither ?rpart nor An Introduction to Recursive Partitioning Using the 
RPART Routines (Therneau et al 1997) suggest that the dependent variable 
may contain interval censored data, but neither do they suggest it 
shouldn't; i.e. as far as I'm aware (!) this restriction is not documented.

This post has three purposes:

1) Bring this behaviour - especially the crash in response to 'bad' data - 
to the attention of the authors.

2) Seek an explanation of the restriction (if intentional). In my 
simplicity, it seems that interval censored data should be easier to handle 
than left or right censored - after all the information content is greater.

3) Seek guidance on how to work around the problem. I'm minded to replace 
the interval censored data by the mid points of the intervals. Does anyone 
have any comments on such an approach?

Any comments gratefully received.

Keith Jewell
==
Version:
 platform = i386-pc-mingw32
 arch = i386
 os = mingw32
 system = i386, mingw32
 status = Patched
 major = 2
 minor = 8.1
 year = 2009
 month = 01
 day = 07
 svn rev = 47502
 language = R
 version.string = R version 2.8.1 Patched (2009-01-07 r47502)

Windows Server 2003 x64 (build 3790) Service Pack 2

Locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United 
Kingdom.1252;LC_MONETARY=English_United 
Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

Search Path:
 .GlobalEnv, package:stats, package:graphics, package:grDevices, 
package:utils, package:datasets, package:methods, Autoloads, package:base

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.