[R] Updated package: survival_2.40-1

2016-11-01 Thread Therneau, Terry M., Ph.D.
Survival version 2.40 has been relased to CRAN.  This is a warning that some users may see 
changes in results, however.


The heart of the issue can be shown with a simple example. Calculate the following simple 
set of intervals:

<>=
birth <- as.Date("1973/03/10")
start <- as.Date("1998/09/13") + 1:40
end   <- as.Date("1998/12/03") + rep(1:10, 4)
interval <- (end-start)
table(interval)
51 61 71 81
10 10 10 10
@

Each interval has a different start and end date, but there are only 4 unique intervals, 
each of which appears 10 times.

Now convert this to an age scale.

<>=
start.age <- as.numeric(start-birth)/365.25
end.age   <- as.numeric(end  -birth)/365.25
age.interval <- end.age - start.age
table(match(age.interval, unique(age.interval)))
1 2 3 4 5 6 7 8
9 1 5 5 1 9 7 3
@
There are now eight different age intervals instead of 4, and the 8 unique values appear 
between 1 and 9 times each.  Exact results likely will depend on your computer system. We 
have become a victim of round off error.


Some users prefer to use time in days and some prefer time in years, and those latter 
users expect, I am sure, survival analysis results to be identical on the two scales.  
Both the coxph and survfit routines treat tied event times in a special way, and this 
roundoff can make actual ties appear as non-tied values, however. Parametric survival such 
as \code{survreg} is not affected by this issue.


In survival version 2.40 this issue has been addressed for the coxph and survfit routines; 
input times are subjected to the same logic found in the all.equal routine in order to 
determine actual ties. The upshot is that some users may experience a changed results.


For the following test case cox1 and cox2 are identical coefficients in version 2.40, but 
different in prior versions.

<<>>=
ndata <- data.frame(id=1:30,
  birth.dt = rep(as.Date("1953/03/10"), 30),
  enroll.dt= as.Date("1993/03/10") + 1:30,
  end.dt   = as.Date("1996/10/21") + 1:30 +
  rep(1:10, 3),
  status= rep(0:1, length=30),
  x = 1:30)
ndata$enroll.age <- with(ndata, as.numeric(enroll.dt - birth.dt))/365.25
ndata$end.age<- with(ndata, as.numeric(end.dt - birth.dt))/365.25

fudays <- with(ndata, as.numeric(end.dt - enroll.dt))
fuyrs  <- with(ndata, as.numeric(end.age- enroll.age))
cox1 <- coxph(Surv(fudays, status) ~ x, data=ndata)
cox2 <- coxph(Surv(fuyrs,  status) ~ x, data=ndata)
@

This general issue of floating point precision arises often enough in R that is part of 
the frequently asked questions, see FAQ 7.31 on CRAN. The author of the survival routines 
(me) has always used days as the scale
for analysis -- just by habit, not for any particluarly good reason -- so the issue had 
never appeared in my work nor in the survival package's test suite. Due to user input, 
this issue had been addressed earlier in the survfit routine, but only when the status 
variable was 0/1, not when it is a factor.


As a final footnote, the simple data set above also gives different results when coded in 
SAS: I am not alone in overlooking it.  As a consequence, the maintainer expects to get 
new emails that ``we have found a bug in your code: it gives a different answer than 
SAS''.  (This is an actual quote.)


Terry Therneau

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] function ave() with seq_along returning char sequence instead of numeric

2016-11-01 Thread Charles C. Berry

On Mon, 31 Oct 2016, Jeff Newmiller wrote:

The help page describes the first argument x as a numeric... it is not 
designed to accept character,


Actually it is so designed, but not advertised as such. See below.


so the fact that you get anything even close to right is just a bonus.

As the doctor says, "if it hurts, don't do that".

ave( rep( 1, length( v ), v, FUN=seq_along )
--


[snip]


Reading the code of `ave` and then `split<-.default`, you will see subset 
replacement, "x[i]<- ...", on the argument 'x'. So, the issue is having 
FUN and that replacement (and possible coercion) yield something 
useful/sensible. In other words, class(x) need not be "numeric".


For instance, operating on "Date" objects:


# start at 2016-01-02, step 10 days, ...
x <- as.Date("2016-01-01")+seq(1,1000,by=10)
z <- rep(1:10, 10)
 class(ave(x,z)) # Date class is preserved

[1] "Date"

ave(x,z) # mean date
  [1] "2017-03-27" "2017-04-06" "2017-04-16" "2017-04-26" ... 

ave(x,z,FUN=min) # earliest date

  [1] "2016-01-02" "2016-01-12" "2016-01-22" "2016-02-01" ...

However, trying to describe this feature in the help page without a lot of 
detail and examples might confuse more users than it would enlighten.


--
Chuck

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] function ave() with seq_along returning char sequence instead of numeric

2016-11-01 Thread S Ellison

> The help page describes the first argument x as a numeric... 
It also describes the _value_ as numeric. One for the help page issue list?

In fact there seems no obvious reason for a hard restriction to numeric*; the 
return value will depend largely on what FUN does, as there's no argument class 
check in the code for ave or for split(), which ave()uses. The principal 
requirement is presumably that FUN must accept a vector of class class(x) and 
return a vector of the same length as its argument (or, if there's grouping, a 
scalar) that split<- (or, with no grouping factor, '<-') can use.

*though plenty of reason to warn of unexpected consequences if not, of course

S Ellison



***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bzip2

2016-11-01 Thread David Winsemius

> On Nov 1, 2016, at 2:23 AM, Josef Eschgfaeller  wrote:
> 
> David Winsemius wrote:
> 
>> http://r.research.att.com/libs/
> 
> I installed xz with
> 
>tar fvxz xz-5.0.5-darwin10-bin2.tar.gz -C /
> 
> but then for R
> 
>./configure --enable-R-shlib
> 
> gives me the same error as before:
> --
> checking for BZ2_bzlibVersion in -lbz2... yes
> checking bzlib.h usability... yes
> checking bzlib.h presence... yes
> checking for bzlib.h... yes
> checking if bzip2 version >= 1.0.6... no
> checking whether bzip2 support suffices... configure: error:
>   bzip2 library and headers are required
> --
> I'm using Mac OS 10.6.8 Snow Leopard.

You ought to a) include more context in your replies, and b) post on R-SIG-Mac. 

The advice in the Admin-guide can be found here, although it says no testing 
has been done recently for SL:

https://cran.r-project.org/doc/manuals/r-release/R-admin.html#OS-X

I suspect there are more difficulties with SL builds than just this problem. 
You will probably need to include all your setup information and the full 
configure log when you post this on the correct mailing list.


> 
> Thanks
> Josef Eschgfaeller
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bzip2

2016-11-01 Thread Josef Eschgfaeller
David Winsemius wrote:

> http://r.research.att.com/libs/

I installed xz with

tar fvxz xz-5.0.5-darwin10-bin2.tar.gz -C /

but then for R

./configure --enable-R-shlib

gives me the same error as before:
--
checking for BZ2_bzlibVersion in -lbz2... yes
checking bzlib.h usability... yes
checking bzlib.h presence... yes
checking for bzlib.h... yes
checking if bzip2 version >= 1.0.6... no
checking whether bzip2 support suffices... configure: error:
   bzip2 library and headers are required
--
I'm using Mac OS 10.6.8 Snow Leopard.

Thanks
Josef Eschgfaeller

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Resetting Baseline Level of Predictor in svyglm Function

2016-11-01 Thread Anthony Damico
hi, i think you want

elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or
Less") )





On Mon, Oct 31, 2016 at 9:05 PM, Courtney Benjamin 
wrote:

> Hello R Users:
>
> I am using the survey package in R for modeling with complex survey data.
> I am trying to reset the baseline level of certain predictor variables
> being used in a logistic regression without success. The following is a
> reproducible example:
>
> library(RCurl)
> library(survey)
>
> data <- getURL("https://raw.githubusercontent.com/
> cbenjamin1821/careertech-ed/master/elsq1adj.csv")
> elsq1ch <- read.csv(text = data)
>
> #Specifying the svyrepdesign object which applies the BRR weights
> elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights =
> elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type =
> "BRR")
> elsq1ch_brr
>
> #Log. Reg. model
> allCC <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+
> F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=
> elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit)
> summary(allCC)
>
> ##Attempting to reset baseline level for predictor variable
> #Both attempts did not work
> elsq1ch$F1HIMATH <- C(elsq1ch$F1HIMATH,contr.treatment, base=1)
> elsq1ch$F1HIMATH <- relevel(elsq1ch$F1HIMATH,"PreAlg or Less")
>
> #Log. Reg. model with no changes in baseline levels for the predictors
> allCC <- svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+
> F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=
> elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit)
> summary(allCC)
>
>
> Any guidance is greatly appreciated.?
>
> Sincerely,
>
> Courtney?
>
> Courtney Benjamin
>
> Broome-Tioga BOCES
>
> Automotive Technology II Teacher
>
> Located at Gault Toyota
>
> Doctoral Candidate-Educational Theory & Practice
>
> State University of New York at Binghamton
>
> cbenj...@btboces.org
>
> 607-763-8633
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.