[R] quantcut

2008-11-03 Thread Kåre Edvardsen
I'm trying to devide x into tertiles, but ends up with integer limits
even x holds one decimal. The analysis is extremely sensitive to the
limits and I like to keep them right. How can that be done?

quartiles - quantcut( x[x = 0], q=seq(0,1, by=(1/3))
 table(quartiles)
quartiles
[180,344] (344,448] (448,644]
16467 16476 16452


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trying to pass arrays as arguments to a function

2008-10-20 Thread Kåre Edvardsen
I'd like to avoid looping through an array in order to change values in
the array as it takes too long.
I red from an earlier post it can be done by do.call but never got it
to work. The Idea is to change the value of y according to values in
x. Wherever x holds the value 3, the corresponding value in y
should be set to 1. 

So I tried the following giving an error message:

#
x - c(1,2,3,2,2,3,1,1,3,3)
y - c(0,0,1,1,0,0,1,0,0,1)

Change_y - function() {

if (x == 3) {y - 1}
  
}

do.call(Change_y, as.list(x,y))

Error in Change_y(1, 2, 3, 2, 2, 3, 1, 1, 3, 3) :
unused argument(s) ( ...)

##

How should it be done?

Cheers,
Kare

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extended summary

2008-10-14 Thread Kåre Edvardsen
Is there a function providing more descriptive statistics than
summary()? I'm working with a coxph analyses and would like to have
more info on certain numbers.

If my call is something like:

Call:
coxph(formula = Surv(followup, CasesCancer) ~ age + BMI + parity + HRT)

I'd like to know:

* How many CasesCancer was excluded (not only the total number of
excluded due to missing)
* Distribution of variables (where are the NA's)

Cheers,
Kare

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference in p-values between R and SPSS

2008-09-12 Thread Kåre Edvardsen
I thought the difference is to big too, so I tried both breslow and
efron with same different result, and exact goes for ever, which is
strange as I'm only using one dependent varable here. Could be that
n~50.000, and I haven't got the most powerful computer either. I'm not
aiming to get equal p-values, but I'm wondering whether one of us has
got a bug in calculating the follow-up time...

Cheers,

On Thu, 2008-09-11 at 13:07 -0700, Thomas Lumley wrote:

 That is a larger difference in p-values than I would expect due to 
 numerical differences and stopping criteria.  My guess is that you are 
 running across the different approximations for tied failure times.  If 
 so, you will get better agreement with SPSS by using method=breslow in 
 coxph().
 
   -thomas
 
 On Thu, 11 Sep 2008, Kre Edvardsen wrote:
 
  My apologies for asking slightly about SPSS in addition to R...
 
  Could not find an exact answer in the archives on whether R and SPSS may
  give different p-vals when output for coeffs and conf-intervals are the
  same.
  Amyway, a colleague and I are doing a very simple coxreg analyses and
  get the same results for the coefficient and confidence interval,
 
   exp(coef) exp(-coef) lower .95 upper .95
  age_at_entry  1.02   0.98  1.01  1.03
 
 
  but in R we get p = 0.00011, and SPSS gives p  0.0001
 
  Should we worry about this difference in p-value or do R and SPSS
  sometime differ?
 
  All the best,
  Kare
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 Thomas Lumley Assoc. Professor, Biostatistics
 [EMAIL PROTECTED] University of Washington, Seattle

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference in p-values between R and SPSS

2008-09-11 Thread Kåre Edvardsen
My apologies for asking slightly about SPSS in addition to R...

Could not find an exact answer in the archives on whether R and SPSS may
give different p-vals when output for coeffs and conf-intervals are the
same.
Amyway, a colleague and I are doing a very simple coxreg analyses and
get the same results for the coefficient and confidence interval,

  exp(coef) exp(-coef) lower .95 upper .95
age_at_entry  1.02   0.98  1.01  1.03


but in R we get p = 0.00011, and SPSS gives p  0.0001

Should we worry about this difference in p-value or do R and SPSS
sometime differ?

All the best,
Kare

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Summary information

2008-06-24 Thread Kåre Edvardsen
Hi all!

I'm doing a coxph analyses on some 50.000 subjects. Is there a simple
function in R that provide some general model information like:

Summary of the number of event and Cencored values like in SAS?

I'm working together with someone using SAS and the summary function in
R does not provide as detailed general information as SAS provide
(unless I have missed some other nice summary function in R)

Cheers,
Kare

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Strange julian and/or strptime

2008-05-23 Thread Kåre Edvardsen
Hi r-helpers...

Why do I get this strange huge jump of 36524 days when changing origin
from 1969-01-01 to 1968-12-31. It should still be close to zero! This
really messes up my calculations of follow-up times in my analyses.

 julian(strptime(010169, format = %d%m%y),origin =
as.Date(1969-01-01))
 Time difference of -0.0417 days

 julian(strptime(311268, format = %d%m%y),origin =
as.Date(1968-12-31))
 Time difference of 36524.96 days


Cheers,
Kare

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange julian and/or strptime

2008-05-23 Thread Kåre Edvardsen
I'm on an Ubuntu Linux PC, and get the same wrong result as you.
I could not work out what the description of '%y' really ment, so I did
not realize this was operating system specific in this sense. Anyway,
I'll find a way to work around this bug.

Have a nice weekend,
Kare


On Fri, 2008-05-23 at 10:41 +0100, Prof Brian Ripley wrote:

 On my Linux box:
 
  strptime(010169, format = %d%m%y)
 [1] 1969-01-01
  strptime(311268, format = %d%m%y)
 [1] 2068-12-31
 
 From the help page:
 
   '%y' Year without century (00-99). If you use this on input, which
century you get is system-specific.  So don't!  Often values
up to 69 (or 68) are prefixed by 20 and 70 (or 69) to 99 by
19.
 
 If all else fails, read the documentation (but the posting guide asks you 
 to do that before posting).  Elsewhere on that help page it refers you to 
 your OS documentation -- the posting guide asked for your OS 'at a 
 minimum', but as you didn't follow it so we have no idea which you used.
 
 On Fri, 23 May 2008, Kåre Edvardsen wrote:
 
  Hi r-helpers...
 
  Why do I get this strange huge jump of 36524 days when changing origin
  from 1969-01-01 to 1968-12-31. It should still be close to zero! This
  really messes up my calculations of follow-up times in my analyses.
 
  julian(strptime(010169, format = %d%m%y),origin =
  as.Date(1969-01-01))
  Time difference of -0.0417 days
 
  julian(strptime(311268, format = %d%m%y),origin =
  as.Date(1968-12-31))
  Time difference of 36524.96 days
 
 
  Cheers,
  Kare
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overall p-value from a factor in a coxph fit

2008-04-21 Thread Kåre Edvardsen
Prof. Paul,  Prof. Frank.

Thank you very much for helping me out. The Design package did the
trick.

Here is how the anova table looks like without using the Design package:

 anova(Fit1)
Analysis of Deviance Table
Cox model: response is Surv(Time, cancer)
Terms added sequentially (first to last)

Df Deviance Resid. Df Resid. Dev
NULL  16783 5341.8
relativ10.0 1678214995.0
hormone3939.4 1677914055.6
.
.
.


As you see, no p-values reported

Here is how it looks with after implementing Design:

 anova(Fit1)
 Wald Statistics  Response: Surv(Time, cancer) 
 Factor Chi-Square d.f. P
 relativ 6.08   1   0.0137
 hormone 8.68   3   0.0339
.
.
.


Regards,
Kare


On Fri, 2008-04-18 at 11:03 -0500, Frank E Harrell Jr wrote:

 Paul Johnson wrote:
  On Fri, Apr 18, 2008 at 3:06 AM, Kåre Edvardsen [EMAIL PROTECTED] wrote:
  Hi all.
 
   If I run the simple regression when x is a categorical variable ( x -
   factor(x) ):
 
MyFit -coxph( Surv(start, stop, event) ~ x )
 
   How can I get the overall p-value on x other than for each dummy
   variable?
 
anova(MyFit)
 
   does NOT provide that information as previously suggested on the list.
 
  
  It should work...  Here's a self contained example showing that
  anova does give the desired significance test for an lm model.
  
  y - rnorm(100)
  x - gl(5,20)
  mod - lm(y~x)
  anova(mod)
  Analysis of Variance Table
  
  Response: y
Df  Sum Sq Mean Sq F value Pr(F)
  x  4   6.575   1.644  1.5125 0.2047
  Residuals 95 103.237   1.087
  
  If you provide a similar self contained example leading up to a coxph,
  I would be glad to investigate your question.  You don't give enough
  information for me to tell which version of coxph you are running, and
  from what  package.
  
  Suppose I guess that you are using the coxph from the package
  survival. If so, it appears to me there is a bug in that package at
  the moment.  The methods anova.coxph and drop1.coxph did exist at one
  time, until very recently.  There is a thread in r-help (which I found
  by typing RSiteSearch(anova.coxph) ) discussing recent troubles
  with anova.coxph.
  
  http://finzi.psych.upenn.edu/R/Rhelp02a/archive/118481.html
  
  As you see from the discussion in that thread, there used to be an
  anova method for coxph, and in the version of survival I have now,
  there is no such method.  The version I have is  2.34-1, Date:
   2008-03-31.
  
  Here's what I see after I run example(coxph) in order to create some
  coxph objects, on which I can test the diagnostics:
  
  drop1(test2)
  Error in terms.default(terms1) : no terms component
  anova(test2)
  Error in UseMethod(anova) : no applicable method for anova
  
  In that survival package, I do find anova.survreg, but not
  anova.coxph. If you are using the survival package, I'd suggest you
  contact Thomas Lumley directly, since he maintains it.
  
  I think if you had reported the exact error you saw, it would have
  been easier for me to diagnose the trouble.
  
  HTH
  pj
  
 
 In the meantime you can do
 
 library(Design)
 f - cph( . . . )
 anova(f)  # multiple d.f. Wald statistics including tests of 
 nonlinearity
 
 cph uses coxph but anova.Design is separate from the survival package.
 
 Frank
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Overall p-value from a factor in a coxph fit

2008-04-18 Thread Kåre Edvardsen
Hi all.

If I run the simple regression when x is a categorical variable ( x -
factor(x) ):

 MyFit -coxph( Surv(start, stop, event) ~ x )

How can I get the overall p-value on x other than for each dummy
variable?

 anova(MyFit)

does NOT provide that information as previously suggested on the list.

All the best,
Kare

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.