[R] quantcut
I'm trying to devide x into tertiles, but ends up with integer limits even x holds one decimal. The analysis is extremely sensitive to the limits and I like to keep them right. How can that be done? quartiles - quantcut( x[x = 0], q=seq(0,1, by=(1/3)) table(quartiles) quartiles [180,344] (344,448] (448,644] 16467 16476 16452 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trying to pass arrays as arguments to a function
I'd like to avoid looping through an array in order to change values in the array as it takes too long. I red from an earlier post it can be done by do.call but never got it to work. The Idea is to change the value of y according to values in x. Wherever x holds the value 3, the corresponding value in y should be set to 1. So I tried the following giving an error message: # x - c(1,2,3,2,2,3,1,1,3,3) y - c(0,0,1,1,0,0,1,0,0,1) Change_y - function() { if (x == 3) {y - 1} } do.call(Change_y, as.list(x,y)) Error in Change_y(1, 2, 3, 2, 2, 3, 1, 1, 3, 3) : unused argument(s) ( ...) ## How should it be done? Cheers, Kare [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extended summary
Is there a function providing more descriptive statistics than summary()? I'm working with a coxph analyses and would like to have more info on certain numbers. If my call is something like: Call: coxph(formula = Surv(followup, CasesCancer) ~ age + BMI + parity + HRT) I'd like to know: * How many CasesCancer was excluded (not only the total number of excluded due to missing) * Distribution of variables (where are the NA's) Cheers, Kare [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Difference in p-values between R and SPSS
I thought the difference is to big too, so I tried both breslow and efron with same different result, and exact goes for ever, which is strange as I'm only using one dependent varable here. Could be that n~50.000, and I haven't got the most powerful computer either. I'm not aiming to get equal p-values, but I'm wondering whether one of us has got a bug in calculating the follow-up time... Cheers, On Thu, 2008-09-11 at 13:07 -0700, Thomas Lumley wrote: That is a larger difference in p-values than I would expect due to numerical differences and stopping criteria. My guess is that you are running across the different approximations for tied failure times. If so, you will get better agreement with SPSS by using method=breslow in coxph(). -thomas On Thu, 11 Sep 2008, Kre Edvardsen wrote: My apologies for asking slightly about SPSS in addition to R... Could not find an exact answer in the archives on whether R and SPSS may give different p-vals when output for coeffs and conf-intervals are the same. Amyway, a colleague and I are doing a very simple coxreg analyses and get the same results for the coefficient and confidence interval, exp(coef) exp(-coef) lower .95 upper .95 age_at_entry 1.02 0.98 1.01 1.03 but in R we get p = 0.00011, and SPSS gives p 0.0001 Should we worry about this difference in p-value or do R and SPSS sometime differ? All the best, Kare __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Difference in p-values between R and SPSS
My apologies for asking slightly about SPSS in addition to R... Could not find an exact answer in the archives on whether R and SPSS may give different p-vals when output for coeffs and conf-intervals are the same. Amyway, a colleague and I are doing a very simple coxreg analyses and get the same results for the coefficient and confidence interval, exp(coef) exp(-coef) lower .95 upper .95 age_at_entry 1.02 0.98 1.01 1.03 but in R we get p = 0.00011, and SPSS gives p 0.0001 Should we worry about this difference in p-value or do R and SPSS sometime differ? All the best, Kare __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Summary information
Hi all! I'm doing a coxph analyses on some 50.000 subjects. Is there a simple function in R that provide some general model information like: Summary of the number of event and Cencored values like in SAS? I'm working together with someone using SAS and the summary function in R does not provide as detailed general information as SAS provide (unless I have missed some other nice summary function in R) Cheers, Kare [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Strange julian and/or strptime
Hi r-helpers... Why do I get this strange huge jump of 36524 days when changing origin from 1969-01-01 to 1968-12-31. It should still be close to zero! This really messes up my calculations of follow-up times in my analyses. julian(strptime(010169, format = %d%m%y),origin = as.Date(1969-01-01)) Time difference of -0.0417 days julian(strptime(311268, format = %d%m%y),origin = as.Date(1968-12-31)) Time difference of 36524.96 days Cheers, Kare [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange julian and/or strptime
I'm on an Ubuntu Linux PC, and get the same wrong result as you. I could not work out what the description of '%y' really ment, so I did not realize this was operating system specific in this sense. Anyway, I'll find a way to work around this bug. Have a nice weekend, Kare On Fri, 2008-05-23 at 10:41 +0100, Prof Brian Ripley wrote: On my Linux box: strptime(010169, format = %d%m%y) [1] 1969-01-01 strptime(311268, format = %d%m%y) [1] 2068-12-31 From the help page: '%y' Year without century (00-99). If you use this on input, which century you get is system-specific. So don't! Often values up to 69 (or 68) are prefixed by 20 and 70 (or 69) to 99 by 19. If all else fails, read the documentation (but the posting guide asks you to do that before posting). Elsewhere on that help page it refers you to your OS documentation -- the posting guide asked for your OS 'at a minimum', but as you didn't follow it so we have no idea which you used. On Fri, 23 May 2008, Kåre Edvardsen wrote: Hi r-helpers... Why do I get this strange huge jump of 36524 days when changing origin from 1969-01-01 to 1968-12-31. It should still be close to zero! This really messes up my calculations of follow-up times in my analyses. julian(strptime(010169, format = %d%m%y),origin = as.Date(1969-01-01)) Time difference of -0.0417 days julian(strptime(311268, format = %d%m%y),origin = as.Date(1968-12-31)) Time difference of 36524.96 days Cheers, Kare [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overall p-value from a factor in a coxph fit
Prof. Paul, Prof. Frank. Thank you very much for helping me out. The Design package did the trick. Here is how the anova table looks like without using the Design package: anova(Fit1) Analysis of Deviance Table Cox model: response is Surv(Time, cancer) Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev NULL 16783 5341.8 relativ10.0 1678214995.0 hormone3939.4 1677914055.6 . . . As you see, no p-values reported Here is how it looks with after implementing Design: anova(Fit1) Wald Statistics Response: Surv(Time, cancer) Factor Chi-Square d.f. P relativ 6.08 1 0.0137 hormone 8.68 3 0.0339 . . . Regards, Kare On Fri, 2008-04-18 at 11:03 -0500, Frank E Harrell Jr wrote: Paul Johnson wrote: On Fri, Apr 18, 2008 at 3:06 AM, Kåre Edvardsen [EMAIL PROTECTED] wrote: Hi all. If I run the simple regression when x is a categorical variable ( x - factor(x) ): MyFit -coxph( Surv(start, stop, event) ~ x ) How can I get the overall p-value on x other than for each dummy variable? anova(MyFit) does NOT provide that information as previously suggested on the list. It should work... Here's a self contained example showing that anova does give the desired significance test for an lm model. y - rnorm(100) x - gl(5,20) mod - lm(y~x) anova(mod) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(F) x 4 6.575 1.644 1.5125 0.2047 Residuals 95 103.237 1.087 If you provide a similar self contained example leading up to a coxph, I would be glad to investigate your question. You don't give enough information for me to tell which version of coxph you are running, and from what package. Suppose I guess that you are using the coxph from the package survival. If so, it appears to me there is a bug in that package at the moment. The methods anova.coxph and drop1.coxph did exist at one time, until very recently. There is a thread in r-help (which I found by typing RSiteSearch(anova.coxph) ) discussing recent troubles with anova.coxph. http://finzi.psych.upenn.edu/R/Rhelp02a/archive/118481.html As you see from the discussion in that thread, there used to be an anova method for coxph, and in the version of survival I have now, there is no such method. The version I have is 2.34-1, Date: 2008-03-31. Here's what I see after I run example(coxph) in order to create some coxph objects, on which I can test the diagnostics: drop1(test2) Error in terms.default(terms1) : no terms component anova(test2) Error in UseMethod(anova) : no applicable method for anova In that survival package, I do find anova.survreg, but not anova.coxph. If you are using the survival package, I'd suggest you contact Thomas Lumley directly, since he maintains it. I think if you had reported the exact error you saw, it would have been easier for me to diagnose the trouble. HTH pj In the meantime you can do library(Design) f - cph( . . . ) anova(f) # multiple d.f. Wald statistics including tests of nonlinearity cph uses coxph but anova.Design is separate from the survival package. Frank [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Overall p-value from a factor in a coxph fit
Hi all. If I run the simple regression when x is a categorical variable ( x - factor(x) ): MyFit -coxph( Surv(start, stop, event) ~ x ) How can I get the overall p-value on x other than for each dummy variable? anova(MyFit) does NOT provide that information as previously suggested on the list. All the best, Kare [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.