[Rd] Different behavior of model.matrix between R 3.2 and R3.1.1
Terry Therneau has been very helpful on r-help but we can't figure out what change in R in the past months made extra columns appear in model.matrix when the terms object is subsetted to remove stratification factors in a Cox model. Terry has changed his logic in the survival package to avoid this issue but he requires generating a larger design matrix then dropping columns. A simple example is below. strat - function(x) x d - expand.grid(a=c('a1','a2'), b=c('b1','b2')) d$y - c(1,3,2,4) f - y ~ a * strat(b) m - model.frame(f, data=d) Terms - drop.terms(terms(f, data=d), 2) model.matrix(Terms, m) (Intercept) aa2 aa1:strat(b)b2 aa2:strat(b)b2 1 1 0 0 0 2 1 1 0 0 3 1 0 1 0 4 1 1 0 1 . . . The column corresponding to a='a1' b='b2' should not be there (aa1:strat(b)b2). This does seem to be a change in R. Any help appreciated. Terms attributes factor and term.labels are: attr(,factors) a a:strat(b) y0 0 a1 2 strat(b) 0 1 attr(,term.labels) [1] a a:strat(b) Frank __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] CRAN and ggplot2 geom and stat extensions
I am thinking about adding several geom and stat extensions to ggplot2 in the Hmisc package. To do this requires using non-exported ggplot2 functions as discussed in http://stackoverflow.com/questions/18108406/creating-a-custom-stat-object-in-ggplot2 If I use the needed ggplot2::: notation the package will no longer pass CRAN checks. Does anyone know of a solution? I'm assuming that Hadley doesn't want to export these functions or he would have done so a long time ago because of the number of users who have asked questions related to this. Frank -- Frank E Harrell Jr Professor and Chairman School of Medicine Department of *Biostatistics* *Vanderbilt University* [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [RFC] A case for freezing CRAN
To me it boils down to one simple question: is an update to a package on CRAN more likely to (1) fix a bug, (2) introduce a bug or downward incompatibility, or (3) add a new feature or fix a compatibility problem without introducing a bug? I think the probability of (1) | (3) is much greater than the probability of (2), hence the current approach maximizes user benefit. Frank -- Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Solved and Question: Problem with S3 method dispatch and NAMESPACE
Thank you very much Peter. That did the trick. Frank Peter Dalgaard-2 wrote On Apr 24, 2013, at 15:59 , Frank Harrell wrote: I found that package quantreg has created a new generic for latex() [I wish it hadn't; this has been a generic in Hmisc for almost 2 decades]. When I require(quantreg) after loading rms, latex(anova.rms object) dispatches latex.default, but everything is fine if I don't load quantreg. rms has import(Hmisc) in NAMESPACE and is loaded before quantreg, hence the conflict. How do I make the generic from Hmisc take precedence? library(quantreg, pos=x) with suitably large x is one idea. Thanks Frank Frank Harrell wrote I have updated the rms package to extensively use NAMESPACE. I cannot get certain S3 methods to dispatch. For example I have in NAMESPACE S3method(anova, rms) S3method(latex, anova.rms) anova.rms produces an object of class anova.rms and there is a latex.anova.rms function in rms. But when I do latex(anova(fit)) I get an invocation of latex.default. I have tried using anova.rms and `anova.rms` in S3method() to no avail. Any help appreciated. I'm using R 2.15.3 Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-S3-method-dispatch-and-NAMESPACE-tp4665179p4665216.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@ mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes@ Priv: PDalgd@ __ R-devel@ mailing list https://stat.ethz.ch/mailman/listinfo/r-devel - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-S3-method-dispatch-and-NAMESPACE-tp4665179p4665226.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Problem with S3 method dispatch and NAMESPACE
I have updated the rms package to extensively use NAMESPACE. I cannot get certain S3 methods to dispatch. For example I have in NAMESPACE S3method(anova, rms) S3method(latex, anova.rms) anova.rms produces an object of class anova.rms and there is a latex.anova.rms function in rms. But when I do latex(anova(fit)) I get an invocation of latex.default. I have tried using anova.rms and `anova.rms` in S3method() to no avail. Any help appreciated. I'm using R 2.15.3 Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-S3-method-dispatch-and-NAMESPACE-tp4665179.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Regression stars
Uwe I've been consulting for decades and have never once been asked for such stars. And when a clinical researcher puts a sentence in a study protocol that P0.05 will be considered significant I get them to take it out. Frank Uwe Ligges-3 wrote On 12.02.2013 14:54, Ben Bolker wrote: Duncan Murdoch murdoch.duncan at gmail.com writes: [snip] Regarding stringsAsFactors: I'm not going to defend keeping it as is, I'll let the people who like it defend it. Would someone (anyone) like to come forward and give us a defense of stringsAsFactors=TRUE -- even someone who doesn't personally like it but would like to play devil's advocate? Sure: I will have to change all my scripts, my teaching examples, my book, and lots of code examples for research and particularly consulting jobs. Personally, I think having stringsAsFactors=TRUE is not too bad for read.table() but less useful for data.frame(). And since you ask for the devil's advocate already, related to the subject line: Removing stars is horrible for consulting: With all those people from biology, medicine and other fields who even ask us questions in term of significance stars that are obviously very common for them. Many of them will certainly ask us for the stars, and ask us to switch to another software product once they do not get it from R. They may not be interested in being taught about the advantages or disadvantages of p-values or stars. There are different use cases of R, and I want to keep stars for consulting tasks where things have to be delivered within minutes. I am happy with or without for teaching, where I have the time and can easily talk about the sense and nonsense of p-values. Best, Uwe What I will likely do is make a few changes so that character vectors are automatically changed to factors in modelling functions, so that operating with stringsAsFactors=FALSE doesn't trigger silly warnings. Duncan Murdoch [apologies for snipping context: gmane made me do it] __ R-devel@ mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@ mailing list https://stat.ethz.ch/mailman/listinfo/r-devel - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Regression-stars-tp4657795p4658268.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Regression stars
Great discussion. Tim's Sinclair quote is priceless and relates to the non-reproducible research done in some quarters. Norm's wish to remove stars altogether is entirely consistent with good statistical practice and would make a statement that R base adheres to good practice. I don't think it will work to add confidence intervals because models can have nonlinear or interaction terms, and the reference cell for a factor variable may not be what the analyst chooses for a comparison group. I would like for us to find a way to, over time, implement Norm's wish to de-emphasize P-values in general. The harm done by P-values is immeasureable. Frank Norm Matloff wrote I appreciate Tim's comments. I myself have a social science paper coming out soon in which I felt forced to use p-values, given their ubiquity. However, I also told readers of the paper that confidence intervals are much more informative and I do provide them. As I said earlier, there is no avoiding that, and R needs to report p-values for that reason. Instead, the question is what to do about the stars; I proposed eliminating them altogether. Star-crazed users know how to determine them themselves from the p-values, but deleting them from R would send a message. I did say my proposal was bold, which really meant I was suggesting that R do SOMETHING to send that message, not necessarily star elimination. One such something would be the proposal I made, which would be to add confidence intervals to the output. This too could be just an option, but again offering that option would send a message. Indeed, I would suggest that the help page explain that confidence intervals are more informative. (The help page could make a similar statement regarding the stars.) When I pitch R to people, I say that in addition to the large function and library base and the nice graphics capabilities, R is above all Statistically Correct--it's written by statisticians who know what they are doing, rather than some programmer simply implementing a formula from a textbook. I know that a lot of people feel this is one of R's biggest strengths. Given that, one might argue that R should do what it can to help users engage in good statistical practice. I think this was Frank's point. Norm __ R-devel@ mailing list https://stat.ethz.ch/mailman/listinfo/r-devel - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Regression-stars-tp4657795p4658084.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Regression stars
Today's GNU R tutorial in http://how-to.linuxcareer.com/a-quick-gnu-r-tutorial-to-statistical-models-and-graphics points out how bad statistical practice is being further perpetuated, by virtue of significance stars still being the default in printed output from lm models. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Regression-stars-tp4657795.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Table Figures and Listings
It is a common misperception that FDA approves software other than the software used to drive medical devices such as pacemakers. You can use any software to make FDA submissions. FDA gets submissions in Minitab and Excel, for example. That being said, base R is CFR11 compliant (see www.r-project.org). We developed the rreport package for R for clinical trials reporting including FDA submission. rreport includes several very high-level functions for baseline stats, labs, AEs, etc. Once you marry R with LaTeX the sky is the limit. And the graphics in R can be used to convert many tired old listings to graphics as we have done for AEs. See http://biostat.mc.vanderbilt.edu/Rreport and http://biostat.mc.vanderbilt.edu/wiki/pub/Main/StatGraphCourse/graphscourse.pdf Frank Orin Richards wrote: Dear All, I am fairly new to R. I work mainly in SAS. Now, I know that SAS is approved by the FDA for submissions. My question is, does the FDA approve {R} for clinical trial submissions. Also has anyone ever tried to produce TFL's using R. I would like to know how difficult it to produce the TFL's in R as compared to SAS. I know that in SAS it is not difficult once you know what you are doing and what is required. My limited knowledge of R suggests that it may be a bit more difficult. Can anyone please provide me with some guidance or sample code for producing a standard table or listing. A good starting point can be a demography table. I can produce a demog table quite easily in SAS. My R knowledge is limited that's why I have ask for some sample code. Thanks for your help. Orin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Table-Figures-and-Listings-tp3560634p3560653.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New code in R-devel: Rao score test for glm.
Economists re-invented the Rao efficient score test, calling it the Lagrange multiplier test. Please check the history of this test. Rao's paper was published in 1947. That being said, score would be more consistent with the survival and rms packages. Frank Brett Presnell wrote: Thanks for doing this Peter. I'll have to install the development version to try this out. One suggestion though. I'm pretty confident that plain old score test is a more common terminology than anything involving Rao's name (econometricians even call it the Lagrange multiplier test). In light of this, I think that it would be much better to use test = score rather than test = Rao. -- Brett Presnell Department of Statistics University of Florida http://www.stat.ufl.edu/~presnell/ We don't think that the popularity of an error makes it the truth. -- Richard Stallman __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/New-code-in-R-devel-Rao-score-test-for-glm-tp3514262p3514679.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel