[Rd] Different behavior of model.matrix between R 3.2 and R3.1.1

2015-06-16 Thread Frank Harrell
Terry Therneau has been very helpful on r-help but we can't figure out 
what change in R in the past months made extra columns appear in 
model.matrix when the terms object is subsetted to remove stratification 
factors in a Cox model.  Terry has changed his logic in the survival 
package to avoid this issue but he requires generating a larger design 
matrix then dropping columns.


A simple example is below.


strat - function(x) x
d - expand.grid(a=c('a1','a2'), b=c('b1','b2'))
d$y - c(1,3,2,4)
f - y ~ a * strat(b)
m - model.frame(f, data=d)
Terms - drop.terms(terms(f, data=d), 2)
model.matrix(Terms, m)

  (Intercept) aa2 aa1:strat(b)b2 aa2:strat(b)b2
1   1   0  0  0
2   1   1  0  0
3   1   0  1  0
4   1   1  0  1
. . .

The column corresponding to a='a1' b='b2' should not be there
(aa1:strat(b)b2).

This does seem to be a change in R.  Any help appreciated.


Terms attributes factor and term.labels are:

attr(,factors)
 a a:strat(b)
y0  0
a1  2
strat(b) 0  1
attr(,term.labels)
[1] a  a:strat(b)


Frank

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] CRAN and ggplot2 geom and stat extensions

2014-12-23 Thread Frank Harrell
I am thinking about adding several geom and stat extensions to ggplot2 
in the Hmisc package.  To do this requires using non-exported ggplot2 
functions as discussed in 
http://stackoverflow.com/questions/18108406/creating-a-custom-stat-object-in-ggplot2

If I use the needed ggplot2::: notation the package will no longer pass 
CRAN checks.  Does anyone know of a solution?  I'm assuming that Hadley 
doesn't want to export these functions or he would have done so a long 
time ago because of the number of users who have asked questions related 
to this.

Frank
-- 

Frank E Harrell Jr  Professor and Chairman  School of Medicine

Department of *Biostatistics*   *Vanderbilt University*


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [RFC] A case for freezing CRAN

2014-03-19 Thread Frank Harrell
To me it boils down to one simple question: is an update to a package on 
CRAN more likely to (1) fix a bug, (2) introduce a bug or downward 
incompatibility, or (3) add a new feature or fix a compatibility problem 
without introducing a bug?  I think the probability of (1) | (3) is much 
greater than the probability of (2), hence the current approach 
maximizes user benefit.


Frank
--
Frank E Harrell Jr Professor and Chairman  School of Medicine
   Department of Biostatistics Vanderbilt University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Solved and Question: Problem with S3 method dispatch and NAMESPACE

2013-04-24 Thread Frank Harrell
Thank you very much Peter.  That did the trick.
Frank

Peter Dalgaard-2 wrote
 On Apr 24, 2013, at 15:59 , Frank Harrell wrote:
 
 I found that package quantreg has created a new generic for latex() [I
 wish
 it hadn't; this has been a generic in Hmisc for almost 2 decades].  When
 I
 require(quantreg) after loading rms, latex(anova.rms object) dispatches
 latex.default, but everything is fine if I don't load quantreg.  rms has
 import(Hmisc) in NAMESPACE and is loaded before quantreg, hence the
 conflict.  How do I make the generic from Hmisc take precedence?
 
 
 library(quantreg, pos=x) with suitably large x is one idea.
 
 Thanks
 Frank
 
 Frank Harrell wrote
 I have updated the rms package to extensively use NAMESPACE.  I cannot
 get
 certain S3 methods to dispatch.  For example I have in NAMESPACE
 
 S3method(anova, rms)
 S3method(latex, anova.rms)
 
 anova.rms produces an object of class anova.rms and there is a
 latex.anova.rms function in rms.  But when I do latex(anova(fit)) I get
 an
 invocation of latex.default.
 
 I have tried using anova.rms and `anova.rms` in S3method() to no
 avail.
 
 Any help appreciated.  I'm using R 2.15.3
 Frank
 
 
 
 
 
 -
 Frank Harrell
 Department of Biostatistics, Vanderbilt University
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Problem-with-S3-method-dispatch-and-NAMESPACE-tp4665179p4665216.html
 Sent from the R devel mailing list archive at Nabble.com.
 
 __
 

 R-devel@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
 
 -- 
 Peter Dalgaard, Professor
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: 

 pd.mes@

   Priv: 

 PDalgd@

 
 __

 R-devel@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel





-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-S3-method-dispatch-and-NAMESPACE-tp4665179p4665226.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Problem with S3 method dispatch and NAMESPACE

2013-04-23 Thread Frank Harrell
I have updated the rms package to extensively use NAMESPACE.  I cannot get
certain S3 methods to dispatch.  For example I have in NAMESPACE

S3method(anova, rms)
S3method(latex, anova.rms)

anova.rms produces an object of class anova.rms and there is a
latex.anova.rms function in rms.  But when I do latex(anova(fit)) I get an
invocation of latex.default.

I have tried using anova.rms and `anova.rms` in S3method() to no avail.

Any help appreciated.  I'm using R 2.15.3
Frank



-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-S3-method-dispatch-and-NAMESPACE-tp4665179.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Regression stars

2013-02-12 Thread Frank Harrell
Uwe I've been consulting for decades and have never once been asked for such
stars.  And when a clinical researcher puts a sentence in a study protocol
that P0.05 will be considered significant I get them to take it out.
Frank

Uwe Ligges-3 wrote
 On 12.02.2013 14:54, Ben Bolker wrote:
 Duncan Murdoch 
 murdoch.duncan at
  gmail.com writes:

[snip]

 Regarding stringsAsFactors:  I'm not going to defend keeping it as is,
 I'll let the people who like it defend it.

Would someone (anyone) like to come forward and give us a defense
 of stringsAsFactors=TRUE -- even someone who doesn't personally like
 it but would like to play devil's advocate?
 
 Sure:
 I will have to change all my scripts, my teaching examples, my book, and 
 lots of code examples for research and particularly consulting jobs.
 
 Personally, I think having stringsAsFactors=TRUE is not too bad for 
 read.table() but less useful for data.frame().
 
 And since you ask for the devil's advocate already, related to the 
 subject line: Removing stars is horrible for consulting: With all those 
 people from biology, medicine and other fields who even ask us questions 
 in term of significance stars that are obviously very common for them. 
 Many of them will certainly ask us for the stars, and ask us to switch 
 to another software product once they do not get it from R. They may not 
 be interested in being taught about the advantages or disadvantages of 
 p-values or stars.
 
 There are different use cases of R, and I want to keep stars for 
 consulting tasks where things have to be delivered within minutes. I am 
 happy with or without for teaching, where I have the time and can easily 
 talk about the sense and nonsense of p-values.
 
 
 Best,
 Uwe
 
 
 
 
 
 
 
 
 
 
 
 
 

 What I will likely do is
 make a few changes so that character vectors are automatically changed
 to factors in modelling functions, so that operating with
 stringsAsFactors=FALSE doesn't trigger silly warnings.

 Duncan Murdoch


   [apologies for snipping context: gmane made me do it]

 __
 

 R-devel@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

 
 __

 R-devel@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel





-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Regression-stars-tp4657795p4658268.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Regression stars

2013-02-10 Thread Frank Harrell
Great discussion.   Tim's Sinclair quote is priceless and relates to the
non-reproducible research done in some quarters.   Norm's wish to remove
stars altogether is entirely consistent with good statistical practice and
would make a statement that R base adheres to good practice.  I don't think
it will work to add confidence intervals because models can have nonlinear
or interaction terms, and the reference cell for a factor variable may not
be what the analyst chooses for a comparison group.

I would like for us to find a way to, over time, implement Norm's wish to
de-emphasize P-values in general.  The harm done  by P-values is
immeasureable.

Frank

Norm Matloff wrote
 I appreciate Tim's comments.
 
 I myself have a social science paper coming out soon in which I felt
 forced to use p-values, given their ubiquity.  However, I also told
 readers of the paper that confidence intervals are much more informative
 and I do provide them.  As I said earlier, there is no avoiding that,
 and R needs to report p-values for that reason.  
 
 Instead, the question is what to do about the stars; I proposed
 eliminating them altogether.  Star-crazed users know how to determine
 them themselves from the p-values, but deleting them from R would send a
 message.
 
 I did say my proposal was bold, which really meant I was suggesting
 that R do SOMETHING to send that message, not necessarily star
 elimination.
 
 One such something would be the proposal I made, which would be to add
 confidence intervals to the output.  This too could be just an option,
 but again offering that option would send a message.  Indeed, I would
 suggest that the help page explain that confidence intervals are more
 informative.  (The help page could make a similar statement regarding
 the stars.)
 
 When I pitch R to people, I say that in addition to the large function
 and library base and the nice graphics capabilities, R is above all
 Statistically Correct--it's written by statisticians who know what they
 are doing, rather than some programmer simply implementing a formula
 from a textbook.  I know that a lot of people feel this is one of R's
 biggest strengths.  Given that, one might argue that R should do what it
 can to help users engage in good statistical practice.  I think this was
 Frank's point.
 
 Norm
 
 __

 R-devel@

  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel





-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Regression-stars-tp4657795p4658084.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Regression stars

2013-02-07 Thread Frank Harrell
Today's GNU R tutorial in
http://how-to.linuxcareer.com/a-quick-gnu-r-tutorial-to-statistical-models-and-graphics
points out how bad statistical practice is being further perpetuated, by
virtue of significance stars still being the default in printed output
from lm models.




-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Regression-stars-tp4657795.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Table Figures and Listings

2011-05-30 Thread Frank Harrell
It is a common misperception that FDA approves software other than the
software used to drive medical devices such as pacemakers.  You can use any
software to make FDA submissions.  FDA gets submissions in Minitab and
Excel, for example.  That being said, base R is CFR11 compliant (see
www.r-project.org).

We developed the rreport package for R for clinical trials reporting
including FDA submission.  rreport includes several very high-level
functions for baseline stats, labs, AEs, etc.  Once you marry R with LaTeX
the sky is the limit.  And the graphics in R can be used to convert many
tired old listings to graphics as we have done for AEs.  See
http://biostat.mc.vanderbilt.edu/Rreport and
http://biostat.mc.vanderbilt.edu/wiki/pub/Main/StatGraphCourse/graphscourse.pdf

Frank

Orin Richards wrote:
 
 Dear All,
 I am fairly new to R.  I work mainly in SAS.  Now, I know that SAS is
 approved 
 by the FDA for submissions.  My question is, does  the FDA approve {R} for 
 clinical trial  submissions.  Also has anyone ever tried to produce TFL's
 using 
 R.  I would like to know how difficult it to produce the TFL's in R as
 compared 
 to SAS.  I know that in SAS it is not difficult once you know what you are
 doing 
 and what is required.  My limited knowledge of R  suggests that it may be
 a bit 
 more difficult.  Can anyone please provide me with some guidance or sample
 code 
 for producing a standard table or listing.  A good starting point can be a 
 demography table.  I can produce a demog table quite easily in SAS.  My R 
 knowledge is limited that's why I have ask for some sample code.
 
 
 
 Thanks for your help.
 
 
 Orin
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Table-Figures-and-Listings-tp3560634p3560653.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] New code in R-devel: Rao score test for glm.

2011-05-11 Thread Frank Harrell
Economists re-invented the Rao efficient score test, calling it the Lagrange
multiplier test.  Please check the history of this test.  Rao's paper was
published in 1947.  That being said, score would be more consistent with
the survival and rms packages.
Frank

Brett Presnell wrote:
 
 Thanks for doing this Peter.  I'll have to install the development
 version to try this out.
 
 One suggestion though.  I'm pretty confident that plain old score test
 is a more common terminology than anything involving Rao's name
 (econometricians even call it the Lagrange multiplier test).  In light
 of this, I think that it would be much better to use test = score
 rather than test = Rao.
 
 -- 
 Brett Presnell
 Department of Statistics
 University of Florida
 http://www.stat.ufl.edu/~presnell/
 
 We don't think that the popularity of an error makes it the truth.
-- Richard Stallman
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/New-code-in-R-devel-Rao-score-test-for-glm-tp3514262p3514679.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel