Dear John et al., Curiously, Georges Monette (at York University in Toronto) and I were just talking last week about influence-statistic contours, and I wrote a couple of functions to show these for Cook's D and for covratio as functions of hat-values and studentized residuals. These differ a bit from the ones previously discussed here in that they show rule-of-thumb cut-offs for D and covratio, along with Bonferroni critical values for studentized residuals.
I've attached a file with these functions, even though they're not that polished. More generally, I wonder whether it's not best to supply plots like these as separate functions rather than as a do-it-all plot method for lm objects. Regards, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -------------------------------- > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of John > Maindonald > Sent: Wednesday, April 27, 2005 7:54 PM > To: Martin Maechler > Cc: David Firth; Werner Stahel; r-devel@stat.math.ethz.ch; > Peter Dalgaard > Subject: Re: [Rd] Enhanced version of plot.lm() > > > On 28 Apr 2005, at 1:30 AM, Martin Maechler wrote: > > >>>>>> "PD" == Peter Dalgaard <[EMAIL PROTECTED]> > >>>>>> on 27 Apr 2005 16:54:02 +0200 writes: > > > > PD> Martin Maechler <[EMAIL PROTECTED]> writes: > >>> I'm about to commit the current proposal(s) to R-devel, > >>> **INCLUDING** changing the default from 'which = 1:4' to 'which = > >>> c(1:3,5) > >>> > >>> and ellicit feedback starting from there. > >>> > >>> One thing I think I would like is to use color for the Cook's > >>> contours in the new 4th plot. > > > > PD> Hmm. First try running example(plot.lm) with the modified > > function and > > PD> tell me which observation has the largest Cook's D. > With the > > suggested > > PD> new 4th plot it is very hard to tell whether obs #49 is > > potentially or > > PD> actually influential. Plots #1 and #3 are very close to > > conveying the > > PD> same information though... > > > > I shouldn't be teaching here, and I know that I'm getting > into fighted > > territory (regression diagnostics; robustness; "The" Truth, > etc,etc) > > but I believe there is no unique way to define "actually > influential" > > (hence I don't believe that it's extremely useful to know exactly > > which Cook's D is largest). > > > > Partly because there are many statistics that can be derived from a > > multiple regression fit all of which are influenced in some way. > > AFAIK, all observation-influence measures g(i) are > functions of (r_i, > > h_{ii}) and the latter are the quantities that "regression users" > > should really know {without consulting a text book} and that are > > generalizable {e.g. to "linear smoothers" such as gam()s (for > > "non-estimated" smoothing parameter)}. > > > > Martin > > I agree with Martin. I like the idea of using color (red?) > for the new Cook's contours. People who want (fairly) > precise comparisons of the Cook's statistics can still use > the present plot #4, perhaps as a follow-up to the new plot #5. > It would be possible to label the Cookwise most extreme > points with the actual values (to perhaps 2sig figures, i.e., > labeling on both sides of such points), but this would add > what I consider is unnecessary clutter to the graph. > > John. > > John Maindonald email: [EMAIL PROTECTED] > phone : +61 2 (6125)3473 fax : +61 2(6125)5549 > Centre for Bioinformation Science, Room 1194, John Dedman > Mathematical Sciences Building (Building 27) Australian > National University, Canberra ACT 0200. > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-devel