It is also wise to make scatterplots, as shown by the famous examples produced of 4 scatterplots with the same R^2, where the first shows the standard ellipsoid pattern implied by the assumptions while the other three indicate very clearly that the assumptions are incorrect. See Anscombe (1973) "Graphs in Statistical Analysis", The American Statistician, 27: 17-22, reproduced in, e.g., du Toit, Steyn and Stumpf (1986) Graphical Exploratory Data Analysis (Springer).

hth. spencer graves

Prof Brian Ripley wrote:
On Tue, 17 Jun 2003, kan Liu wrote:


I want to calculate the R-squared between two variables. Can you advice
me how to identify and remove the outliers before performing R-squared
calculation?


Easy: you don't. It make no sense to consider R^2 after arbitrary outlier removal: if I remove all but two points I get R^2 = 1!

R^2 is normally used to measure the success of a multiple regression, but as you mention two variables, did you just mean the Pearson product-moment correlation? It makes more sense to use a robust measure of correlation, as in cov.rob (package lqs) or even Spearman or Kendall measures (cov.test in package ctest).

If you intended to do this for a multiple regression, you need to do some sort of robust regression and a use a robust measure of fit.


______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Reply via email to