Re: [R] plotting residuals/error message
Dear Jason, Your suspicion is correct: the cases with NAs in either ft_dpc_r or pid_x are absent from the residuals but not from pid_x. There are several ways to deal with this problem, but one straightforward way is to use na.exclude in place of the default na.omit in the call to lm, as in lm(ft_dpc_r~pid_x, na.action=na.exclude). Then residuals() will pad out the values it returns with NAs. I hope this helps, John --- John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.socsci.mcmaster.ca/jfox/ -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Gainous,Jason Sent: Saturday, March 15, 2014 1:42 PM To: r-help@r-project.org Subject: [R] plotting residuals/error message I am trying to plot the residuals from a linear model and I get the following error message: Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ. The outcome is a continuous variable and the explanatory variable is ordinal. My immediate suspicion was that it had something to do with missing values. Each variable has missing values coded as NA. I can't figure it out. Please help. Thanks. obama.mod = lm(ft_dpc_r~pid_x) summary(obama.mod) Call: lm(formula = ft_dpc_r ~ pid_x) Residuals: Min 1Q Median 3Q Max -89.271 -14.783 0.729 10.729 84.193 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 101.5155 0.5797 175.12 2e-16 *** pid_x -12.2441 0.1411 -86.76 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 22.85 on 5876 degrees of freedom (36 observations deleted due to missingness) Multiple R-squared: 0.5616, Adjusted R-squared: 0.5615 F-statistic: 7528 on 1 and 5876 DF, p-value: 2.2e-16 plot(pid_x, resid(obama.mod), ylab = Model Residuals) Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ Jason Gainous, Ph.D. Associate Professor Department of Political Science University of Louisville 203 Ford Hall Louisville, KY 40292 (502) 852-1660 Homepage: https://louisville.academia.edu/JasonGainous [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting residuals/error message
Thanks John. That worked. It was an easy solution. I'm in the process of switching from Stata and some of these small things are still hanging me up. Jason -Original Message- From: John Fox [mailto:j...@mcmaster.ca] Sent: Sunday, March 16, 2014 9:51 AM To: Gainous,Jason Cc: r-help@r-project.org Subject: RE: [R] plotting residuals/error message Dear Jason, Your suspicion is correct: the cases with NAs in either ft_dpc_r or pid_x are absent from the residuals but not from pid_x. There are several ways to deal with this problem, but one straightforward way is to use na.exclude in place of the default na.omit in the call to lm, as in lm(ft_dpc_r~pid_x, na.action=na.exclude). Then residuals() will pad out the values it returns with NAs. I hope this helps, John --- John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.socsci.mcmaster.ca/jfox/ -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Gainous,Jason Sent: Saturday, March 15, 2014 1:42 PM To: r-help@r-project.org Subject: [R] plotting residuals/error message I am trying to plot the residuals from a linear model and I get the following error message: Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ. The outcome is a continuous variable and the explanatory variable is ordinal. My immediate suspicion was that it had something to do with missing values. Each variable has missing values coded as NA. I can't figure it out. Please help. Thanks. obama.mod = lm(ft_dpc_r~pid_x) summary(obama.mod) Call: lm(formula = ft_dpc_r ~ pid_x) Residuals: Min 1Q Median 3Q Max -89.271 -14.783 0.729 10.729 84.193 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 101.5155 0.5797 175.12 2e-16 *** pid_x -12.2441 0.1411 -86.76 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 22.85 on 5876 degrees of freedom (36 observations deleted due to missingness) Multiple R-squared: 0.5616, Adjusted R-squared: 0.5615 F-statistic: 7528 on 1 and 5876 DF, p-value: 2.2e-16 plot(pid_x, resid(obama.mod), ylab = Model Residuals) Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ Jason Gainous, Ph.D. Associate Professor Department of Political Science University of Louisville 203 Ford Hall Louisville, KY 40292 (502) 852-1660 Homepage: https://louisville.academia.edu/JasonGainous [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting residuals/error message
Did you not notice: Residual standard error: 22.85 on 5876 degrees of freedom (36 observations deleted due to missingness) ?? (No residuals for missings...) -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Sat, Mar 15, 2014 at 10:42 AM, Gainous,Jason jason.gain...@louisville.edu wrote: I am trying to plot the residuals from a linear model and I get the following error message: Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ. The outcome is a continuous variable and the explanatory variable is ordinal. My immediate suspicion was that it had something to do with missing values. Each variable has missing values coded as NA. I can't figure it out. Please help. Thanks. obama.mod = lm(ft_dpc_r~pid_x) summary(obama.mod) Call: lm(formula = ft_dpc_r ~ pid_x) Residuals: Min 1Q Median 3Q Max -89.271 -14.783 0.729 10.729 84.193 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 101.5155 0.5797 175.12 2e-16 *** pid_x -12.2441 0.1411 -86.76 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 22.85 on 5876 degrees of freedom (36 observations deleted due to missingness) Multiple R-squared: 0.5616, Adjusted R-squared: 0.5615 F-statistic: 7528 on 1 and 5876 DF, p-value: 2.2e-16 plot(pid_x, resid(obama.mod), ylab = Model Residuals) Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ Jason Gainous, Ph.D. Associate Professor Department of Political Science University of Louisville 203 Ford Hall Louisville, KY 40292 (502) 852-1660 Homepage: https://louisville.academia.edu/JasonGainous [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.