Dear Denis, You got me: I would have thought from ?summary.gam that this would be the same as the adjusted R^2 for a linear model. Note, however, that the percentage of deviance explained checks with the R^2 from the linear model, as expected.
Maybe you should address this question to the package author. Regards, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -------------------------------- > -----Original Message----- > From: Denis Chabot [mailto:[EMAIL PROTECTED] > Sent: Wednesday, October 05, 2005 3:33 PM > To: John Fox > Cc: R list > Subject: Re: [R] testing non-linear component in mgcv:gam > > Thank you everyone for your help, but my introduction to GAM > is turning my brain to mush. I thought the one part of the > output I understood the best was r-sq (adj), but now even > this is becoming foggy. > > In my original message I mentioned a gam fit that turns out > to be a linear fit. By curiosity I analysed it with a linear > predictor only with mgcv package, and then as a linear model. > The output was identical in both, but the r-sq (adj) was 0.55 > in mgcv and 0.26 in lm. In lm I hope that my interpretation > that 26% of the variance in y is explained by the linear > relationship with x is valid. Then what does r2 mean in mgcv? > > Denis > > summary.gam(lin) > > Family: gaussian > Link function: identity > > Formula: > wm.sed ~ Temp > > Parametric coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 0.162879 0.019847 8.207 1.14e-09 *** > Temp -0.023792 0.006369 -3.736 0.000666 *** > --- > Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 > > > R-sq.(adj) = 0.554 Deviance explained = 28.5% > GCV score = 0.09904 Scale est. = 0.093686 n = 37 > > > > summary(sed.true.lin) > > Call: > lm(formula = wm.sed ~ Temp, weights = N.sets) > > Residuals: > Min 1Q Median 3Q Max > -0.6138 -0.1312 -0.0325 0.1089 1.1449 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 0.162879 0.019847 8.207 1.14e-09 *** > Temp -0.023792 0.006369 -3.736 0.000666 *** > --- > Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 > > Residual standard error: 0.3061 on 35 degrees of freedom > Multiple R-Squared: 0.285, Adjusted R-squared: 0.2646 > F-statistic: 13.95 on 1 and 35 DF, p-value: 0.000666 > > > Le 05-10-05 à 09:45, John Fox a écrit : > > > Dear Denis, > > > > Take a closer look at the anova table: The models provide identical > > fits to the data. The differences in degrees of freedom and > deviance > > between the two models are essentially zero, 5.5554e-10 and > 2.353e-11 > > respectively. > > > > I hope this helps, > > John > > > > -------------------------------- > > John Fox > > Department of Sociology > > McMaster University > > Hamilton, Ontario > > Canada L8S 4M4 > > 905-525-9140x23604 > > http://socserv.mcmaster.ca/jfox > > -------------------------------- > > > > > >> -----Original Message----- > >> From: [EMAIL PROTECTED] > >> [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot > >> Sent: Wednesday, October 05, 2005 8:22 AM > >> To: r-help@stat.math.ethz.ch > >> Subject: [R] testing non-linear component in mgcv:gam > >> > >> Hi, > >> > >> I need further help with my GAMs. Most models I test are very > >> obviously non-linear. Yet, to be on the safe side, I report the > >> significance of the smooth (default output of mgcv's > >> summary.gam) and confirm it deviates significantly from linearity. > >> > >> I do the latter by fitting a second model where the same > predictor is > >> entered without the s(), and then use anova.gam to compare > the two. I > >> thought this was the equivalent of the default output of anova.gam > >> using package gam instead of mgcv. > >> > >> I wonder if this procedure is correct because one of my models > >> appears to be linear. In fact mgcv estimates df to be > exactly 1.0 so > >> I could have stopped there. However I inadvertently repeated the > >> procedure outlined above. I would have thought in this case the > >> anova.gam comparing the smooth and the linear fit would > for sure have > >> been not significant. > >> To my surprise, P was 6.18e-09! > >> > >> Am I doing something wrong when I attempt to confirm the non- > >> parametric part a smoother is significant? Here is my example case > >> where the relationship does appear to be linear: > >> > >> library(mgcv) > >> > >>> This is mgcv 1.3-7 > >>> > >> Temp <- c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, > 0.38, 0.62, > >> 0.88, 1.12, > >> 1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 3.38, > >> 3.62, 3.88, > >> 4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 6.12, > >> 6.38, 6.62, 6.88, > >> 7.12, 8.38, 13.62) > >> N.sets <- c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, > 27, 29, 31, > >> 22, 26, 24, 23, > >> 15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, > 3, 1, 1, > >> 1, 1, 1) wm.sed <- c(0.000000000, 0.016129032, 0.000000000, > >> 0.062046512, 0.396459596, 0.189082949, > >> 0.054757925, 0.142810440, 0.168005168, 0.180804428, > >> 0.111439628, 0.128799505, > >> 0.193707937, 0.105921610, 0.103497845, 0.028591837, > >> 0.217894389, 0.020535469, > >> 0.080389068, 0.105234450, 0.070213450, 0.050771363, > >> 0.042074434, 0.102348837, > >> 0.049748344, 0.019100478, 0.005203125, 0.101711864, > >> 0.000000000, 0.000000000, > >> 0.014808824, 0.000000000, 0.222000000, 0.167000000, > >> 0.000000000, 0.000000000, > >> 0.000000000) > >> > >> sed.gam <- gam(wm.sed~s(Temp),weight=N.sets) > >> summary.gam(sed.gam) > >> > >>> Family: gaussian > >>> Link function: identity > >>> > >>> Formula: > >>> wm.sed ~ s(Temp) > >>> > >>> Parametric coefficients: > >>> Estimate Std. Error t value Pr(>|t|) > >>> (Intercept) 0.08403 0.01347 6.241 3.73e-07 *** > >>> --- > >>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > >>> > >>> Approximate significance of smooth terms: > >>> edf Est.rank F p-value > >>> s(Temp) 1 1 13.95 0.000666 *** > >>> --- > >>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > >>> > >>> R-sq.(adj) = 0.554 Deviance explained = 28.5% > >>> GCV score = 0.09904 Scale est. = 0.093686 n = 37 > >>> > >> > >> # testing non-linear contribution > >> sed.lin <- gam(wm.sed~Temp,weight=N.sets) > >> summary.gam(sed.lin) > >> > >>> Family: gaussian > >>> Link function: identity > >>> > >>> Formula: > >>> wm.sed ~ Temp > >>> > >>> Parametric coefficients: > >>> Estimate Std. Error t value Pr(>|t|) > >>> (Intercept) 0.162879 0.019847 8.207 1.14e-09 *** > >>> Temp -0.023792 0.006369 -3.736 0.000666 *** > >>> --- > >>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > >>> > >>> > >>> R-sq.(adj) = 0.554 Deviance explained = 28.5% > >>> GCV score = 0.09904 Scale est. = 0.093686 n = 37 > >>> > >> anova.gam(sed.lin, sed.gam, test="F") > >> > >>> Analysis of Deviance Table > >>> > >>> Model 1: wm.sed ~ Temp > >>> Model 2: wm.sed ~ s(Temp) > >>> Resid. Df Resid. Dev Df Deviance F Pr(>F) > >>> 1 3.5000e+01 3.279 > >>> 2 3.5000e+01 3.279 5.5554e-10 2.353e-11 0.4521 6.18e-09 *** > >>> > >> > >> > >> Thanks in advance, > >> > >> > >> Denis Chabot > >> > >> ______________________________________________ > >> R-help@stat.math.ethz.ch mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide! > >> http://www.R-project.org/posting-guide.html > >> > > > > > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html