Re: [R] unexpected GAM result - at least for me!
You may want to plot your smooth terms: plot(can3.gam,residuals=TRUE,pch=1). The 7 and 4 estimated degrees of freedom on the two middle terms can give you a quite curvy smooth term, and you might overfit the data (as mentioned before by somebody else). Also, you may want to look at the correlation between the smoothing variables. Compute the correlation matrix as a first step and plot each of the variables against the others, which better allows you identifying nonlinear dependencies. If one of these relationships is nearly perfect you may face serious issues due to multicollinearity. I am sorry if I am doubling somebody's earlier response. Cheers, Daniel - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Monica Pisica Gesendet: Tuesday, April 01, 2008 2:44 PM An: Duncan Murdoch Cc: r-help@r-project.org Betreff: Re: [R] unexpected GAM result - at least for me! Hi, I've compared observed and predicted and they match 100%. For 90% probability of occurrence: table(can0,fitted(can3.gam)0.9) FALSE TRUE FALSE230 TRUE 0 125 So i guess it is a valid result . but very unexpected for me. Thank you again for all the help, Monica Date: Mon, 31 Mar 2008 09:30:01 -0400 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: r-help@r-project.org Subject: Re: [R] unexpected GAM result - at least for me! On 3/31/2008 9:01 AM, Monica Pisica wrote: Thanks Duncan. Yes i do have variation in the lidar metrics (be, ch, crr, and home) although i have a quite high correlation between ch and home. But even if i eliminate one metric (either ch or home) i end up with a deviation of 99.99. The species has values of 0 and 1 since i try to predict presence / absence. Do you think it is still a valid result? I repeat: look at the data. Compare the observed and predicted. That's the only way to know whether this is reasonable or not. If you're getting reasonable predictions, then it's a valid fit. (The tests and approximations used in the reported p-values may not be at all valid. I don't know what the requirements are for those in a GAM, but if you're getting a perfect fit, then they probably aren't being met.) Duncan Murdoch Thanks again, Monica Date: Mon, 31 Mar 2008 08:47:48 -0400 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: r-help@r-project.org Subject: Re: [R] unexpected GAM result - at least for me! On 3/31/2008 8:34 AM, Monica Pisica wrote: Hi I am afraid i am not understanding something very fundamental and does not matter how much i am looking into the book Generalized Additive Models of S. Wood i still don't understand my result. I am trying to model presence / absence (presence = 1, absence = 0) of a species using some lidar metrics (i have 4 of these). I am using different models and such and when i used gam i got this very weird (for me) result which i thought it is not possible - or i have no idea how to interpret it. can3.gam - gam(can0~s(be)+s(crr)+s(ch)+s(home), family = 'binomial') summary(can3.gam) Family: binomial Link function: logit Formula: can 0 ~ s(be) + s(crr) + s(ch) + s(home) Parametric coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) 85.39 162.88 0.524 0.6 Approximate significance of smooth terms: edf Est.rank Chi.sq p-value s(be) 1.000 1 0.100 0.751 s(crr) 3.929 8 0.380 1.000 s(ch) 6.820 9 0.396 1.000 s(home) 1.000 1 0.314 0.575 R-sq.(adj) = 1 Deviance explained = 100% UBRE score = -0.81413 Scale est. = 1 n = 148 Is this a perfect fit with no statistical significance, an over-estimating or what It seems that the significance of the smooths terms is null. Of course with such a model i predict perfectly presence / absence of species. Again, i hope you don't mind i'm asking you this. Any explanation will be very much appreciated. Look at the data. You can get a perfect fit to a logistic regression model fairly easily, and it looks as though you've got one. (In fact, the huge intercept suggests that all predictions will be 1. Do you actually have any variation in the data?) Duncan Murdoch In a rush? Get real-time answers with Windows Live Messenger. _ esh_instantaccess_042008 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unexpected GAM result - at least for me!
Hi, I've compared observed and predicted and they match 100%. For 90% probability of occurrence: table(can0,fitted(can3.gam)0.9) FALSE TRUE FALSE230 TRUE 0 125 So i guess it is a valid result . but very unexpected for me. Thank you again for all the help, Monica Date: Mon, 31 Mar 2008 09:30:01 -0400 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: r-help@r-project.org Subject: Re: [R] unexpected GAM result - at least for me! On 3/31/2008 9:01 AM, Monica Pisica wrote: Thanks Duncan. Yes i do have variation in the lidar metrics (be, ch, crr, and home) although i have a quite high correlation between ch and home. But even if i eliminate one metric (either ch or home) i end up with a deviation of 99.99. The species has values of 0 and 1 since i try to predict presence / absence. Do you think it is still a valid result? I repeat: look at the data. Compare the observed and predicted. That's the only way to know whether this is reasonable or not. If you're getting reasonable predictions, then it's a valid fit. (The tests and approximations used in the reported p-values may not be at all valid. I don't know what the requirements are for those in a GAM, but if you're getting a perfect fit, then they probably aren't being met.) Duncan Murdoch Thanks again, Monica Date: Mon, 31 Mar 2008 08:47:48 -0400 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: r-help@r-project.org Subject: Re: [R] unexpected GAM result - at least for me! On 3/31/2008 8:34 AM, Monica Pisica wrote: Hi I am afraid i am not understanding something very fundamental and does not matter how much i am looking into the book Generalized Additive Models of S. Wood i still don't understand my result. I am trying to model presence / absence (presence = 1, absence = 0) of a species using some lidar metrics (i have 4 of these). I am using different models and such and when i used gam i got this very weird (for me) result which i thought it is not possible - or i have no idea how to interpret it. can3.gam - gam(can0~s(be)+s(crr)+s(ch)+s(home), family = 'binomial') summary(can3.gam) Family: binomial Link function: logit Formula: can 0 ~ s(be) + s(crr) + s(ch) + s(home) Parametric coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) 85.39 162.88 0.524 0.6 Approximate significance of smooth terms: edf Est.rank Chi.sq p-value s(be) 1.000 1 0.100 0.751 s(crr) 3.929 8 0.380 1.000 s(ch) 6.820 9 0.396 1.000 s(home) 1.000 1 0.314 0.575 R-sq.(adj) = 1 Deviance explained = 100% UBRE score = -0.81413 Scale est. = 1 n = 148 Is this a perfect fit with no statistical significance, an over-estimating or what It seems that the significance of the smooths terms is null. Of course with such a model i predict perfectly presence / absence of species. Again, i hope you don't mind i'm asking you this. Any explanation will be very much appreciated. Look at the data. You can get a perfect fit to a logistic regression model fairly easily, and it looks as though you've got one. (In fact, the huge intercept suggests that all predictions will be 1. Do you actually have any variation in the data?) Duncan Murdoch In a rush? Get real-time answers with Windows Live Messenger. _ esh_instantaccess_042008 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] unexpected GAM result - at least for me!
Hi I am afraid i am not understanding something very fundamental and does not matter how much i am looking into the book Generalized Additive Models of S. Wood i still don't understand my result. I am trying to model presence / absence (presence = 1, absence = 0) of a species using some lidar metrics (i have 4 of these). I am using different models and such and when i used gam i got this very weird (for me) result which i thought it is not possible - or i have no idea how to interpret it. can3.gam - gam(can0~s(be)+s(crr)+s(ch)+s(home), family = 'binomial') summary(can3.gam) Family: binomial Link function: logit Formula: can 0 ~ s(be) + s(crr) + s(ch) + s(home) Parametric coefficients: Estimate Std. Error z value Pr(|z|) (Intercept)85.39 162.88 0.524 0.6 Approximate significance of smooth terms: edf Est.rank Chi.sq p-value s(be) 1.0001 0.100 0.751 s(crr) 3.9298 0.380 1.000 s(ch) 6.8209 0.396 1.000 s(home) 1.0001 0.314 0.575 R-sq.(adj) = 1 Deviance explained = 100% UBRE score = -0.81413 Scale est. = 1 n = 148 Is this a perfect fit with no statistical significance, an over-estimating or what It seems that the significance of the smooths terms is null. Of course with such a model i predict perfectly presence / absence of species. Again, i hope you don't mind i'm asking you this. Any explanation will be very much appreciated. Thanks, Monica PS. I've contacted the author of the book who is the package maintainer as well but until now i didn't get a reply. _ esh_realtime_042008 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unexpected GAM result - at least for me!
On 3/31/2008 8:34 AM, Monica Pisica wrote: Hi I am afraid i am not understanding something very fundamental and does not matter how much i am looking into the book Generalized Additive Models of S. Wood i still don't understand my result. I am trying to model presence / absence (presence = 1, absence = 0) of a species using some lidar metrics (i have 4 of these). I am using different models and such and when i used gam i got this very weird (for me) result which i thought it is not possible - or i have no idea how to interpret it. can3.gam - gam(can0~s(be)+s(crr)+s(ch)+s(home), family = 'binomial') summary(can3.gam) Family: binomial Link function: logit Formula: can 0 ~ s(be) + s(crr) + s(ch) + s(home) Parametric coefficients: Estimate Std. Error z value Pr(|z|) (Intercept)85.39 162.88 0.524 0.6 Approximate significance of smooth terms: edf Est.rank Chi.sq p-value s(be) 1.0001 0.100 0.751 s(crr) 3.9298 0.380 1.000 s(ch) 6.8209 0.396 1.000 s(home) 1.0001 0.314 0.575 R-sq.(adj) = 1 Deviance explained = 100% UBRE score = -0.81413 Scale est. = 1 n = 148 Is this a perfect fit with no statistical significance, an over-estimating or what It seems that the significance of the smooths terms is null. Of course with such a model i predict perfectly presence / absence of species. Again, i hope you don't mind i'm asking you this. Any explanation will be very much appreciated. Look at the data. You can get a perfect fit to a logistic regression model fairly easily, and it looks as though you've got one. (In fact, the huge intercept suggests that all predictions will be 1. Do you actually have any variation in the data?) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unexpected GAM result - at least for me!
Thanks Duncan. Yes i do have variation in the lidar metrics (be, ch, crr, and home) although i have a quite high correlation between ch and home. But even if i eliminate one metric (either ch or home) i end up with a deviation of 99.99. The species has values of 0 and 1 since i try to predict presence / absence. Do you think it is still a valid result? Thanks again, Monica Date: Mon, 31 Mar 2008 08:47:48 -0400 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: r-help@r-project.org Subject: Re: [R] unexpected GAM result - at least for me! On 3/31/2008 8:34 AM, Monica Pisica wrote: Hi I am afraid i am not understanding something very fundamental and does not matter how much i am looking into the book Generalized Additive Models of S. Wood i still don't understand my result.I am trying to model presence / absence (presence = 1, absence = 0) of a species using some lidar metrics (i have 4 of these). I am using different models and such and when i used gam i got this very weird (for me) result which i thought it is not possible - or i have no idea how to interpret it.can3.gam - gam(can0~s(be)+s(crr)+s(ch)+s(home), family = 'binomial') summary(can3.gam) Family: binomial Link function: logit Formula: can 0 ~ s(be) + s(crr) + s(ch) + s(home) Parametric coefficien! ts: Estimate Std. Error z value Pr(|z|) (Intercept) 85.39 162.88 0.524 0.6 Approximate significance of smooth terms: edf Est.rank Chi.sq p-value s(be) 1.000 1 0.100 0.751 s(crr) 3.929 8 0.380 1.000 s(ch) 6.820 9 0.396 1.000 s(home) 1.000 1 0.314 0.575 R-sq.(adj) = 1 Deviance explained = 100% UBRE score = -0.81413 Scale est. = 1 n = 148Is this a perfect fit with no statistical significance, an over-estimating or what It seems that the significance of the smooths terms is null. Of course with such a model i predict perfectly presence / absence of species.Again, i hope you don't mind i'm asking you this. Any explanation will be very much appreciated. Look at the data. You can get a perfect fit to a logistic regression model fairly easily, and it looks as though you've got one. (In fact, the huge intercept suggests that all predictions will be 1. Do you actually have any variation in the data?) Duncan Murdoch _ esh_realtime_042008 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unexpected GAM result - at least for me!
On 3/31/2008 9:01 AM, Monica Pisica wrote: Thanks Duncan. Yes i do have variation in the lidar metrics (be, ch, crr, and home) although i have a quite high correlation between ch and home. But even if i eliminate one metric (either ch or home) i end up with a deviation of 99.99. The species has values of 0 and 1 since i try to predict presence / absence. Do you think it is still a valid result? I repeat: look at the data. Compare the observed and predicted. That's the only way to know whether this is reasonable or not. If you're getting reasonable predictions, then it's a valid fit. (The tests and approximations used in the reported p-values may not be at all valid. I don't know what the requirements are for those in a GAM, but if you're getting a perfect fit, then they probably aren't being met.) Duncan Murdoch Thanks again, Monica Date: Mon, 31 Mar 2008 08:47:48 -0400 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: r-help@r-project.org Subject: Re: [R] unexpected GAM result - at least for me! On 3/31/2008 8:34 AM, Monica Pisica wrote: Hi I am afraid i am not understanding something very fundamental and does not matter how much i am looking into the book Generalized Additive Models of S. Wood i still don't understand my result. I am trying to model presence / absence (presence = 1, absence = 0) of a species using some lidar metrics (i have 4 of these). I am using different models and such and when i used gam i got this very weird (for me) result which i thought it is not possible - or i have no idea how to interpret it. can3.gam - gam(can0~s(be)+s(crr)+s(ch)+s(home), family = 'binomial') summary(can3.gam) Family: binomial Link function: logit Formula: can 0 ~ s(be) + s(crr) + s(ch) + s(home) Parametric coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) 85.39 162.88 0.524 0.6 Approximate significance of smooth terms: edf Est.rank Chi.sq p-value s(be) 1.000 1 0.100 0.751 s(crr) 3.929 8 0.380 1.000 s(ch) 6.820 9 0.396 1.000 s(home) 1.000 1 0.314 0.575 R-sq.(adj) = 1 Deviance explained = 100% UBRE score = -0.81413 Scale est. = 1 n = 148 Is this a perfect fit with no statistical significance, an over-estimating or what It seems that the significance of the smooths terms is null. Of course with such a model i predict perfectly presence / absence of species. Again, i hope you don't mind i'm asking you this. Any explanation will be very much appreciated. Look at the data. You can get a perfect fit to a logistic regression model fairly easily, and it looks as though you've got one. (In fact, the huge intercept suggests that all predictions will be 1. Do you actually have any variation in the data?) Duncan Murdoch In a rush? Get real-time answers with Windows Live Messenger. http://www.windowslive.com/messenger/overview.html?ocid=TXT_TAGLM_WL_Refresh_realtime_042008 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.