Re: [R] unexpected GAM result - at least for me!

2008-04-02 Thread Daniel Malter
You may want to plot your smooth terms:

plot(can3.gam,residuals=TRUE,pch=1). 

The 7 and 4 estimated degrees of freedom on the two middle terms can give
you a quite curvy smooth term, and you might overfit the data (as mentioned
before by somebody else). Also, you may want to look at the correlation
between the smoothing variables. Compute the correlation matrix as a first
step and plot each of the variables against the others, which better allows
you identifying nonlinear dependencies. If one of these relationships is
nearly perfect you may face serious issues due to multicollinearity. 

I am sorry if I am doubling somebody's earlier response.

Cheers,
Daniel



-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im
Auftrag von Monica Pisica
Gesendet: Tuesday, April 01, 2008 2:44 PM
An: Duncan Murdoch
Cc: r-help@r-project.org
Betreff: Re: [R] unexpected GAM result - at least for me!



Hi,


I've compared observed and predicted and they match 100%.

For 90% probability of occurrence:

table(can0,fitted(can3.gam)0.9)

FALSE TRUE

  FALSE230

  TRUE  0  125

So i guess it is a valid result . but very unexpected for me.

Thank you again for all the help,

Monica



 Date: Mon, 31 Mar 2008 09:30:01 -0400
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 CC: r-help@r-project.org
 Subject: Re: [R] unexpected GAM result - at least for me!

 On 3/31/2008 9:01 AM, Monica Pisica wrote:
 Thanks Duncan.

 Yes i do have variation in the lidar metrics (be, ch, crr, and home) 
 although i have a quite high correlation between ch and home. But 
 even if i eliminate one metric (either ch or home) i end up with a 
 deviation of 99.99. The species has values of 0 and 1 since i try to 
 predict presence / absence.

 Do you think it is still a valid result?

 I repeat: look at the data. Compare the observed and predicted. That's 
 the only way to know whether this is reasonable or not.

 If you're getting reasonable predictions, then it's a valid fit. (The 
 tests and approximations used in the reported p-values may not be at 
 all valid. I don't know what the requirements are for those in a GAM, 
 but if you're getting a perfect fit, then they probably aren't being 
 met.)

 Duncan Murdoch



 Thanks again,

 Monica

 Date: Mon, 31 Mar 2008 08:47:48 -0400
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 CC: r-help@r-project.org
 Subject: Re: [R] unexpected GAM result - at least for me!

 On 3/31/2008 8:34 AM, Monica Pisica wrote:

 Hi


 I am afraid i am not understanding something very fundamental
 and does not matter how much i am looking into the book Generalized 
 Additive Models of S. Wood i still don't understand my result.

 I am trying to model presence / absence (presence = 1, absence = 0)
 of a species using some lidar metrics (i have 4 of these). I am using 
 different models and such  and when i used gam i got this very 
 weird (for me) result which i thought it is not possible - or i have 
 no idea how to interpret it.

 can3.gam - gam(can0~s(be)+s(crr)+s(ch)+s(home), family = 
 'binomial')
 summary(can3.gam)
 Family: binomial
 Link function: logit
 Formula:
 can 0 ~ s(be) + s(crr) + s(ch) + s(home)
 Parametric coefficients:
 Estimate Std. Error z value Pr(|z|)
 (Intercept) 85.39 162.88 0.524 0.6
 Approximate significance of smooth terms:
 edf Est.rank Chi.sq p-value
 s(be) 1.000 1 0.100 0.751
 s(crr) 3.929 8 0.380 1.000
 s(ch) 6.820 9 0.396 1.000
 s(home) 1.000 1 0.314 0.575
 R-sq.(adj) = 1 Deviance explained = 100% UBRE score = -0.81413 
 Scale est. = 1 n = 148

 Is this a perfect fit with no statistical significance, an
 over-estimating or what It seems that the significance of the 
 smooths terms is null. Of course with such a model i predict 
 perfectly presence / absence of species.

 Again, i hope you don't mind i'm asking you this. Any explanation
 will be very much appreciated.

 Look at the data. You can get a perfect fit to a logistic regression 
 model fairly easily, and it looks as though you've got one. (In 
 fact, the huge intercept suggests that all predictions will be 1. Do 
 you actually have any variation in the data?)

 Duncan Murdoch


 In a rush? Get real-time answers with Windows Live Messenger.
 


_


esh_instantaccess_042008
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected GAM result - at least for me!

2008-04-01 Thread Monica Pisica


Hi,


I've compared observed and predicted and they match 100%.

For 90% probability of occurrence:

table(can0,fitted(can3.gam)0.9)

FALSE TRUE

  FALSE230

  TRUE  0  125

So i guess it is a valid result . but very unexpected for me.

Thank you again for all the help,

Monica



 Date: Mon, 31 Mar 2008 09:30:01 -0400
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 CC: r-help@r-project.org
 Subject: Re: [R] unexpected GAM result - at least for me!

 On 3/31/2008 9:01 AM, Monica Pisica wrote:
 Thanks Duncan.

 Yes i do have variation in the lidar metrics (be, ch, crr, and home)
 although i have a quite high correlation between ch and home. But even
 if i eliminate one metric (either ch or home) i end up with a deviation
 of 99.99. The species has values of 0 and 1 since i try to predict
 presence / absence.

 Do you think it is still a valid result?

 I repeat: look at the data. Compare the observed and predicted. That's
 the only way to know whether this is reasonable or not.

 If you're getting reasonable predictions, then it's a valid fit. (The
 tests and approximations used in the reported p-values may not be at all
 valid. I don't know what the requirements are for those in a GAM, but
 if you're getting a perfect fit, then they probably aren't being met.)

 Duncan Murdoch



 Thanks again,

 Monica

 Date: Mon, 31 Mar 2008 08:47:48 -0400
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 CC: r-help@r-project.org
 Subject: Re: [R] unexpected GAM result - at least for me!

 On 3/31/2008 8:34 AM, Monica Pisica wrote:

 Hi


 I am afraid i am not understanding something very fundamental
 and does not matter how much i am looking into the book Generalized
 Additive Models of S. Wood i still don't understand my result.

 I am trying to model presence / absence (presence = 1, absence = 0)
 of a species using some lidar metrics (i have 4 of these). I am using
 different models and such  and when i used gam i got this very weird
 (for me) result which i thought it is not possible - or i have no idea
 how to interpret it.

 can3.gam - gam(can0~s(be)+s(crr)+s(ch)+s(home), family = 'binomial')
 summary(can3.gam)
 Family: binomial
 Link function: logit
 Formula:
 can 0 ~ s(be) + s(crr) + s(ch) + s(home)
 Parametric coefficients:
 Estimate Std. Error z value Pr(|z|)
 (Intercept) 85.39 162.88 0.524 0.6
 Approximate significance of smooth terms:
 edf Est.rank Chi.sq p-value
 s(be) 1.000 1 0.100 0.751
 s(crr) 3.929 8 0.380 1.000
 s(ch) 6.820 9 0.396 1.000
 s(home) 1.000 1 0.314 0.575
 R-sq.(adj) = 1 Deviance explained = 100%
 UBRE score = -0.81413 Scale est. = 1 n = 148

 Is this a perfect fit with no statistical significance, an
 over-estimating or what It seems that the significance of the
 smooths terms is null. Of course with such a model i predict perfectly
 presence / absence of species.

 Again, i hope you don't mind i'm asking you this. Any explanation
 will be very much appreciated.

 Look at the data. You can get a perfect fit to a logistic regression
 model fairly easily, and it looks as though you've got one. (In fact,
 the huge intercept suggests that all predictions will be 1. Do you
 actually have any variation in the data?)

 Duncan Murdoch


 In a rush? Get real-time answers with Windows Live Messenger.
 


_


esh_instantaccess_042008
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] unexpected GAM result - at least for me!

2008-03-31 Thread Monica Pisica


Hi


I am afraid i am not understanding something  very fundamental and does not 
matter how much i am looking into the book Generalized Additive Models  of S. 
Wood i still don't understand my result.

I am trying to model presence / absence (presence = 1, absence = 0) of a 
species using some lidar metrics (i have 4 of these). I am using different 
models and such  and when i used gam i got this very weird (for me) result 
which i thought it is not possible - or i have no idea how to interpret it.

 can3.gam - gam(can0~s(be)+s(crr)+s(ch)+s(home), family = 'binomial')
 summary(can3.gam)
Family: binomial
Link function: logit
Formula:
can 0 ~ s(be) + s(crr) + s(ch) + s(home)
Parametric coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)85.39 162.88   0.524  0.6
Approximate significance of smooth terms:
  edf Est.rank Chi.sq p-value
s(be)   1.0001  0.100   0.751
s(crr)  3.9298  0.380   1.000
s(ch)   6.8209  0.396   1.000
s(home) 1.0001  0.314   0.575
R-sq.(adj) =  1   Deviance explained =  100%
UBRE score = -0.81413  Scale est. = 1 n = 148

Is this a perfect fit with no statistical significance, an over-estimating or 
what It seems that the significance of the smooths terms is null. Of 
course with such a model i predict perfectly presence / absence of species.

Again, i hope you don't mind i'm asking you this. Any explanation will be very 
much appreciated.

Thanks,

Monica

PS. I've contacted the author of the book who is the package maintainer as well 
but until now i didn't get a reply.

_


esh_realtime_042008
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected GAM result - at least for me!

2008-03-31 Thread Duncan Murdoch
On 3/31/2008 8:34 AM, Monica Pisica wrote:
 
 Hi
 
 
 I am afraid i am not understanding something  very fundamental and does 
 not matter how much i am looking into the book Generalized Additive Models  
 of S. Wood i still don't understand my result.
 
 I am trying to model presence / absence (presence = 1, absence = 0) of a 
 species using some lidar metrics (i have 4 of these). I am using different 
 models and such  and when i used gam i got this very weird (for me) 
 result which i thought it is not possible - or i have no idea how to 
 interpret it.
 
 can3.gam - gam(can0~s(be)+s(crr)+s(ch)+s(home), family = 'binomial')
 summary(can3.gam)
 Family: binomial
 Link function: logit
 Formula:
 can 0 ~ s(be) + s(crr) + s(ch) + s(home)
 Parametric coefficients:
 Estimate Std. Error z value Pr(|z|)
 (Intercept)85.39 162.88   0.524  0.6
 Approximate significance of smooth terms:
   edf Est.rank Chi.sq p-value
 s(be)   1.0001  0.100   0.751
 s(crr)  3.9298  0.380   1.000
 s(ch)   6.8209  0.396   1.000
 s(home) 1.0001  0.314   0.575
 R-sq.(adj) =  1   Deviance explained =  100%
 UBRE score = -0.81413  Scale est. = 1 n = 148
 
 Is this a perfect fit with no statistical significance, an over-estimating or 
 what It seems that the significance of the smooths terms is null. Of 
 course with such a model i predict perfectly presence / absence of species.
 
 Again, i hope you don't mind i'm asking you this. Any explanation will be 
 very much appreciated.

Look at the data.  You can get a perfect fit to a logistic regression 
model fairly easily, and it looks as though you've got one.  (In fact, 
the huge intercept suggests that all predictions will be 1.  Do you 
actually have any variation in the data?)

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected GAM result - at least for me!

2008-03-31 Thread Monica Pisica

Thanks Duncan.
 
Yes i do have variation in the lidar metrics (be, ch, crr, and home) although i 
have a quite high correlation between ch and home. But even if i eliminate one 
metric (either ch or home) i end up with a deviation of 99.99. The species has 
values of 0 and 1 since i try to predict presence / absence.
 
Do you think it is still a valid result?
 
Thanks again,
 
Monica Date: Mon, 31 Mar 2008 08:47:48 -0400 From: [EMAIL PROTECTED] To: 
[EMAIL PROTECTED] CC: r-help@r-project.org Subject: Re: [R] unexpected GAM 
result - at least for me!  On 3/31/2008 8:34 AM, Monica Pisica wrote:
Hi  I am afraid i am not understanding something very fundamental 
and does not matter how much i am looking into the book Generalized Additive 
Models of S. Wood i still don't understand my result.I am trying to 
model presence / absence (presence = 1, absence = 0) of a species using some 
lidar metrics (i have 4 of these). I am using different models and such  
and when i used gam i got this very weird (for me) result which i thought it is 
not possible - or i have no idea how to interpret it.can3.gam - 
gam(can0~s(be)+s(crr)+s(ch)+s(home), family = 'binomial')  
summary(can3.gam)  Family: binomial  Link function: logit  Formula:  
can 0 ~ s(be) + s(crr) + s(ch) + s(home)  Parametric coefficien!
 ts:  Estimate Std. Error z value Pr(|z|)  (Intercept) 85.39 162.88 0.524 
0.6  Approximate significance of smooth terms:  edf Est.rank Chi.sq 
p-value  s(be) 1.000 1 0.100 0.751  s(crr) 3.929 8 0.380 1.000  s(ch) 
6.820 9 0.396 1.000  s(home) 1.000 1 0.314 0.575  R-sq.(adj) = 1 Deviance 
explained = 100%  UBRE score = -0.81413 Scale est. = 1 n = 148Is this 
a perfect fit with no statistical significance, an over-estimating or what 
It seems that the significance of the smooths terms is null. Of course with 
such a model i predict perfectly presence / absence of species.Again, i 
hope you don't mind i'm asking you this. Any explanation will be very much 
appreciated.  Look at the data. You can get a perfect fit to a logistic 
regression  model fairly easily, and it looks as though you've got one. (In 
fact,  the huge intercept suggests that all predictions will be 1. Do you  
actually have any variation in the data?)  Duncan Murdoch
_


esh_realtime_042008
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected GAM result - at least for me!

2008-03-31 Thread Duncan Murdoch
On 3/31/2008 9:01 AM, Monica Pisica wrote:
   Thanks Duncan.
  
 Yes i do have variation in the lidar metrics (be, ch, crr, and home) 
 although i have a quite high correlation between ch and home. But even 
 if i eliminate one metric (either ch or home) i end up with a deviation 
 of 99.99. The species has values of 0 and 1 since i try to predict 
 presence / absence.
  
 Do you think it is still a valid result?

I repeat:  look at the data. Compare the observed and predicted. That's 
the only way to know whether this is reasonable or not.

If you're getting reasonable predictions, then it's a valid fit.  (The 
tests and approximations used in the reported p-values may not be at all 
valid.  I don't know what the requirements are for those in a GAM, but 
if you're getting a perfect fit, then they probably aren't being met.)

Duncan Murdoch


  
 Thanks again,
  
 Monica
 
   Date: Mon, 31 Mar 2008 08:47:48 -0400
   From: [EMAIL PROTECTED]
   To: [EMAIL PROTECTED]
   CC: r-help@r-project.org
   Subject: Re: [R] unexpected GAM result - at least for me!
  
   On 3/31/2008 8:34 AM, Monica Pisica wrote:
   
Hi
   
   
I am afraid i am not understanding something very fundamental 
 and does not matter how much i am looking into the book Generalized 
 Additive Models of S. Wood i still don't understand my result.
   
I am trying to model presence / absence (presence = 1, absence = 0) 
 of a species using some lidar metrics (i have 4 of these). I am using 
 different models and such  and when i used gam i got this very weird 
 (for me) result which i thought it is not possible - or i have no idea 
 how to interpret it.
   
can3.gam - gam(can0~s(be)+s(crr)+s(ch)+s(home), family = 'binomial')
summary(can3.gam)
Family: binomial
Link function: logit
Formula:
can 0 ~ s(be) + s(crr) + s(ch) + s(home)
Parametric coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept) 85.39 162.88 0.524 0.6
Approximate significance of smooth terms:
edf Est.rank Chi.sq p-value
s(be) 1.000 1 0.100 0.751
s(crr) 3.929 8 0.380 1.000
s(ch) 6.820 9 0.396 1.000
s(home) 1.000 1 0.314 0.575
R-sq.(adj) = 1 Deviance explained = 100%
UBRE score = -0.81413 Scale est. = 1 n = 148
   
Is this a perfect fit with no statistical significance, an 
 over-estimating or what It seems that the significance of the 
 smooths terms is null. Of course with such a model i predict perfectly 
 presence / absence of species.
   
Again, i hope you don't mind i'm asking you this. Any explanation 
 will be very much appreciated.
  
   Look at the data. You can get a perfect fit to a logistic regression
   model fairly easily, and it looks as though you've got one. (In fact,
   the huge intercept suggests that all predictions will be 1. Do you
   actually have any variation in the data?)
  
   Duncan Murdoch
 
 
 In a rush? Get real-time answers with Windows Live Messenger. 
 http://www.windowslive.com/messenger/overview.html?ocid=TXT_TAGLM_WL_Refresh_realtime_042008

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.