Re: [R-sig-eco] low predicted vales in GAMs (Anna Renwick)

2009-12-12 Thread Highland Statistics Ltd.



--

Message: 1
Date: Fri, 11 Dec 2009 11:43:40 -
From: Anna Renwick anna.renw...@bto.org
Subject: [R-sig-eco] low predicted vales in GAMs
To: r-sig-ecology@r-project.org
Message-ID: bfd6df2c5ca142c58c272652fa017...@btodomain.bto.org
Content-Type: text/plain

Dear All

 


I have come across a problem with the GAM models I am running. Basically the
predicted values are consistently only about 0.4 of the actual values. 

 


A bit more detail:

MODEL:

m4-gam(count~s(east,north,k=10)+ez+cv01+cv03+cv04+cv05+cv07+mtemp+mtotalrai
n+ez:mtemp+ez:mtotalrain+

offset(log(fit.vec)),

weights=wt,

data=spat6,

family=quasipoisson,

start=rep(0,26)

)

MODEL SUMMARY:

 

Family: quasipoisson 

Link function: log 

 


Formula:

count ~ s(east, north, k = 10) + ez + cv01 + cv03 + cv04 + cv05 + 


cv07 + mtemp + mtotalrain + ez:mtemp + ez:mtotalrain +
offset(log(fit.vec))

 


Parametric coefficients:

 Estimate Std. Error   t value Pr(|t|)

(Intercept)-5.296e+00  1.846e+00-2.869 0.004166 ** 

ezM 1.651e+00  2.102e+00 0.785 0.432397


ezP 7.358e+00  2.047e+00 3.595 0.000332 ***

ezU-1.061e+02  1.064e+07 -9.97e-06 0.92


cv017.405e-02  5.437e-0313.620   2e-16 ***

cv032.258e-02  5.145e-03 4.389 1.20e-05 ***

cv042.878e-02  4.839e-03 5.949 3.18e-09 ***

cv053.634e-02  5.326e-03 6.823 1.17e-11 ***

cv072.370e-02  5.712e-03 4.149 3.48e-05 ***

mtemp  -1.838e-01  1.750e-01-1.050 0.293900


mtotalrain  1.872e-02  5.072e-03 3.692 0.000229 ***

ezM:mtemp   6.181e-02  2.204e-01 0.280 0.779197


ezP:mtemp  -7.028e-01  2.050e-01-3.429 0.000619 ***

ezU:mtemp   8.697e-01  1.371e+06  6.34e-07 0.99


ezM:mtotalrain -3.393e-02  5.799e-03-5.851 5.68e-09 ***

ezP:mtotalrain -1.901e-02  5.379e-03-3.535 0.000417 ***

ezU:mtotalrain  3.510e-02  4.074e+04  8.62e-07 0.99


---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

 


Approximate significance of smooth terms:

edf Ref.df F p-value


s(east,north) 8.736  8.736 28.88  2e-16 ***

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

 


R-sq.(adj) =  0.324   Deviance explained = -5.12e+03%

GCV score = 39.556  Scale est. = 39.056n = 2038

 

 


Count = bird counts/square

  


Is this really an integer?



ez=environmental zone

cv = habitat types

mtemp = mean annual temperature

mtotalrain= mean total rain/year

 


Sample size is approximately 2000.

 


The offset fit.vec is bird detectability and the weighting is based on the
number of squares in each area surveyed. I belief that the strange deviance
explained is due to the weighting we have added into the model.

  
Why would you use a weighting factor in a Poisson/quasi-Poisson GLM/GAM? 
See also the weights text for the help file for glm. Not sure what it 
would be doing.


 


I would have assumed that the predicted values divided by the real counts
should be around 1, however they are much lower and hence the model is
consistently predicting lower counts than were observed. I was wondering if
there is anything obvious which I am missing when carrying out these models.

  


you seem to have a very large overdispersion. But that is another 
problem. I think your number of squares should actually be used in the 
offset (the log obviously).


Alain

 


Many thanks,

Anna

 


Dr Anna R. Renwick
Research Ecologist
British Trust for Ornithology, 
The Nunnery, 
Thetford, 
Norfolk, 
IP24 2PU, 
UK
Tel: +44 (0)1842 750050; Fax: +44 (0)1842 750030 

 



[[alternative HTML version deleted]]



--

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


End of R-sig-ecology Digest, Vol 21, Issue 12
*

  



--


Dr. Alain F. Zuur
First author of:

1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
URL: www.springer.com/0-387-45967-7


2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.
http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9


3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
http://www.springer.com/statistics/computational/book/978-0-387-93836-3


Other books: http://www.highstat.com/books.htm


Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Tel: 0044 1358 788177
Email: highs...@highstat.com
URL: www.highstat.com
URL: www.brodgar.com


Re: [R-sig-eco] low predicted vales in GAMs

2009-12-12 Thread Nicholas Lewin-Koh
Hi Anna,
A couple of thoughts:
Did you try fitting a straight Poisson model? The quasi Poisson model is
assuming
the variance is not a strict function of the mean, and that may be
interacting with your
weighting function. Also how exactly are the weights defined? is it (#
squares counted)/(total possible squares) ?
Or can weights be greater than 1?
Did you try fitting without weights and offset? 
And lastly are your bird counts highly clustered? ie lots of 0's and
then some high counts?
because the model will probably over smooth the high counts. 
Since this is a spatial model you might want to look at geoRglm. Or try
cozigam
and try to model zero-inflation (if that is the case)

Hope this helps
Nicholas

 Message: 1
 Date: Fri, 11 Dec 2009 11:43:40 -
 From: Anna Renwick anna.renw...@bto.org
 Subject: [R-sig-eco] low predicted vales in GAMs
 To: r-sig-ecology@r-project.org
 Message-ID: bfd6df2c5ca142c58c272652fa017...@btodomain.bto.org
 Content-Type: text/plain
 
 Dear All
 
  
 
 I have come across a problem with the GAM models I am running. Basically
 the
 predicted values are consistently only about 0.4 of the actual values. 
 
  
 
 A bit more detail:
 
 MODEL:
 
 m4-gam(count~s(east,north,k=10)+ez+cv01+cv03+cv04+cv05+cv07+mtemp+mtotalrai
 n+ez:mtemp+ez:mtotalrain+
 
 offset(log(fit.vec)),
 
 weights=wt,
 
 data=spat6,
 
 family=quasipoisson,
 
 start=rep(0,26)
 
 )
 
 MODEL SUMMARY:
 
  
 
 Family: quasipoisson 
 
 Link function: log 
 
  
 
 Formula:
 
 count ~ s(east, north, k = 10) + ez + cv01 + cv03 + cv04 + cv05 + 
 
 cv07 + mtemp + mtotalrain + ez:mtemp + ez:mtotalrain +
 offset(log(fit.vec))
 
  
 
 Parametric coefficients:
 
  Estimate Std. Error   t value Pr(|t|)
 
 (Intercept)-5.296e+00  1.846e+00-2.869 0.004166 ** 
 
 ezM 1.651e+00  2.102e+00 0.785 0.432397
 
 ezP 7.358e+00  2.047e+00 3.595 0.000332 ***
 
 ezU-1.061e+02  1.064e+07 -9.97e-06 0.92
 
 cv017.405e-02  5.437e-0313.620   2e-16 ***
 
 cv032.258e-02  5.145e-03 4.389 1.20e-05 ***
 
 cv042.878e-02  4.839e-03 5.949 3.18e-09 ***
 
 cv053.634e-02  5.326e-03 6.823 1.17e-11 ***
 
 cv072.370e-02  5.712e-03 4.149 3.48e-05 ***
 
 mtemp  -1.838e-01  1.750e-01-1.050 0.293900
 
 mtotalrain  1.872e-02  5.072e-03 3.692 0.000229 ***
 
 ezM:mtemp   6.181e-02  2.204e-01 0.280 0.779197
 
 ezP:mtemp  -7.028e-01  2.050e-01-3.429 0.000619 ***
 
 ezU:mtemp   8.697e-01  1.371e+06  6.34e-07 0.99
 
 ezM:mtotalrain -3.393e-02  5.799e-03-5.851 5.68e-09 ***
 
 ezP:mtotalrain -1.901e-02  5.379e-03-3.535 0.000417 ***
 
 ezU:mtotalrain  3.510e-02  4.074e+04  8.62e-07 0.99
 
 ---
 
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 
  
 
 Approximate significance of smooth terms:
 
 edf Ref.df F p-value
 
 s(east,north) 8.736  8.736 28.88  2e-16 ***
 
 ---
 
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 
  
 
 R-sq.(adj) =  0.324   Deviance explained = -5.12e+03%
 
 GCV score = 39.556  Scale est. = 39.056n = 2038
 
  
 
  
 
 Count = bird counts/square
 
 ez=environmental zone
 
 cv = habitat types
 
 mtemp = mean annual temperature
 
 mtotalrain= mean total rain/year
 
  
 
 Sample size is approximately 2000.
 
  
 
 The offset fit.vec is bird detectability and the weighting is based on
 the
 number of squares in each area surveyed. I belief that the strange
 deviance
 explained is due to the weighting we have added into the model.
 
  
 
 I would have assumed that the predicted values divided by the real counts
 should be around 1, however they are much lower and hence the model is
 consistently predicting lower counts than were observed. I was wondering
 if
 there is anything obvious which I am missing when carrying out these
 models.
 
  
 
 Many thanks,
 
 Anna
 
  
 
 Dr Anna R. Renwick
 Research Ecologist
 British Trust for Ornithology, 
 The Nunnery, 
 Thetford, 
 Norfolk, 
 IP24 2PU, 
 UK
 Tel: +44 (0)1842 750050; Fax: +44 (0)1842 750030 
 
  
 
 
   [[alternative HTML version deleted]]
 
 
 
 --
 
 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
 
 
 End of R-sig-ecology Digest, Vol 21, Issue 12
 *

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology