subject:"\[R\] analyzing binomial data with spatially correlated errors"

Re: [R] analyzing binomial data with spatially correlated errors

2008-03-20 Thread Ben Bolker

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Douglas Bates wrote:
| On Wed, Mar 19, 2008 at 3:02 PM, Ben Bolker <[EMAIL PROTECTED]> wrote:
|> Jean-Baptiste Ferdy  univ-montp2.fr> writes:
|>
|>  >
|>  > Dear R users,
|>  >
|>  > I want to explain binomial data by a serie of fixed effects. My
problem is
|>  > that my binomial data  are spatially correlated. Naively, I
thought I could
|>  > found something similar to gls to analyze such data. After some
reading, I
|>  > decided that lmer is probably to tool I need. The model I want to
fit would
|>  > look like
|>  >
|>  > lmer ( cbind(n.success,n.failure) ~ (x1 + x2 + ... + xn)^2 ,
family=binomial,
|>  > correlation=corExp(1,form=~longitude+latitude))
|>  >

| This is more than a notational difference.  In a linear model the
| effect of b is limited to the linear predictor and, through that, the
| mean.  The variance-covariance specification can be separated from the
| mean and, hence, can be specified separately.  It is easy to fool
| yourself into thinking that the same should be true for generalized
| linear models, just like it is easy to fool yourself into thinking
| that all the arguments for the lme function will work unchanged in
| lmer.

~  Fair enough.  I guess the model I was thinking of was

~  Y ~ Binomial(p,N)
~  logit(p) ~ MVN(mu,Sigma)
~  mu = (determined by model matrix and predictors)
~  Sigma = (exponential spatial correlation matrix)*sigma^2

~ This model is certainly different from the model that the original
poster may have been thinking of, because in the limit where there
is no extra-binomial variation, there can't be any correlation either.
On the other hand, it seems to be a sensible model.

~  cheers
~Ben Bolker

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFH4m/zc5UpGjwzenMRAi02AJ9UNy8WsUkN8hVI5ih1yOLxtQn3TwCfeovt
Q1iOmczhkWqi4d4VeZDcylo=
=Mr7v
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] analyzing binomial data with spatially correlated errors

2008-03-20 Thread Douglas Bates

On Wed, Mar 19, 2008 at 3:02 PM, Ben Bolker <[EMAIL PROTECTED]> wrote:
> Jean-Baptiste Ferdy  univ-montp2.fr> writes:
>
>  >
>  > Dear R users,
>  >
>  > I want to explain binomial data by a serie of fixed effects. My problem is
>  > that my binomial data  are spatially correlated. Naively, I thought I could
>  > found something similar to gls to analyze such data. After some reading, I
>  > decided that lmer is probably to tool I need. The model I want to fit would
>  > look like
>  >
>  > lmer ( cbind(n.success,n.failure) ~ (x1 + x2 + ... + xn)^2 , 
> family=binomial,
>  > correlation=corExp(1,form=~longitude+latitude))
>  >
>  > This doesn't work because lmer says it needs a random effect in the model.
>  > And, apart from the spatial random effect that I want to capture by 
> computing
>  > the correlation matrix, I have no other random effect.
>  >
>  > There must be something I do not understand here... I can't get why gls 
> can do
>  > this on gaussian data but lmer can't on binomial ones.
>  >
>
>   This is a hard problem.  The proximal issue is that lmer does not yet
>  include a correlation term (I'm a little surprised you didn't get an
>  error to that effect), and won't for some time since it is still in heavy
>  development for more basic features.

The lmer function does have a ... argument that will swallow up the
correlation argument (and proceed to ignore it).  I think there are
advantages to including a ... argument but one of the disadvantages is
this quietly ignoring arguments that are not defined in the function
(and are not mentioned in any documentation about the function).

> If your data were normal you could
>  use gls from the nlme package, but nlme doesn't do generalized LMMs
>  (only LMMs and NLMMs).  You could *almost* use glmmPQL from the MASS package,
>  which allows you to fit any lme model structure
>  within a GLM 'wrapper', but as far as I know it wraps only lme (
>  which requires at least one random effect) and not gls.

I don't think it is at all trivial to define a model for a binary or
binomial response with a spatial correlation structure.  It may be
well-known; it's just that I don't know how to do it easily.  I just
saw the post by Roger Bivand who gave a reference on one approach
(although usually when one finds oneself using something like a group
factor with only one level to define the random effects it is a sign
that something suspicious is underway.  Expecting to estimate a
variance from a single observation should raise a few red flags.)

The way Jose and I approached correlation structures in lme is to
pre-whiten the data and the model matrices, conditional on the
parameters that determine the (additional) marginal correlation of the
responses.  That works when the conditional distribution of the
response is normal.  I don't see how it could work for a binary or
binomial response.  A linear combination of normals is normal.  A
non-trivial linear combination of binomials is not binomial.

In writing lmer I have found that I must think about the model very
carefully before I can determine a suitable computational method.  I
spent most of the month of January staring at a sequence of
transformations on the whiteboard in my office trying to determine
what should go where and how to implement the whole chain in data
structures and algorithms.

The normal distribution and linear predictors fit together in such a
way that one can factor out correlation structures or variance
functions.  Get away from the normal distribution and things start to
break.

Think of the typical way in which we write a linear model:

y = X b + e

where y is an n-dimensional response, X is an n by p model matrix, b
is a p-dimensional coefficient vector to be estimated and e is the
"noise" term with a multivariate normal distribution that has mean 0.
Now, try to write a generalized linear model that way.  It doesn't
work.  You must express a GLM differently, relating the mean to the
liner predictor, and taking into account the way the variance relates
to the mean.

This is more than a notational difference.  In a linear model the
effect of b is limited to the linear predictor and, through that, the
mean.  The variance-covariance specification can be separated from the
mean and, hence, can be specified separately.  It is easy to fool
yourself into thinking that the same should be true for generalized
linear models, just like it is easy to fool yourself into thinking
that all the arguments for the lme function will work unchanged in
lmer.

>   You could try gee or geoRglm -- neither trivially easy, I think ...
>
>   Ben Bolker
>
>
>
>  __
>  R-help@r-project.org mailing list
>  https://stat.ethz.ch/mailman/listinfo/r-help
>  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>  and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailin

Re: [R] analyzing binomial data with spatially correlated errors

2008-03-20 Thread Rubén Roa-Ureta

Roger Bivand wrote:
> Ben Bolker  ufl.edu> writes:
> 
>> Jean-Baptiste Ferdy  univ-montp2.fr> writes:
>>
>>> Dear R users,
>>>
>>> I want to explain binomial data by a serie of fixed effects. My 
>>> problem is that my binomial data  are spatially correlated. Naively, 
>>> I thought I could found something similar to gls to analyze such
>>> data. After some reading, I decided that lmer is probably to tool
>>> I need. The model I want to fit would look like
>>>
> (...)
>> You could *almost* use glmmPQL from the MASS package,
>> which allows you to fit any lme model structure
>> within a GLM 'wrapper', but as far as I know it wraps only lme (
>> which requires at least one random effect) and not gls.
>>
> 
> The trick used in:
> 
> Dormann, C. F., McPherson, J. M., Araujo, M. B., Bivand, R.,
> Bolliger, J., Carl, G., Davies, R. G., Hirzel, A., Jetz, W., 
> Kissling, W. D., Kühn, I., Ohlemüller, R., Peres-Neto, P. R., 
> Reineking, B., Schröder, B., Schurr, F. M. & Wilson, R. J. (2007): 
> Methods to account for spatial autocorrelation in the analysis of 
> species distributional data: a review. Ecography 30: 609–628
> 
> (see online supplement), is to add a constant term "group", and set 
> random=~1|group. The specific use with a binomial family there is for 
> a (0,1) response, rather than a two-column matrix. 
> 
>>   You could try gee or geoRglm -- neither trivially easy, I think ...
> 
> The same paper includes a GEE adaptation, but for a specific spatial
> configuration rather than a general one.
> 
> Roger Bivand
> 
>>   Ben Bolker

I suggest you also check out the package geoRglm, where you can model 
binomial and Poisson spatially correlated data. I used it to model 
spatially correlated binomial data but without covariates, i.e. without 
your fixed effects (so my model was a logistic regression with the 
intercept only) (Reference below). But I understand that you can add 
covariates and use them to estimate the non-random set of predictors. 
Here is the geoRglm webpage:
http://www.daimi.au.dk/~olefc/geoRglm/
This approach would be like tackling the problem from the point of view 
of geostatistics, rather than from mixed models. But I believe the 
likelihood-based geostatistical model is the same as a generalized 
linear mixed model where the distance is the random effect.
In SAS you can do this using the macro glimmix but from the point of 
view of generalized linear mixed models because there they have 
implemented a correlation term, so that you can identify typical spatial 
correlation functions such as Gauss and exponential, particular cases of 
the Matern family.

Rubén

Roa-Ureta, R. and E.N. Niklitschek (2007) Biomass estimation from 
surveys with likelihood-based geostatistics. ICES Journal of Marine 
Science 64:1723-1734

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] analyzing binomial data with spatially correlated errors

2008-03-20 Thread Roger Bivand

Ben Bolker  ufl.edu> writes:

> 
> Jean-Baptiste Ferdy  univ-montp2.fr> writes:
> 
> > 
> > Dear R users,
> > 
> > I want to explain binomial data by a serie of fixed effects. My 
> > problem is that my binomial data  are spatially correlated. Naively, 
> > I thought I could found something similar to gls to analyze such
> > data. After some reading, I decided that lmer is probably to tool
> > I need. The model I want to fit would look like
> > 
(...)
> You could *almost* use glmmPQL from the MASS package,
> which allows you to fit any lme model structure
> within a GLM 'wrapper', but as far as I know it wraps only lme (
> which requires at least one random effect) and not gls.
> 

The trick used in:

Dormann, C. F., McPherson, J. M., Araujo, M. B., Bivand, R.,
Bolliger, J., Carl, G., Davies, R. G., Hirzel, A., Jetz, W., 
Kissling, W. D., Kühn, I., Ohlemüller, R., Peres-Neto, P. R., 
Reineking, B., Schröder, B., Schurr, F. M. & Wilson, R. J. (2007): 
Methods to account for spatial autocorrelation in the analysis of 
species distributional data: a review. Ecography 30: 609–628

(see online supplement), is to add a constant term "group", and set 
random=~1|group. The specific use with a binomial family there is for 
a (0,1) response, rather than a two-column matrix. 

>   You could try gee or geoRglm -- neither trivially easy, I think ...

The same paper includes a GEE adaptation, but for a specific spatial
configuration rather than a general one.

Roger Bivand

> 
>   Ben Bolker
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] analyzing binomial data with spatially correlated errors

2008-03-19 Thread Ben Bolker

Jean-Baptiste Ferdy  univ-montp2.fr> writes:

> 
> Dear R users,
> 
> I want to explain binomial data by a serie of fixed effects. My problem is 
> that my binomial data  are spatially correlated. Naively, I thought I could 
> found something similar to gls to analyze such data. After some reading, I 
> decided that lmer is probably to tool I need. The model I want to fit would 
> look like
> 
> lmer ( cbind(n.success,n.failure) ~ (x1 + x2 + ... + xn)^2 , family=binomial, 
> correlation=corExp(1,form=~longitude+latitude))
> 
> This doesn't work because lmer says it needs a random effect in the model. 
> And, apart from the spatial random effect that I want to capture by computing 
> the correlation matrix, I have no other random effect.
> 
> There must be something I do not understand here... I can't get why gls can do
> this on gaussian data but lmer can't on binomial ones.
> 

  This is a hard problem.  The proximal issue is that lmer does not yet 
include a correlation term (I'm a little surprised you didn't get an 
error to that effect), and won't for some time since it is still in heavy
development for more basic features.  If your data were normal you could 
use gls from the nlme package, but nlme doesn't do generalized LMMs 
(only LMMs and NLMMs).  You could *almost* use glmmPQL from the MASS package,
which allows you to fit any lme model structure
within a GLM 'wrapper', but as far as I know it wraps only lme (
which requires at least one random effect) and not gls.

  You could try gee or geoRglm -- neither trivially easy, I think ...

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] analyzing binomial data with spatially correlated errors

2008-03-19 Thread Jean-Baptiste Ferdy

Dear R users,

I want to explain binomial data by a serie of fixed effects. My problem is 
that my binomial data  are spatially correlated. Naively, I thought I could 
found something similar to gls to analyze such data. After some reading, I 
decided that lmer is probably to tool I need. The model I want to fit would 
look like

lmer ( cbind(n.success,n.failure) ~ (x1 + x2 + ... + xn)^2 , family=binomial, 
correlation=corExp(1,form=~longitude+latitude))

This doesn't work because lmer says it needs a random effect in the model. 
And, apart from the spatial random effect that I want to capture by computing 
the correlation matrix, I have no other random effect.

There must be something I do not understand here... I can't get why gls can do 
this on gaussian data but lmer can't on binomial ones.

Any help or thought on this would be welcome !
-- 
Jean-Baptiste Ferdy
Institut des Sciences de l'Évolution de Montpellier
CNRS UMR 5554
Université Montpellier 2
34 095 Montpellier cedex 05
tel. +33 (0)4 67 14 42 27
fax  +33 (0)4 67 14 36 22

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] analyzing binomial data with spatially correlated errors

Re: [R] analyzing binomial data with spatially correlated errors

Re: [R] analyzing binomial data with spatially correlated errors

Re: [R] analyzing binomial data with spatially correlated errors

Re: [R] analyzing binomial data with spatially correlated errors

[R] analyzing binomial data with spatially correlated errors

6 matches

Site Navigation

Mail list logo

Footer information