Patrick, Specification of the spatial weights matrix (W) is important, and, in general, the connectedness of the W influences the estimation and inference of the model. When you say that you do not know the "true rho", I suspect you are saying that you do not know the true underlying spatial structure of the data, and thus the appropriate specification of the spatial weights matrix. One tool in the spdep package that may be helpful to you is the sp.correlogram function for spatial correlogram; other techniques have been used including semivariograms. I would be interested in what others have to say regarding determining the optimal level of connectedness of W.
Two classic references regarding connectedness of W are: Florax, R.J.G.M. and Rey, S. 1995 The Impacts of Misspecified Spatial Interaction in Linear Regression Models. In Anselin L, Florax R J G M (eds) New directions in spatial econometrics. Berlin, Springer: 111-135 Bell, K.P. and Bockstael, N.E. 2000. Applying the Generalized-Moment Estimation Approach to Spatial Problems Involving Micro level Data. The Review of Economics and Statistics, February 2000, 82 (1): 72-82. Terry Griffin, Ph.D. Associate Professor - Economics University of Arkansas - Division of Agriculture 501.249.6360 (SMS) tgrif...@uaex.edu ----- Original Message ----- From: "Patrick Downey" <pdow...@urban.org> To: "Roger Bivand" <roger.biv...@nhh.no> Cc: r-sig-...@stat.math.ethz.ch Sent: Tuesday, September 6, 2011 1:02:04 PM Subject: Re: [R-sig-Geo] Simulating spatially autocorrelated data Hi Roger and Terry, Thank you very much for your help and directing me towards Roger's spdep package, which of course had everything I needed. I've now worked through this code and done some additional simulations. I have one remaining question. You say "the larger the distance threshold, the less well the spatial process is captured." I was wondering if you could further provide some information on this, either by explaining or referencing a document or webpage with explanation. Decreasing the distance threshold, as you suggest, radically alters the results and I'm looking for some guidance on how to select the appropriate distance threshold when I don't know the true rho (that is, with non-simulated data). Thanks, Mitch -----Original Message----- From: Roger Bivand [mailto:roger.biv...@nhh.no] Sent: Thursday, September 01, 2011 2:20 PM To: Downey, Patrick Cc: r-sig-...@stat.math.ethz.ch Subject: Re: [R-sig-Geo] Simulating spatially autocorrelated data On Thu, 1 Sep 2011, Downey, Patrick wrote: > Hello all, > > I'm trying to simulate a spatially autocorrelated random variable, and > I cannot figure out what the problem is. All I want is a simple > spatial lag model where > > Y = rho*W*Y + e > > Where e is a vector of iid normal random variables, rho is the > autocorrelation, W is a row-normalized distance matrix (a spatial > weights matrix), and Y is the random variable. > > I thought the following program should do it, but it's not working. At > the end of the program, I calculate Moran's I, and it is not even > close to rejecting the null hypothesis of no spatial autocorrelation, > even when rho is very high (for example, below, rho is 0.95). Can > someone please identify what the problem is and offer some guidance on how to fix it? > > PS - I apologize in advance, but I am not familiar with R's spatial > packages. I've done very little spatial analysis in R, so if there's a > package that can already do this, please recommend. > > BEGIN PROGRAM: > > install.packages("fields");library(fields) > install.packages("ape");library(ape) > > N <- 200 > rho <- 0.95 > > x.coord <- runif(N,0,100) > y.coord <- runif(N,0,100) > > points <- cbind(x.coord,y.coord) > > e <- rnorm(N,0,1) > > dist.nonnorm <- rdist(points,points) # Matrix of Euclidean distances > dist <- dist.nonnorm/rowSums(dist.nonnorm) # Row normalizing the distance > matrix > diag(dist) <- 0 # Ensuring that the main diagonal is exactly 0 I think that you are using the distances as weights, not inverse distances, which seems more sensible. > > I <- diag(N) # Identity matrix (not Moran's I) > > inv <- solve(I-rho.lag*dist) # Inverting (I - rho*W) > y <- as.vector(inv %*% e) # Generating data that is supposed to be > spatially autocorrelated > > Moran.I(y,dist) # Does not reject null hypothesis of no spatial > autocorrelation > As Terry Griffin says, you can use spdep for this: library(spdep) rho <- 0.95 N <- 200 x.coord <- runif(N,0,100) y.coord <- runif(N,0,100) points <- cbind(x.coord,y.coord) e <- rnorm(N,0,1) dnb <- dnearneigh(points, 0, 150) dsts <- nbdists(dnb, points) idw <- lapply(dsts, function(x) 1/x) lw <- nb2listw(dnb, glist=idw, style="W") inv <- invIrW(lw, rho) y <- inv %*% e moran.test(y, lw) to reproduce your analysis with IDW, here without: lw <- nb2listw(dnb, glist=dsts, style="W") inv <- invIrW(lw, rho) y <- inv %*% e moran.test(y, lw) # no autocorrelation and here with a less inclusive distance threshold: dnb <- dnearneigh(points, 0, 15) dsts <- nbdists(dnb, points) idw <- lapply(dsts, function(x) 1/x) lw <- nb2listw(dnb, glist=idw, style="W") inv <- invIrW(lw, rho) y <- inv %*% e moran.test(y, lw) the larger the distance threshold, the less well the spatial process is captured, alternatively use idw <- lapply(dsts, function(x) 1/(x^2)), for example, to attenuate the weights more sharply. Hope this clarifies, Roger > _______________________________________________ > R-sig-Geo mailing list > R-sig-Geo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-geo > -- Roger Bivand Department of Economics, NHH Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: roger.biv...@nhh.no _______________________________________________ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo _______________________________________________ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo