Do provide a complete reproducible example. I really appeal to all posting questions to give potential helpers something to work on. Asking for reproducible examples is the absolutely dominant response to postings that lack them, if they get any response at all.

Start with this and work backwards until you can reproduce your misunderstanding:

col <- st_read(system.file("shapes/columbus.shp", package="spData"))
train <- col[col$EW == 1,]
test <- col[col$EW == 0,]
col.nb <- spdep::poly2nb(col)
train.nb <- spdep::poly2nb(train)
test.nb <- spdep::poly2nb(test)
attr(col.nb, "region.id")
attr(train.nb, "region.id")
attr(test.nb, "region.id")
train.mod <- lagsarlm(CRIME ~ INC + HOVAL, data=train,
  listw=spdep::nb2listw(train.nb))
try(preds <- predict(train.mod, newdata=test,
  listw=spdep::nb2listw(test.nb)))
preds[2]
try(preds1 <- predict(train.mod, newdata=col,
  listw=spdep::nb2listw(col.nb)))
# warning


preds1[4]
try(preds2 <- predict(train.mod, newdata=test,
  listw=spdep::nb2listw(col.nb)))
preds2[2]

Using the complete set of weights permits the spatial process to flow between neighbouring members of train/test sets.

Your problem is probably that your two data objects do not use row.names as expected:

attr(test.nb, "region.id") <- as.character(1:length(test.nb))
attr(train.nb, "region.id") <- as.character(1:length(train.nb))
train.mod1 <- lagsarlm(CRIME ~ INC + HOVAL, data=train,
  listw=spdep::nb2listw(train.nb))
try(preds3 <- predict(train.mod, newdata=test,
  listw=spdep::nb2listw(test.nb)))
# Error in predict.sarlm(train.mod, newdata = test, listw = # spdep::nb2listw(test.nb)) : # mismatch between newdata and spatial weights. newdata should have # region.id as row.names

as is obvious. So when the predict method is trying to assign the newdata neighbours (it needs to identify the correct rows in newdata based on the "region.id" attribute of the provided weights), it fails as described.

Use the whole data weights when predicting for the test set newdata=, or if the two graphs do not neighbour each other, that is train.nb is separate from test.nb (think two islands), make sure that the region.ids and row.names do not overlap between test and train sets.

Please use the example to explore the problem in your workflow, (re-)read Goulard et al. (2017), and the help page, and report back. Remember that you can only predict for a test set of reasonable size (because as you see from the underlying article, you probably need an inverted nxn matrix in the spatial lag model case).

Hope this clarifies

Roger




On Mon, 8 Jul 2019, Jiawen Ng wrote:

Another question on predict.sarlm!

Here is the line of code that is producing the error:
pred <- spatialreg::predict.sarlm(model, df, test.listw,zero.policy = T)

Here is the error:

Error in mat2listw(W, row.names = region.id.mixed, style = style) :
 non-unique row.names given
In addition: Warning messages:
1: In spatialreg::predict.sarlm(model, df, test.listw,  :
 some region.id are both in data and newdata
2: In subset(attr(listw.mixed, "region.id"), attr(listw.mixed, "region.id")
%in%  :
 longer object length is not a multiple of shorter object length

Any idea how I can solve the non-unique row.names error?

Thank you!

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: roger.biv...@nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Reply via email to