Alrighty then! Say I create this adaptive bandwidth model using the original dataset "georgia"
coords = cbind(georgia$x, georgia$y) bwsel <- gwr.sel(PctBach ~ TotPop90 + PctRural + PctEld + PctFB + PctPov + PctBlack, data=georgia, adapt=TRUE, coords, gweight=gwr.Gauss, method = "aic" ) bw1 <- gw.adapt(coords, coords, quant=bwsel) model1 <- gwr(PctBach ~ TotPop90 + PctRural + PctEld + PctFB + PctPov + PctBlack, data=georgia, bw=b1, coords, hatmatrix=T) model 1 Suppose I receive an updated data set (same dependent and independent variables) and I wish to test the above model1's ability to predict the dependent variable of these new data points. If this were a basic lm regression in R, I would use the "predict()" command. I wish to better understand how I would do so using a GWR model. I found the below procedure, but I would like to know first if it is capable accomplishing this task, and secondly, if I am specifying it correctly. It seems to me that this procedure, as it stands, doesn't take into account the appropriate bandwidths for the new data, say, "georgiaNewData" PredictionsOfNewData <- gwr(PctBach ~ TotPop90 + PctRural + PctEld + PctFB + PctPov + PctBlack, data=gSRDF, adapt=TRUE, gweight=gwr.Gauss, method = "aic", bandwidth=bw1, predictions=TRUE, fit.points=georgiaNewData) PredictionsOfNewData Thanks in advance for guidance and insight... On Fri, Aug 30, 2013 at 9:01 AM, Roger Bivand <roger.biv...@nhh.no> wrote: > Provide a reproducible code example of your problem using a built in data > set. No reproducible example, no response, as I cannot guess (and likely > nobody else can either) what your specific misunderstanding is. Code using > for example the Georgia data set in the package. You seem to be assuming > that you understand how GWR works, I don't think that you do, so you have > to show what you mean in code. > > Roger > > > On Fri, 30 Aug 2013, Paul Bidanset wrote: > > Roger, >> >> I think all I would like to know is if it is possible to apply a >> calibrated >> GWR model to a hold-out sample, and if so, what the most accurate way to >> do >> so is. I understand the pitfalls of GWR but would like to learn as much as >> I can before progressing to the next spatial methodology I learn in R. >> >> >> On Fri, Aug 30, 2013 at 3:37 AM, Roger Bivand <roger.biv...@nhh.no> >> wrote: >> >> Paul, Luis, >>> >>> I suspect that your speculations are completely wrong-headed. Please >>> provide a reproducible example with a built-in data set, so that there is >>> at least minimal clarity in what you are guessing. Note in addition that >>> GWR as a technique should not be used for anything other than exploration >>> of possible mis-specification in the underlying model with the given >>> data, >>> as patterning in coefficients is induced by GWR for simulated covariates >>> with no pattern. >>> >>> Roger >>> >>> >>> On Fri, 30 Aug 2013, Luis Guerra wrote: >>> >>> Thank you Luis. When calibrating the adaptive model, using adapt=t in >>> the >>> >>>> bandwidth selection created the proportion you speak of, which then >>>>> allowed >>>>> me to create a bandwidth matrix using gwr.adapt. However, this has not >>>>> worked for me with holdout samples. Have you had success in this >>>>> regard? >>>>> >>>>> Now I get what you mean. Let's show an example: >>>>> >>>> >>>> bw <- gwr.sel(var ~ var1, data=yourdata, adapt=TRUE) >>>> m <- gwr(var~var1, data=yourdata, adapt=bw, fit.points=newdata) >>>> >>>> So an adaptative bandwidth (bw) is calculated based on"yourdata", while >>>> you >>>> are fitting "newdata" later on using that previously found bw. I had not >>>> thought about it previously. Let's see whether someone else can help you >>>> (us). >>>> >>>> >>>> I do not know the intended influence of these "fit.points". I would >>>> think >>>> >>>>> that new localized regressions are not calculated, as we're testing the >>>>> model and previous data points' ability to predict for these new ones, >>>>> but >>>>> I could be wrong. My current method, however, is producing much poorer >>>>> results with the holdouts, which I am fairly sure is related to my >>>>> inability to incorporate the new points necessary bandwidths. >>>>> >>>>> Coming back to the previously created example, imagine that "newdata" >>>>> >>>> is a >>>> single point that you want to fit. Imagine now that "yourdata" is a >>>> sample >>>> with 1000 cases. Then you are getting 1000 models with 1000 different >>>> intercepts and 1000 different beta values to adjust var1, rigth? Which >>>> of >>>> all these parameters do you use for fitting "newdata"? And something >>>> else, >>>> what would happen with "newdata" if it is enough far away from >>>> "yourdata" >>>> and we would be using a fixed bandwidth? >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Aug 29, 2013 8:56 PM, "Luis Guerra" <luispelay...@gmail.com> wrote: >>>> >>>>> >>>>> Dear Paul, >>>>> >>>>>> >>>>>> I am dealing with this kind of problems right now, and if I am not >>>>>> wrong, >>>>>> when you want to apply an adaptative bandwidth, you should introduce a >>>>>> value for the "adapt" parameter instead of for the "bandwidth" >>>>>> parameter. >>>>>> This value will be between 0 and 1 and indicates the proportion of >>>>>> cases >>>>>> around your regression point that should be included to estimate each >>>>>> local >>>>>> model. So depending on the amount of points around each case, the >>>>>> model >>>>>> will use a different bandwidth for each point to be fitted. >>>>>> >>>>>> Related to your question, do you know what is the influence of the >>>>>> data >>>>>> introduced in the "data" parameter to the data to be fitted >>>>>> (introduced >>>>>> in >>>>>> the "fit.points" parameter)? I mean, you have to obtain new local >>>>>> models >>>>>> (one for each point to be fitted), so I do not understand whether the >>>>>> "data" parameter is used somehow... >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Luis >>>>>> >>>>>> >>>>>> On Fri, Aug 30, 2013 at 1:26 AM, Paul Bidanset <pbidan...@gmail.com >>>>>> >>>>>>> wrote: >>>>>>> >>>>>> >>>>>> Hi Folks, >>>>>> >>>>>>> >>>>>>> I was curious if anyone has had experience applying an SPGWR model >>>>>>> with >>>>>>> an >>>>>>> adaptive bandwidth matrix to a holdout or validation sample. I am >>>>>>> using >>>>>>> the >>>>>>> "fit.points" command, which does not seem to allow for a new >>>>>>> bandwidth >>>>>>> calibrated around the holdout samples XY coordinates. Any direction >>>>>>> would >>>>>>> be greatly appreciated. I am also open to other viable methods. >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> Paul >>>>>>> >>>>>>> [[alternative HTML version deleted]] >>>>>>> >>>>>>> ______________________________****_________________ >>>>>>> R-sig-Geo mailing list >>>>>>> R-sig-Geo@r-project.org >>>>>>> https://stat.ethz.ch/mailman/****listinfo/r-sig-geo<https://stat.ethz.ch/mailman/**listinfo/r-sig-geo> >>>>>>> <https://**stat.ethz.ch/mailman/listinfo/**r-sig-geo<https://stat.ethz.ch/mailman/listinfo/r-sig-geo> >>>>>>> > >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________****_________________ >>>> R-sig-Geo mailing list >>>> R-sig-Geo@r-project.org >>>> https://stat.ethz.ch/mailman/****listinfo/r-sig-geo<https://stat.ethz.ch/mailman/**listinfo/r-sig-geo> >>>> <https://**stat.ethz.ch/mailman/listinfo/**r-sig-geo<https://stat.ethz.ch/mailman/listinfo/r-sig-geo> >>>> > >>>> >>>> >>>> -- >>> Roger Bivand >>> Department of Economics, NHH Norwegian School of Economics, >>> Helleveien 30, N-5045 Bergen, Norway. >>> voice: +47 55 95 93 55; fax +47 55 95 95 43 >>> e-mail: roger.biv...@nhh.no >>> >>> >>> >> >> >> > -- > Roger Bivand > Department of Economics, NHH Norwegian School of Economics, > Helleveien 30, N-5045 Bergen, Norway. > voice: +47 55 95 93 55; fax +47 55 95 95 43 > e-mail: roger.biv...@nhh.no > > -- Paul Bidanset (757) 412-9217 pbidan...@gmail.com [[alternative HTML version deleted]] _______________________________________________ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo