Re: [R-sig-Geo] Holdout Sampling Adaptive Bandwidth SPGWR

Paul Bidanset Fri, 30 Aug 2013 07:29:34 -0700

Thank you. I'd like to subset into a specific county. Should there be
further partitioning from that level?



On Fri, Aug 30, 2013 at 10:19 AM, Roger Bivand <roger.biv...@nhh.no> wrote:

> On Fri, 30 Aug 2013, Paul Bidanset wrote:
>
>  Alrighty then!
>>
>
> Thanks. Now make this your case by subsetting georgia in a way that
> matches your case (all counties west of x?, random set?), and we may be
> getting closer. In the geographical partition, the fit points are all a
> long way from the data points, in the random case, they aren't grouped in
> the same way. You may also need to run the model twice, passing the fitted
> model (fit.points == data.points) through to the next stage, but I'm unsure
> about that.
>
> Roger
>
>
>> Say I create this adaptive bandwidth model using the original dataset
>> "georgia"
>>
>> coords = cbind(georgia$x, georgia$y)
>> bwsel <- gwr.sel(PctBach ~ TotPop90 + PctRural + PctEld + PctFB + PctPov +
>> PctBlack, data=georgia, adapt=TRUE, coords, gweight=gwr.Gauss, method =
>> "aic" )
>> bw1 <- gw.adapt(coords, coords, quant=bwsel)
>> model1 <- gwr(PctBach ~ TotPop90 + PctRural + PctEld + PctFB + PctPov +
>> PctBlack, data=georgia, bw=b1, coords, hatmatrix=T)
>> model 1
>>
>> Suppose I receive an updated data set (same dependent and independent
>> variables) and I wish to test the above model1's ability to predict the
>> dependent variable of these new data points. If this were a basic lm
>> regression in R, I would use the "predict()" command. I wish to better
>> understand how I would do so using a GWR model. I found the below
>> procedure, but I would like to know first if it is capable accomplishing
>> this task, and secondly, if I am specifying it correctly. It seems to me
>> that this procedure, as it stands, doesn't take into account the
>> appropriate bandwidths for the new data, say, "georgiaNewData"
>>
>> PredictionsOfNewData  <- gwr(PctBach ~ TotPop90 + PctRural + PctEld +
>> PctFB
>> + PctPov + PctBlack, data=gSRDF, adapt=TRUE, gweight=gwr.Gauss, method =
>> "aic",  bandwidth=bw1,
>> predictions=TRUE, fit.points=georgiaNewData)
>> PredictionsOfNewData
>>
>> Thanks in advance for guidance and insight...
>>
>>
>> On Fri, Aug 30, 2013 at 9:01 AM, Roger Bivand <roger.biv...@nhh.no>
>> wrote:
>>
>>  Provide a reproducible code example of your problem using a built in data
>>> set. No reproducible example, no response, as I cannot guess (and likely
>>> nobody else can either) what your specific misunderstanding is. Code
>>> using
>>> for example the Georgia data set in the package. You seem to be assuming
>>> that you understand how GWR works, I don't think that you do, so you have
>>> to show what you mean in code.
>>>
>>> Roger
>>>
>>>
>>> On Fri, 30 Aug 2013, Paul Bidanset wrote:
>>>
>>>  Roger,
>>>
>>>>
>>>> I think all I would like to know is if it is possible to apply a
>>>> calibrated
>>>> GWR model to a hold-out sample, and if so, what the most accurate way to
>>>> do
>>>> so is. I understand the pitfalls of GWR but would like to learn as much
>>>> as
>>>> I can before progressing to the next spatial methodology I learn in R.
>>>>
>>>>
>>>> On Fri, Aug 30, 2013 at 3:37 AM, Roger Bivand <roger.biv...@nhh.no>
>>>> wrote:
>>>>
>>>>  Paul, Luis,
>>>>
>>>>>
>>>>> I suspect that your speculations are completely wrong-headed. Please
>>>>> provide a reproducible example with a built-in data set, so that there
>>>>> is
>>>>> at least minimal clarity in what you are guessing. Note in addition
>>>>> that
>>>>> GWR as a technique should not be used for anything other than
>>>>> exploration
>>>>> of possible mis-specification in the underlying model with the given
>>>>> data,
>>>>> as patterning in coefficients is induced by GWR for simulated
>>>>> covariates
>>>>> with no pattern.
>>>>>
>>>>> Roger
>>>>>
>>>>>
>>>>> On Fri, 30 Aug 2013, Luis Guerra wrote:
>>>>>
>>>>>  Thank you Luis. When calibrating the adaptive model, using adapt=t in
>>>>> the
>>>>>
>>>>>  bandwidth selection created the proportion you speak of, which then
>>>>>>
>>>>>>> allowed
>>>>>>> me to create a bandwidth matrix using gwr.adapt. However, this has
>>>>>>> not
>>>>>>> worked for me with holdout samples. Have you had success in this
>>>>>>> regard?
>>>>>>>
>>>>>>>  Now I get what you mean. Let's show an example:
>>>>>>>
>>>>>>>
>>>>>> bw <- gwr.sel(var ~ var1, data=yourdata, adapt=TRUE)
>>>>>> m <- gwr(var~var1, data=yourdata, adapt=bw, fit.points=newdata)
>>>>>>
>>>>>> So an adaptative bandwidth (bw) is calculated based on"yourdata",
>>>>>> while
>>>>>> you
>>>>>> are fitting "newdata" later on using that previously found bw. I had
>>>>>> not
>>>>>> thought about it previously. Let's see whether someone else can help
>>>>>> you
>>>>>> (us).
>>>>>>
>>>>>>
>>>>>>  I do not know the intended influence of these "fit.points". I would
>>>>>> think
>>>>>>
>>>>>>  that new localized regressions are not calculated, as we're testing
>>>>>>> the
>>>>>>> model and previous data points' ability to predict for these new
>>>>>>> ones,
>>>>>>> but
>>>>>>> I could be wrong. My current method, however, is producing much
>>>>>>> poorer
>>>>>>> results with the holdouts, which I am fairly sure is related to my
>>>>>>> inability to incorporate the new points necessary bandwidths.
>>>>>>>
>>>>>>>  Coming back to the previously created example, imagine that
>>>>>>> "newdata"
>>>>>>>
>>>>>>>  is a
>>>>>> single point that you want to fit. Imagine now that "yourdata" is a
>>>>>> sample
>>>>>> with 1000 cases. Then you are getting 1000 models with 1000 different
>>>>>> intercepts and 1000 different beta values to adjust var1, rigth? Which
>>>>>> of
>>>>>> all these parameters do you use for fitting "newdata"? And something
>>>>>> else,
>>>>>> what would happen with "newdata" if it is enough far away from
>>>>>> "yourdata"
>>>>>> and we would be using a fixed bandwidth?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>  On Aug 29, 2013 8:56 PM, "Luis Guerra" <luispelay...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>>  Dear Paul,
>>>>>>>
>>>>>>>
>>>>>>>> I am dealing with this kind of problems right now, and if I am not
>>>>>>>> wrong,
>>>>>>>> when you want to apply an adaptative bandwidth, you should
>>>>>>>> introduce a
>>>>>>>> value for the "adapt" parameter instead of for the "bandwidth"
>>>>>>>> parameter.
>>>>>>>> This value will be between 0 and 1 and indicates the proportion of
>>>>>>>> cases
>>>>>>>> around your regression point that should be included to estimate
>>>>>>>> each
>>>>>>>> local
>>>>>>>> model. So depending on the amount of points around each case, the
>>>>>>>> model
>>>>>>>> will use a different bandwidth for each point to be fitted.
>>>>>>>>
>>>>>>>> Related to your question, do you know what is the influence of the
>>>>>>>> data
>>>>>>>> introduced in the "data" parameter to the data to be fitted
>>>>>>>> (introduced
>>>>>>>> in
>>>>>>>> the "fit.points" parameter)? I mean, you have to obtain new local
>>>>>>>> models
>>>>>>>> (one for each point to be fitted), so I do not understand whether
>>>>>>>> the
>>>>>>>> "data" parameter is used somehow...
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Luis
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Aug 30, 2013 at 1:26 AM, Paul Bidanset <pbidan...@gmail.com
>>>>>>>>
>>>>>>>>  wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>  Hi Folks,
>>>>>>>>
>>>>>>>>
>>>>>>>>> I was curious if anyone has had experience applying an SPGWR model
>>>>>>>>> with
>>>>>>>>> an
>>>>>>>>> adaptive bandwidth matrix to a holdout or validation sample. I am
>>>>>>>>> using
>>>>>>>>> the
>>>>>>>>> "fit.points" command, which does not seem to allow for a new
>>>>>>>>> bandwidth
>>>>>>>>> calibrated around the holdout samples XY coordinates. Any direction
>>>>>>>>> would
>>>>>>>>> be greatly appreciated.  I am also open to other viable methods.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> Paul
>>>>>>>>>
>>>>>>>>>         [[alternative HTML version deleted]]
>>>>>>>>>
>>>>>>>>> ______________________________******_________________
>>>>>>>>> R-sig-Geo mailing list
>>>>>>>>> R-sig-Geo@r-project.org
>>>>>>>>> https://stat.ethz.ch/mailman/******listinfo/r-sig-geo<https://stat.ethz.ch/mailman/****listinfo/r-sig-geo>
>>>>>>>>> <https://**stat.ethz.ch/mailman/****listinfo/r-sig-geo<https://stat.ethz.ch/mailman/**listinfo/r-sig-geo>
>>>>>>>>> >
>>>>>>>>> <https://**stat.ethz.ch/**mailman/listinfo/**r-sig-geo<http://stat.ethz.ch/mailman/listinfo/**r-sig-geo>
>>>>>>>>> <h**ttps://stat.ethz.ch/mailman/**listinfo/r-sig-geo<https://stat.ethz.ch/mailman/listinfo/r-sig-geo>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>          [[alternative HTML version deleted]]
>>>>>>>>
>>>>>>>
>>>>>> ______________________________******_________________
>>>>>> R-sig-Geo mailing list
>>>>>> R-sig-Geo@r-project.org
>>>>>> https://stat.ethz.ch/mailman/******listinfo/r-sig-geo<https://stat.ethz.ch/mailman/****listinfo/r-sig-geo>
>>>>>> <https://**stat.ethz.ch/mailman/****listinfo/r-sig-geo<https://stat.ethz.ch/mailman/**listinfo/r-sig-geo>
>>>>>> >
>>>>>> <https://**stat.ethz.ch/**mailman/listinfo/**r-sig-geo<http://stat.ethz.ch/mailman/listinfo/**r-sig-geo>
>>>>>> <h**ttps://stat.ethz.ch/mailman/**listinfo/r-sig-geo<https://stat.ethz.ch/mailman/listinfo/r-sig-geo>
>>>>>> >
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>  --
>>>>>>
>>>>> Roger Bivand
>>>>> Department of Economics, NHH Norwegian School of Economics,
>>>>> Helleveien 30, N-5045 Bergen, Norway.
>>>>> voice: +47 55 95 93 55; fax +47 55 95 95 43
>>>>> e-mail: roger.biv...@nhh.no
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>  --
>>> Roger Bivand
>>> Department of Economics, NHH Norwegian School of Economics,
>>> Helleveien 30, N-5045 Bergen, Norway.
>>> voice: +47 55 95 93 55; fax +47 55 95 95 43
>>> e-mail: roger.biv...@nhh.no
>>>
>>>
>>>
>>
>>
>>
> --
> Roger Bivand
> Department of Economics, NHH Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; fax +47 55 95 95 43
> e-mail: roger.biv...@nhh.no
>
>


-- 
Paul Bidanset
(757) 412-9217
pbidan...@gmail.com

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Re: [R-sig-Geo] Holdout Sampling Adaptive Bandwidth SPGWR

Reply via email to