Dear James, In addition to what others have suggested, you may want to try a different modelling approach using zero-inflated models. If you are working on a rare disease, a zero-inflated model can accommodate the high number of zeroes better than standard models.
Best wishes, Virgilio On mar, 2014-01-07 at 09:57 +0000, James Rooney wrote: > Hi Roger, > > Thanks for your reply. Coding the joins is not a problem I've already done > that on a smaller scale in a different project. > > No postcodes in my country. I have polygon data from the census and I have > geocoded cases for every case of a rare disease. This is all pretty much > fixed there is nothing I can do about it. I have performed an analysis based > on about 3500 polygons and that works ok. However the population data has bad > maths properties. There I'm now working with newer data using 18,000 polygons > and the same cases. This population data has better maths properties (i.e. > population per polygon is more symmetrically distributed). But there are too > many polygons - most of the polygons have no cases. So when I do Bayesian > smoothing I just end up with a uniform map of Relative Risk =1 everywhere as > all the polygons with cases are all surrounded by polygons with no cases. > > I figure to get around this I either fiddle with the spatial weighting (seems > unwise), or join polygons in some sensible fashion. My question was really > wondering are there algorithms to deduce a list of polygon joins based on > polygon properties. For example - I don't want to join urban and rural > polygons as I am interested in the association of population density with > incidence rate. I'm also interested in the relationship with social > deprivation - so I don't want to join an area of high deprivation with and > area of low deprivation. Basically I want to know is there a package that > will create me a join list based on such rules ? I can of course write some > code to do it but I was hoping not to have to spend the time on it! > > James > ________________________________________ > From: Roger Bivand [roger.biv...@nhh.no] > Sent: 07 January 2014 08:28 > To: James Rooney > Cc: r-sig-geo@r-project.org > Subject: Re: [R-sig-Geo] algorthirm to join polygons based on population > properties > > On Tue, 7 Jan 2014, James Rooney wrote: > > > Dear all, > > > > I have dataset with very many more polygons than cases. I wish to apply > > Bayesian smoothing to areal disease rates, however I have too many > > polygons and need a smart way to combine them so that there are less > > overall polygons. > > Bascially I need to only combine polygons of similar population density > > and it would be best if the new polygons have a distribution of total > > population that was within a limited range/normally distributed. > > This is not clear. Do you mean density (count/area) or just count? If you > have "too many polygons", then probably you haven't thought through your > sampling design - you need polygons with the correct support for the data > collection protocol used. Are you looking at postcode polygons and sparse > geocoded cases, with many empty postcodes? Are postcodes the relevant > support? > > If you think through support first (Gotway & Young 2002), then ad hoc > aggregation (that's the easy part) may be replaced by appropriate > aggregation (postcodes by health agency, surgery, etc.). The aggregation > can be done with rgeos::gUnaryUnion, but you need a vector assigning > polygons to aggregates first, preferably coded so that the data can be > maptools::spCbind using well-matched row.names of the aggregated > SpatialPolygons and data.frame objects to key on observation IDs. > > First clarity on support, then aggregate polygons to appropriate support, > then merge. Otherwise you are ignoring the uncertainty introduced into > your Bayesian analysis by the aggregation (dfferent aggregations will give > different results). There are good chapters on this in the Handbook of > Spatial Statistics by Gelfand and Wakefield/Lyons. > > Hope this clarifies, > > Roger > > > > > I can of course come up with some way of doing this myself, but I'm not > > keen to reinvent the wheel and so I am wondering - are there any smart > > algorithms already out there for doing this kind of thing ? > > > > Thanks, > > James > > _______________________________________________ > > R-sig-Geo mailing list > > R-sig-Geo@r-project.org > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo > > > > -- > Roger Bivand > Department of Economics, Norwegian School of Economics, > Helleveien 30, N-5045 Bergen, Norway. > voice: +47 55 95 93 55; fax +47 55 95 95 43 > e-mail: roger.biv...@nhh.no > > _______________________________________________ > R-sig-Geo mailing list > R-sig-Geo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-geo _______________________________________________ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo