Re: [postgis-users] Calculate variance of a multipoint
Just wanted to again say thanks for this help from last May. It turns out that my problem was that I was inadvertently geocoding some state highway mile markers to US highways and vice versa. The dataset that identified the events didn't well-distinguish between them...until I found a field named str1 (what a descriptive name!!) that provided the highway designation more clearly. After getting that fixed, the upper end of the variance on my geocoding dropped sharply. Aren On Sat, May 28, 2011 at 8:09 AM, Aren Cambre a...@arencambre.com wrote: Thank you! On Fri, May 27, 2011 at 9:33 PM, Stephen Woodbridge wood...@swoodbridge.com wrote: Aren, Your purposed approach sounds reasonable to me. You can do it all in one query like: select c.gid, sum(c.dist*c.dist)/count(*) as variance from ( select b.gid, b.cent, st_distance(b.geom, b.cent) as dist from ( select a.gid, (st_dump(a.the_geom)).geom as geom, centroid(a.the_geom) as cent from ( select 99 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5)'::geometry as the_geom union all select 88 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5,3 5,9 9)'::geometry as the_geom ) as a ) as b ) as c group by gid order by variance desc; You should be able to replace the select...union all select ... with your table of multipoints. -Steve W On 5/27/2011 6:19 PM, Aren Cambre wrote: Did anyone have thoughts on this? :-) Aren On Wed, May 4, 2011 at 2:12 PM, Aren Cambre a...@arencambre.com mailto:a...@arencambre.com wrote: The more I think about it, is this a job for R? I know I need to start using R at some point, just haven't begun yet. Aren On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com mailto:a...@arencambre.com wrote: Suppose you have a geometry type with a multipoint. How would you calculate the variance of the points in that multipoint? I looked through the PostGIS 1.5 function reference and am not coming up with any easy way. A hard way seems to be using st_centroid(multipoint) to find the multipoint's center. From there, I can calculate the distance of each point from its center, and use that towards calculating the variance (each distance is squared, all squared distances are added together, then divide by number of points). I guess my ultimate need is to measure relative dispersion of multipoints. The multipoints that have the most dispersion are suspect, but I need a way of identifying which ones are like this. Aren ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Calculate variance of a multipoint
I'm not quite clear to me what you are trying to demonstrate - do you want to know the density of the points... relative to their total size (area / number?), or relative to some defined area? cheers Ben On 28/05/2011, at 6:19 AM, Aren Cambre wrote: Did anyone have thoughts on this? :-) Aren On Wed, May 4, 2011 at 2:12 PM, Aren Cambre a...@arencambre.com wrote: The more I think about it, is this a job for R? I know I need to start using R at some point, just haven't begun yet. Aren On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com wrote: Suppose you have a geometry type with a multipoint. How would you calculate the variance of the points in that multipoint? I looked through the PostGIS 1.5 function reference and am not coming up with any easy way. A hard way seems to be using st_centroid(multipoint) to find the multipoint's center. From there, I can calculate the distance of each point from its center, and use that towards calculating the variance (each distance is squared, all squared distances are added together, then divide by number of points). I guess my ultimate need is to measure relative dispersion of multipoints. The multipoints that have the most dispersion are suspect, but I need a way of identifying which ones are like this. Aren ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Calculate variance of a multipoint
It's to help me double check my interpretation of a large dataset. I have a collection of millions of traffic tickets. Each ticket has route name, milepost, and lat/long. I want to see how well the tickets of a particular route/milepost are to each other. E.g., all tickets written for US 71, milepost 204--if they have a very large dispersion, then either I have an error in my analysis or the data is not good. Aren On Sat, May 28, 2011 at 2:32 AM, Ben Madin li...@remoteinformation.com.auwrote: I'm not quite clear to me what you are trying to demonstrate - do you want to know the density of the points... relative to their total size (area / number?), or relative to some defined area? cheers Ben On 28/05/2011, at 6:19 AM, Aren Cambre wrote: Did anyone have thoughts on this? :-) Aren On Wed, May 4, 2011 at 2:12 PM, Aren Cambre a...@arencambre.com wrote: The more I think about it, is this a job for R? I know I need to start using R at some point, just haven't begun yet. Aren On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com wrote: Suppose you have a geometry type with a multipoint. How would you calculate the variance of the points in that multipoint? I looked through the PostGIS 1.5 function reference and am not coming up with any easy way. A hard way seems to be using st_centroid(multipoint) to find the multipoint's center. From there, I can calculate the distance of each point from its center, and use that towards calculating the variance (each distance is squared, all squared distances are added together, then divide by number of points). I guess my ultimate need is to measure relative dispersion of multipoints. The multipoints that have the most dispersion are suspect, but I need a way of identifying which ones are like this. Aren ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Calculate variance of a multipoint
Thank you! On Fri, May 27, 2011 at 9:33 PM, Stephen Woodbridge wood...@swoodbridge.com wrote: Aren, Your purposed approach sounds reasonable to me. You can do it all in one query like: select c.gid, sum(c.dist*c.dist)/count(*) as variance from ( select b.gid, b.cent, st_distance(b.geom, b.cent) as dist from ( select a.gid, (st_dump(a.the_geom)).geom as geom, centroid(a.the_geom) as cent from ( select 99 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5)'::geometry as the_geom union all select 88 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5,3 5,9 9)'::geometry as the_geom ) as a ) as b ) as c group by gid order by variance desc; You should be able to replace the select...union all select ... with your table of multipoints. -Steve W On 5/27/2011 6:19 PM, Aren Cambre wrote: Did anyone have thoughts on this? :-) Aren On Wed, May 4, 2011 at 2:12 PM, Aren Cambre a...@arencambre.com mailto:a...@arencambre.com wrote: The more I think about it, is this a job for R? I know I need to start using R at some point, just haven't begun yet. Aren On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com mailto:a...@arencambre.com wrote: Suppose you have a geometry type with a multipoint. How would you calculate the variance of the points in that multipoint? I looked through the PostGIS 1.5 function reference and am not coming up with any easy way. A hard way seems to be using st_centroid(multipoint) to find the multipoint's center. From there, I can calculate the distance of each point from its center, and use that towards calculating the variance (each distance is squared, all squared distances are added together, then divide by number of points). I guess my ultimate need is to measure relative dispersion of multipoints. The multipoints that have the most dispersion are suspect, but I need a way of identifying which ones are like this. Aren ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Calculate variance of a multipoint
Did anyone have thoughts on this? :-) Aren On Wed, May 4, 2011 at 2:12 PM, Aren Cambre a...@arencambre.com wrote: The more I think about it, is this a job for R? I know I need to start using R at some point, just haven't begun yet. Aren On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com wrote: Suppose you have a geometry type with a multipoint. How would you calculate the variance of the points in that multipoint? I looked through the PostGIS 1.5 function reference and am not coming up with any easy way. A hard way seems to be using st_centroid(multipoint) to find the multipoint's center. From there, I can calculate the distance of each point from its center, and use that towards calculating the variance (each distance is squared, all squared distances are added together, then divide by number of points). I guess my ultimate need is to measure relative dispersion of multipoints. The multipoints that have the most dispersion are suspect, but I need a way of identifying which ones are like this. Aren ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Calculate variance of a multipoint
Aren, Your purposed approach sounds reasonable to me. You can do it all in one query like: select c.gid, sum(c.dist*c.dist)/count(*) as variance from ( select b.gid, b.cent, st_distance(b.geom, b.cent) as dist from ( select a.gid, (st_dump(a.the_geom)).geom as geom, centroid(a.the_geom) as cent from ( select 99 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5)'::geometry as the_geom union all select 88 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5,3 5,9 9)'::geometry as the_geom ) as a ) as b ) as c group by gid order by variance desc; You should be able to replace the select...union all select ... with your table of multipoints. -Steve W On 5/27/2011 6:19 PM, Aren Cambre wrote: Did anyone have thoughts on this? :-) Aren On Wed, May 4, 2011 at 2:12 PM, Aren Cambre a...@arencambre.com mailto:a...@arencambre.com wrote: The more I think about it, is this a job for R? I know I need to start using R at some point, just haven't begun yet. Aren On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com mailto:a...@arencambre.com wrote: Suppose you have a geometry type with a multipoint. How would you calculate the variance of the points in that multipoint? I looked through the PostGIS 1.5 function reference and am not coming up with any easy way. A hard way seems to be using st_centroid(multipoint) to find the multipoint's center. From there, I can calculate the distance of each point from its center, and use that towards calculating the variance (each distance is squared, all squared distances are added together, then divide by number of points). I guess my ultimate need is to measure relative dispersion of multipoints. The multipoints that have the most dispersion are suspect, but I need a way of identifying which ones are like this. Aren ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
[postgis-users] Calculate variance of a multipoint
Suppose you have a geometry type with a multipoint. How would you calculate the variance of the points in that multipoint? I looked through the PostGIS 1.5 function reference and am not coming up with any easy way. A hard way seems to be using st_centroid(multipoint) to find the multipoint's center. From there, I can calculate the distance of each point from its center, and use that towards calculating the variance (each distance is squared, all squared distances are added together, then divide by number of points). I guess my ultimate need is to measure relative dispersion of multipoints. The multipoints that have the most dispersion are suspect, but I need a way of identifying which ones are like this. Aren ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] Calculate variance of a multipoint
The more I think about it, is this a job for R? I know I need to start using R at some point, just haven't begun yet. Aren On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com wrote: Suppose you have a geometry type with a multipoint. How would you calculate the variance of the points in that multipoint? I looked through the PostGIS 1.5 function reference and am not coming up with any easy way. A hard way seems to be using st_centroid(multipoint) to find the multipoint's center. From there, I can calculate the distance of each point from its center, and use that towards calculating the variance (each distance is squared, all squared distances are added together, then divide by number of points). I guess my ultimate need is to measure relative dispersion of multipoints. The multipoints that have the most dispersion are suspect, but I need a way of identifying which ones are like this. Aren ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users