Re: [postgis-users] Calculate variance of a multipoint

2011-08-21 Thread Aren Cambre
Just wanted to again say thanks for this help from last May.

It turns out that my problem was that I was inadvertently geocoding some
state highway mile markers to US highways and vice versa. The dataset that
identified the events didn't well-distinguish between them...until I found a
field named str1 (what a descriptive name!!) that provided the highway
designation more clearly.

After getting that fixed, the upper end of the variance on my geocoding
dropped sharply.

Aren

On Sat, May 28, 2011 at 8:09 AM, Aren Cambre a...@arencambre.com wrote:

 Thank you!


 On Fri, May 27, 2011 at 9:33 PM, Stephen Woodbridge 
 wood...@swoodbridge.com wrote:

 Aren,

 Your purposed approach sounds reasonable to me. You can do it all in one
 query like:

 select c.gid, sum(c.dist*c.dist)/count(*) as variance
  from (
select b.gid, b.cent, st_distance(b.geom, b.cent) as dist
  from (
select a.gid, (st_dump(a.the_geom)).geom as geom,
 centroid(a.the_geom) as cent
  from (
select 99 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5)'::geometry as
 the_geom
union all
select 88 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5,3 5,9
 9)'::geometry as the_geom
) as a
) as b
) as c
  group by gid order by variance desc;

 You should be able to replace the select...union all select ... with your
 table of multipoints.

 -Steve W


 On 5/27/2011 6:19 PM, Aren Cambre wrote:

 Did anyone have thoughts on this? :-)

 Aren

 On Wed, May 4, 2011 at 2:12 PM, Aren Cambre a...@arencambre.com
 mailto:a...@arencambre.com wrote:

The more I think about it, is this a job for R? I know I need to
start using R at some point, just haven't begun yet.

Aren


On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com
mailto:a...@arencambre.com wrote:

Suppose you have a geometry type with a multipoint. How would
you calculate the variance of the points in that multipoint?

I looked through the PostGIS 1.5 function reference and am not
coming up with any easy way.

A hard way seems to be using st_centroid(multipoint) to find the
multipoint's  center. From there, I can calculate the distance
of each point from its center, and use that towards calculating
the variance (each distance is squared, all squared distances
are added together, then divide by number of points).

I guess my ultimate need is to measure relative dispersion of
multipoints. The multipoints that have the most dispersion are
suspect, but I need a way of identifying which ones are like this.

Aren





 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users


 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users



___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Calculate variance of a multipoint

2011-05-28 Thread Ben Madin
I'm not quite clear to me what you are trying to demonstrate - do you want to 
know the density of the points... relative to their total size (area / 
number?), or relative to some defined area?

cheers

Ben


On 28/05/2011, at 6:19 AM, Aren Cambre wrote:

 Did anyone have thoughts on this? :-)
 
 Aren
 
 On Wed, May 4, 2011 at 2:12 PM, Aren Cambre a...@arencambre.com wrote:
 The more I think about it, is this a job for R? I know I need to start using 
 R at some point, just haven't begun yet.
 
 Aren
 
 
 On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com wrote:
 Suppose you have a geometry type with a multipoint. How would you calculate 
 the variance of the points in that multipoint?
 
 I looked through the PostGIS 1.5 function reference and am not coming up with 
 any easy way.
 
 A hard way seems to be using st_centroid(multipoint) to find the multipoint's 
  center. From there, I can calculate the distance of each point from its 
 center, and use that towards calculating the variance (each distance is 
 squared, all squared distances are added together, then divide by number of 
 points).
 
 I guess my ultimate need is to measure relative dispersion of multipoints. 
 The multipoints that have the most dispersion are suspect, but I need a way 
 of identifying which ones are like this.
 
 Aren
 
 
 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users

___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Calculate variance of a multipoint

2011-05-28 Thread Aren Cambre
It's to help me double check my interpretation of a large dataset.

I have a collection of millions of traffic tickets. Each ticket has route
name, milepost, and lat/long. I want to see how well the tickets of a
particular route/milepost are to each other.

E.g., all tickets written for US 71, milepost 204--if they have a very large
dispersion, then either I have an error in my analysis or the data is not
good.

Aren

On Sat, May 28, 2011 at 2:32 AM, Ben Madin
li...@remoteinformation.com.auwrote:

 I'm not quite clear to me what you are trying to demonstrate - do you want
 to know the density of the points... relative to their total size (area /
 number?), or relative to some defined area?

 cheers

 Ben


 On 28/05/2011, at 6:19 AM, Aren Cambre wrote:

 Did anyone have thoughts on this? :-)

 Aren

 On Wed, May 4, 2011 at 2:12 PM, Aren Cambre a...@arencambre.com wrote:

 The more I think about it, is this a job for R? I know I need to start
 using R at some point, just haven't begun yet.

 Aren


 On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com wrote:

 Suppose you have a geometry type with a multipoint. How would you
 calculate the variance of the points in that multipoint?

 I looked through the PostGIS 1.5 function reference and am not coming up
 with any easy way.

 A hard way seems to be using st_centroid(multipoint) to find the
 multipoint's  center. From there, I can calculate the distance of each point
 from its center, and use that towards calculating the variance (each
 distance is squared, all squared distances are added together, then divide
 by number of points).

 I guess my ultimate need is to measure relative dispersion of
 multipoints. The multipoints that have the most dispersion are suspect, but
 I need a way of identifying which ones are like this.

 Aren



 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users



 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users


___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Calculate variance of a multipoint

2011-05-28 Thread Aren Cambre
Thank you!

On Fri, May 27, 2011 at 9:33 PM, Stephen Woodbridge wood...@swoodbridge.com
 wrote:

 Aren,

 Your purposed approach sounds reasonable to me. You can do it all in one
 query like:

 select c.gid, sum(c.dist*c.dist)/count(*) as variance
  from (
select b.gid, b.cent, st_distance(b.geom, b.cent) as dist
  from (
select a.gid, (st_dump(a.the_geom)).geom as geom,
 centroid(a.the_geom) as cent
  from (
select 99 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5)'::geometry as
 the_geom
union all
select 88 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5,3 5,9
 9)'::geometry as the_geom
) as a
) as b
) as c
  group by gid order by variance desc;

 You should be able to replace the select...union all select ... with your
 table of multipoints.

 -Steve W


 On 5/27/2011 6:19 PM, Aren Cambre wrote:

 Did anyone have thoughts on this? :-)

 Aren

 On Wed, May 4, 2011 at 2:12 PM, Aren Cambre a...@arencambre.com
 mailto:a...@arencambre.com wrote:

The more I think about it, is this a job for R? I know I need to
start using R at some point, just haven't begun yet.

Aren


On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com
mailto:a...@arencambre.com wrote:

Suppose you have a geometry type with a multipoint. How would
you calculate the variance of the points in that multipoint?

I looked through the PostGIS 1.5 function reference and am not
coming up with any easy way.

A hard way seems to be using st_centroid(multipoint) to find the
multipoint's  center. From there, I can calculate the distance
of each point from its center, and use that towards calculating
the variance (each distance is squared, all squared distances
are added together, then divide by number of points).

I guess my ultimate need is to measure relative dispersion of
multipoints. The multipoints that have the most dispersion are
suspect, but I need a way of identifying which ones are like this.

Aren





 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users


 ___
 postgis-users mailing list
 postgis-users@postgis.refractions.net
 http://postgis.refractions.net/mailman/listinfo/postgis-users

___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Calculate variance of a multipoint

2011-05-27 Thread Aren Cambre
Did anyone have thoughts on this? :-)

Aren

On Wed, May 4, 2011 at 2:12 PM, Aren Cambre a...@arencambre.com wrote:

 The more I think about it, is this a job for R? I know I need to start
 using R at some point, just haven't begun yet.

 Aren


 On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com wrote:

 Suppose you have a geometry type with a multipoint. How would you
 calculate the variance of the points in that multipoint?

 I looked through the PostGIS 1.5 function reference and am not coming up
 with any easy way.

 A hard way seems to be using st_centroid(multipoint) to find the
 multipoint's  center. From there, I can calculate the distance of each point
 from its center, and use that towards calculating the variance (each
 distance is squared, all squared distances are added together, then divide
 by number of points).

 I guess my ultimate need is to measure relative dispersion of multipoints.
 The multipoints that have the most dispersion are suspect, but I need a way
 of identifying which ones are like this.

 Aren



___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Calculate variance of a multipoint

2011-05-27 Thread Stephen Woodbridge

Aren,

Your purposed approach sounds reasonable to me. You can do it all in one 
query like:


select c.gid, sum(c.dist*c.dist)/count(*) as variance
  from (
select b.gid, b.cent, st_distance(b.geom, b.cent) as dist
  from (
select a.gid, (st_dump(a.the_geom)).geom as geom, 
centroid(a.the_geom) as cent

  from (
select 99 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5)'::geometry 
as the_geom

union all
select 88 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5,3 5,9 
9)'::geometry as the_geom

) as a
) as b
) as c
 group by gid order by variance desc;

You should be able to replace the select...union all select ... with 
your table of multipoints.


-Steve W

On 5/27/2011 6:19 PM, Aren Cambre wrote:

Did anyone have thoughts on this? :-)

Aren

On Wed, May 4, 2011 at 2:12 PM, Aren Cambre a...@arencambre.com
mailto:a...@arencambre.com wrote:

The more I think about it, is this a job for R? I know I need to
start using R at some point, just haven't begun yet.

Aren


On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com
mailto:a...@arencambre.com wrote:

Suppose you have a geometry type with a multipoint. How would
you calculate the variance of the points in that multipoint?

I looked through the PostGIS 1.5 function reference and am not
coming up with any easy way.

A hard way seems to be using st_centroid(multipoint) to find the
multipoint's  center. From there, I can calculate the distance
of each point from its center, and use that towards calculating
the variance (each distance is squared, all squared distances
are added together, then divide by number of points).

I guess my ultimate need is to measure relative dispersion of
multipoints. The multipoints that have the most dispersion are
suspect, but I need a way of identifying which ones are like this.

Aren





___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


[postgis-users] Calculate variance of a multipoint

2011-05-04 Thread Aren Cambre
Suppose you have a geometry type with a multipoint. How would you calculate
the variance of the points in that multipoint?

I looked through the PostGIS 1.5 function reference and am not coming up
with any easy way.

A hard way seems to be using st_centroid(multipoint) to find the
multipoint's  center. From there, I can calculate the distance of each point
from its center, and use that towards calculating the variance (each
distance is squared, all squared distances are added together, then divide
by number of points).

I guess my ultimate need is to measure relative dispersion of multipoints.
The multipoints that have the most dispersion are suspect, but I need a way
of identifying which ones are like this.

Aren
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Calculate variance of a multipoint

2011-05-04 Thread Aren Cambre
The more I think about it, is this a job for R? I know I need to start using
R at some point, just haven't begun yet.

Aren

On Wed, May 4, 2011 at 1:42 PM, Aren Cambre a...@arencambre.com wrote:

 Suppose you have a geometry type with a multipoint. How would you calculate
 the variance of the points in that multipoint?

 I looked through the PostGIS 1.5 function reference and am not coming up
 with any easy way.

 A hard way seems to be using st_centroid(multipoint) to find the
 multipoint's  center. From there, I can calculate the distance of each point
 from its center, and use that towards calculating the variance (each
 distance is squared, all squared distances are added together, then divide
 by number of points).

 I guess my ultimate need is to measure relative dispersion of multipoints.
 The multipoints that have the most dispersion are suspect, but I need a way
 of identifying which ones are like this.

 Aren

___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users