Re: [postgis-users] Geocoder 2010, Pg 9, PostGIS 1.5

2011-05-28 Thread Johnathan Leppert
What are the exact errors and in what part of create_geocode.sql is it
failing?

Thanks,

Johnathan

On Sun, May 29, 2011 at 2:28 AM,  wrote:

>
>
>   I tried hooking up the Geocoder for TIGER 2010 as found in postgis_trunk
> but it is failing.. First try was using PostGIS 2.0 trunk, after problems
> next was
> with PostGIS 1.5.. both with Postgresql 9.0 under linux
>
>   It seems if I add the schema tiger_data to search_path
> then run create_geocode.sql again manually, it gets a little
> further.. but still, I end up with a database that looks like this:
>
>  List of relations
>  Schema | Name  |   Type   | Owner |Size|
> Description
>
> +---+--+---++-
>  public | geography_columns | view | dbb   | 0 bytes|
>  public | geometry_columns  | table| dbb   | 0 bytes|
>  public | spatial_ref_sys   | table| dbb   | 0 bytes|
>  tiger  | addr  | table| dbb   | 0 bytes|
>  tiger  | addr_gid_seq  | sequence | dbb   | 8192 bytes |
>  tiger  | county| table| dbb   | 0 bytes|
>  tiger  | county_gid_seq| sequence | dbb   | 8192 bytes |
>  tiger  | county_lookup | table| dbb   | 0 bytes|
>  tiger  | countysub_lookup  | table| dbb   | 0 bytes|
>  tiger  | cousub| table| dbb   | 0 bytes|
>  tiger  | cousub_gid_seq| sequence | dbb   | 8192 bytes |
>  tiger  | direction_lookup  | table| dbb   | 8192 bytes |
>  tiger  | edges | table| dbb   | 0 bytes|
>  tiger  | edges_gid_seq | sequence | dbb   | 8192 bytes |
>  tiger  | faces | table| dbb   | 0 bytes|
>  tiger  | faces_gid_seq | sequence | dbb   | 8192 bytes |
>  tiger  | featnames | table| dbb   | 0 bytes|
>  tiger  | featnames_gid_seq | sequence | dbb   | 8192 bytes |
>  tiger  | loader_lookuptables   | table| dbb   | 16 kB  |
>  tiger  | loader_platform   | table| dbb   | 8192 bytes |
>  tiger  | loader_variables  | table| dbb   | 8192 bytes |
>  tiger  | place | table| dbb   | 0 bytes|
>  tiger  | place_gid_seq | sequence | dbb   | 8192 bytes |
>  tiger  | place_lookup  | table| dbb   | 0 bytes|
>  tiger  | secondary_unit_lookup | table| dbb   | 8192 bytes |
>  tiger  | state | table| dbb   | 0 bytes|
>  tiger  | state_gid_seq | sequence | dbb   | 8192 bytes |
>  tiger  | state_lookup  | table| dbb   | 8192 bytes |
>  tiger  | county_lookup | table| dbb   | 0 bytes|
>  tiger  | countysub_lookup  | table| dbb   | 0 bytes|
>  tiger  | cousub| table| dbb   | 0 bytes|
>  tiger  | cousub_gid_seq| sequence | dbb   | 8192 bytes |
>  tiger  | direction_lookup  | table| dbb   | 8192 bytes |
>  tiger  | edges | table| dbb   | 0 bytes|
>  tiger  | edges_gid_seq | sequence | dbb   | 8192 bytes |
>  tiger  | faces | table| dbb   | 0 bytes|
>  tiger  | faces_gid_seq | sequence | dbb   | 8192 bytes |
>  tiger  | featnames | table| dbb   | 0 bytes|
>  tiger  | featnames_gid_seq | sequence | dbb   | 8192 bytes |
>  tiger  | loader_lookuptables   | table| dbb   | 16 kB  |
>  tiger  | loader_platform   | table| dbb   | 8192 bytes |
>  tiger  | loader_variables  | table| dbb   | 8192 bytes |
>  tiger  | place | table| dbb   | 0 bytes|
>  tiger  | place_gid_seq | sequence | dbb   | 8192 bytes |
>  tiger  | place_lookup  | table| dbb   | 0 bytes|
>  tiger  | secondary_unit_lookup | table| dbb   | 8192 bytes |
>  tiger  | state | table| dbb   | 0 bytes|
>  tiger  | state_gid_seq | sequence | dbb   | 8192 bytes |
>  tiger  | state_lookup  | table| dbb   | 8192 bytes |
>  tiger  | street_type_lookup| table| dbb   | 24 kB  |
>  tiger  | zcta500   | table| dbb   | 0 bytes|
>  tiger  | zcta500_gid_seq   | sequence | dbb   | 8192 bytes |
>  tiger  | zip_lookup| table| dbb   | 0 bytes|
>  tiger  | zip_lookup_all| table| dbb   | 0 bytes|
>  tiger  | zip_lookup_base   | table| dbb   | 0 bytes|
>  tiger  | zip_state | table| dbb   | 0 bytes|
>  tiger  | zip_state_loc | table| dbb   | 0 bytes|
>
>
> ___
> postgis-users mailing list
> postgis-users@postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>



-- 
*Johnathan*
Software Architect & Developer
Columbus, Ohio
*Follow me on Twitter: @iamleppert *
___

[postgis-users] Geocoder 2010, Pg 9, PostGIS 1.5

2011-05-28 Thread maplabs



  I tried hooking up the Geocoder for TIGER 2010 as found in postgis_trunk
but it is failing.. First try was using PostGIS 2.0 trunk, after 
problems next was 

with PostGIS 1.5.. both with Postgresql 9.0 under linux

  It seems if I add the schema tiger_data to search_path 
then run create_geocode.sql again manually, it gets a little
further.. but still, I end up with a database that looks like this:

                             List of relations
 Schema |         Name          |   Type   | Owner |   
 Size    | Description 

+---+--+---++-
 public | geography_columns     | view     | dbb   | 0 bytes    | 
 public | geometry_columns      | table    | dbb   | 0 bytes    | 
 public | spatial_ref_sys       | table    | dbb   | 0 bytes    | 
 tiger  | addr                  | table    | dbb   | 0 
bytes    | 

 tiger  | addr_gid_seq          | sequence | dbb   | 8192 bytes | 
 tiger  | county                | table    | dbb   | 0 
bytes    | 

 tiger  | county_gid_seq        | sequence | dbb   | 8192 bytes | 
 tiger  | county_lookup         | table    | dbb   | 0 bytes    | 
 tiger  | countysub_lookup      | table    | dbb   | 0 bytes    | 
 tiger  | cousub                | table    | dbb   | 0 
bytes    | 

 tiger  | cousub_gid_seq        | sequence | dbb   | 8192 bytes | 
 tiger  | direction_lookup      | table    | dbb   | 8192 bytes | 
 tiger  | edges                 | table    | dbb   | 0 
bytes    | 

 tiger  | edges_gid_seq         | sequence | dbb   | 8192 bytes | 
 tiger  | faces                 | table    | dbb   | 0 
bytes    | 

 tiger  | faces_gid_seq         | sequence | dbb   | 8192 bytes | 
 tiger  | featnames             | table    | dbb   | 0 bytes 
   | 

 tiger  | featnames_gid_seq     | sequence | dbb   | 8192 bytes | 
 tiger  | loader_lookuptables   | table    | dbb   | 16 kB      | 
 tiger  | loader_platform       | table    | dbb   | 8192 bytes | 
 tiger  | loader_variables      | table    | dbb   | 8192 bytes | 
 tiger  | place                 | table    | dbb   | 0 
bytes    | 

 tiger  | place_gid_seq         | sequence | dbb   | 8192 bytes | 
 tiger  | place_lookup          | table    | dbb   | 0 bytes 
   | 

 tiger  | secondary_unit_lookup | table    | dbb   | 8192 bytes | 
 tiger  | state                 | table    | dbb   | 0 
bytes    | 

 tiger  | state_gid_seq         | sequence | dbb   | 8192 bytes | 
 tiger  | state_lookup          | table    | dbb   | 8192 bytes | 
 tiger  | county_lookup         | table    | dbb   | 0 bytes    | 
 tiger  | countysub_lookup      | table    | dbb   | 0 bytes    | 
 tiger  | cousub                | table    | dbb   | 0 
bytes    | 

 tiger  | cousub_gid_seq        | sequence | dbb   | 8192 bytes | 
 tiger  | direction_lookup      | table    | dbb   | 8192 bytes | 
 tiger  | edges                 | table    | dbb   | 0 
bytes    | 

 tiger  | edges_gid_seq         | sequence | dbb   | 8192 bytes | 
 tiger  | faces                 | table    | dbb   | 0 
bytes    | 

 tiger  | faces_gid_seq         | sequence | dbb   | 8192 bytes | 
 tiger  | featnames             | table    | dbb   | 0 bytes 
   | 

 tiger  | featnames_gid_seq     | sequence | dbb   | 8192 bytes | 
 tiger  | loader_lookuptables   | table    | dbb   | 16 kB      | 
 tiger  | loader_platform       | table    | dbb   | 8192 bytes | 
 tiger  | loader_variables      | table    | dbb   | 8192 bytes | 
 tiger  | place                 | table    | dbb   | 0 
bytes    | 

 tiger  | place_gid_seq         | sequence | dbb   | 8192 bytes | 
 tiger  | place_lookup          | table    | dbb   | 0 bytes 
   | 

 tiger  | secondary_unit_lookup | table    | dbb   | 8192 bytes | 
 tiger  | state                 | table    | dbb   | 0 
bytes    | 

 tiger  | state_gid_seq         | sequence | dbb   | 8192 bytes | 
 tiger  | state_lookup          | table    | dbb   | 8192 bytes | 
 tiger  | street_type_lookup    | table    | dbb   | 24 kB      | 
 tiger  | zcta500               | table    | dbb   | 0 
bytes    | 

 tiger  | zcta500_gid_seq       | sequence | dbb   | 8192 bytes | 
 tiger  | zip_lookup            | table    | dbb   | 0 bytes 
   | 

 tiger  | zip_lookup_all        | table    | dbb   | 0 bytes    | 
 tiger  | zip_lookup_base       | table    | dbb   | 0 bytes    | 
 tiger  | zip_state             | table    | dbb   | 0 bytes 
   | 

 tiger  | zip_state_loc         | table    | dbb   | 0 bytes    | 


___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Fw: re: Geocoder (from extras)

2011-05-28 Thread Stephen Woodbridge
I have had similar performance experiences working with tiger data in 
other applications, ie not this geocoder, where queries cost about 
200-400ms  initially querying a 30 million record streets table and then 
go to 20-40ms afterwards. I have always attributed this to page caching. 
My queries are typically spatial in nature and I cluster my data based 
on the spatial index. But for the geocoder, I would expect similar 
performance if you cluster your data by zipcode and then sort your input 
data by zipcode, you should get very good performance depending on your 
queries and indexes.


-Steve

On 5/28/2011 5:24 PM, Paragon Corporation wrote:

Mikal,
Can you send me the change you made and the indexes you added. When
adding some more data, I realized I had hardcoded an index for our local
state (MA) and I know without that index that that particular query does
run pretty slow. So just wondering if its along the same lines.
I've also fixed I think all the issues with running the loader on
Unix/Linux -- well at least I was able to get it to run on my CentOS.
Thanks to all who contributed input to that. I took bits and pieces from
many people's comments but couldn't apply a full diff from anyones since
I had already changed the code too much to safely apply any of those
patches.
How many states do you have loaded BTW? I just have CA loaded on my
CentOS -- which is an 8GB/8 core cloud server. I'm getting around 38ms -
450ms per test, but I have yet to load the other states.
It also seems to cache very well so that if I geocode an address on the
same street (not necessarily same address), the first call might take
450ms and the second 38ms. I suspect this might be because I also marked
a good chunk of the functions STABLE or IMMUTABLE.
Thanks,
Regina
http://www.postgis.us

*From:* postgis-users-boun...@postgis.refractions.net
[mailto:postgis-users-boun...@postgis.refractions.net] *On Behalf Of
*Mikal Laster
*Sent:* Friday, May 27, 2011 8:18 AM
*To:* postgis-users@postgis.refractions.net
*Subject:* [postgis-users] Fw: re: Geocoder (from extras)



--- On *Fri, 5/27/11, Mikal Laster //* wrote:


From: Mikal Laster 
Subject: re:[postgis-users] Geocoder (from extras)
To: postgis-users@postgis.refractions.net
Date: Friday, May 27, 2011, 7:22 AM

in response to
http://postgis.refractions.net/pipermail/postgis-users/2011-May/029566.html.
After creating some indexes and rewriting geocode_address. I was
able to get geocode to run in 483-523 ms for "5775 Perimeter Dr
Dublin, Ohio". This used to take 1700-2000 ms for me. I'm removing
the main inner qui



___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Fw: re: Geocoder (from extras)

2011-05-28 Thread Paragon Corporation
Mikal,
 
Can you send me the change you made and the indexes you added.  When adding
some more data, I realized I had hardcoded an index for our local state (MA)
and I know without that index that that particular query does run pretty
slow.  So just wondering if its along the same lines.
 
I've also fixed I think all the issues with running the loader on Unix/Linux
-- well at least I was able to get it to run on my CentOS.  Thanks to all
who contributed input to that.  I took bits and pieces from many people's
comments but couldn't apply a full diff from anyones since I had already
changed the code too much to safely apply any of those patches.
 
How many states do you have loaded BTW?  I just have CA loaded on my CentOS
-- which is an 8GB/8 core cloud server.  I'm getting around 38ms - 450ms per
test, but I have yet to load the other states.
It also seems to cache very well so that if I geocode an address on the same
street (not necessarily same address), the first call might take 450ms and
the second 38ms.  I suspect this might be because I also marked a good chunk
of the functions STABLE or IMMUTABLE.
 
Thanks,
Regina
http://www.postgis.us
 


  _  

From: postgis-users-boun...@postgis.refractions.net
[mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of Mikal
Laster
Sent: Friday, May 27, 2011 8:18 AM
To: postgis-users@postgis.refractions.net
Subject: [postgis-users] Fw: re: Geocoder (from extras)




--- On Fri, 5/27/11, Mikal Laster  wrote:




From: Mikal Laster 
Subject: re:[postgis-users] Geocoder (from extras)
To: postgis-users@postgis.refractions.net
Date: Friday, May 27, 2011, 7:22 AM



in response to
http://postgis.refractions.net/pipermail/postgis-users/2011-May/029566.html.
After creating some indexes and rewriting geocode_address. I was able to get
geocode to run in 483-523 ms for "5775 Perimeter Dr Dublin, Ohio". This used
to take 1700-2000 ms for me. I'm removing the main inner qui


___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Calculate variance of a multipoint

2011-05-28 Thread Aren Cambre
Thank you!

On Fri, May 27, 2011 at 9:33 PM, Stephen Woodbridge  wrote:

> Aren,
>
> Your purposed approach sounds reasonable to me. You can do it all in one
> query like:
>
> select c.gid, sum(c.dist*c.dist)/count(*) as variance
>  from (
>select b.gid, b.cent, st_distance(b.geom, b.cent) as dist
>  from (
>select a.gid, (st_dump(a.the_geom)).geom as geom,
> centroid(a.the_geom) as cent
>  from (
>select 99 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5)'::geometry as
> the_geom
>union all
>select 88 as gid, 'MULTIPOINT(1 2,2 3,3 4,4 5,3 5,9
> 9)'::geometry as the_geom
>) as a
>) as b
>) as c
>  group by gid order by variance desc;
>
> You should be able to replace the select...union all select ... with your
> table of multipoints.
>
> -Steve W
>
>
> On 5/27/2011 6:19 PM, Aren Cambre wrote:
>
>> Did anyone have thoughts on this? :-)
>>
>> Aren
>>
>> On Wed, May 4, 2011 at 2:12 PM, Aren Cambre > > wrote:
>>
>>The more I think about it, is this a job for R? I know I need to
>>start using R at some point, just haven't begun yet.
>>
>>Aren
>>
>>
>>On Wed, May 4, 2011 at 1:42 PM, Aren Cambre >> wrote:
>>
>>Suppose you have a geometry type with a multipoint. How would
>>you calculate the variance of the points in that multipoint?
>>
>>I looked through the PostGIS 1.5 function reference and am not
>>coming up with any easy way.
>>
>>A hard way seems to be using st_centroid(multipoint) to find the
>>multipoint's  center. From there, I can calculate the distance
>>of each point from its center, and use that towards calculating
>>the variance (each distance is squared, all squared distances
>>are added together, then divide by number of points).
>>
>>I guess my ultimate need is to measure relative dispersion of
>>multipoints. The multipoints that have the most dispersion are
>>suspect, but I need a way of identifying which ones are like this.
>>
>>Aren
>>
>>
>>
>>
>>
>> ___
>> postgis-users mailing list
>> postgis-users@postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>
> ___
> postgis-users mailing list
> postgis-users@postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Calculate variance of a multipoint

2011-05-28 Thread Aren Cambre
It's to help me double check my interpretation of a large dataset.

I have a collection of millions of traffic tickets. Each ticket has route
name, milepost, and lat/long. I want to see how well the tickets of a
particular route/milepost are to each other.

E.g., all tickets written for US 71, milepost 204--if they have a very large
dispersion, then either I have an error in my analysis or the data is not
good.

Aren

On Sat, May 28, 2011 at 2:32 AM, Ben Madin
wrote:

> I'm not quite clear to me what you are trying to demonstrate - do you want
> to know the density of the points... relative to their total size (area /
> number?), or relative to some defined area?
>
> cheers
>
> Ben
>
>
> On 28/05/2011, at 6:19 AM, Aren Cambre wrote:
>
> Did anyone have thoughts on this? :-)
>
> Aren
>
> On Wed, May 4, 2011 at 2:12 PM, Aren Cambre  wrote:
>
>> The more I think about it, is this a job for R? I know I need to start
>> using R at some point, just haven't begun yet.
>>
>> Aren
>>
>>
>> On Wed, May 4, 2011 at 1:42 PM, Aren Cambre  wrote:
>>
>>> Suppose you have a geometry type with a multipoint. How would you
>>> calculate the variance of the points in that multipoint?
>>>
>>> I looked through the PostGIS 1.5 function reference and am not coming up
>>> with any easy way.
>>>
>>> A hard way seems to be using st_centroid(multipoint) to find the
>>> multipoint's  center. From there, I can calculate the distance of each point
>>> from its center, and use that towards calculating the variance (each
>>> distance is squared, all squared distances are added together, then divide
>>> by number of points).
>>>
>>> I guess my ultimate need is to measure relative dispersion of
>>> multipoints. The multipoints that have the most dispersion are suspect, but
>>> I need a way of identifying which ones are like this.
>>>
>>> Aren
>>>
>>
>>
> ___
> postgis-users mailing list
> postgis-users@postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
>
> ___
> postgis-users mailing list
> postgis-users@postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Calculate variance of a multipoint

2011-05-28 Thread Ben Madin
I'm not quite clear to me what you are trying to demonstrate - do you want to 
know the density of the points... relative to their total size (area / 
number?), or relative to some defined area?

cheers

Ben


On 28/05/2011, at 6:19 AM, Aren Cambre wrote:

> Did anyone have thoughts on this? :-)
> 
> Aren
> 
> On Wed, May 4, 2011 at 2:12 PM, Aren Cambre  wrote:
> The more I think about it, is this a job for R? I know I need to start using 
> R at some point, just haven't begun yet.
> 
> Aren
> 
> 
> On Wed, May 4, 2011 at 1:42 PM, Aren Cambre  wrote:
> Suppose you have a geometry type with a multipoint. How would you calculate 
> the variance of the points in that multipoint?
> 
> I looked through the PostGIS 1.5 function reference and am not coming up with 
> any easy way.
> 
> A hard way seems to be using st_centroid(multipoint) to find the multipoint's 
>  center. From there, I can calculate the distance of each point from its 
> center, and use that towards calculating the variance (each distance is 
> squared, all squared distances are added together, then divide by number of 
> points).
> 
> I guess my ultimate need is to measure relative dispersion of multipoints. 
> The multipoints that have the most dispersion are suspect, but I need a way 
> of identifying which ones are like this.
> 
> Aren
> 
> 
> ___
> postgis-users mailing list
> postgis-users@postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users

___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users