Re: [postgis-users] speed of query
On 01/03/11 21:48, Michael Smedberg wrote: OK, next guess. It looks like your SQL might be pretty inefficient. For example, consider this SQL statement: SELECT a, b FROM ll WHERE ST_Within( ST_Point( ST_X(ST_Transform(the_geom, 4326)), ST_Y(ST_Transform(the_geom, 4326)) ), ST_MakeBox2D(ST_Point(-91.048, 45.956), ST_Point(-90.973, 46.007)) ) I think that says something like: Look through the II table For each row, transform the_geom to 4326, and take the X For each row, transform the_geom to 4326, and take the Y Make a point from that X,Y Check if that point is within a box I don't think that'll use an index, and I think it will do a bunch of transformation work for every row. I think that instead of transforming every row in II to 4326, you'd probably be better served by transforming your bounding box to 2163 one time. I think the SQL would look something like this: SELECT a, b FROM ll WHERE ST_Within( the_geom, ST_Transform( ST_MakeBox2D( ST_Point( -91.048, 45.956 ), ST_Point( -90.973, 46.007 ) ), 2163 ) ) In any case, you should probably try looking at the output of EXPLAIN or EXPLAIN ANALYZE to understand whether your index is being used, etc. Hi Michael, Yes indeed - I think you summarised this brilliantly :) If your geometries are stored as SRID 2163, then the index bounding boxes will also be stored in SRID 2163 - hence the index can only be used for queries involving other SRID 2163 bounding boxes. Otherwise PostgreSQL assumes it has to convert your entire geometry column to SRID 4326 first in order to calculate the intersection, which involves scanning the entire table and converting all the geometries on the fly... ATB, Mark. -- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speed of query
Puneet, I'm thinking you are doing a lot of extra calculations here by constantly transforming the points. Why did you turn a geometry into two points and then make the geometry again... this would invalidate the use of an index on the same column? (Neither did you define a SRID for your box in the first query) Have a look at : file:///usr/local/pgsql/share/doc/contrib/postgis-1.5/postgis.html#ST_MakeBox2D I think the query (slightly modified here) --Return all features that fall reside or partly reside in a US national atlas coordinate bounding box --It is assumed here that the geometries are stored with SRID = 2163 (US National atlas equal area) SELECT feature_id, feature_name, the_geom FROM features WHERE the_geom && st_transform(ST_SetSRID(ST_MakeBox2D(ST_Point(-91.048, 45.956), ST_Point(-90.973, 46.007)),4326),2163) is what you are after (obviously you will have to transform the lat long points (or the box. cheers Ben On 02/03/2011, at 3:40 AM, Puneet Kishor wrote: > I have a table with ~ 13.25 million points. > > CREATE TABLE ll ( > gid serial NOT NULL, > latitude double precision, > longitude double precision, > a integer, > b integer, > the_geom geometry, > CONSTRAINT ll_pkey PRIMARY KEY (gid), > CONSTRAINT enforce_dims_the_geom CHECK (st_ndims(the_geom) = 2), > CONSTRAINT enforce_geotype_the_geom CHECK ( > geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL > ), > CONSTRAINT enforce_srid_the_geom CHECK (st_srid(the_geom) = 2163) > ) > WITH ( > OIDS=FALSE > ); > > I want to select the columns a,b for the rows that lie within a box made by > points [-91.048, 45.956] and [-90.973, 46.007]. Here are my results -- > > Query 1 > > SELECT a, b FROM ll > WHERE > ST_Within( > ST_Point( > ST_X(ST_Transform(the_geom, 4326)), > ST_Y(ST_Transform(the_geom, 4326)) > ), > ST_MakeBox2D(ST_Point(-91.048, 45.956), ST_Point(-90.973, 46.007)) > ) > > 31 rows returned in 46125 ms > > Query 2 > > SELECT a, b FROM ll > WHERE > ST_X(ST_Transform(the_geom, 4326)) >= -91.048 AND > ST_X(ST_Transform(the_geom, 4326)) <= -90.973 AND > ST_Y(ST_Transform(the_geom, 4326)) >= 45.956 AND > ST_Y(ST_Transform(the_geom, 4326)) <= 46.007 > > 31 rows returned in 25729 ms > > Query 3 > > SELECT a, b FROM ll > WHERE > longitude >= -91.048 AND > longitude <= -90.973 AND > latitude >= 45.956 AND > latitude <= 46.007 > > 31 rows returned in 4011 ms > > Query 4 > > I also have the same data in a SQLite database with an R*Tree index on > lat/lon. A query analogous to Query 3 returns fast enough to not even > register a time... a few milliseconds; effectively 0 seconds. > > What gives? > -- > Puneet Kishor > > > > ___ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speed of query
Hi, Indeed: ST_Point, ST_MakeBox2D and ST_Transform are IMMUTABLE, so the whole expression is computed once for all rows in this case. The only function called every time is ST_Within, which normally makes use of the spatial index. I wonder if the '&&' operator (ST_Overlaps) would be faster or not... JF 2011/3/1 Michael Smedberg : > OK, next guess. It looks like your SQL might be pretty inefficient. For > example, consider this SQL statement: > SELECT a, b FROM ll > WHERE > ST_Within( > ST_Point( > ST_X(ST_Transform(the_geom, 4326)), > ST_Y(ST_Transform(the_geom, 4326)) > ), > ST_MakeBox2D(ST_Point(-91.048, 45.956), ST_Point(-90.973, 46.007)) > ) > I think that says something like: > Look through the II table > For each row, transform the_geom to 4326, and take the X > For each row, transform the_geom to 4326, and take the Y > Make a point from that X,Y > Check if that point is within a box > I don't think that'll use an index, and I think it will do a bunch of > transformation work for every row. > I think that instead of transforming every row in II to 4326, you'd probably > be better served by transforming your bounding box to 2163 one time. I > think the SQL would look something like this: > > SELECT > a, > b > FROM > ll > WHERE > ST_Within( > the_geom, > ST_Transform( > ST_MakeBox2D( > ST_Point( > -91.048, > 45.956 > ), > ST_Point( > -90.973, > 46.007 > ) > ), > 2163 > ) > ) > > In any case, you should probably try looking at the output of EXPLAIN or > EXPLAIN ANALYZE to understand whether your index is being used, etc. > > > > > On Tue, Mar 1, 2011 at 11:40 AM, Puneet Kishor wrote: >> >> I have a table with ~ 13.25 million points. >> >> CREATE TABLE ll ( >> gid serial NOT NULL, >> latitude double precision, >> longitude double precision, >> a integer, >> b integer, >> the_geom geometry, >> CONSTRAINT ll_pkey PRIMARY KEY (gid), >> CONSTRAINT enforce_dims_the_geom CHECK (st_ndims(the_geom) = 2), >> CONSTRAINT enforce_geotype_the_geom CHECK ( >> geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL >> ), >> CONSTRAINT enforce_srid_the_geom CHECK (st_srid(the_geom) = 2163) >> ) >> WITH ( >> OIDS=FALSE >> ); >> >> I want to select the columns a,b for the rows that lie within a box made >> by points [-91.048, 45.956] and [-90.973, 46.007]. Here are my results -- >> >> Query 1 >> >> SELECT a, b FROM ll >> WHERE >> ST_Within( >> ST_Point( >> ST_X(ST_Transform(the_geom, 4326)), >> ST_Y(ST_Transform(the_geom, 4326)) >> ), >> ST_MakeBox2D(ST_Point(-91.048, 45.956), ST_Point(-90.973, 46.007)) >> ) >> >> 31 rows returned in 46125 ms >> >> Query 2 >> >> SELECT a, b FROM ll >> WHERE >> ST_X(ST_Transform(the_geom, 4326)) >= -91.048 AND >> ST_X(ST_Transform(the_geom, 4326)) <= -90.973 AND >> ST_Y(ST_Transform(the_geom, 4326)) >= 45.956 AND >> ST_Y(ST_Transform(the_geom, 4326)) <= 46.007 >> >> 31 rows returned in 25729 ms >> >> Query 3 >> >> SELECT a, b FROM ll >> WHERE >> longitude >= -91.048 AND >> longitude <= -90.973 AND >> latitude >= 45.956 AND >> latitude <= 46.007 >> >> 31 rows returned in 4011 ms >> >> Query 4 >> >> I also have the same data in a SQLite database with an R*Tree index on >> lat/lon. A query analogous to Query 3 returns fast enough to not even >> register a time... a few milliseconds; effectively 0 seconds. >> >> What gives? >> -- >> Puneet Kishor >> >> >> >> ___ >> postgis-users mailing list >> postgis-users@postgis.refractions.net >> http://postgis.refractions.net/mailman/listinfo/postgis-users > > > ___ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users > > ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speed of query
OK, next guess. It looks like your SQL might be pretty inefficient. For example, consider this SQL statement: SELECT a, b FROM ll WHERE ST_Within( ST_Point( ST_X(ST_Transform(the_geom, 4326)), ST_Y(ST_Transform(the_geom, 4326)) ), ST_MakeBox2D(ST_Point(-91.048, 45.956), ST_Point(-90.973, 46.007)) ) I think that says something like: Look through the II table For each row, transform the_geom to 4326, and take the X For each row, transform the_geom to 4326, and take the Y Make a point from that X,Y Check if that point is within a box I don't think that'll use an index, and I think it will do a bunch of transformation work for every row. I think that instead of transforming every row in II to 4326, you'd probably be better served by transforming your bounding box to 2163 one time. I think the SQL would look something like this: SELECT a, b FROM ll WHERE ST_Within( the_geom, ST_Transform( ST_MakeBox2D( ST_Point( -91.048, 45.956 ), ST_Point( -90.973, 46.007 ) ), 2163 ) ) In any case, you should probably try looking at the output of EXPLAIN or EXPLAIN ANALYZE to understand whether your index is being used, etc. On Tue, Mar 1, 2011 at 11:40 AM, Puneet Kishor wrote: > I have a table with ~ 13.25 million points. > > CREATE TABLE ll ( > gid serial NOT NULL, > latitude double precision, > longitude double precision, > a integer, > b integer, > the_geom geometry, > CONSTRAINT ll_pkey PRIMARY KEY (gid), > CONSTRAINT enforce_dims_the_geom CHECK (st_ndims(the_geom) = 2), > CONSTRAINT enforce_geotype_the_geom CHECK ( > geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL > ), > CONSTRAINT enforce_srid_the_geom CHECK (st_srid(the_geom) = 2163) > ) > WITH ( > OIDS=FALSE > ); > > I want to select the columns a,b for the rows that lie within a box made by > points [-91.048, 45.956] and [-90.973, 46.007]. Here are my results -- > > Query 1 > > SELECT a, b FROM ll > WHERE > ST_Within( > ST_Point( > ST_X(ST_Transform(the_geom, 4326)), > ST_Y(ST_Transform(the_geom, 4326)) > ), > ST_MakeBox2D(ST_Point(-91.048, 45.956), ST_Point(-90.973, 46.007)) > ) > > 31 rows returned in 46125 ms > > Query 2 > > SELECT a, b FROM ll > WHERE > ST_X(ST_Transform(the_geom, 4326)) >= -91.048 AND > ST_X(ST_Transform(the_geom, 4326)) <= -90.973 AND > ST_Y(ST_Transform(the_geom, 4326)) >= 45.956 AND > ST_Y(ST_Transform(the_geom, 4326)) <= 46.007 > > 31 rows returned in 25729 ms > > Query 3 > > SELECT a, b FROM ll > WHERE > longitude >= -91.048 AND > longitude <= -90.973 AND > latitude >= 45.956 AND > latitude <= 46.007 > > 31 rows returned in 4011 ms > > Query 4 > > I also have the same data in a SQLite database with an R*Tree index on > lat/lon. A query analogous to Query 3 returns fast enough to not even > register a time... a few milliseconds; effectively 0 seconds. > > What gives? > -- > Puneet Kishor > > > > ___ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users > ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speed of query
On Tuesday, March 1, 2011 at 3:30 PM, Michael Smedberg wrote: > It looks like maybe you don't have an index on your the_geom column? If so, > you might want to read the "Indexing the data" section of > http://www.bostongis.com/PrinterFriendly.aspx?content_name=postgis_tut01 > I should have added the following information -- CREATE INDEX ll_the_geom_gist ON ll USING gist (the_geom); I do have an index. Ironically, I don't have an index on longitude and latitude columns, yet, searching on those columns is faster than on the_geom. > On Tue, Mar 1, 2011 at 11:40 AM, Puneet Kishor wrote: > > I have a table with ~ 13.25 million points. > > > > CREATE TABLE ll ( > > gid serial NOT NULL, > > latitude double precision, > > longitude double precision, > > a integer, > > b integer, > > the_geom geometry, > > CONSTRAINT ll_pkey PRIMARY KEY (gid), > > CONSTRAINT enforce_dims_the_geom CHECK (st_ndims(the_geom) = 2), > > CONSTRAINT enforce_geotype_the_geom CHECK ( > > geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL > > ), > > CONSTRAINT enforce_srid_the_geom CHECK (st_srid(the_geom) = 2163) > > ) > > WITH ( > > OIDS=FALSE > > ); > > > > I want to select the columns a,b for the rows that lie within a box made > > by points [-91.048, 45.956] and [-90.973, 46.007]. Here are my results -- > > > > Query 1 > > > > SELECT a, b FROM ll > > WHERE > > ST_Within( > > ST_Point( > > ST_X(ST_Transform(the_geom, 4326)), > > ST_Y(ST_Transform(the_geom, 4326)) > > ), > > ST_MakeBox2D(ST_Point(-91.048, 45.956), ST_Point(-90.973, 46.007)) > > ) > > > > 31 rows returned in 46125 ms > > > > Query 2 > > > > SELECT a, b FROM ll > > WHERE > > ST_X(ST_Transform(the_geom, 4326)) >= -91.048 AND > > ST_X(ST_Transform(the_geom, 4326)) <= -90.973 AND > > ST_Y(ST_Transform(the_geom, 4326)) >= 45.956 AND > > ST_Y(ST_Transform(the_geom, 4326)) <= 46.007 > > > > 31 rows returned in 25729 ms > > > > Query 3 > > > > SELECT a, b FROM ll > > WHERE > > longitude >= -91.048 AND > > longitude <= -90.973 AND > > latitude >= 45.956 AND > > latitude <= 46.007 > > > > 31 rows returned in 4011 ms > > > > Query 4 > > > > I also have the same data in a SQLite database with an R*Tree index on > > lat/lon. A query analogous to Query 3 returns fast enough to not even > > register a time... a few milliseconds; effectively 0 seconds. > > > > What gives? > > -- > > Puneet Kishor > > > > > > > > ___ > > postgis-users mailing list > > postgis-users@postgis.refractions.net > > http://postgis.refractions.net/mailman/listinfo/postgis-users > ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
Re: [postgis-users] speed of query
It looks like maybe you don't have an index on your the_geom column? If so, you might want to read the "Indexing the data" section of http://www.bostongis.com/PrinterFriendly.aspx?content_name=postgis_tut01 On Tue, Mar 1, 2011 at 11:40 AM, Puneet Kishor wrote: > I have a table with ~ 13.25 million points. > > CREATE TABLE ll ( > gid serial NOT NULL, > latitude double precision, > longitude double precision, > a integer, > b integer, > the_geom geometry, > CONSTRAINT ll_pkey PRIMARY KEY (gid), > CONSTRAINT enforce_dims_the_geom CHECK (st_ndims(the_geom) = 2), > CONSTRAINT enforce_geotype_the_geom CHECK ( > geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL > ), > CONSTRAINT enforce_srid_the_geom CHECK (st_srid(the_geom) = 2163) > ) > WITH ( > OIDS=FALSE > ); > > I want to select the columns a,b for the rows that lie within a box made by > points [-91.048, 45.956] and [-90.973, 46.007]. Here are my results -- > > Query 1 > > SELECT a, b FROM ll > WHERE > ST_Within( > ST_Point( > ST_X(ST_Transform(the_geom, 4326)), > ST_Y(ST_Transform(the_geom, 4326)) > ), > ST_MakeBox2D(ST_Point(-91.048, 45.956), ST_Point(-90.973, 46.007)) > ) > > 31 rows returned in 46125 ms > > Query 2 > > SELECT a, b FROM ll > WHERE > ST_X(ST_Transform(the_geom, 4326)) >= -91.048 AND > ST_X(ST_Transform(the_geom, 4326)) <= -90.973 AND > ST_Y(ST_Transform(the_geom, 4326)) >= 45.956 AND > ST_Y(ST_Transform(the_geom, 4326)) <= 46.007 > > 31 rows returned in 25729 ms > > Query 3 > > SELECT a, b FROM ll > WHERE > longitude >= -91.048 AND > longitude <= -90.973 AND > latitude >= 45.956 AND > latitude <= 46.007 > > 31 rows returned in 4011 ms > > Query 4 > > I also have the same data in a SQLite database with an R*Tree index on > lat/lon. A query analogous to Query 3 returns fast enough to not even > register a time... a few milliseconds; effectively 0 seconds. > > What gives? > -- > Puneet Kishor > > > > ___ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users > ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users
[postgis-users] speed of query
I have a table with ~ 13.25 million points. CREATE TABLE ll ( gid serial NOT NULL, latitude double precision, longitude double precision, a integer, b integer, the_geom geometry, CONSTRAINT ll_pkey PRIMARY KEY (gid), CONSTRAINT enforce_dims_the_geom CHECK (st_ndims(the_geom) = 2), CONSTRAINT enforce_geotype_the_geom CHECK ( geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL ), CONSTRAINT enforce_srid_the_geom CHECK (st_srid(the_geom) = 2163) ) WITH ( OIDS=FALSE ); I want to select the columns a,b for the rows that lie within a box made by points [-91.048, 45.956] and [-90.973, 46.007]. Here are my results -- Query 1 SELECT a, b FROM ll WHERE ST_Within( ST_Point( ST_X(ST_Transform(the_geom, 4326)), ST_Y(ST_Transform(the_geom, 4326)) ), ST_MakeBox2D(ST_Point(-91.048, 45.956), ST_Point(-90.973, 46.007)) ) 31 rows returned in 46125 ms Query 2 SELECT a, b FROM ll WHERE ST_X(ST_Transform(the_geom, 4326)) >= -91.048 AND ST_X(ST_Transform(the_geom, 4326)) <= -90.973 AND ST_Y(ST_Transform(the_geom, 4326)) >= 45.956 AND ST_Y(ST_Transform(the_geom, 4326)) <= 46.007 31 rows returned in 25729 ms Query 3 SELECT a, b FROM ll WHERE longitude >= -91.048 AND longitude <= -90.973 AND latitude >= 45.956 AND latitude <= 46.007 31 rows returned in 4011 ms Query 4 I also have the same data in a SQLite database with an R*Tree index on lat/lon. A query analogous to Query 3 returns fast enough to not even register a time... a few milliseconds; effectively 0 seconds. What gives? -- Puneet Kishor ___ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users