Robert, I have been haunted by similar issues but I haven't had time to figure out some explanation I really believe in.
Does this post with a reproducible case seem similar? It seems like it to me. http://postgis.refractions.net/pipermail/postgis-devel/2008-September/003547.html For now when these situations aren't working (the planner favors the spatial index due to a weird estimate when it is really going to be lots) I am using a constant geometry [e.g. geometryfromwkt(wkt geometry string)] for the join condition. I realize the feasibility of this depends on your application, or your desire to write a user-defined function. Why does the planner get different estimates for the same geometry being used in a constraint based on it coming from a table, or from a string in the query? I think the reason this is happening might be apparent in lwgeom_estimate.c, but I'm open to being totally wrong :). -Mose On Wed, Mar 18, 2009 at 11:32 AM, Paragon Corporation <[email protected]> wrote: > Robert, > > Which version of PostgreSQL are you running? I wander if you are being > bitten by one of these bugs > > --the description doesn't quite match any of these 2 - but it could be > another side effect of these > http://www.postgresql.org/docs/8.3/static/release-8-3-7.html > > Fix planner problem with sub-SELECT in the output list of a larger subquery > (Tom) > > Or > http://www.postgresql.org/docs/8.3/static/release-8-3-6.html > Fix planner misestimation of selectivity when transitive equality is > applied > to an outer-join clause (Tom) > > This could result in bad plans for queries like ... from a left join b on > a.a1 = b.b1 where a.a1 = 42 ... > > > > Hope that helps, > Regina > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of > Burgholzer,Robert > Sent: Wednesday, March 18, 2009 9:05 AM > To: PostGIS Users Discussion > Subject: [postgis-users] Index Use Craziness > > I am having much trouble getting postgres to use the proper index. > > I have a table with about 20 million daily rainfall observation points, > with > between 0 and 19,000 points on any given day. In order to speed up > querying, I have created a GIST index on the geometry column, and an index > on the date column. > > What I want to do is to summarize the observed rainfall in another shape > table on any given day. If I just want to query points by date, this uses > the date index, and performs very quickly (see "CASE 1" below). > > When I try to introduce a geometry operator, the GIST index takes over, and > the index on the "thisdate" column is ignored. The weird thing is, I can > hear it churning the heck out of my hard drive checking on all the points. > (See "CASE 2" below). > > I even tried to sub-query the points for the given date, but the index did > not seem to be fooled ("CASE 3"). It gave me the exact same query plan!! > I > am unsure if I really understand the way a sub-query functions given this > result. > > Any help would be appreciated! > > Robert > > CASE 1: > wsp=# explain select count(*) from precip_gridded where thisdate = > '2009-01-12'; > > QUERY PLAN > ------------------------------------------------------------------------ > ---- > Aggregate (cost=479.43..479.44 rows=1 width=0) > -> Index Scan using pg_tdix on precip_gridded (cost=0.00..451.92 > rows=11004 > width=0) > Index Cond: (thisdate = '2009-01-12 00:00:00'::timestamp without > time zone) > (3 rows) > > > CASE 2: > explain select count(*) from precip_gridded where thisdate = '2009-01-12' > and the_geom && setsrid((select extent(the_geom) from proj_seggroups where > gid = 448),4326); > > QUERY PLAN > ------------------------------------------------------------------------ > ---- > Aggregate (cost=100.23..100.24 rows=1 width=0) > InitPlan > -> Aggregate (cost=5.87..5.88 rows=1 width=2807) > -> Index Scan using psg_giix on proj_seggroups > (cost=0.00..5.87 rows=1 width=2807) > Index Cond: (gid = 448) > -> Index Scan using pg_gix on precip_gridded (cost=0.00..94.35 > rows=1 width=0) > Index Cond: (the_geom && setsrid(($0)::geometry, 4326)) > Filter: ((thisdate = '2009-01-12 00:00:00'::timestamp without time > zone) AND (the_geom && setsrid(($0)::geometry, 4326))) > (8 rows) > > > CASE 3: Sub-query by date to try and force the use of date index wsp=# > explain select count(a.*) from (select * from precip_gridded where thisdate > = '2009-01-12') as a where a.the_geom && setsrid((select > extent(the_geom) from proj_seggroups where gid = 448),4326); > > QUERY PLAN > > ------------------------------------------------------------------------ > ---- > Aggregate (cost=100.23..100.24 rows=1 width=81) > InitPlan > -> Aggregate (cost=5.87..5.88 rows=1 width=2807) > -> Index Scan using psg_giix on proj_seggroups > (cost=0.00..5.87 rows=1 width=2807) > Index Cond: (gid = 448) > -> Index Scan using pg_gix on precip_gridded (cost=0.00..94.35 > rows=1 width=81) > Index Cond: (the_geom && setsrid(($0)::geometry, 4326)) > Filter: ((thisdate = '2009-01-12 00:00:00'::timestamp without time > zone) AND (the_geom && setsrid(($0)::geometry, 4326))) > (8 rows) > > > Robert W. Burgholzer > Surface Water Modeler > Office of Water Supply and Planning > Virginia Department of Environmental Quality [email protected] > 804-698-4405 > Open Source Modeling Tools: > http://sourceforge.net/projects/npsource/ > > > _______________________________________________ > postgis-users mailing list > [email protected] > http://postgis.refractions.net/mailman/listinfo/postgis-users > > > > _______________________________________________ > postgis-users mailing list > [email protected] > http://postgis.refractions.net/mailman/listinfo/postgis-users >
_______________________________________________ postgis-users mailing list [email protected] http://postgis.refractions.net/mailman/listinfo/postgis-users
