lr,
Thanks for your response.
The dimensions of the problem are relevantly different, something like
this:
* number of building_types: 10.000
* number of buildings: 5 billion
* average number rows returned by the query (function of distance,
building type): 100
(the problems something other that buildings, it's stock exchange
messages & events, but the analogy is 1-on-1 )
I'm currently working of a small sample of that dataset, so other
issues might pop up.
.. so the child table approach (on the large 5 billion rows table) is
probably something I need to do anyway, although my current plan was
to segment on a different key...
The way I wanted it to work was: have a compound index that would
first filter on building type -resulting in spatial datasets of size
500K-, and *then* do distance queries on that resulting in approx 100
rows.
Another idea I just has is to define regions (like city) and use a
combined index on "building_type, city", and then do a filtering on
distance/combining adjacent cities using "building_types per city"
queries
On Mar 18, 2009, at 1:31 PM, Paragon Corporation wrote:
M.A.,
My gut feeling tells me having 2 partial indexes on the same field
as you
have described is just bad. How would the planner know which to
choose
unless you always have the filter in there. If you only have 2 sets
of data
then breaking it out is even more silly. If you later add more
building
types, then you would never do an indexed searches on those.
You'd be better off having a single gist index on the geometry and
possibly
a btree index on the building_type. Though if you only have 2
building
types, that index would never need to be used. PostgreSQL is quite
efficient at using bit map indexes. To my knowledge you can't really
combine a gist and btree type in a single compound index and you
wouldn't
want to anyway.
The child table approach is generally the preferred way of doing
this, but
even that is not necessary if you have less than a million records
or if you
wan that benefit of quickly reloading all buildings or all schools
(by being
able to do a TRUNCATE TABLE).
Regardig your query, your query as it stands will not use any
indexes. You
should write it using the ST_Dwithin function which does use indexes.
SELECT the_geom, building_name
FROM buildings
WHERE building_type = 'School' AND ST_DWithin(the_geom, 'POINT(100312
102312)', 5000)
Leo
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of
M.A.
(Thijs) van den Berg
Sent: Wednesday, March 18, 2009 8:08 AM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] Can I build an index on
combinednon-spatial&spatial columns?
Thanks, that's true. I think the performance benefit with be good
because I
have approx 10.000 building_type's (although, that's a lot of
indices to
choose between). A typical distance query for a specific
building_type will
result in 10-100 matches. Not using partial indexing will thus
result in
appox 500.000 rows for each query (all building types), which I
would then
have to filter on building_type, reducing the resultset with a
factor 10.000
Giving it some more thought, I think splitting the building table into
individual building_type tables (and attaching spatial indices to
those) might be the best solution.
From a technical point, I think the postgresql CREATE INDEX should
support
multiple USING to allow for mixed patial / non-spatial indices,
something
like this:
CREATE INDEX blabla ON buildings (USING HASH(building_type), USING
GIST(the_geom))
internally that could be rewritten as
CREATE INDEX blabla_hash_value1 ON buildings USING GIST(the_geom)
WHERE
hash(building_type)=value1 CREATE INDEX blabla_hash_value2 ON
buildings
USING GIST(the_geom) WHERE hash(building_type)=value2 ..
On Mar 18, 2009, at 12:43 PM, Kemal Bayram wrote:
Well according to EXPLAIN a partial index is taken into account, how
much of a performance benefit you gain ultimately depends on how many
records lie within your average distance.
If your data set is relatively static you may also want to consider
clustering your table too.
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of
M.A. (Thijs) van den Berg
Sent: 18 March 2009 13:00
To: PostGIS Users Discussion
Subject: Re: [postgis-users] Can I build an index on combined
non-spatial&spatial columns?
Thanks :)
Does PostGIS support partial indices?
http://www.postgresql.org/docs/8.3/interactive/indexes-partial.html
If so.. I could do something like this:
CREATE INDEX building_type_spindx_school ON buildings USING
GIST(the_geom) WHERE building_type='School';
CREATE INDEX building_type_spindx_bar ON buildings USING
GIST(the_geom) WHERE building_type='Bar';
...
Yet another option would be to partition the buildings table into
child tables per building_type (ie have child table containing all
the schools, another child table having all the bars), and have
individua spatial indices on the child tables. Is that an option?
On Mar 18, 2009, at 11:36 AM, Kemal Bayram wrote:
I don't know about combining indexes but you want to
include a spatial
operator such as && in addition to Distance so that you can take
advantage of an index on "the_geom".
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On
Behalf Of
M.A. (Thijs) van den Berg
Sent: 18 March 2009 11:57
To: [email protected]
Subject: [postgis-users] Can I build an index on combined
non-spatial
&spatial columns?
Suppose I want to speedup the following type of queries with an
index:
SELECT the_geom, building_name
FROM buildings
WHERE Distance(the_geom, 'POINT(100312 102312)') < 5000 AND
building_type = 'School'
Is that possible, and if not, why not? My current option
is to write
my own index storage in C++, using a hashmap to filter
building_type,
and build separate spatial indices for each building type.
Can I do
something similar in postgresql?
Would my only option be to split the building_types into separate
tables?
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users