hello everbody, we have spent some time in finally attacking cross column correlation. as this is an issue which keeps bugging us for a couple of applications (some years). this is a WIP patch which can do:
special cross column correlation specific syntax:
CREATE CROSS COLUMN STATISTICS ON tablename (field, ...);
DROP CROSS COLUMN STATISTICS ON tablename (field, ...);
we use specific syntax because we simply cannot keep track of all possible
correlations in the DB so the admi can take care of things explicitly. some
distant day somebody might want to write a mechanism to derive the desired
stats automatically but this is beyond the scope of our project for now.
as far as the patch is concerned:
it is patched nicely into clauselist_selectivity(), but has some rough edges,
even when a cross-col stat is found, the single col selectivities are still
counted ( = lovering the selectivity even more), this is a TODO.
this patch adds the grammar and the start of planner integration with a static
selectivity value for now, the previous discussion about cross-column
statistics can be continued and perhaps comes to fruition soon.
how does it work? we try to find suitable statistics for an arbitrary length
list of conditions so that the planner can use it directly rather than
multiplying all the selectivities. this should make estimates a lot more
precise.
the current approach can be extended to work with expressions and well as
"straight" conditions.
goal: to make cross column correlation work for 9.2 ...
the purpose of this mail is mostly to get the race for a patch going and to see
if the approach as such is reasonable / feasible.
many thanks,
hans
cross-column-v5.patch
Description: Binary data
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
-- Sent via pgsql-hackers mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
