On Sun, Dec 12, 2010 at 07:10:44PM -0800, Nathan Boley wrote:
> Another quick note: I think that storing the full contingency table is
> wasteful since the marginals are already stored in the single column
> statistics. Look at copulas [2] ( FWIW I think that Josh Tolley was
> looking at this a couple years back ).

Josh Tolley still looks at it occasionally, though time hasn't permitted any
sort of significant work for quite some time. The multicolstat branch on my
git.postgresql.org repository will create an empirical copula each
multi-column index, and stick it in pg_statistic. It doesn't yet do anything
useful with that information, nor am I convinced it's remotely bug-free. In a
brief PGCon discussion with Tom a while back, it was suggested a good place
for the planner to use these stats would be clausesel.c, which is responsible
for handling code such as "...WHERE foo > 4 AND foo > 5".

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

Attachment: signature.asc
Description: Digital signature

Reply via email to