On Sun, Dec 12, 2010 at 07:10:44PM -0800, Nathan Boley wrote: > Another quick note: I think that storing the full contingency table is > wasteful since the marginals are already stored in the single column > statistics. Look at copulas [2] ( FWIW I think that Josh Tolley was > looking at this a couple years back ).
Josh Tolley still looks at it occasionally, though time hasn't permitted any sort of significant work for quite some time. The multicolstat branch on my git.postgresql.org repository will create an empirical copula each multi-column index, and stick it in pg_statistic. It doesn't yet do anything useful with that information, nor am I convinced it's remotely bug-free. In a brief PGCon discussion with Tom a while back, it was suggested a good place for the planner to use these stats would be clausesel.c, which is responsible for handling code such as "...WHERE foo > 4 AND foo > 5". -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com
signature.asc
Description: Digital signature