Dne 13.12.2010 01:05, Robert Haas napsal(a): > This is a good idea, but I guess the question is what you do next. If > you know that the "applicability" is 100%, you can disregard the > restriction clause on the implied column. And if it has no > implicatory power, then you just do what we do now. But what if it > has some intermediate degree of implicability?
Well, I think you've missed the e-mail from Florian Pflug - he actually pointed out that the 'implicativeness' Heikki mentioned is called conditional probability. And conditional probability can be used to express the "AND" probability we are looking for (selectiveness). For two columns, this is actually pretty straighforward - as Florian wrote, the equation is P(A and B) = P(A|B) * P(B) = P(B|A) * P(A) where P(B) may be estimated from the current histogram, and P(A|B) may be estimated from the contingency (see the previous mails). And "P(A and B)" is actually the value we're looking for. Anyway there really is no "intermediate" degree of aplicability, it just gives you the right estimate. And AFAIR this is easily extensible to more than two columns, as P(A and B and C) = P(A and (B and C)) = P(A|(B and C)) * P(B and C) so it's basically a recursion. Well, I hope my statements are really correct - it's been a few years since I gained my degree in statistics ;-) regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers