Robert Haas wrote: > On Mon, Sep 7, 2009 at 8:51 PM, Karl Denninger<k...@denninger.net> wrote: > >> Robert Haas wrote: >> >> On Sat, Sep 5, 2009 at 8:19 PM, Karl Denninger<k...@denninger.net> wrote: >> >> >> There was a previous thread and I referenced it. I don't have the other one >> in my email system any more to follow up to it. >> >> I give up; the attack-dog crowd has successfully driven me off. Ciao. >> >> >> Perhaps I'm biased by knowing some of the people involved, but I don't >> think anyone on this thread has been anything but polite. It would >> certainly be great if PostgreSQL could properly estimate the >> selectivity of expressions like this without resorting to nasty hacks, >> but it can't, and unfortunately, there's really no possibility of that >> changing any time soon. Even if someone implements a fix today, the >> soonest it will appear in a production release is June 2010. So, any >> suggestion for improvement is going to be in the form of suggesting >> that you modify the schema in some way. I know that's not really what >> you're looking for, but unfortunately it's the best we can do. >> >> As far as I can tell, it is not correct to say that you referenced the >> previous thread. I do not see any such reference. >> >> ...Robert >> >> >> >> I was asking about modifying the schema. >> >> The current schema is an integer being used as a bitmask. If the planner >> knows how to handle a type of "bit(X)" (and will at least FILTER rather than >> NESTED LOOP it on a select, as happens for an Integer used in this fashion), >> that change is easier than splitting it into individual boolean fields. >> > > Well, the first several replies seem to address that question - I > think we all agree that won't help. I'm not sure what you mean by "at > least FILTER rather than NESTED LOOP it on a select". However, > typically, the time when you get a nested loop is when the planner > believes that the loop will be executed very few times (in other > words, the outer side will return very few rows). It probably isn't > the case that the planner COULDN'T choose to execute the query in some > other way; rather, the planner believes that the nested loop is faster > because of a (mistaken) belief about how many rows the > bitmap-criterion will actually match. All the suggestions you've > gotten upthread are tricks to enable the planner to make a better > estimate, which will hopefully cause it to choose a better plan. > > As a general statement, selectivity estimation problems are very > painful to work around and often involve substantial application > redesign. In all honesty, I think you've run across one of the easier > variants. As painful as it is to hear the word easy applied to a > problem that's killing you, there actually IS a good solution to this > problem: use individual boolean fields. I know that's not what you > want to do, but it's better than "sorry, you're hosed, no matter how > you do this it ain't gonna work". And I do think there are a few in > the archives that fall into that category. > > Good luck, and sorry for the bad news. > > ...Robert > The individual boolean fields don't kill me and in terms of some of the application issues they're actually rather easy to code for.
The problem with re-coding for them is extensibility (by those who install and administer the package); a mask leaves open lots of extra bits for "site-specific" use, where hard-coding booleans does not, and since the executable is a binary it instantly becomes a huge problem for everyone but me. It does appear, however, that a bitfield doesn't evaluate any differently than does an integer used with a mask, so there you have it..... it is what it is, and if I want this sort of selectivity in the search I have no choice. -- Karl
<<attachment: karl.vcf>>
-- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance