Dne 20.1.2011 03:36, Robert Haas napsal(a): > On Wed, Jan 19, 2011 at 5:13 PM, Tomas Vondra <t...@fuzzy.cz> wrote: >>>> Regarding the crash scenario - if the commit fails, just throw away the >>>> local estimator copy, it's not needed. I'm not sure how to take care of >>>> the case when commit succeeds and the write of the merged estimator >>>> fails, but I think it might be possible to write the estimator to xlog >>>> or something like that. So it would be replayed during recovery etc. Or >>>> is it a stupid idea? >>> >>> It's not stupid, in the sense that that is what you'd need to do if >>> you want to avoid ever having to rescan the table. But it is another >>> thing that I think is going to be way too expensive. >> >> Way too expensive? All you need to put into the logfile is a copy of the >> estimator, which is a few kBs. How is that 'way too expensive'? > > At this point, this is all a matter of religion, right? Neither of us > has a working implementation we've benchmarked. But yes, I believe > you're going to find that implementing some kind of streaming > estimator is going to impose a... <pulls number out of rear end> 6% > performance penalty, even after you've optimized the living daylights > out of it. Now you might say... big deal, it improves my problem > queries by 100x. OK, but if you could get the same benefit by doing > an occasional full table scan during off hours, you could have the > same performance with a *0%* performance penalty. Even better, the > code changes would be confined to ANALYZE rather than spread out all > over the system, which has positive implications for robustness and > likelihood of commit.
Good point. What I was trying to do was to continuously update the estimator with new data - that was the whole idea behind the collecting of new values (which might lead to problems with memory etc. as you've pointed out) and updating a local copy of the estimator (which is a good idea I think). But this might be another option - let the user decide if he wants to continuously update the estimates (and pay the price) or do that off the hours (and pay almost nothing). That sounds as a very good solution to me. > I'm not trying to argue you out of working on this. It's obviously > your time to spend, and if works better than I think it will, great! > I'm merely offering you an opinion on what will probably happen if you > go this route - namely, it'll carry an unpalatable run-time penalty. > That opinion may be worth no more than what you paid for it, but there > you have it. Yes, and I appreciate all feedback. But I still believe this can be done so that users that don't need the feature don't pay for it. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers