Simon Riggs wrote: > This type of work is 90% analysis, 10% coding. You'll need to do a lot > of investigation, lots of discussion and listening.
I absolutely agree with you and I am not about to rush into coding right now. First of all I'm going to dig a lot in the PG sources, readme's and so on. It's a good school of coding and DBMS internals understanding. > > That's what I want to do: > > 1. Replace not very useful indexCorrelation with indexClustering. > > An opinion such as "not very useful" isn't considered sufficient > explanation or justification for a change around here. Sometimes the indexCorrelation even wrongful. There are many examples of overestimation of index scan cost (data well-clustered but not ordered - correlation is low) and some cases of underestimation when tuples look like well ordered with high degree of correlation, but index scan actually causes random page fetches (1-3-2-4-6-5, for example. On server without RAID it is VERY slow. 25 times slower than bitmap index scan). If we have special clustering measure we can more precisely estimate pages count. The next step could be to introduce 'ordering' as a measure of pages access sequentiality. Without the 'ordering' all we can assume that pages are fetched in random order. Anyhow, if index access cost is overestimated we can set random_page_cost=2. (Is it true in a production database with smart RAID?) Moreover, I think problem is more complex. With assumption that index access is always random we dip in another problem: overestimation of master table index scan. If it is small enough PG can choose seq scan instead of index scan even if the last one actually much cheaper because of caching. That is why caching should be taking into account during joining cost calculation. > > 2. Consider caching of inner table in a nested loops join during > > estimation total cost of the join. > I'd work on one thing at a time and go into it deeply. Good news. So I'm very interested in what you think about my ideas. Is it wrong or too naive? ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend