On Mon, 2006-09-11 at 06:20 -0700, Say42 wrote: > I intend to play with some optimizer aspects. Just for fun.
Cool. If you think its fun (it is), you're half way there. > I'm a > novice in the DBMS development so I can not promise any available > results but if it can be useful even as yet another failed attempt I > will try. This type of work is 90% analysis, 10% coding. You'll need to do a lot of investigation, lots of discussion and listening. > That's what I want to do: > 1. Replace not very useful indexCorrelation with indexClustering. An opinion such as "not very useful" isn't considered sufficient explanation or justification for a change around here. > 2. Consider caching of inner table in a nested loops join during > estimation total cost of the join. > > More details: > 1. During analyze we have sample rows. For every N-th sample row we can > scan indices on qual like 'value >= index_first_column' and fetch first > N row TIDs. To estimate count of fetched heap pages is not hard. To > take the index clustering value just divide the pages count by the > sample rows count. > 2. It's more-more harder and may be impossible to me at all. The main > ideas: > - split page fetches cost and CPU cost into different variables and > don't summarize it before join estimation. > - final path cost estimation should be done in the join cost estimation > and take into account number of inner table access (=K). CPU cost is > directly proportionate to K but page fetches can be estimated by > Mackert and Lohman formula using the total tuples count (K * > inner_table_selectivity * inner_table_total_tuples). I'd work on one thing at a time and go into it deeply. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend