On Mon, Nov 9, 2015 at 2:42 AM, Simon Riggs <si...@2ndquadrant.com> wrote: > On 9 November 2015 at 10:08, <kawami...@tkl.iis.u-tokyo.ac.jp> wrote: >> >> >> We guessed the cause of this error would be in the cost model of Postgres, >> and investigated the source code of optimizer, and we found the cause of >> this problem. It was in the index cost estimation process. On scanning inner >> table, if loop count is greater than 1, its I/O cost is counted as random >> access. In the case of Query2, in one loop (i.e. one inner table scan) , >> much data is read sequentially with clustered index, so it seems to be wrong >> that optimizer thinks its I/O workload is random access. > > > We don't have a clustered index in Postgres. We do store correlation stats > for the index, which is used in some places to reduce cost. > > Could you look some more at this and see if a model change that uses the > correlation can improve this?
That is already happening. min_IO_cost is set on the assumption of perfect correlation, max_IO_cost is set on the assumption of no correlation, and then later in the code it interpolates between these two based on the observed correlation: run_cost += max_IO_cost + csquared * (min_IO_cost - max_IO_cost); But the min_IO_cost is very pessimistic in this particular case. So I think that their proposed change is already exactly what you are asking them to try. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers