Re: [PERFORM] hash join vs nested loop join

Huan Ruan Thu, 20 Dec 2012 13:16:59 -0800

On 21 December 2012 01:06, Kevin Grittner <kgri...@mail.com> wrote:


> Huan Ruan wrote:
> > Kevin Grittner wrote:
>
> >> Frankly, at 12 microseconds per matched pair of rows, I think
> >> you're doing OK.
> >
> > This plan is the good one, I want the indexscan nested loop join and this
> > is only achieved after making all these costing factors change. Before
> > that, it was hash join and was very slow.
> >
> > However, I'm worried about the config changes being too 'extreme', i.e.
> > both sequential I/O and random I/O have the same cost and being only 0.1.
> > So, I was more wondering why I have to make such dramatic changes to
> > convince the optimiser to use NL join instead of hash join. And also, I'm
> > not sure what impact will these changes have on other queries yet. e.g.
> > will a query that's fine with hash join now choose NL join and runs
> slower?
>
> I understand the concern, but PostgreSQL doesn't yet have a knob to
> turn for "cache hit ratio". You essentially need to build that into
> the page costs. Since your cache hit ratio (between shared buffers
> and the OS) is so high, the cost of page access relative to CPU
> costs has declined and there isn't any effective difference between
> sequential and random access. As the level of caching changes, you
> may need to adjust. In one production environment where there was
> significant caching, but far enough from 100% to matter, we tested
> various configurations and found the fastest plans being chosen
> with seq_page_cost = 0.3 and random_page_cost = 0.5. Tune to your
> workload.
>
>
> Thanks Kevin. I think I get some ideas now that I can try on the
production server when we switch.

Re: [PERFORM] hash join vs nested loop join

Reply via email to