Re: pgsql: Add parallel-aware hash joins.

2018-01-24 Thread Tom Lane
Andres Freund writes: > On 2018-01-22 23:17:47 +1300, Thomas Munro wrote: >> Here is a patch that halves the size of the test tables used. >> ... Does this pass repeatedly on gaur? > I'd say, let's just commit it and see? Oh, sorry, I forgot I was on the hook to check that. The news isn't good:

Re: pgsql: Add parallel-aware hash joins.

2018-01-24 Thread Andres Freund
Hi, On 2018-01-22 23:17:47 +1300, Thomas Munro wrote: > On Thu, Dec 28, 2017 at 5:26 PM, Tom Lane wrote: > Here is a patch that halves the size of the test tables used. I don't > want them to be too small because I want some some real parallel > processing at least sometimes. On my slowest syst

Re: pgsql: Add parallel-aware hash joins.

2018-01-22 Thread Tom Lane
I wrote: > Anyway, it looks like we should write off these longfin timings as > ambient noise :-( Here's a possibly more useful graph of regression test timings over the last year. I pulled this from the buildfarm database: it is the reported runtime for the "installcheck-C" step in each successf

Re: pgsql: Add parallel-aware hash joins.

2018-01-22 Thread Tom Lane
Thomas Munro writes: > On Mon, Jan 22, 2018 at 11:17 PM, Thomas Munro > wrote: >> It looks to me like longfin's average and >> variation increased half way between fa330f9a and 18042840, somewhere >> near Dec 7 to 9, when we went from ~40s +/- 1 to ~50s with several >> seconds' variation. Was th

Re: pgsql: Add parallel-aware hash joins.

2018-01-22 Thread Thomas Munro
On Mon, Jan 22, 2018 at 11:17 PM, Thomas Munro wrote: > It looks to me like longfin's average and > variation increased half way between fa330f9a and 18042840, somewhere > near Dec 7 to 9, when we went from ~40s +/- 1 to ~50s with several > seconds' variation. Was there some other environmental c

Re: pgsql: Add parallel-aware hash joins.

2018-01-22 Thread Thomas Munro
On Thu, Dec 28, 2017 at 5:26 PM, Tom Lane wrote: > Thomas Munro writes: >> On Thu, Dec 28, 2017 at 3:32 PM, Tom Lane wrote: >>> Aside from the instability problems, I'm pretty unhappy about how much >>> the PHJ patch has added to the runtime of "make check". I do not think >>> any one feature c

Re: pgsql: Add parallel-aware hash joins.

2018-01-04 Thread Thomas Munro
On Fri, Jan 5, 2018 at 5:00 AM, Tom Lane wrote: > The early returns indicate that that problem is fixed; Thanks for your help and patience with that. I've made a list over here so we don't lose track of the various things that should be improved in this area, and will start a new thread when I h

Re: pgsql: Add parallel-aware hash joins.

2018-01-04 Thread Tom Lane
I wrote: > I ran a couple dozen test cycles on gaur without a failure. That's > not enough to really prove anything, but it's more successes than I was > getting before. I pushed the patch so we can see what the rest of the > buildfarm thinks. The early returns indicate that that problem is fixe

Re: pgsql: Add parallel-aware hash joins.

2018-01-03 Thread Tom Lane
Thomas Munro writes: > I spent a lot of time trying and failing to get the world's slowest 32 > bit powerpc emulation to reproduce this. Bleugh. Before we rip that > test out, would you mind checking if this passes repeatedly on gaur or > pademelon? I ran a couple dozen test cycles on gaur with

Re: pgsql: Add parallel-aware hash joins.

2018-01-03 Thread Tom Lane
Thomas Munro writes: > I spent a lot of time trying and failing to get the world's slowest 32 > bit powerpc emulation to reproduce this. Bleugh. Before we rip that > test out, would you mind checking if this passes repeatedly on gaur or > pademelon? Will do, but that machine is none too fast it

Re: pgsql: Add parallel-aware hash joins.

2018-01-03 Thread Thomas Munro
On Thu, Jan 4, 2018 at 2:17 PM, Tom Lane wrote: > So this patch has been in place for two weeks, the buildfarm has still > got the measles, and we're entering the January commitfest with a lot > of other work to get done. I realize that the two weeks were mostly > holiday time, but it's time to h

Re: pgsql: Add parallel-aware hash joins.

2018-01-03 Thread Andres Freund
Hi, On 2018-01-03 20:17:04 -0500, Tom Lane wrote: > So this patch has been in place for two weeks, the buildfarm has still > got the measles, and we're entering the January commitfest with a lot > of other work to get done. I realize that the two weeks were mostly > holiday time, but it's time to

Re: pgsql: Add parallel-aware hash joins.

2018-01-03 Thread Tom Lane
So this patch has been in place for two weeks, the buildfarm has still got the measles, and we're entering the January commitfest with a lot of other work to get done. I realize that the two weeks were mostly holiday time, but it's time to have some urgency about clearing the buildfarm failures.

Re: pgsql: Add parallel-aware hash joins.

2018-01-03 Thread Tom Lane
Thomas Munro writes: > On Wed, Jan 3, 2018 at 2:38 PM, Tom Lane wrote: >> Hm. That could do it, except it doesn't really account for the observed >> result that slower single-processor machines seem more prone to the >> bug. Surely they should be less likely to get multiple workers activated.

Re: pgsql: Add parallel-aware hash joins.

2018-01-03 Thread Thomas Munro
On Wed, Jan 3, 2018 at 2:38 PM, Tom Lane wrote: > Thomas Munro writes: >> I mean that ExecChooseHashTableSize() estimates the hash table size like >> this: >> inner_rel_bytes = ntuples * tupsize; > >> ... but then at execution time, in the Parallel Hash case, we do >> memory accounting not i

Re: pgsql: Add parallel-aware hash joins.

2018-01-02 Thread Tom Lane
Thomas Munro writes: > On Sun, Dec 31, 2017 at 1:00 PM, Tom Lane wrote: >> "Size estimation error"? Why do you think it's that? We have exactly >> the same plan in both cases. > I mean that ExecChooseHashTableSize() estimates the hash table size like this: > inner_rel_bytes = ntuples * tup

Re: pgsql: Add parallel-aware hash joins.

2018-01-02 Thread Thomas Munro
On Sun, Dec 31, 2017 at 1:00 PM, Tom Lane wrote: >> Right. That's apparently unrelated and is the last build-farm issue >> on my list (so far). I had noticed that certain BF animals are prone >> to that particular failure, and they mostly have architectures that I >> don't have so a few things a

Re: pgsql: Add parallel-aware hash joins.

2018-01-01 Thread Thomas Munro
On Tue, Jan 2, 2018 at 11:42 AM, Andres Freund wrote: > Pushed your updated version. Thanks. That should leave just the test failures like this on certain machines: *** 6103,6109 $$); initially_multibatch | increased_batches --+--- ! t

Re: pgsql: Add parallel-aware hash joins.

2018-01-01 Thread Andres Freund
On 2017-12-31 10:59:26 +1300, Thomas Munro wrote: > On Sun, Dec 31, 2017 at 5:16 AM, Andres Freund wrote: > >> In a race case, EXPLAIN ANALYZE could fail to display correct nbatch and > >> size > >> information. Refactor so that participants report only on batches they > >> worked > >> on rathe

Re: pgsql: Add parallel-aware hash joins.

2017-12-30 Thread Tom Lane
Thomas Munro writes: > On Sun, Dec 31, 2017 at 11:34 AM, Tom Lane wrote: >> ... This isn't quite 100% reproducible on gaur/pademelon, >> but it fails more often than not seems like, so I can poke into it >> if you can say what info would be helpful. > Right. That's apparently unrelated and is t

Re: pgsql: Add parallel-aware hash joins.

2017-12-30 Thread Thomas Munro
On Sun, Dec 31, 2017 at 11:34 AM, Tom Lane wrote: > Thomas Munro writes: >> You mentioned that prairiedog sees the problem about one time in >> thirty. Would you mind checking if it goes away with this patch >> applied? > > I've run 55 cycles of "make installcheck" without seeing a failure > wit

Re: pgsql: Add parallel-aware hash joins.

2017-12-30 Thread Tom Lane
Thomas Munro writes: >> This is explained by the early exit case in >> ExecParallelHashEnsureBatchAccessors(). With just the right timing, >> it finishes up not reporting the true nbatch number, and never calling >> ExecParallelHashUpdateSpacePeak(). > Hi Tom, > You mentioned that prairiedog se

Re: pgsql: Add parallel-aware hash joins.

2017-12-30 Thread Thomas Munro
On Sun, Dec 31, 2017 at 5:16 AM, Andres Freund wrote: >> In a race case, EXPLAIN ANALYZE could fail to display correct nbatch and size >> information. Refactor so that participants report only on batches they >> worked >> on rather than trying to report on all of them, and teach explain.c to >>

Re: pgsql: Add parallel-aware hash joins.

2017-12-30 Thread Andres Freund
Hi, On 2017-12-31 02:51:26 +1300, Thomas Munro wrote: > You mentioned that prairiedog sees the problem about one time in > thirty. Would you mind checking if it goes away with this patch > applied? > > -- > Thomas Munro > http://www.enterprisedb.com > From cbed027275039cc5debf8db89342a133a831c

Re: pgsql: Add parallel-aware hash joins.

2017-12-30 Thread Thomas Munro
On Fri, Dec 29, 2017 at 2:21 AM, Thomas Munro wrote: > On Thu, Dec 28, 2017 at 5:15 PM, Thomas Munro > wrote: >> On Thu, Dec 28, 2017 at 3:32 PM, Tom Lane wrote: >>> !Buckets: 1024 (originally 2048) Batches: 1 >>> (originally 1) Memory Usage: 0kB >>> ! Execution t

Re: pgsql: Add parallel-aware hash joins.

2017-12-28 Thread Thomas Munro
On Thu, Dec 28, 2017 at 5:15 PM, Thomas Munro wrote: > On Thu, Dec 28, 2017 at 3:32 PM, Tom Lane wrote: >> !Buckets: 1024 (originally 2048) Batches: 1 >> (originally 1) Memory Usage: 0kB >> ! Execution time: 243.120 ms >> >> I don't have enough insight to be totall

Re: pgsql: Add parallel-aware hash joins.

2017-12-27 Thread Tom Lane
Thomas Munro writes: > On Thu, Dec 28, 2017 at 3:32 PM, Tom Lane wrote: >> Aside from the instability problems, I'm pretty unhappy about how much >> the PHJ patch has added to the runtime of "make check". I do not think >> any one feature can justify adding 20% to that. Can't you cut down the >

Re: pgsql: Add parallel-aware hash joins.

2017-12-27 Thread Thomas Munro
On Thu, Dec 28, 2017 at 3:32 PM, Tom Lane wrote: > Thomas Munro writes: >> I'll address the instability of the regression test output separately. > > If you're still looking for data on that --- prairiedog is able to > reproduce the "multibatch = f" variant about one time in thirty. > I modified

Re: pgsql: Add parallel-aware hash joins.

2017-12-27 Thread Tom Lane
Thomas Munro writes: > I'll address the instability of the regression test output separately. If you're still looking for data on that --- prairiedog is able to reproduce the "multibatch = f" variant about one time in thirty. I modified the test case to print out the full EXPLAIN ANALYZE output r

Re: pgsql: Add parallel-aware hash joins.

2017-12-26 Thread Thomas Munro
On Fri, Dec 22, 2017 at 9:22 PM, Andres Freund wrote: > On 2017-12-22 21:16:10 +1300, Thomas Munro wrote: >> Andres, your machine francolin crashed -- got a core file? > > Unfortunately not - it appears the buildfarm cleared it away :( I now have a workload that fails within a few minutes or so o

Re: pgsql: Add parallel-aware hash joins.

2017-12-22 Thread Andres Freund
Hi, On 2017-12-22 21:16:10 +1300, Thomas Munro wrote: > Andres, your machine francolin crashed -- got a core file? Unfortunately not - it appears the buildfarm cleared it away :( Might try to reproduce it on that machine... Greetings, Andres Freund

Re: pgsql: Add parallel-aware hash joins.

2017-12-22 Thread Thomas Munro
On Fri, Dec 22, 2017 at 1:48 AM, Thomas Munro wrote: > I don't think that's quite it, because it should never have set > 'writing' for any batch number >= nbatch. > > It's late here, but I'll take this up tomorrow and either find a fix > or figure out how to avoid antisocial noise levels on the bu

Re: pgsql: Add parallel-aware hash joins.

2017-12-21 Thread Tom Lane
Andres Freund writes: > On 2017-12-21 08:49:46 +, Andres Freund wrote: >> Add parallel-aware hash joins. > There's to relatively mundane failures: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2017-12-21%2008%3A48%3A12 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl

Re: pgsql: Add parallel-aware hash joins.

2017-12-21 Thread Thomas Munro
On Thu, Dec 21, 2017 at 10:55 PM, Andres Freund wrote: > Thomas, I wonder if the problem is that PHJ_GROW_BATCHES_ELECTING > updates, via ExecParallelHashJoinSetUpBatches(), HashJoinTable->nbatch, > while other backends also access ->nbatch in > ExecParallelHashCloseBatchAccessors(). Both happens

Re: pgsql: Add parallel-aware hash joins.

2017-12-21 Thread Andres Freund
On 2017-12-21 01:55:50 -0800, Andres Freund wrote: > On 2017-12-21 01:29:40 -0800, Andres Freund wrote: > > On 2017-12-21 08:49:46 +, Andres Freund wrote: > > > Add parallel-aware hash joins. > > > > There's to relatively mundane failures: > > https://buildfarm.postgresql.org/cgi-bin/show_log.p

Re: pgsql: Add parallel-aware hash joins.

2017-12-21 Thread Andres Freund
On 2017-12-21 01:29:40 -0800, Andres Freund wrote: > On 2017-12-21 08:49:46 +, Andres Freund wrote: > > Add parallel-aware hash joins. > > There's to relatively mundane failures: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2017-12-21%2008%3A48%3A12 > https://buildfarm.pos

Re: pgsql: Add parallel-aware hash joins.

2017-12-21 Thread Thomas Munro
On Thu, Dec 21, 2017 at 10:29 PM, Andres Freund wrote: > On 2017-12-21 08:49:46 +, Andres Freund wrote: >> Add parallel-aware hash joins. > > There's to relatively mundane failures: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2017-12-21%2008%3A48%3A12 > https://buildfarm.

Re: pgsql: Add parallel-aware hash joins.

2017-12-21 Thread Andres Freund
On 2017-12-21 08:49:46 +, Andres Freund wrote: > Add parallel-aware hash joins. There's to relatively mundane failures: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2017-12-21%2008%3A48%3A12 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=termite&dt=2017-12-21%2008%3

pgsql: Add parallel-aware hash joins.

2017-12-21 Thread Andres Freund
Add parallel-aware hash joins. Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel Hash Join with Parallel Hash. While hash joins could already appear in parallel queries, they were previously always parallel-oblivious and had a partial subplan only on the outer side, mea