Re: Avoiding hash join batch explosions with extreme skew and weird stats

2020-09-23 Thread Michael Paquier
On Mon, Aug 31, 2020 at 03:13:06PM -0700, Melanie Plageman wrote: > Attached is the current version of adaptive hash join with two > significant changes as compared to v10: The CF bot is complaining about a regression test failure: @@ -2465,7 +2465,7 @@ Gather (actual rows=469 loops=1)

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2020-06-25 Thread Tomas Vondra
On Thu, Jun 25, 2020 at 03:09:44PM -0700, Melanie Plageman wrote: On Tue, Jun 23, 2020 at 3:24 PM Tomas Vondra wrote: I started looking at the patch to refresh my knowledge both of this patch and parallel hash join, but I think it needs a rebase. The changes in 7897e3bb90 apparently touched

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2020-06-24 Thread Jesse Zhang
Hi Tomas, On Tue, Jun 23, 2020 at 3:24 PM Tomas Vondra wrote: > > Now, a couple comments / questions about the code. > > > nodeHash.c > -- > > > 1) MultiExecPrivateHash says this > >/* > * Not subject to skew optimization, so either insert normally > * or save to batch file if

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2020-06-23 Thread Tomas Vondra
On Mon, Jun 08, 2020 at 05:12:25PM -0700, Melanie Plageman wrote: On Wed, May 27, 2020 at 7:25 PM Melanie Plageman wrote: I've attached a rebased patch which includes the "provisionally detach" deadlock hazard fix approach Alas, the "provisional detach" logic proved incorrect (see last

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2020-05-04 Thread David Kimura
On Wed, Apr 29, 2020 at 4:44 PM David Kimura wrote: > > Following patch adds logic to create a batch 0 file for serial hash join so > that even in pathalogical case we do not need to exceed work_mem. Updated the patch to spill batch 0 tuples after it is marked as fallback. A couple questions

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2020-04-30 Thread Thomas Munro
On Fri, May 1, 2020 at 2:30 AM Melanie Plageman wrote: > I made a few edits to the message and threw it into a draft patch (on > top of master, of course). I didn't want to junk up peoples' inboxes, so > I didn't start a separate thread, but, it will be pretty hard to > collaboratively edit the

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2020-04-30 Thread Alvaro Herrera
On 2020-Apr-30, Melanie Plageman wrote: > On Tue, Apr 28, 2020 at 11:50 PM Heikki Linnakangas wrote: > > I haven't looked at the patch in detail, but thanks [...] > Thanks for taking a look, Heikki! Hmm. We don't have Heikki's message in the archives. In fact, the last message from Heikki

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2020-04-30 Thread Melanie Plageman
On Tue, Apr 28, 2020 at 11:50 PM Heikki Linnakangas wrote: > On 29/04/2020 05:03, Melanie Plageman wrote: > > I've attached a patch which should address some of the previous feedback > > about code complexity. Two of my co-workers and I wrote what is > > essentially a new prototype of the idea.

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2020-04-30 Thread David Kimura
On Wed, Apr 29, 2020 at 4:39 PM Melanie Plageman wrote: > > In addition to many assorted TODOs in the code, there are a few major > projects left: > - Batch 0 falling back > - Stripe barrier deadlock > - Performance improvements and testing > Batch 0 never spills. That behavior is an artifact

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2020-01-09 Thread Melanie Plageman
On Tue, Jan 7, 2020 at 4:14 PM Thomas Munro wrote: > * I am uneasy about BarrierArriveExplicitAndWait() (a variant of > BarrierArriveAndWait() that lets you skip directly to a given phase?); > perhaps you only needed that for a circular phase system, which you > could do with modular phase

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2020-01-07 Thread Thomas Munro
On Mon, Dec 30, 2019 at 4:34 PM Melanie Plageman wrote: > So, I finally have a prototype to share of parallel hashloop fallback. Hi Melanie, Thanks for all your continued work on this! I started looking at it today; it's a difficult project and I think it'll take me a while to grok. I do have

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-09-10 Thread Tomas Vondra
On Fri, Sep 06, 2019 at 10:54:13AM -0700, Melanie Plageman wrote: On Thu, Sep 5, 2019 at 10:35 PM Thomas Munro wrote: Seems like a good time for me to try to summarise what I think the main problems are here: 1. The match-bit storage problem already discussed. The tuples that each process

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-09-06 Thread Melanie Plageman
On Thu, Sep 5, 2019 at 10:35 PM Thomas Munro wrote: > Seems like a good time for me to try to summarise what I think the > main problems are here: > > 1. The match-bit storage problem already discussed. The tuples that > each process receives while reading from SharedTupleStore are >

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-09-05 Thread Thomas Munro
On Wed, Jul 31, 2019 at 6:47 AM Melanie Plageman wrote: > So, I've rewritten the patch to use a BufFile for the outer table > batch file tuples' match statuses and write bytes to and from the file > which start as 0 and, upon encountering a match for a tuple, I set its > bit in the file to 1

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-07-30 Thread Peter Geoghegan
On Tue, Jul 30, 2019 at 8:07 PM Melanie Plageman wrote: > For the actual write to disk, I'm pretty sure I get that for free from > the BufFile API, no? > I was more thinking about optimizing when I call BufFileWrite at all. Right. Clearly several existing buffile.c users regularly have very

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-07-30 Thread Melanie Plageman
On Tue, Jul 30, 2019 at 4:36 PM Robert Haas wrote: > On Tue, Jul 30, 2019 at 2:47 PM Melanie Plageman > wrote: > > I did the "needlessly dumb implementation" Robert mentioned, though, > > I thought about it and couldn't come up with a much smarter way to > > write match bits to a file. I think

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-07-30 Thread Robert Haas
On Tue, Jul 30, 2019 at 2:47 PM Melanie Plageman wrote: > I did the "needlessly dumb implementation" Robert mentioned, though, > I thought about it and couldn't come up with a much smarter way to > write match bits to a file. I think there might be an optimization > opportunity in not writing the

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-07-30 Thread Melanie Plageman
So, I've rewritten the patch to use a BufFile for the outer table batch file tuples' match statuses and write bytes to and from the file which start as 0 and, upon encountering a match for a tuple, I set its bit in the file to 1 (also rebased with current master). It, of course, only works for

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-07-13 Thread Tomas Vondra
On Wed, Jul 03, 2019 at 02:22:09PM -0700, Melanie Plageman wrote: On Tue, Jun 18, 2019 at 3:24 PM Melanie Plageman wrote: These questions will probably make a lot more sense with corresponding code, so I will follow up with the second version of the state machine patch once I finish it. I

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-07-03 Thread Melanie Plageman
On Tue, Jun 18, 2019 at 3:24 PM Melanie Plageman wrote: > > These questions will probably make a lot more sense with corresponding > code, so I will follow up with the second version of the state machine > patch once I finish it. > > I have changed the state machine and resolved the questions I

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-18 Thread Melanie Plageman
On Thu, Jun 13, 2019 at 7:10 AM Robert Haas wrote: > On Tue, Jun 11, 2019 at 2:35 PM Melanie Plageman > wrote: > > Let me try to articulate what I think the bitmap implementation would > look > > like: > > > > Before doing chunked hashloop join for any batch, we would need to > > know how many

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-13 Thread Robert Haas
On Tue, Jun 11, 2019 at 2:35 PM Melanie Plageman wrote: > Let me try to articulate what I think the bitmap implementation would look > like: > > Before doing chunked hashloop join for any batch, we would need to > know how many tuples are in the outer batch to make the bitmap the > correct size.

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-11 Thread Melanie Plageman
On Fri, Jun 7, 2019 at 7:30 AM Robert Haas wrote: > On Thu, Jun 6, 2019 at 7:31 PM Melanie Plageman > wrote: > > I'm not sure I understand why you would need to compare the original > > tuples to the unmatched tuples file. > > I think I was confused. Actually, I'm still not sure I understand

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-07 Thread Robert Haas
On Fri, Jun 7, 2019 at 10:17 AM Tomas Vondra wrote: > Yes, they could get quite big, and I think you're right we need to > keep that in mind, because it's on the outer (often quite large) side of > the join. And if we're aiming to restrict memory usage, it'd be weird to > just ignore this. > >

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-07 Thread Robert Haas
On Thu, Jun 6, 2019 at 7:31 PM Melanie Plageman wrote: > I'm not sure I understand why you would need to compare the original > tuples to the unmatched tuples file. I think I was confused. Actually, I'm still not sure I understand this part: > Then, you iterate again through the outer side a

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-07 Thread Tomas Vondra
On Thu, Jun 06, 2019 at 04:33:31PM -0700, Melanie Plageman wrote: On Tue, Jun 4, 2019 at 12:15 PM Robert Haas wrote: On Tue, Jun 4, 2019 at 3:09 PM Melanie Plageman wrote: > One question I have is, how would the OR'd together bitmap be > propagated to workers after the first chunk? That is,

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-07 Thread Tomas Vondra
On Thu, Jun 06, 2019 at 04:37:19PM -0700, Melanie Plageman wrote: On Thu, May 16, 2019 at 3:22 PM Thomas Munro wrote: Admittedly I don't have a patch, just a bunch of handwaving. One reason I haven't attempted to write it is because although I know how to do the non-parallel version using a

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-06 Thread Melanie Plageman
On Thu, May 16, 2019 at 3:22 PM Thomas Munro wrote: > Admittedly I don't have a patch, just a bunch of handwaving. One > reason I haven't attempted to write it is because although I know how > to do the non-parallel version using a BufFile full of match bits in > sync with the tuples for outer

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-06 Thread Melanie Plageman
On Tue, Jun 4, 2019 at 12:15 PM Robert Haas wrote: > On Tue, Jun 4, 2019 at 3:09 PM Melanie Plageman > wrote: > > One question I have is, how would the OR'd together bitmap be > > propagated to workers after the first chunk? That is, when there are > > no tuples left in the outer bunch, for a

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-06 Thread Melanie Plageman
On Tue, Jun 4, 2019 at 12:08 PM Robert Haas wrote: > On Tue, Jun 4, 2019 at 2:47 PM Melanie Plageman > wrote: > > Oops! You are totally right. > > I will amend the idea: > > For each chunk on the inner side, loop through both the original batch > > file and the unmatched outer tuples file

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-04 Thread Tomas Vondra
On Tue, Jun 04, 2019 at 03:08:24PM -0400, Robert Haas wrote: On Tue, Jun 4, 2019 at 2:47 PM Melanie Plageman wrote: Oops! You are totally right. I will amend the idea: For each chunk on the inner side, loop through both the original batch file and the unmatched outer tuples file created for

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-04 Thread Robert Haas
On Tue, Jun 4, 2019 at 3:09 PM Melanie Plageman wrote: > One question I have is, how would the OR'd together bitmap be > propagated to workers after the first chunk? That is, when there are > no tuples left in the outer bunch, for a given inner chunk, would you > load the bitmaps from each worker

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-04 Thread Melanie Plageman
On Tue, Jun 4, 2019 at 6:05 AM Robert Haas wrote: > On Sun, May 19, 2019 at 7:07 PM Thomas Munro > wrote: > > Unfortunately that bits-in-order scheme doesn't work for parallel > > hash, where the SharedTuplestore tuples seen by each worker are > > non-deterministic. So perhaps in that case we

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-04 Thread Robert Haas
On Tue, Jun 4, 2019 at 2:47 PM Melanie Plageman wrote: > Oops! You are totally right. > I will amend the idea: > For each chunk on the inner side, loop through both the original batch > file and the unmatched outer tuples file created for the last chunk. > Emit any matches and write out any

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-04 Thread Melanie Plageman
On Tue, Jun 4, 2019 at 5:43 AM Robert Haas wrote: > On Mon, Jun 3, 2019 at 5:10 PM Melanie Plageman > wrote: > > I was talking to Jeff Davis about this on Saturday, and, he felt that > > there might be a way to solve the problem differently if we thought of > > the left join case as performing

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-04 Thread Robert Haas
On Sun, May 19, 2019 at 7:07 PM Thomas Munro wrote: > Unfortunately that bits-in-order scheme doesn't work for parallel > hash, where the SharedTuplestore tuples seen by each worker are > non-deterministic. So perhaps in that case we could use the > HEAP_TUPLE_HAS_MATCH bit in the outer tuple

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-04 Thread Robert Haas
On Mon, Jun 3, 2019 at 5:10 PM Melanie Plageman wrote: > I was talking to Jeff Davis about this on Saturday, and, he felt that > there might be a way to solve the problem differently if we thought of > the left join case as performing an inner join and an antijoin > instead. > > Riffing on this

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-06-03 Thread Melanie Plageman
On Sun, May 19, 2019 at 4:07 PM Thomas Munro wrote: > On Sat, May 18, 2019 at 12:15 PM Melanie Plageman > wrote: > > On Thu, May 16, 2019 at 3:22 PM Thomas Munro > wrote: > >> Admittedly I don't have a patch, just a bunch of handwaving. One > >> reason I haven't attempted to write it is

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-20 Thread Melanie Plageman
On Sun, May 19, 2019 at 4:07 PM Thomas Munro wrote: > On Sat, May 18, 2019 at 12:15 PM Melanie Plageman > wrote: > > On Thu, May 16, 2019 at 3:22 PM Thomas Munro > wrote: > >> Admittedly I don't have a patch, just a bunch of handwaving. One > >> reason I haven't attempted to write it is

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-20 Thread Tomas Vondra
On Mon, May 20, 2019 at 01:25:52PM +1200, Thomas Munro wrote: On Mon, May 20, 2019 at 12:22 PM Tomas Vondra wrote: On Mon, May 20, 2019 at 11:07:03AM +1200, Thomas Munro wrote: >First let me restate the PostgreSQL terminology for this stuff so I >don't get confused while talking about it: > >*

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-19 Thread Andres Freund
Hi, On 2019-05-20 13:25:52 +1200, Thomas Munro wrote: > In PostgreSQL, it's always inner = right, outer = left. You can see > that reflected in plannodes.h and elsewhere: > > /* > * these are defined to avoid confusion problems with "left" > * and "right" and

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-19 Thread Thomas Munro
On Mon, May 20, 2019 at 12:22 PM Tomas Vondra wrote: > On Mon, May 20, 2019 at 11:07:03AM +1200, Thomas Munro wrote: > >First let me restate the PostgreSQL terminology for this stuff so I > >don't get confused while talking about it: > > > >* The inner side of the join = the right side = the side

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-19 Thread Tomas Vondra
On Mon, May 20, 2019 at 11:07:03AM +1200, Thomas Munro wrote: On Sat, May 18, 2019 at 12:15 PM Melanie Plageman wrote: On Thu, May 16, 2019 at 3:22 PM Thomas Munro wrote: Admittedly I don't have a patch, just a bunch of handwaving. One reason I haven't attempted to write it is because

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-19 Thread Thomas Munro
On Sat, May 18, 2019 at 12:15 PM Melanie Plageman wrote: > On Thu, May 16, 2019 at 3:22 PM Thomas Munro wrote: >> Admittedly I don't have a patch, just a bunch of handwaving. One >> reason I haven't attempted to write it is because although I know how >> to do the non-parallel version using a

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-17 Thread Melanie Plageman
On Thu, May 16, 2019 at 3:22 PM Thomas Munro wrote: > Admittedly I don't have a patch, just a bunch of handwaving. One > reason I haven't attempted to write it is because although I know how > to do the non-parallel version using a BufFile full of match bits in > sync with the tuples for outer

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-16 Thread Thomas Munro
On Fri, May 17, 2019 at 11:46 AM Tomas Vondra wrote: > On Thu, May 16, 2019 at 06:58:43PM -0400, Tom Lane wrote: > >Thomas Munro writes: > >> On Fri, May 17, 2019 at 4:39 AM Tomas Vondra > >> wrote: > >>> I kinda like the idea with increasing the spaceAllowed value. Essentially, > >>> if we

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-16 Thread Tomas Vondra
On Thu, May 16, 2019 at 06:58:43PM -0400, Tom Lane wrote: Thomas Munro writes: On Fri, May 17, 2019 at 4:39 AM Tomas Vondra wrote: I kinda like the idea with increasing the spaceAllowed value. Essentially, if we decide adding batches would be pointless, increasing the memory budget is the

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-16 Thread Tom Lane
Thomas Munro writes: > On Fri, May 17, 2019 at 4:39 AM Tomas Vondra > wrote: >> I kinda like the idea with increasing the spaceAllowed value. Essentially, >> if we decide adding batches would be pointless, increasing the memory >> budget is the only thing we can do anyway. > But that's not OK,

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-16 Thread Tomas Vondra
On Fri, May 17, 2019 at 10:21:56AM +1200, Thomas Munro wrote: On Fri, May 17, 2019 at 4:39 AM Tomas Vondra wrote: I think this is a step in the right direction, but as I said on the other thread(s), I think we should not disable growth forever and recheck once in a while. Otherwise we'll end

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-16 Thread Thomas Munro
On Fri, May 17, 2019 at 4:39 AM Tomas Vondra wrote: > I think this is a step in the right direction, but as I said on the other > thread(s), I think we should not disable growth forever and recheck once > in a while. Otherwise we'll end up in sad situation with non-uniform data > sets, as poined

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-16 Thread Tomas Vondra
On Thu, May 16, 2019 at 01:22:31PM +1200, Thomas Munro wrote: ... Here's a quick hack to show that a 95% cut-off fixes those examples. I don't really know how to choose the number, but I suspect it should be much closer to 100 than 50. I think this is the easiest of three fundamental problems

Avoiding hash join batch explosions with extreme skew and weird stats

2019-05-15 Thread Thomas Munro
Hello, As discussed elsewhere[1][2], our algorithm for deciding when to give up on repartitioning (AKA increasing the number of batches) tends to keep going until it has a number of batches that is a function of the number of distinct well distributed keys. I wanted to move this minor issue away