Re: [HACKERS] Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets

2009-03-20 Thread Bryce Cutt
The skew buckets are dynamically flushed just like buckets in a dynamic hash join would be. - Bryce Cutt On Fri, Mar 20, 2009 at 5:51 PM, Robert Haas wrote: > On Fri, Mar 20, 2009 at 8:45 PM, Bryce Cutt wrote: >> On Fri, Mar 20, 2009 at 5:35 PM, Robert Haas wrote: >>> If the

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets

2009-03-20 Thread Bryce Cutt
way through if too many inner tuples fall into the new "skew buckets" (formerly IM buckets) and dump the tuples back into the main buckets. The potential win is still pretty high though. - Bryce Cutt On Fri, Mar 20, 2009 at 5:35 PM, Robert Haas wrote: > On Fri, Mar 20, 2009 at 8:14

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets

2009-03-02 Thread Bryce Cutt
nodeHashjoin.c (in ExecHashJoin) has been rearranged so that in the single batch case short circuit evaluation requires only the first test in the IF to be checked. The "limited skew" check mentioned in Case 2 above is a simple check in the ExecHashJoinDetectSkew function. - Bryce Cutt

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets

2009-02-26 Thread Bryce Cutt
en at all and instead of the if statement happening per tuple it is run just once per join. We have to test this a bit more but it should further reduce the overhead. Hopefully we will have the new patch ready to go this weekend. - Bryce Cutt On Thu, Feb 26, 2009 at 7:45 AM, Tom Lane

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets

2009-01-06 Thread Bryce Cutt
on finalizing this patch, especially in regard to style issues. Thank you for all the help. - Dr. Ramon Lawrence and Bryce Cutt On Sun, Jan 4, 2009 at 6:48 PM, Robert Haas wrote: >> 1) Isn't ExecHashFreezeNextMCVPartition actually a most common TUPLE >> partition, rather t

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets

2008-12-30 Thread Bryce Cutt
the probe side is an operator other than a seq scan (such as another hashjoin) the code can now find the stats tuple for the underlying relation. The new idea of limiting the number of MCVs to a percentage of memory has not been added yet. - Bryce Cutt On Mon, Dec 29, 2008 at 8:55 PM, Robert

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-12-23 Thread Bryce Cutt
ccepted. It relies on getting the stats tuple for the join during the planning phase (in the cost function) and estimating the benefit that would have on the join cost. - Bryce Cutt On Mon, Dec 22, 2008 at 6:15 AM, Joshua Tolley wrote: > On Sun, Dec 21, 2008 at 10:25:59PM -0500, Rob

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-12-20 Thread Bryce Cutt
use the examine_variable() function in selfuncs.c except I would first need a PlannerInfo and I don't think I can get that from inside the join initialization code. - Bryce Cutt On Mon, Dec 15, 2008 at 8:51 PM, Robert Haas wrote: > I have to admit that I haven't fully grokked what

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-11-05 Thread Bryce Cutt
Tom is correct that it can be removed in the final version. - Bryce Cutt On Wed, Nov 5, 2008 at 7:22 AM, Joshua Tolley <[EMAIL PROTECTED]> wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On Wed, Nov 5, 2008 at 8:20 AM, Tom Lane wrote: >> Joshua Tolley wr