On Tue, Dec 23, 2008 at 10:14:29AM -0500, Robert Haas wrote:
> > It's equivalent to our assumption that distributions of values in
> > columns in the same table are independent. Making that assumption in
> > this case would probably result in occasional dramatic speed
> > improvements similar to th
> It's equivalent to our assumption that distributions of values in
> columns in the same table are independent. Making that assumption in
> this case would probably result in occasional dramatic speed
> improvements similar to the ones we've seen in less complex joins,
> offset by just-as-occasion
On Tue, Dec 23, 2008 at 09:22:27AM -0500, Robert Haas wrote:
> On Tue, Dec 23, 2008 at 2:21 AM, Bryce Cutt wrote:
> > Because there is no nice way in PostgreSQL (that I know of) to derive
> > a histogram after a join (on an intermediate result) currently
> > usingMostCommonValues is only enabled o
On Tue, Dec 23, 2008 at 2:21 AM, Bryce Cutt wrote:
> Because there is no nice way in PostgreSQL (that I know of) to derive
> a histogram after a join (on an intermediate result) currently
> usingMostCommonValues is only enabled on a join when the outer (probe)
> side is a table scan (seq scan only
Because there is no nice way in PostgreSQL (that I know of) to derive
a histogram after a join (on an intermediate result) currently
usingMostCommonValues is only enabled on a join when the outer (probe)
side is a table scan (seq scan only actually). See
getMostCommonValues (soon to be called
Exec
On Sun, Dec 21, 2008 at 10:25:59PM -0500, Robert Haas wrote:
> [Some performance testing.]
I (finally!) have a chance to post my performance testing results... my
apologies for the really long delay.
Unfortunately I'm not seeing wonderful speedups with the particular
queries I did in this case.
[Some performance testing.]
I ran this query 10x with this patch applied, and then 10x again with
enable_hashjoin_usestatmvcs set to false to disable the optimization:
select sum(1) from (select * from part, lineitem where p_partkey = l_partkey) x;
With the optimization enabled, the query took b
Robert,
I thoroughly appreciate the constructive criticism.
The compile errors are due to my development process being convoluted.
I will endeavor to not waste your time in the future with errors
caused by my development process.
I have updated the code to follow the conventions and suggestions
ql.org [mailto:pgsql-hackers-
> ow...@postgresql.org] On Behalf Of Robert Haas
> Sent: December 17, 2008 7:54 PM
> To: Lawrence, Ramon
> Cc: Tom Lane; pgsql-hackers@postgresql.org; Bryce Cutt
> Subject: Re: [HACKERS] Proposed Patch to Improve Performance of Multi-
> Batch Hash Join for
Dr. Lawrence:
I'm still working on reviewing this patch. I've managed to load the
sample TPCH data from tpch1g1z.zip after changing the line endings to
UNIX-style and chopping off the trailing vertical bars. (If anyone is
interested, I have the results of pg_dump | bzip2 -9 on the resulting
data
I have to admit that I haven't fully grokked what this patch is about
just yet, so what follows is mostly a coding style review at this
point. It would help a lot if you could add some comments to the new
functions that are being added to explain the purpose of each at a
very high level. There's
> -Original Message-
> From: Tom Lane [mailto:[EMAIL PROTECTED]
> I'm a tad worried about what happens when the values that are
frequently
> occurring in the outer relation are also frequently occurring in the
> inner (which hardly seems an improbable case). Don't you stand a
severe
> risk
"Lawrence, Ramon" <[EMAIL PROTECTED]> writes:
> We propose a patch that improves hybrid hash join's performance for
> large multi-batch joins where the probe relation has skew.
> ...
> The basic idea
> is to keep build relation tuples in a small in-memory hash table that
> have join values that are
On Wed, Nov 05, 2008 at 04:06:11PM -0800, Bryce Cutt wrote:
> The error is causes by me Asserting against the wrong variable. I
> never noticed this as I apparently did not have assertions turned on
> on my development machine. That is fixed now and with the new patch
> version I have attached al
On Thu, Nov 6, 2008 at 5:31 PM, Lawrence, Ramon <[EMAIL PROTECTED]> wrote:
>> -Original Message-
>> > Minor question on this patch. AFAICS there is another patch that
> seems
>> > to be aiming at exactly the same use case. Jonah's Bloom filter
> patch.
>> >
>> > Shouldn't we have a dust off
> -Original Message-
> > Minor question on this patch. AFAICS there is another patch that
seems
> > to be aiming at exactly the same use case. Jonah's Bloom filter
patch.
> >
> > Shouldn't we have a dust off to see which one is best? Or at least a
> > discussion to test whether they overlap
On Thu, Nov 6, 2008 at 3:52 PM, Simon Riggs <[EMAIL PROTECTED]> wrote:
>
> On Thu, 2008-11-06 at 15:33 -0700, Joshua Tolley wrote:
>
>> Stay tuned.
>
> Minor question on this patch. AFAICS there is another patch that seems
> to be aiming at exactly the same use case. Jonah's Bloom filter patch.
>
>
On Thu, 2008-11-06 at 15:33 -0700, Joshua Tolley wrote:
> Stay tuned.
Minor question on this patch. AFAICS there is another patch that seems
to be aiming at exactly the same use case. Jonah's Bloom filter patch.
Shouldn't we have a dust off to see which one is best? Or at least a
discussion to
On Wed, Nov 5, 2008 at 5:06 PM, Bryce Cutt <[EMAIL PROTECTED]> wrote:
> The error is causes by me Asserting against the wrong variable. I
> never noticed this as I apparently did not have assertions turned on
> on my development machine. That is fixed now and with the new patch
> version I have a
On Wed, Nov 05, 2008 at 04:06:11PM -0800, Bryce Cutt wrote:
> The error is causes by me Asserting against the wrong variable. I
> never noticed this as I apparently did not have assertions turned on
> on my development machine. That is fixed now and with the new patch
> version I have attached al
The error is causes by me Asserting against the wrong variable. I
never noticed this as I apparently did not have assertions turned on
on my development machine. That is fixed now and with the new patch
version I have attached all assertions are passing with your query and
my test queries. I add
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Wed, Nov 5, 2008 at 8:20 AM, Tom Lane wrote:
> Joshua Tolley writes:
>> On Mon, Oct 20, 2008 at 03:42:49PM -0700, Lawrence, Ramon wrote:
>>> We propose a patch that improves hybrid hash join's performance for large
>>> multi-batch joins where the
Joshua Tolley <[EMAIL PROTECTED]> writes:
> On Mon, Oct 20, 2008 at 03:42:49PM -0700, Lawrence, Ramon wrote:
>> We propose a patch that improves hybrid hash join's performance for large
>> multi-batch joins where the probe relation has skew.
> I also recommend modifying docs/src/sgml/config.sgml t
On Mon, Oct 20, 2008 at 03:42:49PM -0700, Lawrence, Ramon wrote:
>We propose a patch that improves hybrid hash join's performance for large
>multi-batch joins where the probe relation has skew.
I also recommend modifying docs/src/sgml/config.sgml to include the
enable_hashjoin_usestatmcvs
On Mon, Oct 20, 2008 at 03:42:49PM -0700, Lawrence, Ramon wrote:
>We propose a patch that improves hybrid hash join's performance for large
>multi-batch joins where the probe relation has skew.
I'm running into problems with this patch. It applies cleanly, and the
technique you provided fo
> From: Tom Lane [mailto:[EMAIL PROTECTED]
> What alternatives are there for people who do not run Windows?
>
> regards, tom lane
The TPC-H generator is a standard code base provided at
http://www.tpc.org/tpch/. We have been able to compile this code on
Linux.
However, we
"Lawrence, Ramon" <[EMAIL PROTECTED]> writes:
> The easiest way to test would be to generate your own TPC-H data and
> load it into a database for testing. I have posted the TPC-H generator
> at:
> http://people.ok.ubc.ca/rlawrenc/TPCHSkew.zip
> The generator can produce skewed data sets. It was
On Sun, Nov 2, 2008 at 4:48 PM, Lawrence, Ramon <[EMAIL PROTECTED]> wrote:
> Joshua,
>
> Thank you for offering to review the patch.
>
> The easiest way to test would be to generate your own TPC-H data and
> load it into a database for testing. I have posted the TPC-H generator
> at:
>
> http://pe
Okanagan
E-mail: [EMAIL PROTECTED]
> -Original Message-
> From: Joshua Tolley [mailto:[EMAIL PROTECTED]
> Sent: November 1, 2008 3:42 PM
> To: Lawrence, Ramon
> Cc: pgsql-hackers@postgresql.org; Bryce Cutt
> Subject: Re: [HACKERS] Proposed Patch to Improve Performance of M
On Mon, Oct 20, 2008 at 4:42 PM, Lawrence, Ramon <[EMAIL PROTECTED]> wrote:
> We propose a patch that improves hybrid hash join's performance for large
> multi-batch joins where the probe relation has skew.
>
> Project name: Histojoin
> Patch file: histojoin_v1.patch
>
> This patch implements the H
We propose a patch that improves hybrid hash join's performance for
large multi-batch joins where the probe relation has skew.
Project name: Histojoin
Patch file: histojoin_v1.patch
This patch implements the Histojoin join algorithm as an optional
feature added to the standard Hybrid Hash
31 matches
Mail list logo