On Mon, Jan 27, 2014 at 10:00 AM, Simon Riggs wrote:
> On 27 January 2014 17:44, Pavel Stehule wrote:
>
> > This topic is interesting - we found very bad performance with hashing
> large
> > tables with high work_mem. MergeJoin with quicksort was significantly
> > faster.
>
> I've seen this also
On 27/01/14 18:00, Simon Riggs wrote:
On 27 January 2014 17:44, Pavel Stehule wrote:
This topic is interesting - we found very bad performance with hashing large
tables with high work_mem. MergeJoin with quicksort was significantly
faster.
I've seen this also.
I didn't deeper research - th
On 27 January 2014 17:44, Pavel Stehule wrote:
> This topic is interesting - we found very bad performance with hashing large
> tables with high work_mem. MergeJoin with quicksort was significantly
> faster.
I've seen this also.
> I didn't deeper research - there is a possibility of virtualizat
2014-01-27 Stephen Frost
> * Simon Riggs (si...@2ndquadrant.com) wrote:
> > I don't see anything for 9.4 in here now.
>
> Attached is what I was toying with (thought I had attached it previously
> somewhere.. perhaps not), but in re-testing, it doesn't appear to do
> enough to move things in the
* Stephen Frost (sfr...@snowman.net) wrote:
> * Simon Riggs (si...@2ndquadrant.com) wrote:
> > I don't see anything for 9.4 in here now.
>
> Attached [...]
I'm apparently bad at this 'attaching' thing, particularly on this
subject.
Here it is.
Thanks,
Stephen
colordiff
* Simon Riggs (si...@2ndquadrant.com) wrote:
> I don't see anything for 9.4 in here now.
Attached is what I was toying with (thought I had attached it previously
somewhere.. perhaps not), but in re-testing, it doesn't appear to do
enough to move things in the right direction in all cases. I did
On 25 January 2014 23:08, Simon Riggs wrote:
> On 25 January 2014 22:33, Stephen Frost wrote:
>
>> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>
>>> AFAICT, there was no consensus in this thread on what to do, which
>>> probably has something to do with the lack of concrete performance
>>> tests prese
Sent from my iPad
> On 26-Jan-2014, at 4:38, Simon Riggs wrote:
>
>> On 25 January 2014 22:33, Stephen Frost wrote:
>>
>> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>
>>> AFAICT, there was no consensus in this thread on what to do, which
>>> probably has something to do with the lack of concret
On 25 January 2014 22:33, Stephen Frost wrote:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> AFAICT, there was no consensus in this thread on what to do, which
>> probably has something to do with the lack of concrete performance
>> tests presented to back up any particular proposal.
>
> This I en
Tom,
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost writes:
> > In the end, I believe we absolutely should do something about this.
> > Hashing a 64M-row table (requiring upwards of 8G) instead of hashing
> > a 2M-row table is really bad of us.
>
> Huh? I don't see anything in the threa
Stephen Frost writes:
> * Bruce Momjian (br...@momjian.us) wrote:
>> Uh, were are we on this? Is it a TODO?
> I've been strongly considering my previous patch which tweaked
> NTUP_PER_BUCKET to '1' (instead of the default '10') when there's
> sufficient work_mem for it. There was recently anoth
* Bruce Momjian (br...@momjian.us) wrote:
> Uh, were are we on this? Is it a TODO?
I've been strongly considering my previous patch which tweaked
NTUP_PER_BUCKET to '1' (instead of the default '10') when there's
sufficient work_mem for it. There was recently another complaint on IRC
about our te
Uh, were are we on this? Is it a TODO?
---
On Wed, Jul 3, 2013 at 01:39:28PM +0530, Atri Sharma wrote:
> Hi all,
>
> I have been working on a patch for the above discussed
> functionalities. I made an array of int32s, one
Hi all,
I have been working on a patch for the above discussed
functionalities. I made an array of int32s, one for each bucket in a
hash table(the array is per hash table).
I am using the hash value directly to set the corresponding bit in the
bit field.Specifically:
bitfield_orvalue = 1
On Sun, Jun 23, 2013 at 2:45 AM, Simon Riggs wrote:
> On 23 June 2013 03:16, Stephen Frost wrote:
>
> > Will think on it more.
>
> Some other thoughts related to this...
>
> * Why are we building a special kind of hash table? Why don't we just
> use the hash table code that we in every other pla
>
> Isn't it currently the last week of this month? :) I'm guessing you
> mean July.
Heh,no.I lose track of time these days. Alright, second week of July then.
> I really don't see that happening, to be honest.. I think it would be
> interesting to try some of the surrogate-additional-hashing
* Atri Sharma (atri.j...@gmail.com) wrote:
> Right, let me look.Although, I am pretty busy atm with ordered set
> functions, so will get it done maybe last week of this month.
Isn't it currently the last week of this month? :) I'm guessing you
mean July.
> Another thing I believe in is that we s
On Wed, Jun 26, 2013 at 9:20 PM, Stephen Frost wrote:
> * Atri Sharma (atri.j...@gmail.com) wrote:
>> My point is that I would like to help in the implementation, if possible. :)
>
> Feel free to go ahead and implement it.. I'm not sure when I'll have a
> chance to (probably not in the next week
* Atri Sharma (atri.j...@gmail.com) wrote:
> My point is that I would like to help in the implementation, if possible. :)
Feel free to go ahead and implement it.. I'm not sure when I'll have a
chance to (probably not in the next week or two anyway). Unfortunately,
the bigger issue here is really
On Wed, Jun 26, 2013 at 7:12 PM, Stephen Frost wrote:
> Atri,
>
> * Atri Sharma (atri.j...@gmail.com) wrote:
>> I just popped in here on Simon's advice to put an idea I had about
>> optimizing hash joins on this thread.
>
> I'd encourage reading the thread a bit first, in the future.. :)
>
Yeah, I
Atri,
* Atri Sharma (atri.j...@gmail.com) wrote:
> I just popped in here on Simon's advice to put an idea I had about
> optimizing hash joins on this thread.
I'd encourage reading the thread a bit first, in the future.. :)
> Essentially, I was thinking of using bloom filters in the part where
>
On Sun, Jun 23, 2013 at 8:41 PM, Heikki Linnakangas
wrote:
> On 23.06.2013 01:48, Simon Riggs wrote:
>>
>> On 22 June 2013 21:40, Stephen Frost wrote:
>>
>>> I'm actually not a huge fan of this as it's certainly not cheap to do. If
>>> it
>>> can be shown to be better than an improved heuristic t
On 23.06.2013 01:48, Simon Riggs wrote:
On 22 June 2013 21:40, Stephen Frost wrote:
I'm actually not a huge fan of this as it's certainly not cheap to do. If it
can be shown to be better than an improved heuristic then perhaps it would
work but I'm not convinced.
We need two heuristics, it w
On Sunday, June 23, 2013, Simon Riggs wrote:
> On 23 June 2013 03:16, Stephen Frost >
> wrote:
>
> > Will think on it more.
>
> Some other thoughts related to this...
>
> * Why are we building a special kind of hash table? Why don't we just
> use the hash table code that we in every other place in
On Sunday, June 23, 2013, Simon Riggs wrote:
> On 23 June 2013 03:16, Stephen Frost >
> wrote:
>
> > Still doesn't really address the issue of dups though.
>
> Checking for duplicates in all cases would be wasteful, since often we
> are joining to the PK of a smaller table.
Well, that's what nd
On 23 June 2013 03:16, Stephen Frost wrote:
> Will think on it more.
Some other thoughts related to this...
* Why are we building a special kind of hash table? Why don't we just
use the hash table code that we in every other place in the backend.
If that code is so bad why do we use it everywhe
On 23 June 2013 03:16, Stephen Frost wrote:
> Still doesn't really address the issue of dups though.
Checking for duplicates in all cases would be wasteful, since often we
are joining to the PK of a smaller table.
If duplicates are possible at all for a join, then it would make sense
to build
On Saturday, June 22, 2013, Simon Riggs wrote:
> On 22 June 2013 21:40, Stephen Frost >
> wrote:
>
> > I'm actually not a huge fan of this as it's certainly not cheap to do.
> If it
> > can be shown to be better than an improved heuristic then perhaps it
> would
> > work but I'm not convinced.
>
>
On Saturday, June 22, 2013, Heikki Linnakangas wrote:
> On 22.06.2013 19:19, Simon Riggs wrote:
>
>> So I think that (2) is the best route: Given that we know with much
>> better certainty the number of rows in the scanned-relation, we should
>> be able to examine our hash table after it has been
On 22 June 2013 21:40, Stephen Frost wrote:
> I'm actually not a huge fan of this as it's certainly not cheap to do. If it
> can be shown to be better than an improved heuristic then perhaps it would
> work but I'm not convinced.
We need two heuristics, it would seem:
* an initial heuristic to
On 22.06.2013 19:19, Simon Riggs wrote:
So I think that (2) is the best route: Given that we know with much
better certainty the number of rows in the scanned-relation, we should
be able to examine our hash table after it has been built and decide
whether it would be cheaper to rebuild the hash t
On Saturday, June 22, 2013, Simon Riggs wrote:
> On 22 June 2013 14:48, Stephen Frost >
> wrote:
>
> > Based on your argument that we want to have a bucket load which is, on
> > average, the size of NTUP_PER_BUCKET, I have to '-1' on this.
>
> That is not my argument. I am pointing out that the co
On Sat, Jun 22, 2013 at 9:48 AM, Stephen Frost wrote:
>> The correct calculation that would match the objective set out in the
>> comment would be
>>
>> dbuckets = (hash_table_bytes / tupsize) / NTUP_PER_BUCKET;
>
> This looks to be driving the size of the hash table size off of "how
> many of th
On Sat, Jun 22, 2013 at 9:15 AM, Simon Riggs wrote:
> Previous discussions of Hash Joins have noted that the performance
> decreases when the average number of tuples per bucket increases.
> O(N^2) effects are seen.
>
> We've argued this about many different ways, yet all of those
> discussions ha
On 22 June 2013 14:48, Stephen Frost wrote:
> Based on your argument that we want to have a bucket load which is, on
> average, the size of NTUP_PER_BUCKET, I have to '-1' on this.
That is not my argument. I am pointing out that the comments claim the
code does that, and yet the code does not.
Simon,
* Simon Riggs (si...@2ndquadrant.com) wrote:
> Previous discussions of Hash Joins have noted that the performance
> decreases when the average number of tuples per bucket increases.
Having the hash table so small that we have hash bucket collisions with
different 32bit hash values is extre
36 matches
Mail list logo