Re: [HACKERS] a few crazy ideas about hash joins

2009-04-07 Thread Bruce Momjian
Are there any TODOs here? --- Robert Haas wrote: On Fri, Apr 3, 2009 at 5:10 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Apr 3, 2009 at 4:29 PM, Tom Lane t...@sss.pgh.pa.us

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-07 Thread Robert Haas
On Tue, Apr 7, 2009 at 9:55 AM, Bruce Momjian br...@momjian.us wrote: Are there any TODOs here? I'd say that all of the items listed in my original email could be TODOs. I'm planning to work on as many of them as I have time for. Ramon Lawrence is also working on some related ideas, as

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-07 Thread Bruce Momjian
Robert Haas wrote: On Tue, Apr 7, 2009 at 9:55 AM, Bruce Momjian br...@momjian.us wrote: Are there any TODOs here? I'd say that all of the items listed in my original email could be TODOs. I'm planning to work on as many of them as I have time for. Ramon Lawrence is also working on some

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-07 Thread Robert Haas
On Tue, Apr 7, 2009 at 5:11 PM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: On Tue, Apr 7, 2009 at 9:55 AM, Bruce Momjian br...@momjian.us wrote: Are there any TODOs here? I'd say that all of the items listed in my original email could be TODOs.  I'm planning to work on as many

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-07 Thread Bruce Momjian
Robert Haas wrote: On Tue, Apr 7, 2009 at 5:11 PM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: On Tue, Apr 7, 2009 at 9:55 AM, Bruce Momjian br...@momjian.us wrote: Are there any TODOs here? I'd say that all of the items listed in my original email could be TODOs. ?I'm

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-07 Thread Chris Dunlop
On 2009-04-03, Simon Riggs si...@2ndquadrant.com wrote: On Fri, 2009-04-03 at 18:03 +0100, Greg Stark wrote: I wonder if we need a whole class of index algorithms to deal specifically with read-only tables I think we can drop the word index from the sentence as well. Read-only isn't an

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-07 Thread Robert Haas
On Tue, Apr 7, 2009 at 5:57 PM, Bruce Momjian br...@momjian.us wrote: I think perhaps Optimizer / Executor would be more appropriate, since these are not about hash indices but rather about hash joins.  I will look at doing that. Yes, please. Done. See what you think... Also I think the

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-04 Thread Dimitri Fontaine
Hi, Le 3 avr. 09 à 22:29, Tom Lane a écrit : Correct, but you've got the details all wrong. The real problem is that the planner might discard a join path hash(A,B) at level 2 because it loses compared to, say, merge(A,B). But when we get to level three, perhaps hash(hash(A,B),C) would've

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-04 Thread Robert Haas
On Fri, Apr 3, 2009 at 5:10 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Apr 3, 2009 at 4:29 PM, Tom Lane t...@sss.pgh.pa.us wrote: Correct, but you've got the details all wrong.  The real problem is that the planner might discard a join path

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Robert Haas
On Fri, Apr 3, 2009 at 1:48 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Robert Haas wrote: While investigating some performance problems recently I've had cause to think about the way PostgreSQL uses hash joins.  So here are a few thoughts.  Some of these have been

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Kenneth Marshall
On Fri, Apr 03, 2009 at 08:03:33AM -0400, Robert Haas wrote: On Fri, Apr 3, 2009 at 1:48 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Robert Haas wrote: While investigating some performance problems recently I've had cause to think about the way PostgreSQL uses hash

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Lawrence, Ramon
While investigating some performance problems recently I've had cause to think about the way PostgreSQL uses hash joins. So here are a few thoughts. Some of these have been brought up before. 1. When the hash is not expected to spill to disk, it preserves the pathkeys of the outer side of

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Robert Haas
On Fri, Apr 3, 2009 at 11:24 AM, Lawrence, Ramon ramon.lawre...@ubc.ca wrote: I would be interested in working with you on any of these changes to hash join if you decide to pursue them.   I am especially interested in looking at the hash aggregation code and potentially improving its

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Simon Riggs
On Thu, 2009-04-02 at 22:08 -0400, Robert Haas wrote: 3. Avoid building the exact same hash table twice in the same query. This happens more often you'd think. For example, a table may have two columns creator_id and last_updater_id which both reference person (id). If you're considering a

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Greg Stark
1. When the hash is not expected to spill to disk, it preserves the pathkeys of the outer side of the join.  If the optimizer were allowed to assume that, it could produce significantly more efficient query plans in some cases. This is definitely possible, but you will have to dynamically

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Greg Stark
On Fri, Apr 3, 2009 at 5:41 PM, Simon Riggs si...@2ndquadrant.com wrote: I would be especially interested in using a shared memory hash table that *all* backends can use - if the table is mostly read-only, as dimension tables often are in data warehouse applications. That would give zero

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Simon Riggs
On Fri, 2009-04-03 at 18:03 +0100, Greg Stark wrote: I wonder if we need a whole class of index algorithms to deal specifically with read-only tables I think we can drop the word index from the sentence as well. Read-only isn't an isolated case. Often you find many read-only tables alongside

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Kevin Grittner
Simon Riggs si...@2ndquadrant.com wrote: Read-only isn't an isolated case. Often you find many read-only tables alongside rapidly changing tables. So even the busiest of databases can benefit from read-only optimisations. Having changes to those tables cause much heavier additional work is

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Lawrence, Ramon
I would be especially interested in using a shared memory hash table that *all* backends can use - if the table is mostly read-only, as dimension tables often are in data warehouse applications. That would give zero startup cost and significantly reduced memory. I think that's a

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Robert Haas
On Fri, Apr 3, 2009 at 12:55 PM, Greg Stark st...@enterprisedb.com wrote: 1. When the hash is not expected to spill to disk, it preserves the pathkeys of the outer side of the join.  If the optimizer were allowed to assume that, it could produce significantly more efficient query plans in some

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: I don't see why hash_inner_and_outer can't walk the outer path looking for suitable hashes to reuse. I think the question is how aggressive we want to be in performing that search. Correct, but you've got the details all wrong. The real problem is

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Robert Haas
On Fri, Apr 3, 2009 at 4:29 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I don't see why hash_inner_and_outer can't walk the outer path looking for suitable hashes to reuse.  I think the question is how aggressive we want to be in performing that search.

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Fri, Apr 3, 2009 at 4:29 PM, Tom Lane t...@sss.pgh.pa.us wrote: Correct, but you've got the details all wrong.  The real problem is that the planner might discard a join path hash(A,B) at level 2 because it loses compared to, say, merge(A,B).  But

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Grzegorz Jaskiewicz
On 3 Apr 2009, at 19:44, Lawrence, Ramon wrote: I would be especially interested in using a shared memory hash table that *all* backends can use - if the table is mostly read-only, as dimension tables often are in data warehouse applications. That would give zero startup cost and

[HACKERS] a few crazy ideas about hash joins

2009-04-02 Thread Robert Haas
While investigating some performance problems recently I've had cause to think about the way PostgreSQL uses hash joins. So here are a few thoughts. Some of these have been brought up before. 1. When the hash is not expected to spill to disk, it preserves the pathkeys of the outer side of the

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-02 Thread Heikki Linnakangas
Robert Haas wrote: While investigating some performance problems recently I've had cause to think about the way PostgreSQL uses hash joins. So here are a few thoughts. Some of these have been brought up before. 1. When the hash is not expected to spill to disk, it preserves the pathkeys of