Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-28 Thread Stefan Kaltenbrunner
Matteo Beccati wrote: Stefan Kaltenbrunner wrote: too bad - however any idea on one of the other troubling querys (q21) I mentioned in the mail I resent to the list (after the original one got lost)? http://archives.postgresql.org/pgsql-hackers/2006-09/msg02011.php What happens if you

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-27 Thread Matteo Beccati
Stefan Kaltenbrunner wrote: too bad - however any idea on one of the other troubling querys (q21) I mentioned in the mail I resent to the list (after the original one got lost)? http://archives.postgresql.org/pgsql-hackers/2006-09/msg02011.php What happens if you increase statistics for

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-26 Thread Stefan Kaltenbrunner
Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: http://www.kaltenbrunner.cc/files/analyze_q9_beta1.txt The next problem seems to be the drastic misestimation of this join size: - Nested Loop (cost=0.00..6872092.36 rows=135 width=28) (actual time=94.762..14429291.129

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-26 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Tom Lane wrote: It evidently thinks that most of the rows in the join of part and partsupp won't have any matching rows in lineitem, whereas on average there are about 7 matching rows apiece. So that's totally wacko, and it's not immediately

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-26 Thread Stefan Kaltenbrunner
Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Tom Lane wrote: It evidently thinks that most of the rows in the join of part and partsupp won't have any matching rows in lineitem, whereas on average there are about 7 matching rows apiece. So that's totally wacko, and it's

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-25 Thread Stefan Kaltenbrunner
[RESENT in a modified version since the original reply seems to have ben lost by the listserver which seems to happen sometimes lately] Tom Lane wrote: Matteo Beccati [EMAIL PROTECTED] writes: Tom Lane ha scritto: Matteo Beccati [EMAIL PROTECTED] writes: I cannot see anything bad by using

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-25 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: http://www.kaltenbrunner.cc/files/analyze_q9_beta1.txt The next problem seems to be the drastic misestimation of this join size: - Nested Loop (cost=0.00..6872092.36 rows=135 width=28) (actual time=94.762..14429291.129 rows=3554044 loops=1)

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-22 Thread Stefan Kaltenbrunner
Tom Lane wrote: Matteo Beccati [EMAIL PROTECTED] writes: Tom Lane ha scritto: Matteo Beccati [EMAIL PROTECTED] writes: I cannot see anything bad by using something like that: if (histogram is large/representative enough) Well, the question is exactly what is large enough? I feel a bit

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-20 Thread Tom Lane
Matteo Beccati [EMAIL PROTECTED] writes: Tom Lane ha scritto: Matteo Beccati [EMAIL PROTECTED] writes: I cannot see anything bad by using something like that: if (histogram is large/representative enough) Well, the question is exactly what is large enough? I feel a bit uncomfortable about

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-20 Thread Matteo Beccati
Hi, Tom Lane wrote: I've committed this change with (for now) 100 as the minimum histogram size to use. Stefan, are you interested in retrying your benchmark? A first try with ltree gave big improvements on my smaller data set: the estimated row count is correct or off by only 1 row. I'm

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-19 Thread Mark Cave-Ayland
On Mon, 2006-09-18 at 17:46 +0200, Matteo Beccati wrote: Tom Lane ha scritto: Matteo Beccati [EMAIL PROTECTED] writes: I cannot see anything bad by using something like that: if (histogram is large/representative enough) Well, the question is exactly what is large enough? I feel a bit

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-19 Thread Stefan Kaltenbrunner
Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Tom Lane wrote: I'm not sure what we could do about the concurrent-sessions issue, but we could make some sort of attack on the query complexity issue by pro-rating the effective_cache_size among all the tables used by a query.

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-19 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: ok - the planner switches to a different plan at about 2.5GB of effective_cache_size resulting in the following [ much better ] plan: OK, so it sounds like it'd be a good idea to try to pro-rate effective_cache_size among all the tables in the

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-18 Thread Matteo Beccati
Hi, Tom Lane wrote: Hmm ... pattern_sel already applies the operator directly to the most_common_vals, but in this situation those aren't common enough to help much. With such an extensive histogram it is awfully tempting to assume that the histogram members are a representative sample, and

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-18 Thread Tom Lane
Matteo Beccati [EMAIL PROTECTED] writes: I cannot see anything bad by using something like that: if (histogram is large/representative enough) Well, the question is exactly what is large enough? I feel a bit uncomfortable about applying the idea to a histogram with only 10 entries (especially

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-18 Thread Matteo Beccati
Tom Lane ha scritto: Matteo Beccati [EMAIL PROTECTED] writes: I cannot see anything bad by using something like that: if (histogram is large/representative enough) Well, the question is exactly what is large enough? I feel a bit uncomfortable about applying the idea to a histogram with only

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Stefan Kaltenbrunner
[already sent a variant of that yesterday but it doesn't look like it made it to the list] Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Tom Lane wrote: Apparently we've made the planner a bit too optimistic about the savings that can be expected from repeated indexscans

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Stefan Kaltenbrunner
Stefan Kaltenbrunner wrote: [already sent a variant of that yesterday but it doesn't look like it made it to the list] Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Tom Lane wrote: Apparently we've made the planner a bit too optimistic about the savings that can be

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: that is the explain for the 4h+ Q9: http://www.kaltenbrunner.cc/files/analyze_q9.txt The big problem there seems to be the drastic misestimation of the number of rows matching the p_name ~~ '%ghost%' condition. What does pg_stats have for the

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Stefan Kaltenbrunner
Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: that is the explain for the 4h+ Q9: http://www.kaltenbrunner.cc/files/analyze_q9.txt The big problem there seems to be the drastic misestimation of the number of rows matching the p_name ~~ '%ghost%' condition. What does

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: some additional numbers(first one is with default settings, second is with enable_nestloop = 'off', third one is with enable_nestloop = 'off' and enable_hashjoin='off'): http://www.kaltenbrunner.cc/files/analyze_q7.txt I'm inclined to think you

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Tom Lane wrote: The big problem there seems to be the drastic misestimation of the number of rows matching the p_name ~~ '%ghost%' condition. What does pg_stats have for the p_name column? http://www.kaltenbrunner.cc/files/pg_stat_p_name.txt

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Stefan Kaltenbrunner
Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: some additional numbers(first one is with default settings, second is with enable_nestloop = 'off', third one is with enable_nestloop = 'off' and enable_hashjoin='off'): http://www.kaltenbrunner.cc/files/analyze_q7.txt I'm

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Tom Lane wrote: I'm not sure what we could do about the concurrent-sessions issue, but we could make some sort of attack on the query complexity issue by pro-rating the effective_cache_size among all the tables used by a query. hmm not sure i

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-13 Thread Stefan Kaltenbrunner
Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Tom Lane wrote: Could we see the actual EXPLAIN ANALYZE results for the slow plan? http://www.kaltenbrunner.cc/files/dbt3_explain_analyze.txt Well, indeed it seems that the hash join is just an innocent bystander: the bulk of

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-13 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Tom Lane wrote: Apparently we've made the planner a bit too optimistic about the savings that can be expected from repeated indexscans occurring on the inside of a join. effective_cache_size was set to 10GB(my fault for copying over the conf

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-13 Thread Stefan Kaltenbrunner
Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Tom Lane wrote: Apparently we've made the planner a bit too optimistic about the savings that can be expected from repeated indexscans occurring on the inside of a join. effective_cache_size was set to 10GB(my fault for copying

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-13 Thread David Fetter
On Wed, Sep 13, 2006 at 10:47:09AM -0400, Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Tom Lane wrote: Apparently we've made the planner a bit too optimistic about the savings that can be expected from repeated indexscans occurring on the inside of a join.

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-12 Thread Stefan Kaltenbrunner
Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: btw - the hashjoin is bad was more or less based on the observation that nearly all of the cpu is burned in hash-related functions in the profile (when profiling over a longer period of time those accumulate even more % of the time

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-12 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Tom Lane wrote: Could we see the actual EXPLAIN ANALYZE results for the slow plan? http://www.kaltenbrunner.cc/files/dbt3_explain_analyze.txt Well, indeed it seems that the hash join is just an innocent bystander: the bulk of the runtime (all but

[HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-11 Thread Stefan Kaltenbrunner
Hi! I'm about to do some benchmarking on -HEAD one some hardware I have available and it seems I'm hitting a rather weird issue causing the osdl dbt3 benchmark to run very slow and eating CPU time for hours ... it seems that the issue is caused by the following query: (in case it gets

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-11 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: I'm about to do some benchmarking on -HEAD one some hardware I have available and it seems I'm hitting a rather weird issue causing the osdl dbt3 benchmark to run very slow and eating CPU time for hours ... Could we see the actual EXPLAIN ANALYZE

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-11 Thread Stefan Kaltenbrunner
Tom Lane wrote: Stefan Kaltenbrunner [EMAIL PROTECTED] writes: I'm about to do some benchmarking on -HEAD one some hardware I have available and it seems I'm hitting a rather weird issue causing the osdl dbt3 benchmark to run very slow and eating CPU time for hours ... Could we see the

Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-11 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: btw - the hashjoin is bad was more or less based on the observation that nearly all of the cpu is burned in hash-related functions in the profile (when profiling over a longer period of time those accumulate even more % of the time than in the