On Fri, Dec 18, 2015 at 8:47 PM Robert Wrote, >> Yes, you are right, that create_gather_path() sets parallel_safe to false >> unconditionally but whenever we are building a non partial path, that time >> we should carry forward the parallel_safe state to its parent, and it seems >> like that part is missing here..
>Ah, right. Woops. I can't exactly replicate your results, but I've >attempted to fix this in a systematic way in the new version attached >here (parallel-join-v3.patch). I Have tested with the latest patch, problem is solved.. During my testing i observed one more behaviour in the hash join, where Parallel hash join is taking more time compared to Normal hash join, Here i have ensured that apart from hash column there is one more condition on other column which force Random page fetch.... I think this behaviour seems similar what Amit has given in above thread http://www.postgresql.org/message-id/caa4ek1+s3ud2g1wskeaw_fzgp8jeyw3ycnvtueplihe_e1d...@mail.gmail.com create table t1 (c1 int, c2 int, c3 text); create table t2 (c1 int, c2 int, c3 text); insert into t1 values(generate_series(1,10000000), generate_series(1,10000000), repeat('x', 1000)); insert into t2 values(generate_series(1,3000000), generate_series(1,3000000), repeat('x', 5)); analyze t1; analyze t2; set max_parallel_degree=6; postgres=# explain analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t2.c2 + t1.c1 > 100; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=474378.39..474378.40 rows=1 width=0) (actual time=34507.573..34507.573 rows=1 loops=1) -> Gather (cost=96436.00..471878.39 rows=1000000 width=0) (actual time=2004.186..33918.216 rows=2999950 loops=1) Number of Workers: 6 -> Hash Join (cost=95436.00..370878.39 rows=1000000 width=0) (actual time=2077.085..18651.868 rows=428564 loops=7) Hash Cond: (t1.c1 = t2.c1) Join Filter: ((t2.c2 + t1.c1) > 100) Rows Removed by Join Filter: 7 -> Parallel Seq Scan on t1 (cost=0.00..235164.93 rows=1538462 width=4) (actual time=0.741..13199.231 rows=1428571 loops=7) -> Hash (cost=46217.00..46217.00 rows=3000000 width=8) (actual time=2070.827..2070.827 rows=3000000 loops=7) Buckets: 131072 Batches: 64 Memory Usage: 2861kB -> Seq Scan on t2 (cost=0.00..46217.00 rows=3000000 width=8) (actual time=0.027..904.607 rows=3000000 loops=7) Planning time: 0.292 ms Execution time: 34507.857 ms (13 rows) postgres=# set max_parallel_degree=0; SET postgres=# explain analyze SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t2.c2 + t1.c1 > 100; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=1823853.06..1823853.07 rows=1 width=0) (actual time=17833.067..17833.067 rows=1 loops=1) -> Hash Join (cost=95436.00..1821353.06 rows=1000000 width=0) (actual time=1286.788..17558.987 rows=2999950 loops=1) Hash Cond: (t1.c1 = t2.c1) Join Filter: ((t2.c2 + t1.c1) > 100) Rows Removed by Join Filter: 50 -> Seq Scan on t1 (cost=0.00..1528572.04 rows=10000004 width=4) (actual time=2.728..9881.659 rows=10000000 loops=1) -> Hash (cost=46217.00..46217.00 rows=3000000 width=8) (actual time=1279.688..1279.688 rows=3000000 loops=1) Buckets: 131072 Batches: 64 Memory Usage: 2861kB -> Seq Scan on t2 (cost=0.00..46217.00 rows=3000000 width=8) (actual time=0.029..588.887 rows=3000000 loops=1) Planning time: 0.314 ms Execution time: 17833.143 ms (11 rows) Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com On Fri, Dec 18, 2015 at 8:47 PM, Robert Haas <robertmh...@gmail.com> wrote: > On Fri, Dec 18, 2015 at 3:54 AM, Dilip Kumar <dilipbal...@gmail.com> > wrote: > > On Fri, Dec 18, 2015 at 7.59 AM Robert Haas <robertmh...@gmail.com> > Wrote, > >> Uh oh. That's not supposed to happen. A GatherPath is supposed to > >> have parallel_safe = false, which should prevent the planner from > >> using it to form new partial paths. Is this with the latest version > >> of the patch? The plan output suggests that we're somehow reaching > >> try_partial_hashjoin_path() with inner_path being a GatherPath, but I > >> don't immediately see how that's possible, because > >> create_gather_path() sets parallel_safe to false unconditionally, and > >> hash_inner_and_outer() never sets cheapest_safe_inner to a path unless > >> that path is parallel_safe. > > > > Yes, you are right, that create_gather_path() sets parallel_safe to false > > unconditionally but whenever we are building a non partial path, that > time > > we should carry forward the parallel_safe state to its parent, and it > seems > > like that part is missing here.. > > Ah, right. Woops. I can't exactly replicate your results, but I've > attempted to fix this in a systematic way in the new version attached > here (parallel-join-v3.patch). > > >> Do you have a self-contained test case that reproduces this, or any > >> insight as to how it's happening here? > > > > This is TPC-H benchmark case: > > we can setup like this.. > > 1. git clone https://tkej...@bitbucket.org/tkejser/tpch-dbgen.git > > 2. complie using make > > 3. ./dbgen –v –s 5 > > 4. ./qgen > > Thanks. After a bit of fiddling I was able to get this to work. I'm > attaching two other patches that seem to help this case quite > considerably. The first (parallel-reader-order-v1) cause Gather to > read from the same worker repeatedly until it can't get another tuple > from that worker without blocking, and only then move on to the next > worker. With 4 workers, this seems to be drastically more efficient > than what's currently in master - I saw the time for Q5 drop from over > 17 seconds to about 6 (this was an assert-enabled build running with > EXPLAIN ANALYZE, though, so take those numbers with a grain of salt). > The second (gather-disuse-physical-tlist.patch) causes Gather to force > underlying scan nodes to project, which is a good idea here for > reasons very similar to why it's a good idea for the existing node > types that use disuse_physical_tlist: forcing extra data through the > Gather node is bad. That shaved another half second off this query. > > The exact query I was using for testing was: > > explain (analyze, verbose) select n_name, sum(l_extendedprice * (1 - > l_discount)) as revenue from customer, orders, lineitem, supplier, > nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey > and l_suppkey = s_suppkey and c_nationkey = s_nationkey and > s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = > 'EUROPE' and o_orderdate >= date '1995-01-01' and o_orderdate < date > '1995-01-01' + interval '1' year group by n_name order by revenue > desc; > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >