On 25 March 2015 at 10:27, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Fri, Mar 20, 2015 at 5:36 PM, Amit Kapila <amit.kapil...@gmail.com> > wrote: > > > > > > So the patches have to be applied in below sequence: > > HEAD Commit-id : 8d1f2390 > > parallel-mode-v8.1.patch [2] > > assess-parallel-safety-v4.patch [1] > > parallel-heap-scan.patch [3] > > parallel_seqscan_v11.patch (Attached with this mail) > > > > The reason for not using the latest commit in HEAD is that latest > > version of assess-parallel-safety patch was not getting applied, > > so I generated the patch at commit-id where I could apply that > > patch successfully. > > > > [1] - > http://www.postgresql.org/message-id/ca+tgmobjsuefipok6+i9werugeab3ggjv7jxlx+r6s5syyd...@mail.gmail.com > > [2] - > http://www.postgresql.org/message-id/ca+tgmozjjzynpxchl3gr7nwruzkazpmpvkatdt5shvc5cd7...@mail.gmail.com > > [3] - > http://www.postgresql.org/message-id/ca+tgmoyjetgeaxuszrona7bdtwzptqexpjntv1gkcavmgsd...@mail.gmail.com > > > > Fixed the reported issue on assess-parallel-safety thread and another > bug caught while testing joins and integrated with latest version of > parallel-mode patch (parallel-mode-v9 patch). > > Apart from that I have moved the Initialization of dsm segement from > InitNode phase to ExecFunnel() (on first execution) as per suggestion > from Robert. The main idea is that as it creates large shared memory > segment, so do the work when it is really required. > > > HEAD Commit-Id: 11226e38 > parallel-mode-v9.patch [2] > assess-parallel-safety-v4.patch [1] > parallel-heap-scan.patch [3] > parallel_seqscan_v12.patch (Attached with this mail) > > [1] - > http://www.postgresql.org/message-id/ca+tgmobjsuefipok6+i9werugeab3ggjv7jxlx+r6s5syyd...@mail.gmail.com > [2] - > http://www.postgresql.org/message-id/ca+tgmozfsxzhs6qy4z0786d7iu_abhbvpqfwlthpsvgiecz...@mail.gmail.com > [3] - > http://www.postgresql.org/message-id/ca+tgmoyjetgeaxuszrona7bdtwzptqexpjntv1gkcavmgsd...@mail.gmail.com > Okay, with my pgbench_accounts partitioned into 300, I ran: SELECT DISTINCT bid FROM pgbench_accounts; The query never returns, and I also get this: grep -r 'starting background worker process "parallel worker for PID 12165"' postgresql-2015-03-25_112522.log | wc -l 2496 2,496 workers? This is with parallel_seqscan_degree set to 8. If I set it to 2, this number goes down to 626, and with 16, goes up to 4320. Here's the query plan: QUERY PLAN --------------------------------------------------------------------------------------------------------- HashAggregate (cost=38856527.50..38856529.50 rows=200 width=4) Group Key: pgbench_accounts.bid -> Append (cost=0.00..38806370.00 rows=20063001 width=4) -> Seq Scan on pgbench_accounts (cost=0.00..0.00 rows=1 width=4) -> Funnel on pgbench_accounts_1 (cost=0.00..192333.33 rows=100000 width=4) Number of Workers: 8 -> Partial Seq Scan on pgbench_accounts_1 (cost=0.00..1641000.00 rows=100000 width=4) -> Funnel on pgbench_accounts_2 (cost=0.00..192333.33 rows=100000 width=4) Number of Workers: 8 -> Partial Seq Scan on pgbench_accounts_2 (cost=0.00..1641000.00 rows=100000 width=4) -> Funnel on pgbench_accounts_3 (cost=0.00..192333.33 rows=100000 width=4) Number of Workers: 8 ... -> Partial Seq Scan on pgbench_accounts_498 (cost=0.00..10002.10 rows=210 width=4) -> Funnel on pgbench_accounts_499 (cost=0.00..1132.34 rows=210 width=4) Number of Workers: 8 -> Partial Seq Scan on pgbench_accounts_499 (cost=0.00..10002.10 rows=210 width=4) -> Funnel on pgbench_accounts_500 (cost=0.00..1132.34 rows=210 width=4) Number of Workers: 8 -> Partial Seq Scan on pgbench_accounts_500 (cost=0.00..10002.10 rows=210 width=4) Still not sure why 8 workers are needed for each partial scan. I would expect 8 workers to be used for 8 separate scans. Perhaps this is just my misunderstanding of how this feature works. -- Thom