Hello all, This is to propose a patch for enabling parallel index-only scans. With the patch of parallel index scan around [1], adding the mechanism for parallelising index-only scans makes sense. Without this mechanism for the queries preferring index-only scans, it is likely that at higher scale parallel index or parallel seq scan serve as cheaper alternative to index-only scans and then query performance might suffer because of the added processing of index or seq scans.
Performance ----------------- Consider the performance of a simple query on TPC-H schema, explain analyse select count(*) from lineitem where l_shipdate < date '1995-01-03'; Without parallel index-only scan, parallel seq scan got picked and it took around 23 seconds for the query to execute, > Finalize Aggregate (cost=651586.63..651586.64 rows=1 width=8) (actual > time=22853.872..22853.873 rows=1 loops=1) > -> Gather (cost=651586.21..651586.62 rows=4 width=8) (actual > time=22853.684..22853.864 rows=5 loops=1) > Workers Planned: 4 > Workers Launched: 4 > -> Partial Aggregate (cost=650586.21..650586.22 rows=1 width=8) > (actual time=22850.489..22850.489 rows=1 loops=5) > -> Parallel Seq Scan on lineitem (cost=0.00..618021.73 > rows=13025795 width=0) (actual time=0.035..20553.495 rows=10342437 loops=5) > Filter: (l_shipdate < '1995-01-03'::date) > Rows Removed by Filter: 13656485 > Planning time: 0.225 ms > Execution time: 22855.196 ms > However, with parallel index-only scan, it took only 8.5 seconds, Finalize Aggregate (cost=568883.69..568883.70 rows=1 width=8) (actual > time=8548.993..8548.993 rows=1 loops=1) > -> Gather (cost=568883.27..568883.68 rows=4 width=8) (actual > time=8548.789..8548.976 rows=5 loops=1) > Workers Planned: 4 > Workers Launched: 4 > -> Partial Aggregate (cost=567883.27..567883.28 rows=1 width=8) > (actual time=8541.929..8541.929 rows=1 loops=5) > -> Parallel Index Only Scan using idx_l_shipdate on > lineitem (cost=0.57..535318.78 rows=13025795 width=0) (actual > time=0.113..5866.729 rows=10342437 loops=5) > Index Cond: (l_shipdate < '1995-01-03'::date) > Heap Fetches: 0 > Planning time: 0.266 ms > Execution time: 8569.735 ms The effect of parallel index-only scan can be seen more in some more complex queries where parallelism is enabled till top of the tree, e.g, following query takes 118 ms on head, explain analyse select sum(l_extendedprice * l_discount) as revenue, avg(o_orderkey) from orders, lineitem where o_orderkey < 10000 and o_orderkey = l_orderkey group by l_shipmode order by revenue Sort (cost=24396.44..24396.45 rows=7 width=75) (actual > time=118.823..118.825 rows=7 loops=1) > Sort Key: (sum((lineitem.l_extendedprice * lineitem.l_discount))) > Sort Method: quicksort Memory: 25kB > -> HashAggregate (cost=24396.23..24396.34 rows=7 width=75) (actual > time=118.749..118.786 rows=7 loops=1) > Group Key: lineitem.l_shipmode > -> Nested Loop (cost=1.13..24293.11 rows=10312 width=27) > (actual time=0.096..73.198 rows=9965 loops=1) > -> Index Only Scan using orders_pkey on orders > (cost=0.56..46.48 rows=2578 width=4) (actual time=0.072..1.663 rows=2503 > loops=1) > Index Cond: (o_orderkey < 10000) > Heap Fetches: 0 > -> Index Scan using idx_lineitem_orderkey on lineitem > (cost=0.57..6.45 rows=296 width=31) (actual time=0.018..0.023 rows=4 > loops=2503) > Index Cond: (l_orderkey = orders.o_orderkey) > Planning time: 1.062 ms > Execution time: 118.977 ms With parallel index-only scan, the performance improves to 40 ms, Sort (cost=7191.33..7191.35 rows=7 width=75) (actual time=40.475..40.476 > rows=7 loops=1) > Sort Key: (sum((lineitem.l_extendedprice * lineitem.l_discount))) > Sort Method: quicksort Memory: 25kB > -> Finalize GroupAggregate (cost=7190.78..7191.23 rows=7 width=75) > (actual time=40.168..40.451 rows=7 loops=1) > Group Key: lineitem.l_shipmode > -> Sort (cost=7190.78..7190.85 rows=28 width=75) (actual > time=40.105..40.127 rows=35 loops=1) > Sort Key: lineitem.l_shipmode > Sort Method: quicksort Memory: 29kB > -> Gather (cost=7187.22..7190.11 rows=28 width=75) > (actual time=39.344..39.983 rows=35 loops=1) > Workers Planned: 4 > Workers Launched: 4 > -> Partial HashAggregate (cost=6187.22..6187.31 > rows=7 width=75) (actual time=25.981..26.011 rows=7 loops=5) > Group Key: lineitem.l_shipmode > -> Nested Loop (cost=1.13..6084.10 rows=10312 > width=27) (actual time=0.139..16.352 rows=1993 loops=5) > -> Parallel Index Only Scan using > orders_pkey on orders (cost=0.56..27.14 rows=644 width=4) (actual > time=0.082..0.366 rows=501 loops=5) > Index Cond: (o_orderkey < 10000) > Heap Fetches: 0 > -> Index Scan using > idx_lineitem_orderkey on lineitem (cost=0.57..6.45 rows=296 width=31) > (actual time=0.020..0.025 rows=4 loops=2503) > Index Cond: (l_orderkey = > orders.o_orderkey) > Planning time: 1.170 ms > Execution time: 40.898 ms Test configuration and machine detail: -------------------------------------------------- TPC-H: scale facto : 20 work_mem : 64MB shared buffer : 1GB max_parallel_workers_per_gather : 4 Machine : IBM POWER, 4 socket machine with 512 GB RAM. random_page_cost = seq_page_cost = 0.1 We kept random and seq page cost same since warm cache environment was ensured, thus, keeping a value of random_page_cost that is higher than seq_page_cost doesn't makes much sense in this setup. Also, some additional indexes were build, for lineitem table l_shipdate, l_shipmode, and l_returnflag, on orders table index is added for o_comment and o_orderdate individually. The point to note here is that with parallel index-only scan, not only the heap fetches are saved but also the aggregates/sort can be pushed down to workers and thus the total time of query can be improved. Clearly, the performance of queries improved significantly with this new operator and considering the changes required after parallel index scan patches is less if evaluated against the improvement in performance it offers. Attached file: ------------------ 1. parallel_index_only_v1.patch This patch is to be applied over parallel index scan [1]. * Thanks to my colleagues Dilip Kumar and Amit Kapila for their support and feedback. [1] https://www.postgresql.org/message-id/CAOGQiiOneen9WEppO6V_myKpQ97CrjBQJ0Pv7ri0rxmMYvLcTg%40mail.gmail.com -- Regards, Rafia Sabih EnterpriseDB: http://www.enterprisedb.com/
parallel_index_only_v1.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers