Apologies for not including this in the original email. The other index, job_u_createtime_2cy0wgyqpani8, is on pc_job(CreateTime, Retired, Subtype, ID). The optimizer chooses Nested Loop when choosing that index, vs Hash Join when choosing the index in the first plan that I posted. It seems like the choice of the Hash Join in the 1st plan that I posted is collateral damage from the seemingly unnecessary need to do the sort.
Here's the plan without forcing the index: Limit (cost=1.00..52692.73 rows=10 width=20) (actual time=55219.289..87704.704 rows=10 loops=1) Buffers: shared hit=9579294 read=328583 I/O Timings: read=1157740.299 -> Nested Loop (cost=1.00..2007555.82 rows=381 width=20) (actual time=55219.288..87704.695 rows=10 loops=1) Buffers: shared hit=9579294 read=328583 I/O Timings: read=1157740.299 -> Index Scan using job_u_createtime_2cy0wgyqpani8 on pc_job groot (cost=0.56..1800117.94 rows=153696 width=28) (actual time=102.075..79470.670 rows=5650 loops=1) Index Cond: ((groot.retired = 0) AND (groot.subtype = 7)) Filter: (groot.closedate IS NULL) Rows Removed by Filter: 14994857 Buffers: shared hit=9563981 read=321566 I/O Timings: read=1149579.949 -> Index Scan using pc_policy_pk on pc_policy policy_0 (cost=0.43..1.35 rows=1 width=8) (actual time=1.456..1.456 rows=0 loops=5650) Index Cond: (policy_0.id = groot.policyid) Filter: ((policy_0.retired = 0) AND (policy_0.producercodeofserviceid = ANY ('{248,1092,1848,74101,103158,103159,117402,122618,129215,132420,135261,137719}'::bigint[]))) Rows Removed by Filter: 1 Buffers: shared hit=15313 read=7017 I/O Timings: read=8160.350 Planning time: 2.209 ms Execution time: 87705.116 ms Thanks, Jerry On Wed, Jan 17, 2024 at 6:39 AM Jerry Brenner <jbren...@guidewire.com> wrote: > We are on 13.9. > I'm wondering why a sort is required for this query, as the index should > be providing the required ordering to satisfy the ORDER BY clause. Does it > have to do with the IS NULL predicate on the leading key column in the > index? > > There's an index, job_u_closedate_g9cdc6ghupib, on pc_job(CloseDate, > Retired, Subtype, CreateTime, ID). All columns have ASC sort order and > NULLs LAST. > > - pc_job is the probe table in a hash join > - There are IS NULL and equality predicates on the 3 leading columns > in the index and the last 2 key columns (CreateTime, ID) are the ordering > columns in the query > - So, the Index Scan of job_u_closedate_g9cdc6ghupib is returning the > rows in the sorted order > - NOTE: The sort is cheap, but I'm investigating this because > "CloseDate IS NULL" is very selective and without forcing the index the > optimizer is choosing a different sort avert index that does not include > CloseDate and hence a lot of time is spent filtering out rows on that > predicate against the heap. > > Here's the query > > SELECT /* ISNULL:pc_job.CloseDate:, KeyTable:pc_job; */ gRoot.ID col0, > gRoot.Subtype col1, gRoot.CreateTime col2 > FROM pc_job gRoot INNER JOIN pc_policy policy_0 > ON policy_0.ID = gRoot.PolicyID > WHERE gRoot.Subtype = 7 AND gRoot.CloseDate IS NULL > AND gRoot.Retired = 0 > AND policy_0.ProducerCodeOfServiceID IN > > (248,1092,1848,74101,103158,103159,117402,122618,129215,132420,135261,137719) > AND policy_0.Retired = 0 > ORDER BY col2 ASC, col0 ASC LIMIT 10 > > Here's the query plan: > > Limit (cost=107826.77..107826.79 rows=10 width=20) (actual > time=13149.872..13149.877 rows=10 loops=1) > Buffers: shared hit=2756 read=40121 > I/O Timings: read=105917.908 > -> Sort (cost=107826.77..107827.72 rows=381 width=20) (actual > time=13149.871..13149.874 rows=10 loops=1) > Sort Key: groot.createtime, groot.id > Sort Method: top-N heapsort Memory: 25kB > Buffers: shared hit=2756 read=40121 > I/O Timings: read=105917.908 > -> Hash Join (cost=15632.51..107818.53 rows=381 width=20) (actual > time=578.511..13149.658 rows=144 loops=1) > Buffers: shared hit=2750 read=40121 > I/O Timings: read=105917.908 > -> Index Scan using job_u_closedate_g9cdc6ghupib on pc_job > groot (cost=0.56..91783.14 rows=153696 width=28) (actual > time=3.864..12562.568 rows=75558 loops=1) > Index Cond: ((groot.closedate IS NULL) AND > (groot.retired = 0) AND (groot.subtype = 7)) > Buffers: shared hit=2721 read=27934 > I/O Timings: read=58781.220 > -> Hash (cost=15427.92..15427.92 rows=16322 width=8) (actual > time=543.298..543.299 rows=13016 loops=1) > Buffers: shared hit=29 read=12187 > I/O Timings: read=47136.688 > -> Index Scan using policy_n_producerco_3e8i0ojsyckhx on > pc_policy policy_0 (cost=0.43..15427.92 rows=16322 width=8) (actual > time=6.149..540.501 rows=13016 loops=1) > Index Cond: ((policy_0.producercodeofserviceid = > ANY > ('{248,1092,1848,74101,103158,103159,117402,122618,129215,132420,135261,137719}'::bigint[])) > AND (policy_0.retired = 0)) > Buffers: shared hit=29 read=12187 > I/O Timings: read=47136.688 > Planning time: 0.538 ms > Execution time: 13150.301 ms > > Thanks, > > Jerry > > >