Hi all, I was wondering if anyone had any further insight as to why the query plan would have changed so dramatically between queries, even when isolated by a transaction?
Is it possible for database statistics to change after a query has been run (independent of transactions)? Thanks in advance, Fenn. On Wed, Apr 16, 2014 at 2:25 PM, Fenn Bailey <fenn.bai...@gmail.com> wrote: > Some more information on this: > >> >> > I say "hang", where really I've given up after ~12 hours execution. The >> > exact same query can then be terminated and run in <90 seconds, with >> none >> > of the underlying data changing. >> >> But does the plan change? >> >> > As it turns out it does. Some further examination shows that the plan > changes, at least partially due to the fact that I now see this query > attempts to modify the data that it's querying (ie: if any updates have > occurred, the subsequent query is not the same). > > That said, all queries were performed in a transaction, so when they're > aborted, no modification should be made to the underlying data. > > This makes it even more odd that on a subsequent run of the same query > (after the rollback) it will complete in ~140 seconds (vs the several hours > it was taking on the first run). > > Is it possible that the initial run is changing the statistics available > on the table, hence providing a different query plan (even on the same > data)? > > Query plan on first/second run follows - > > First run (failure): > --------------------------- > Update on ad_events e (cost=1426714795.64..1426749795.64 rows=1000000 > width=237) > -> Hash Join (cost=1426714795.64..1426749795.64 rows=1000000 > width=237) > Hash Cond: (x.ad_event_id = e.ad_event_id) > -> Subquery Scan on x (cost=1426638856.33..1426651356.33 > rows=1000000 width=144) > -> Sort (cost=1426638856.33..1426641356.33 rows=1000000 > width=108) > Sort Key: s.ad_user_id, s.ad_session_id, > t."timestamp" > -> Nested Loop Left Join > (cost=12046.09..1426539198.49 rows=1000000 width=108) > Join Filter: ((t."timestamp" >= > s.session_start) AND (t."timestamp" <= s.session_end) AND ((s.ad_user_id = > u.ad_user_id) OR (s.ad_user_id = u.merged_id))) > -> Hash Join (cost=12046.09..203878.09 > rows=1000000 width=92) > Hash Cond: (t.user_id = > u.orig_distinct_id) > -> Seq Scan on ad_events_mv t > (cost=0.00..173082.00 rows=1000000 width=112) > Filter: (NOT processed) > -> Hash (cost=7932.15..7932.15 > rows=329115 width=67) > -> Seq Scan on ad_users u > (cost=0.00..7932.15 rows=329115 width=67) > -> Materialize (cost=0.00..5478.88 rows=63392 > width=32) > -> Seq Scan on ad_sessions s > (cost=0.00..5161.92 rows=63392 width=32) > -> Hash (cost=50938.58..50938.58 rows=2000058 width=101) > -> Seq Scan on ad_events e (cost=0.00..50938.58 > rows=2000058 width=101) > > Second run (completed in ~148 seconds): > ------------------------------ > Update on ad_events e (cost=39730372.76..39765372.76 rows=1000000 > width=237) > -> Hash Join (cost=39730372.76..39765372.76 rows=1000000 width=237) > Hash Cond: (x.ad_event_id = e.ad_event_id) > -> Subquery Scan on x (cost=39654433.45..39666933.45 > rows=1000000 width=144) > -> Sort (cost=39654433.45..39656933.45 rows=1000000 > width=108) > Sort Key: s.ad_user_id, s.ad_session_id, t."timestamp" > -> Nested Loop Left Join > (cost=12054.20..39554775.61 rows=1000000 width=108) > Join Filter: ((t."timestamp" >= > s.session_start) AND (t."timestamp" <= s.session_end)) > -> Hash Join (cost=12046.09..203878.09 > rows=1000000 width=92) > Hash Cond: (t.user_id = > u.orig_distinct_id) > -> Seq Scan on ad_events_mv t > (cost=0.00..173082.00 rows=1000000 width=112) > Filter: (NOT processed) > -> Hash (cost=7932.15..7932.15 > rows=329115 width=67) > -> Seq Scan on ad_users u > (cost=0.00..7932.15 rows=329115 width=67) > -> Bitmap Heap Scan on ad_sessions s > (cost=8.11..39.22 rows=8 width=32) > Recheck Cond: ((ad_user_id = > u.ad_user_id) OR (ad_user_id = u.merged_id)) > -> BitmapOr (cost=8.11..8.11 rows=8 > width=0) > -> Bitmap Index Scan on > ad_sessions_ad_user_id_idx (cost=0.00..4.05 rows=4 width=0) > Index Cond: (ad_user_id = > u.ad_user_id) > -> Bitmap Index Scan on > ad_sessions_ad_user_id_idx (cost=0.00..4.05 rows=4 width=0) > Index Cond: (ad_user_id = > u.merged_id) > -> Hash (cost=50938.58..50938.58 rows=2000058 width=101) > -> Seq Scan on ad_events e (cost=0.00..50938.58 > rows=2000058 width=101) > --------------- > > From a quick look, the plans diverge at: > -> Seq Scan on ad_sessions s (cost=0.00..5161.92 rows=63392 width=32) > vs > -> Bitmap Heap Scan on ad_sessions s (cost=8.11..39.22 rows=8 width=32) > > I will continue to look further into this in an effort to "fix" the > queries but it would be great if someone could give me insight as to what > could be causing the inconsistency in plan. > > Cheers! >