Hi Aeham, To figure out what is going on, it would probably be sufficient to execute the following queries and send me the results:
EXPLAIN SELECT t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,t0.priorityset FROM jobqueue t0 WHERE t0.checkaction='R' AND t0.checktime<=2414771559577 ORDER BY t0.docpriority LIMIT 20; SELECT t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,t0.priorityset FROM jobqueue t0 WHERE t0.checkaction='R' AND t0.checktime<=2414771559577 ORDER BY t0.docpriority LIMIT 20; These should both read directly from the index in question; we can see immediately what kinds of documents it is having to skip to get to the active ones. Thanks! Karl On Fri, Oct 31, 2014 at 2:42 PM, Karl Wright <[email protected]> wrote: > Hi Aeham, > > I've looked at this at some level of detail now, and have some conclusions. > > (1) The first query you provided is in fact using the proper index for > ordering; it's just not obvious from the plan. The way you can tell is if > there is an explicit sort/limit. The primary jobqueue index scan is > happening within an anti-join and a semi-join; you can tell this by cutting > out all the EXISTS clauses and getting a reduced plan that way, and > comparing. For instance, here's an explain of the simple query: > > >>>>>> > dbname=# explain SELECT > t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,t0.priorityset > FROM jobqueue t0 ORDER BY t0.docpriority LIMIT 240; > QUERY PLAN > > > ------------------------------------------------------------------------------------------------- > Limit (cost=0.29..38.20 rows=240 width=161) > -> Index Scan using i1413214448001 on jobqueue t0 (cost=0.29..2312.39 > rows=14634 width=161) > (2 rows) > <<<<<< > > Applying filtering criteria: > > >>>>>> > dbname=# explain SELECT > t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,t0.priorityset > FROM jobqueue t0 WHERE (t0.status='P' OR t0.status='G') AND > t0.checkaction='R' AND t0.checktime<=2414771559577 ORDER BY t0.docpriority > LIMIT 240; > QUERY PLAN > > > ------------------------------------------------------------------------------------------------ > Limit (cost=0.29..68.98 rows=240 width=161) > -> Index Scan using i1413214448001 on jobqueue t0 (cost=0.29..2389.31 > rows=8347 width=161) > Index Cond: ((checkaction = 'R'::bpchar) AND (checktime <= > 2414771559577::bigint)) > Filter: ((status = 'P'::bpchar) OR (status = 'G'::bpchar)) > (4 rows) > <<<<<< > > Now, this is a bit strange, because there is a filter on the status, so > the index isn't helping there, and yet the index includes the status: > "i1413214448001" btree (docpriority, status, checkaction, checktime) > But maybe this is a statistical optimization? I don't know. > > Anyhow, as soon as a NOT EXISTS clause is added, you see an anti-join: > > >>>>>> > dbname=# explain SELECT > t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,t0.priorityset > FROM jobqueue t0 WHERE (t0.status='P' OR t0.status='G') AND > t0.checkaction='R' AND t0.checktime<=2414771559577 AND NOT EXISTS(SELECT > 'x' FROM prereqevents t3,events t4 WHERE t0.id=t3.owner AND t3.eventname= > t4.name) ORDER BY t0.docpriority LIMIT 240; > QUERY PLAN > > > ---------------------------------------------------------------------------------------------------------- > Limit (cost=0.57..231.40 rows=240 width=161) > -> Nested Loop Anti Join (cost=0.57..7957.24 rows=8273 width=161) > -> Index Scan using i1413214448001 on jobqueue t0 > (cost=0.29..2389.31 rows=8347 width=161) > Index Cond: ((checkaction = 'R'::bpchar) AND (checktime <= > 2414771559577::bigint)) > Filter: ((status = 'P'::bpchar) OR (status = 'G'::bpchar)) > -> Nested Loop (cost=0.29..0.66 rows=1 width=8) > -> Index Scan using i1413214448007 on prereqevents t3 > (cost=0.14..0.17 rows=1 width=524) > Index Cond: (t0.id = owner) > -> Index Only Scan using events_pkey on events t4 > (cost=0.14..0.48 rows=1 width=516) > Index Cond: (name = (t3.eventname)::text) > (10 rows) > <<<<<< > > The index scan is still there and is apparently providing the sorting, > which is good still. It's just inside an anti-join. > > So you can take this further and add the two remaining EXISTS/NOT EXISTS > clauses, and then your first plan makes sense. But your proposed *second* > plan is clearly *not* using the docpriority index for sorting, since it > sorts by itself later on: > > >>>>>> > Limit (cost=122440.72..122441.42 rows=280 width=177) (actual > time=1010.722..1010.722 rows=0 loops=1) > -> Sort (cost=122440.72..122451.13 rows=4164 width=177) (actual > time=1010.719..1010.719 rows=0 loops=1) > Sort Key: t0.docpriority > Sort Method: quicksort Memory: 25kB > <<<<<< > > So what we need to understand now is why the first query is slow. It's > reading from the correct index, but is nevertheless taking a long time, > possibly because there are a lot of records with status 'P' or 'G' that > belong to jobs that are not in state 'a' or 'A' that it has to go through > before it assembles 240 results. Can you confirm this picture? > > Thanks, > Karl > > > On Fri, Oct 31, 2014 at 1:09 PM, Aeham Abushwashi < > [email protected]> wrote: > >> That's great news, thanks Karl! On my part, I've been digging deeper into >> the execution plans of the two queries and I've made a similar observation >> of the ORDER BY skewing the original query's plan significantly. However, >> I >> have yet to figure out why the same ORDER BY clause doesn't have a similar >> impact on the modified query that uses JOIN. I'll look into this a bit >> more >> over the weekend. >> >> Best regards, >> Aeham >> > >
