> the answer is that is because it is a GIN index. Make the same index only as btree, and you should get good performance as it can filter the tags within a given project without visiting the table.
Currently I have this GIN index: "index_subscriptions_on_project_id_and_tags" gin (project_id, tags) WHERE trashed_at IS NULL It uses the btree_gin extension and works perfectly for tag search, except for the "NOT" operator. I don't understand why it doesn't use the GIN index also for the "NOT" operator. The problem is that I cannot create the same index with BTree, because PG doesn't support BTree on array :( On Wed, Nov 13, 2019 at 12:30 PM Jeff Janes <jeff.ja...@gmail.com> wrote: > On Wed, Nov 13, 2019 at 4:20 AM Marco Colli <collimarc...@gmail.com> > wrote: > >> Replying to the previous questions: >> - work_mem = 64MB (there are hundreds of connections) >> - the project 123 has more than 7M records, and those that don't have the >> tag 'en' are 4.8M >> >> >>> What was the plan for the one that took 500ms? >> >> >> This is the query / plan without the filter on tags: >> >> SELECT COUNT(*) FROM "subscriptions" WHERE "subscriptions"."project_id" = >> 123 AND "subscriptions"."trashed_at" IS NULL; >> >> QUERY PLAN >> >> >> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >> Finalize Aggregate (cost=291342.67..291342.68 rows=1 width=8) (actual >> time=354.556..354.556 rows=1 loops=1) >> -> Gather (cost=291342.05..291342.66 rows=6 width=8) (actual >> time=354.495..374.305 rows=7 loops=1) >> Workers Planned: 6 >> Workers Launched: 6 >> -> Partial Aggregate (cost=290342.05..290342.06 rows=1 >> width=8) (actual time=349.799..349.799 rows=1 loops=7) >> -> Parallel Index Only Scan using >> index_subscriptions_on_project_id_and_uid on subscriptions >> (cost=0.56..287610.27 rows=1092713 width=0) (actual time=0.083..273.018 >> rows=1030593 loops=7) >> Index Cond: (project_id = 123) >> Heap Fetches: 280849 >> Planning Time: 0.753 ms >> Execution Time: 374.483 ms >> (10 rows) >> > > My previous comment about the bitmap index scan taking half the time was a > slip of the eye, I was comparing *cost* of the bitmap index scan to the > *time* of the overall plan. But then the question is, why isn't it doing > an index-only scan on "index_subscriptions_on_project_id_and_tags"? And > the answer is that is because it is a GIN index. Make the same index only > as btree, and you should get good performance as it can filter the tags > within a given project without visiting the table. > > Cheers, > > Jeff > >>