Hello, Michael, Tom: thanks for all the insights and informations in your previous mails.
A quick update of the explain outputs (this time using explain (analyze, buffers, verbose)) *The good: * *LOG Time: | 2022-02-28 09:30:01.400777+01 | order rows: | 9668* Limit (cost=616.37..653.30 rows=10 width=22) (actual time=1.062..1.063 rows=1 loops=1) Output: version, content Buffers: shared hit=154 -> Bitmap Heap Scan on orderstore."order" (cost=616.37..793.63 rows=48 width=22) (actual time=1.061..1.062 rows=1 loops=1) Output: version, content Recheck Cond: (jsonb_to_tsvector('english'::regconfig, "order".content, '["all"]'::jsonb) @@ '''1.20709841'''::tsquery) Heap Blocks: exact=1 Buffers: shared hit=154 -> Bitmap Index Scan on idx_fulltext_content (cost=0.00..616.36 rows=48 width=0) (actual time=1.053..1.053 rows=1 loops=1) Index Cond: (jsonb_to_tsvector('english'::regconfig, "order".content, '["all"]'::jsonb) @@ '''1.20709841'''::tsquery) Buffers: shared hit=153 Planning: Buffers: shared hit=50 Planning Time: 0.408 ms *Execution Time: 1.079 ms* *pg_stat_all_tables: * n_tup_ins | 102 *n_tup_upd | 1554* n_tup_del | 0 n_tup_hot_upd | 0 n_live_tup | 9668 *n_dead_tup | 69* n_mod_since_analyze | 61 n_ins_since_vacuum | 8 last_vacuum | 2022-02-25 07:54:46.196508+01 last_autovacuum | last_analyze | 2022-02-28 03:20:38.761482+01 last_autoanalyze | *The bad: * *LOG Time: | 2022-02-28 09:45:01.662702+01 | order rows: | 9668* LIMIT 10: Limit (cost=0.00..805.63 rows=10 width=22) (actual time=24175.964..25829.767 rows=1 loops=1) Output: version, content Buffers: shared hit=26284 read=12550 dirtied=4 -> Seq Scan on orderstore."order" (cost=0.00..3867.01 rows=48 width=22) (actual time=24175.962..25829.763 rows=1 loops=1) Output: version, content Filter: (jsonb_to_tsvector('english'::regconfig, "order".content, '["all"]'::jsonb) @@ '''1.20709841'''::tsquery) Rows Removed by Filter: 9667 Buffers: shared hit=26284 read=12550 dirtied=4 Planning: Buffers: shared hit=50 Planning Time: 0.377 ms *Execution Time: 25829.778 ms* *pg_stat_all_tables:* n_tup_ins | 102 *n_tup_upd | 1585* n_tup_del | 0 n_tup_hot_upd | 0 n_live_tup | 9668 *n_dead_tup | 100* n_mod_since_analyze | 92 n_ins_since_vacuum | 8 last_vacuum | 2022-02-25 07:54:46.196508+01 last_autovacuum | last_analyze | 2022-02-28 03:20:38.761482+01 last_autoanalyze | *The ugly:* It should be mentioned that the table in question mainly lives in toast land (but I have no idea if this also influences the query planner): oid | table_schema | table_name | row_estimate | total_bytes | index_bytes | toast_bytes | table_bytes | total | index | toast | table --------+--------------+------------+--------------+-------------+-------------+-------------+-------------+--------+--------+-------+------- 155544 | orderstore | order | 9570 | 229826560 | 120184832 | 98557952 | 11083776 | 219 MB | 115 MB | 94 MB | 11 MB Since tinkering with the text search functions is out of the question we came up with three possibilities on how to deal with this issue: - significantly increase the limit clause or omit it at all (meh ...) - use 'set random_page_cost = 0.5' in the transaction in order to convince the query planner to prefer the index (tested and works) - schedule an hourly vacuum job for the table (the most likely solution we will settle on since it comes with the least implementation effort) None of these seems very elegant or viable in the long run ... we'll see. Ah, yes: our global settings for random_page_cost and autovacuum/analyze are set to the defaults. Will json-processing experience some improvements in pg14/15? We are about to update to 14 in the near future with our devs saying that this topic is the main trigger to do so. Any further thoughts on the case are very much appreciated. kr p. Am Do., 24. Feb. 2022 um 17:10 Uhr schrieb Tom Lane <t...@sss.pgh.pa.us>: > Peter Adlersburg <peter.adlersb...@gmail.com> writes: > > Limit (cost=0.00..804.97 rows=10 width=22) (actual > > time=23970.845..25588.432 rows=1 loops=1) > > -> Seq Scan on "order" (cost=0.00..3863.86 rows=48 width=22) (actual > > time=23970.843..25588.429 rows=1 loops=1) > > Filter: (jsonb_to_tsvector('english'::regconfig, content, > > '["all"]'::jsonb) @@ '''1.20709841'''::tsquery) > > Rows Removed by Filter: 9652 > > Planning Time: 0.430 ms > > Execution Time: 25588.448 ms > > I think the expense here comes from re-executing jsonb_to_tsvector > a lot of times. By default that's estimated as 100 times more expensive > than a simple function (such as addition), but these results make it > seem like that's an understatement. You might try something like > > alter function jsonb_to_tsvector(regconfig, jsonb, jsonb) cost 1000; > > to further discourage the planner from picking this plan shape. > > Possibly the cost estimate for ts_match_tq (the function underlying > this variant of @@) needs to be bumped up as well. > > (Bear in mind that pg_dump will not propagate such hacks on > system-defined objects, so you'll need a note to reapply > any such changes after dump/reload or pg_upgrade.) > > regards, tom lane >