On 7/17/23 04:13, basti wrote:
[snip]
The Indexes:
volkszaehler=# SELECT tablename,indexname,indexdef FROM pg_indexes WHERE
tablename LIKE 'data%' ORDER BY tablename,indexname;
tablename | indexname | indexdef
-----------+--------------------------------+--------------------------------------------------------------------------------------------------
data | idx_16391_idx_adf3f36372f5a1aa | CREATE INDEX
idx_16391_idx_adf3f36372f5a1aa ON volkszaehler.data USING btree (channel_id)
data | idx_16391_primary | CREATE UNIQUE INDEX
idx_16391_primary ON volkszaehler.data USING btree (channel_id, "timestamp")
data | idx_data_timestamp | CREATE INDEX
idx_data_timestamp ON volkszaehler.data USING btree ("timestamp")
(3 rows)
The Query:
volkszaehler=# explain analyze SELECT COUNT(DISTINCT DATE_TRUNC('day',
TIMESTAMP 'epoch' + timestamp * INTERVAL '1 millisecond')) FROM data WHERE
channel_id = 5 AND timestamp >= 0;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=590793.68..590793.69 rows=1 width=8) (actual
time=15449.536..15449.539 rows=1 loops=1)
-> Seq Scan on data (cost=0.00..382037.82 rows=16700469 width=8)
(actual time=247.092..3833.495 rows=16669429 loops=1)
Filter: (("timestamp" >= 0) AND (channel_id = 5))
Rows Removed by Filter: 1215163
Planning Time: 0.374 ms
JIT:
Functions: 5
Options: Inlining true, Optimization true, Expressions true, Deforming
true
Timing: Generation 1.201 ms, Inlining 158.833 ms, Optimization 59.816
ms, Emission 28.472 ms, Total 248.322 ms
Execution Time: 15451.093 ms
(10 rows)
Round about 16 sec is too long, the frontend run in timeout or other erros.
What fraction of the rows in the table meet the "WHERE channel_id = 5 AND
timestamp >= 0" qualification?
If it's high (and "high" can be a seemingly low value), then Postgresql will
decide it's cheaper to sequentially scan the table.
A simple count look like
volkszaehler=# explain analyze SELECT count (channel_id) FROM data;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=208089.76..208089.77 rows=1 width=8) (actual
time=3514.293..3523.842 rows=1 loops=1)
-> Gather (cost=208089.55..208089.76 rows=2 width=8) (actual
time=3514.247..3523.800 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=207089.55..207089.56 rows=1 width=8)
(actual time=3427.139..3427.141 rows=1 loops=3)
-> Parallel Seq Scan on data (cost=0.00..188457.44
rows=7452844 width=8) (actual time=0.137..2121.695 rows=5962263 loops=3)
Planning Time: 0.247 ms
JIT:
Functions: 11
Options: Inlining false, Optimization false, Expressions true,
Deforming true
Timing: Generation 2.665 ms, Inlining 0.000 ms, Optimization 1.612 ms,
Emission 48.580 ms, Total 52.857 ms
Execution Time: 3525.656 ms
(12 rows)
Why are the indexes not used ?
Is it a Problem of the DATE_TRUNC ?
How can I optimize?
Best regards
--
Born in Arizona, moved to Babylonia.