On 7/17/23 04:13, basti wrote:
[snip]

The Indexes:

volkszaehler=# SELECT tablename,indexname,indexdef FROM pg_indexes WHERE tablename LIKE 'data%' ORDER BY tablename,indexname;

 tablename |           indexname            | indexdef
-----------+--------------------------------+--------------------------------------------------------------------------------------------------  data      | idx_16391_idx_adf3f36372f5a1aa | CREATE INDEX idx_16391_idx_adf3f36372f5a1aa ON volkszaehler.data USING btree (channel_id)  data      | idx_16391_primary              | CREATE UNIQUE INDEX idx_16391_primary ON volkszaehler.data USING btree (channel_id, "timestamp")  data      | idx_data_timestamp             | CREATE INDEX idx_data_timestamp ON volkszaehler.data USING btree ("timestamp")
(3 rows)


The Query:

volkszaehler=# explain analyze SELECT COUNT(DISTINCT DATE_TRUNC('day', TIMESTAMP 'epoch' + timestamp * INTERVAL '1 millisecond')) FROM data WHERE channel_id = 5 AND timestamp >= 0;

                                                         QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------  Aggregate  (cost=590793.68..590793.69 rows=1 width=8) (actual time=15449.536..15449.539 rows=1 loops=1)    ->  Seq Scan on data  (cost=0.00..382037.82 rows=16700469 width=8) (actual time=247.092..3833.495 rows=16669429 loops=1)
         Filter: (("timestamp" >= 0) AND (channel_id = 5))
         Rows Removed by Filter: 1215163
 Planning Time: 0.374 ms
 JIT:
   Functions: 5
   Options: Inlining true, Optimization true, Expressions true, Deforming true    Timing: Generation 1.201 ms, Inlining 158.833 ms, Optimization 59.816 ms, Emission 28.472 ms, Total 248.322 ms
 Execution Time: 15451.093 ms
(10 rows)

Round about 16 sec is too long, the frontend run in timeout or other erros.

What fraction of the rows in the table meet the "WHERE channel_id = 5 AND timestamp >= 0" qualification?

If it's high (and "high" can be a seemingly low value), then Postgresql will decide it's cheaper to sequentially scan the table.


A simple count look like

volkszaehler=# explain analyze SELECT count (channel_id) FROM data;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------  Finalize Aggregate  (cost=208089.76..208089.77 rows=1 width=8) (actual time=3514.293..3523.842 rows=1 loops=1)    ->  Gather  (cost=208089.55..208089.76 rows=2 width=8) (actual time=3514.247..3523.800 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=207089.55..207089.56 rows=1 width=8) (actual time=3427.139..3427.141 rows=1 loops=3)                ->  Parallel Seq Scan on data (cost=0.00..188457.44 rows=7452844 width=8) (actual time=0.137..2121.695 rows=5962263 loops=3)
 Planning Time: 0.247 ms
 JIT:
   Functions: 11
   Options: Inlining false, Optimization false, Expressions true, Deforming true    Timing: Generation 2.665 ms, Inlining 0.000 ms, Optimization 1.612 ms, Emission 48.580 ms, Total 52.857 ms
 Execution Time: 3525.656 ms
(12 rows)

Why are the indexes not used ?
Is it a Problem of the DATE_TRUNC ?
How can I optimize?

Best regards








--
Born in Arizona, moved to Babylonia.


Reply via email to