Is an option partitioning the table by month? If your report is month based, you can improve performance by partitioning.
Felipph Em dom., 16 de abr. de 2023 às 19:10, Andres Freund <and...@anarazel.de> escreveu: > Hi, > > On 2023-04-16 19:00:33 +0200, Clemens Eisserer wrote: > > I am currently trying to migrate an influxdb 1.7 smarthome database to > > postgresql (13.9) running on my raspberry 3. > > It works quite well, but for the queries executed by grafana I get a > > bit highter execution times than I'd hoped for. > > > > Example: > > table smartmeter with non-null column ts (timestamp with time zone) > > and brinc index on ts, no pk to avoid a btree index. > > Sensor values are stored every 5s, so for 1 month there are about 370k > > rows - and in total the table currently holds about 3M rows. > > The query to display the values for 1 month takes ~3s, with the bitmap > > heap scan as well as aggregation taking up most of the time, with > > sorting in between. > > > > Is there anything that could be improved? > > With influxdb I was able to view 3 and 6 months graphs, with > > postgresql it simply takes too long. > > > > I am currently running the 32-bit ARMv6 build, would it be a big > > improvement running ARMv8/64-bit? > > Yes, I suspect so. On a 64bit system most of the datatypes you're dealing > with > are going to be pass-by-value, i.e. not incur memory allocation > overhead. Whereas timestamps, doubles, etc will all require allocations on > a > 32bit system. > > > > smarthomedb=> explain analyze SELECT floor(extract(epoch from > > ts)/10800)*10800 AS "time", AVG(stromL1) as l1, AVG(stromL2) as l2, > > AVG(stroml3) as l3 FROM smartmeter WHERE ts BETWEEN '2023-03-16 > > T09:51:28.397Z' AND '2023-04-16T08:51:28.397Z' GROUP BY time order by > time; > > > > QUERY PLAN > > > --------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > GroupAggregate (cost=117490.70..132536.10 rows=376135 width=32) > > (actual time=2061.253..2974.336 rows=236 loops=1) > > Group Key: ((floor((date_part('epoch'::text, ts) / '10800'::double > > precision)) * '10800'::double precision)) > > -> Sort (cost=117490.70..118431.04 rows=376135 width=20) (actual > > time=2058.407..2410.467 rows=371810 loops=1) > > Sort Key: ((floor((date_part('epoch'::text, ts) / > > '10800'::double precision)) * '10800'::double precision)) > > Given the number of rows you're sorting on a somewhat slow platform, the > complexity of the expression here might be a relevant factor. Particularly > on > a 32bit system (see above), due to the memory allocations we'll end up > doing. > > > I don't know how much control over the query generation you have. Consider > rewriting > floor(extract(epoch from ts)/10800)*10800 AS "time" > to something like > date_bin('3h', ts, '2001-01-01 00:00') > > > > > Sort Method: external merge Disk: 10960kB > > -> Bitmap Heap Scan on smartmeter (cost=112.09..74944.93 > > rows=376135 width=20) (actual time=88.336..1377.862 rows=371810 > > loops=1) > > Given the time spent in the bitmap heap scan, it might be beneficial to > increase effective_io_concurrency some. > > > > Recheck Cond: ((ts >= '2023-03-16 > > 10:51:28.397+01'::timestamp with time zone) AND (ts <= '2023-04-16 > > 10:51:28.397+02'::timestamp with time zone)) > > Rows Removed by Index Recheck: 2131 > > Heap Blocks: lossy=4742 > > The lossiness might also incur some overhead, so increasing work_mem a bit > will help some. > > > Greetings, > > Andres Freund > > >