2018-06-24 22:32 GMT+02:00 Dmitry Dolgov <9erthali...@gmail.com>: > > On 23 June 2018 at 08:47, Pavel Stehule <pavel.steh...@gmail.com> wrote: > > > > > > 2018-06-23 8:35 GMT+02:00 Pavel Stehule <pavel.steh...@gmail.com>: > >> > >> Hi > >> > >> I try to measure effect of JIT tuple deform and I don't see any possible > >> effect. > >> > >> Is it this feature active in master branch? > >> > >> Is possible to see this feature in EXPLAIN ANALYZE? > > > > > > Unfortunately I got slowdown > > > > 0. shared buffers = 1GB > > 1. create table with 50 int columns > > 2. insert into this table 4M rows > > Hi, > > Looks like I can reproduce the situation you're talking about (with some > minor > differences) > > =# explain analyze select sum(data45) from test_deform; > QUERY PLAN > ------------------------------------------------------------ > ------------------- > Finalize Aggregate > (cost=127097.71..127097.72 rows=1 width=8) > (actual time=813.957..813.957 rows=1 loops=1) > -> Gather > (cost=127097.50..127097.71 rows=2 width=8) > (actual time=813.946..813.950 rows=3 loops=1) > Workers Planned: 2 > Workers Launched: 2 > -> Partial Aggregate > (cost=126097.50..126097.51 rows=1 width=8) > (actual time=802.585..802.585 rows=1 loops=3) > -> Parallel Seq Scan on test_deform > (cost=0.00..121930.80 rows=1666680 width=4) > (actual time=0.040..207.326 rows=1333333 loops=3) > Planning Time: 0.212 ms > JIT: > Functions: 6 > Generation Time: 3.076 ms > Inlining: false > Inlining Time: 0.000 ms > Optimization: false > Optimization Time: 1.328 ms > Emission Time: 8.601 ms > Execution Time: 820.127 ms > (16 rows) > > =# set jit_tuple_deforming to off; > > =# explain analyze select sum(data45) from test_deform; > QUERY PLAN > ------------------------------------------------------------ > ------------------- > Finalize Aggregate > (cost=127097.71..127097.72 rows=1 width=8) > (actual time=742.578..742.578 rows=1 loops=1) > -> Gather > (cost=127097.50..127097.71 rows=2 width=8) > (actual time=742.529..742.569 rows=3 loops=1) > Workers Planned: 2 > Workers Launched: 2 > -> Partial Aggregate > (cost=126097.50..126097.51 rows=1 width=8) > (actual time=727.876..727.876 rows=1 loops=3) > -> Parallel Seq Scan on test_deform > (cost=0.00..121930.80 rows=1666680 width=4) > (actual time=0.044..204.097 rows=1333333 loops=3) > Planning Time: 0.361 ms > JIT: > Functions: 4 > Generation Time: 2.840 ms > Inlining: false > Inlining Time: 0.000 ms > Optimization: false > Optimization Time: 0.751 ms > Emission Time: 6.436 ms > Execution Time: 749.827 ms > (16 rows) > > But at the same time on the bigger dataset (20M rows instead of 4M) the > very > same query gets better: > > =# explain analyze select sum(data45) from test_deform; > QUERY PLAN > ------------------------------------------------------------ > ------------------- > Finalize Aggregate > (cost=631482.92..631482.93 rows=1 width=8) > (actual time=2350.288..2350.288 rows=1 loops=1) > -> Gather > (cost=631482.71..631482.92 rows=2 width=8) > (actual time=2350.276..2350.279 rows=3 loops=1) > Workers Planned: 2 > Workers Launched: 2 > -> Partial Aggregate > (cost=630482.71..630482.72 rows=1 width=8) > (actual time=2328.378..2328.379 rows=1 loops=3) > -> Parallel Seq Scan on test_deform > (cost=0.00..609649.37 rows=8333337 width=4) > (actual time=0.027..1175.960 rows=6666667 loops=3) > Planning Time: 0.600 ms > JIT: > Functions: 6 > Generation Time: 3.661 ms > Inlining: true > Inlining Time: 65.373 ms > Optimization: true > Optimization Time: 120.885 ms > Emission Time: 58.836 ms > Execution Time: 2543.280 ms > (16 rows) > > =# set jit_tuple_deforming to off; > > =# explain analyze select sum(data45) from test_deform; > QUERY PLAN > ------------------------------------------------------------ > ------------------- > Finalize Aggregate > (cost=631482.92..631482.93 rows=1 width=8) > (actual time=3616.977..3616.977 rows=1 loops=1) > -> Gather > (cost=631482.71..631482.92 rows=2 width=8) > (actual time=3616.959..3616.971 rows=3 loops=1) > Workers Planned: 2 > Workers Launched: 2 > -> Partial Aggregate > (cost=630482.71..630482.72 rows=1 width=8) > (actual time=3593.220..3593.220 rows=1 loops=3) > -> Parallel Seq Scan on test_deform > (cost=0.00..609649.37 rows=8333337 width=4) > (actual time=0.049..1027.353 rows=6666667 loops=3) > Planning Time: 0.149 ms > JIT: > Functions: 4 > Generation Time: 1.803 ms > Inlining: true > Inlining Time: 23.529 ms > Optimization: true > Optimization Time: 18.843 ms > Emission Time: 9.307 ms > Execution Time: 3625.674 ms > (16 rows) > > `perf diff` indeed shows that in the first case (with the 4M rows dataset) > the > jitted version has some noticeable delta for one call, and unfortunately > so far > I couldn't figure out which one exactly because of JIT (btw, who can > explain > how to see a correct full `perf report` in this case? Somehow `perf > inject --jit -o > perf.data.jitted` and jit_profiling_support didn't help). > > But since on the bigger dataset I've got expected results, maybe it's just > a > sign that JIT kicks in too early in this case and what's necessary is to > adjust > jit_above_cost/jit_optimize_above_cost/jit_inline_above_cost? >
maybe llvm does real compilation too late. It is too strange, because I though about JIT cost like initial (fixed) costs. Now, it looks so this cost is related to row numbers, and then the situation is much more complex. Regards Pavel