On 5/2/24 20:22, Tomas Vondra wrote: >> >>> For some of the opclasses it can regress (like the jsonb_path_ops). I >>> don't think that's a major issue. Or more precisely, I'm not surprised >>> by it. It'd be nice to be able to disable the parallel builds in these >>> cases somehow, but I haven't thought about that. >> >> Do you know why it regresses? >> > > No, but one thing that stands out is that the index is much smaller than > the other columns/opclasses, and the compression does not save much > (only about 5% for both phases). So I assume it's the overhead of > writing writing and reading a bunch of GB of data without really gaining > much from doing that. >
I finally got to look into this regression, but I think I must have done something wrong before because I can't reproduce it. This is the timings I get now, if I rerun the benchmark: workers trgm tsvector jsonb jsonb (hash) ------------------------------------------------------- 0 1225 404 104 56 1 772 180 57 60 2 549 143 47 52 3 426 127 43 50 4 364 116 40 48 5 323 111 38 46 6 292 111 37 45 and the speedup, relative to serial build: workers trgm tsvector jsonb jsonb (hash) -------------------------------------------------------- 1 63% 45% 54% 108% 2 45% 35% 45% 94% 3 35% 31% 41% 89% 4 30% 29% 38% 86% 5 26% 28% 37% 83% 6 24% 28% 35% 81% So there's a small regression for the jsonb_path_ops opclass, but only with one worker. After that, it gets a bit faster than serial build. While not a great speedup, it's far better than the earlier results that showed maybe 40% regression. I don't know what I did wrong before - maybe I had a build with an extra debug info or something like that? No idea why would that affect only one of the opclasses. But this time I made doubly sure the results are correct etc. Anyway, I'm fairly happy with these results. I don't think it's surprising there are cases where parallel build does not help much. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company