Hi, On 2020-01-11 22:32:45 -0500, Tom Lane wrote: > I wrote: > > I saw at this point that the remaining top spots were > > enlargeStringInfo and appendBinaryStringInfo, so I experimented > > with inlining them (again, see patch below). That *did* move > > the needle: down to 72691 ms, or 20% better than HEAD. > > Oh ... marking the test in the inline part of enlargeStringInfo() > as unlikely() helps quite a bit more: 66100 ms, a further 9% gain. > Might be over-optimizing for this particular case, perhaps > > (But ... I'm not finding these numbers to be super reproducible > across different ASLR layouts. So take it with a grain of salt.)
FWIW, I've also observed, in another thread (the node func generation thing [1]), that inlining enlargeStringInfo() helps a lot, especially when inlining some of its callers. Moving e.g. appendStringInfo() inline allows the compiler to sometimes optimize away the strlen. But if e.g. an inlined appendBinaryStringInfo() still calls enlargeStringInfo() unconditionally, successive appends cannot optimize away memory accesses for ->len/->data. For the case of send functions, we really ought to have at least pq_begintypsend(), enlargeStringInfo() and pq_endtypsend() inline. That way the compiler ought to be able to avoid repeatedly loading/storing ->len, after the initial initStringInfo() call. Might even make sense to also have initStringInfo() inline, because then the compiler would probably never actually materialize the StringInfoData (and would automatically have good aliasing information too). The commit referenced above is obviously quite WIP-ey, and contains things that should be split into separate commits. But I think it might be worth moving more functions into the header, like I've done in that commit. The commit also adds appendStringInfoU?Int(32,64) operations - I've unsuprisingly found these to be *considerably* faster than going through appendStringInfoString(). > but I think that's a reasonable marking given that we overallocate > the stringinfo buffer for most uses. Wonder if it's worth providing a function to initialize the stringinfo differently for the many cases where we have at least a very good idea of how long the string will be. It's sad to allocate 1kb just for e.g. int4send to send an integer plus length. Greetings, Andres Freund [1] https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=commit;h=127e860cf65f50434e0bb97acbba4b0ea6f38cfd