Hi, On 2024-02-06 10:01:36 +0900, Michael Paquier wrote: > On Mon, Feb 05, 2024 at 10:21:18AM -0800, Andres Freund wrote: > > Have you benchmarked the performance effects of 2889fd23be5 ? I'd not at all > > be surprised if it lead to a measurable performance regression. > > Yes, I was looking at runtimes and some profiles around CopyOneRowTo() > to see the effects that this has yesterday. The principal point of > contention is CopyOneRowTo() where the callback is called once per > attribute, so more attributes stress it more.
Right. > If you have concerns about that, I'm OK to revert, I'm not wedded to > this level of control. Note that I've actually seen *better* > runtimes. I'm somewhat worried that handling the different formats at that level will make it harder to improve copy performance - it's quite attrociously slow right now. The more we reduce the per-row/field overhead, the more the dispatch overhead will matter. > [1]: https://www.postgresql.org/message-id/zbr6piwuvhdtf...@paquier.xyz > > > I think callbacks for individual attributes is the wrong approach - the > > dispatch needs to happen at a higher level, otherwise there are too many > > indirect function calls. > > Hmm. Do you have concerns about v13 posted on [2] then? As is I'm indeed not a fan. It imo doesn't make sense to have an indirect dispatch for *both* ->copy_attribute_out *and* ->CopyToOneRow. After all, when in ->CopyToOneRow for text, we could know that we need to call CopyAttributeOutText etc. > If yes, then I'd assume that this shuts down the whole thread or that it > needs a completely different approach, because we will multiply indirect > function calls that can control how data is generated for each row, which is > the original case that Sutou-san wanted to tackle. I think it could be rescued fairly easily - remove the dispatch via ->copy_attribute_out(). To avoid duplicating code you could use a static inline function that's used with constant arguments by both csv and text mode. I think it might also be worth ensuring that future patches can move branches like if (cstate->encoding_embeds_ascii) if (cstate->need_transcoding) into the choice of per-row callback. > The End, Start and In/OutFunc callbacks are called only once per query, so > these don't matter AFAIU. Right. Greetings, Andres Freund