Re: confusing / inefficient "need_transcoding" handling in copy

2024-02-13 Thread Sutou Kouhei
Hi, In "Re: confusing / inefficient "need_transcoding" handling in copy" on Wed, 14 Feb 2024 06:56:16 +0900, Michael Paquier wrote: > We have a couple of non-ASCII characters in the tests, but I suspect > that this one will not be digested correctly everywher

Re: confusing / inefficient "need_transcoding" handling in copy

2024-02-13 Thread Michael Paquier
On Thu, Feb 08, 2024 at 05:25:01PM +0900, Sutou Kouhei wrote: > In <20240206222445.hzq22pb2nye7r...@awork3.anarazel.de> > "Re: confusing / inefficient "need_transcoding" handling in copy" on Tue, 6 > Feb 2024 14:24:45 -0800, > Andres Freund wrote: >

Re: confusing / inefficient "need_transcoding" handling in copy

2024-02-08 Thread Andres Freund
Hi, On 2024-02-09 09:36:28 +0900, Michael Paquier wrote: > On Thu, Feb 08, 2024 at 10:25:07AM +0200, Heikki Linnakangas wrote: > > There's no validation, just conversion. I'd suggest: > > > > "Set up encoding conversion info if the file and server encodings differ > > (see also

Re: confusing / inefficient "need_transcoding" handling in copy

2024-02-08 Thread Michael Paquier
On Thu, Feb 08, 2024 at 10:25:07AM +0200, Heikki Linnakangas wrote: > There's no validation, just conversion. I'd suggest: > > "Set up encoding conversion info if the file and server encodings differ > (see also pg_server_to_any)." > > Other than that, +1 Cool. I've used your wording and

Re: confusing / inefficient "need_transcoding" handling in copy

2024-02-08 Thread Michael Paquier
On Thu, Feb 08, 2024 at 05:29:46PM +0900, Sutou Kouhei wrote: > Oh, sorry. I missed the Michael's patch: > https://www.postgresql.org/message-id/flat/ZcR9Q9hJ8GedFSCd%40paquier.xyz#e73272b042a22befac7a95f7bcb4fb9a > > I withdraw my change. No problem. Thanks for caring about that. -- Michael

Re: confusing / inefficient "need_transcoding" handling in copy

2024-02-08 Thread Sutou Kouhei
Hi, In <20240208.172501.2177371292839763981@clear-code.com> "Re: confusing / inefficient "need_transcoding" handling in copy" on Thu, 08 Feb 2024 17:25:01 +0900 (JST), Sutou Kouhei wrote: > How about the following to avoid needless transcoding? Oh, sorry.

Re: confusing / inefficient "need_transcoding" handling in copy

2024-02-08 Thread Heikki Linnakangas
On 08/02/2024 09:05, Michael Paquier wrote: On Tue, Feb 06, 2024 at 02:24:45PM -0800, Andres Freund wrote: I think the code is just very confusing - there actually *is* verification of the encoding, it just happens at a different, earlier, layer, namely in copyfromparse.c: CopyConvertBuf()

Re: confusing / inefficient "need_transcoding" handling in copy

2024-02-08 Thread Sutou Kouhei
Hi, In <20240206222445.hzq22pb2nye7r...@awork3.anarazel.de> "Re: confusing / inefficient "need_transcoding" handling in copy" on Tue, 6 Feb 2024 14:24:45 -0800, Andres Freund wrote: > One unfortunate issue: We don't have any tests verifying that COPY FROM &g

Re: confusing / inefficient "need_transcoding" handling in copy

2024-02-07 Thread Michael Paquier
On Tue, Feb 06, 2024 at 02:24:45PM -0800, Andres Freund wrote: > I think the code is just very confusing - there actually *is* verification of > the encoding, it just happens at a different, earlier, layer, namely in > copyfromparse.c: CopyConvertBuf() which says: > /* >* If the file

Re: confusing / inefficient "need_transcoding" handling in copy

2024-02-06 Thread Andres Freund
Hi, On 2024-02-06 12:51:48 -0500, Tom Lane wrote: > Michael Paquier writes: > > On Mon, Feb 05, 2024 at 06:05:04PM -0800, Andres Freund wrote: > >> I haven't yet dug into the code history. One guess is that this should only > >> have been set this way for COPY FROM. > > > Looking the git

Re: confusing / inefficient "need_transcoding" handling in copy

2024-02-06 Thread Tom Lane
Michael Paquier writes: > On Mon, Feb 05, 2024 at 06:05:04PM -0800, Andres Freund wrote: >> I haven't yet dug into the code history. One guess is that this should only >> have been set this way for COPY FROM. > Looking the git history, this looks like an oversight of c61a2f58418e > that has

Re: confusing / inefficient "need_transcoding" handling in copy

2024-02-05 Thread Michael Paquier
On Mon, Feb 05, 2024 at 06:05:04PM -0800, Andres Freund wrote: > I don't really understand why we need to validate anything during COPY TO? > Which is good, because it turns out that we don't actually validate anything, > as pg_server_to_any() returns without doing anything if the encoding

confusing / inefficient "need_transcoding" handling in copy

2024-02-05 Thread Andres Freund
Hi, Looking at the profiles in [1], and similar profiles locally, made me wonder why a basic COPY TO shows pg_server_to_any() and the strlen() to compute the length of the to-be-converted string so heavily in profiles. Example profile, for [2]: - 88.11%12.02% postgres postgres