Re: Improvements in Copy From
At Fri, 11 Sep 2020 18:44:13 +1000, Peter Smith wrote in > On Thu, Sep 10, 2020 at 9:21 PM vignesh C wrote: > > > Whether such a micro-optimisation is worth doing is another question. > > Yes, what you suggested can also be done, but even I have the same > > question as you. Because we will reduce just one function call, the > > eof check is present immediately in the function, Should we include > > this or not? > > I expect the difference from my suggestion is too small to be measured. > > Probably it is not worth changing the already complicated code unless > those changes can achieve something observable. > > ~~ > > FYI, I ran a few performance tests BEFORE/AFTER applying your patch. > > Perf results for \COPY 5GB CSV file to UNLOGGED table. > > perf -a –g > psql -d test -c "\copy tbl from '/my/path/data_5GB.csv' with (format csv);” > perf report –g > > BEFORE > #1 CopyReadLineText = 12.70%, CopyLoadRawBuf = 0.81% > #2 CopyReadLineText = 12.54%, CopyLoadRawBuf = 0.81% > #3 CopyReadLineText = 12.52%, CopyLoadRawBuf = 0.81% > > AFTER > #1 CopyReadLineText = 12.55%, CopyLoadRawBuf = 1.20% > #2 CopyReadLineText = 12.15%, CopyLoadRawBuf = 1.10% > #3 CopyReadLineText = 13.11%, CopyLoadRawBuf = 1.24% > #4 CopyReadLineText = 12.86%, CopyLoadRawBuf = 1.18% > > I didn't quite know how to interpret those results. It was opposite > what I expected. Perhaps the original excessive CopyLoadRawBuf calls > were so brief they could often avoid being sampled? Anyway, I hope you > have a better understanding of perf than I do and can explain it. > > I then repeated/times same tests but without perf > > BEFORE > #1 4min.36s > #2 4min.45s > #3 4min.43s > #4 4min.34s > > AFTER > #1 4min.41s > #2 4min.37s > #3 4min.34s > > As you can see, unfortunately, the patch gave no observable benefit > for my test case. That observation agrees with my assumption. At Fri, 11 Sep 2020 15:58:04 +0900 (JST), Kyotaro Horiguchi wrote in me> we should do that. On the contrary, if incoming data were me> intermittently delayed for some reasons (heavy load of client or me> in-between network), this patch would make things worse by waiting for me> delayed bits before processing already received bits. It seems that a slow network is enough to cause that behavior even without any trouble, regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Improvements in Copy From
On Thu, Sep 10, 2020 at 9:21 PM vignesh C wrote: > > Whether such a micro-optimisation is worth doing is another question. > Yes, what you suggested can also be done, but even I have the same > question as you. Because we will reduce just one function call, the > eof check is present immediately in the function, Should we include > this or not? I expect the difference from my suggestion is too small to be measured. Probably it is not worth changing the already complicated code unless those changes can achieve something observable. ~~ FYI, I ran a few performance tests BEFORE/AFTER applying your patch. Perf results for \COPY 5GB CSV file to UNLOGGED table. perf -a –g psql -d test -c "\copy tbl from '/my/path/data_5GB.csv' with (format csv);” perf report –g BEFORE #1 CopyReadLineText = 12.70%, CopyLoadRawBuf = 0.81% #2 CopyReadLineText = 12.54%, CopyLoadRawBuf = 0.81% #3 CopyReadLineText = 12.52%, CopyLoadRawBuf = 0.81% AFTER #1 CopyReadLineText = 12.55%, CopyLoadRawBuf = 1.20% #2 CopyReadLineText = 12.15%, CopyLoadRawBuf = 1.10% #3 CopyReadLineText = 13.11%, CopyLoadRawBuf = 1.24% #4 CopyReadLineText = 12.86%, CopyLoadRawBuf = 1.18% I didn't quite know how to interpret those results. It was opposite what I expected. Perhaps the original excessive CopyLoadRawBuf calls were so brief they could often avoid being sampled? Anyway, I hope you have a better understanding of perf than I do and can explain it. I then repeated/times same tests but without perf BEFORE #1 4min.36s #2 4min.45s #3 4min.43s #4 4min.34s AFTER #1 4min.41s #2 4min.37s #3 4min.34s As you can see, unfortunately, the patch gave no observable benefit for my test case. Kind Regards, Peter Smith. Fujitsu Australia
Re: Improvements in Copy From
At Thu, 10 Sep 2020 21:55:27 +0300, Surafel Temesgen wrote in > On Thu, Sep 10, 2020 at 1:17 PM vignesh C wrote: > > > > > > > > > We have a patch for column matching feature [1] that may need a header > > line to be further processed. Even without that I think it is preferable to > > process the header line for nothing than adding those checks to the loop, > > performance-wise. > > > > I had seen that patch, I feel that change to match the header if the > > header is specified can be addressed in this patch if that patch gets > > committed first or vice versa. We are doing a lot of processing for > > the data which we need not do anything. Shouldn't this be skipped if > > not required. Similar check is present in NextCopyFromRawFields also > > to skip header. > > > > The existing check is unavoidable but we can live better without the checks > added by the patch. For very large files the loop may iterate millions of > times if it is not in billion and I am sure doing the check that many times > will incur noticeable performance degradation than further processing a > single line. FWIW, I thought the same thing seeing the additional if-conditions. It gives more loss than gain. For the first part, the patch reveals COPY_NEW_FE, which I don't think to be a knowledge for the function, to CopyGetData. Considering that that doesn't seem to offer noticeable performance gain, I don't think we should do that. On the contrary, if incoming data were intermittently delayed for some reasons (heavy load of client or in-between network), this patch would make things worse by waiting for delayed bits before processing already received bits. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Improvements in Copy From
On Thu, Sep 10, 2020 at 1:17 PM vignesh C wrote: > > > > > We have a patch for column matching feature [1] that may need a header > line to be further processed. Even without that I think it is preferable to > process the header line for nothing than adding those checks to the loop, > performance-wise. > > I had seen that patch, I feel that change to match the header if the > header is specified can be addressed in this patch if that patch gets > committed first or vice versa. We are doing a lot of processing for > the data which we need not do anything. Shouldn't this be skipped if > not required. Similar check is present in NextCopyFromRawFields also > to skip header. > The existing check is unavoidable but we can live better without the checks added by the patch. For very large files the loop may iterate millions of times if it is not in billion and I am sure doing the check that many times will incur noticeable performance degradation than further processing a single line. regards Surafel
Re: Improvements in Copy From
On Wed, Sep 9, 2020 at 12:24 PM Peter Smith wrote: > > My basic understanding of first part of your patch is that by > adjusting the "minread" it now allows it to loop multiple times > internally within the CopyGetData rather than calling CopyLoadRawBuf > for every N lines. There doesn't seem to be much change to what other > code gets executed so the saving is essentially whatever is the cost > of making 2 x function calls (CopyLoadRawBuff + CopyGetData) x N. Is > that understanding correct? Yes you are right, we will avoid the function calls and try to get as many records as possible from the buffer & insert it to the relation. > But with that change there seems to be opportunity for yet another > tiny saving possible. IIUC, now you are processing a lot more data > within the CopyGetData so it is now very likely that you will also > gobble the COPY_NEW_FE's 'c' marker. So cstate->reached_eof will be > set. So this means the calling code of CopyReadLineText may no longer > need to call the CopyLoadRawBuf one last time just to discover there > are no more bytes to read - something that it already knows if > cstate->reached_eof == true. > > For example, with your change can't you also modify CopyReadLineText like > below: > > BEFORE > if (!CopyLoadRawBuf(cstate)) > hit_eof = true; > > AFTER > if (cstate->reached_eof) > { > cstate->raw_buf[0] = '\0'; > cstate->raw_buf_index = cstate->raw_buf_len = 0; > hit_eof = true; > } > else if (!CopyLoadRawBuf(cstate)) > { > hit_eof = true; > } > > Whether such a micro-optimisation is worth doing is another question. Yes, what you suggested can also be done, but even I have the same question as you. Because we will reduce just one function call, the eof check is present immediately in the function, Should we include this or not? Regards, VIgnesh EnterpriseDB: http://www.enterprisedb.com
Re: Improvements in Copy From
On Mon, Sep 7, 2020 at 1:19 PM Surafel Temesgen wrote: > > > Hi Vignesh > > On Wed, Jul 1, 2020 at 3:46 PM vignesh C wrote: >> >> Hi, >> >> While reviewing copy from I identified few improvements for copy from >> that can be done : >> a) copy from stdin copies lesser amount of data to buffer even though >> space is available in buffer because minread was passed as 1 to >> CopyGetData, Hence it only reads until the data read from libpq is >> less than minread. This can be fixed by passing the actual space >> available in buffer, this reduces the unnecessary frequent calls to >> CopyGetData. > > > why not applying the same optimization on file read ? For file read this is already taken care as you can see from below code: bytesread = fread(databuf, 1, maxread, cstate->copy_file); if (ferror(cstate->copy_file)) ereport(ERROR, (errcode_for_file_access(), errmsg("could not read from COPY file: %m"))); if (bytesread == 0) cstate->reached_eof = true; break; We do not have any condition to break unlike the case of stdin, we read 1 * maxread size of data, So no need to do anything for it. > >> >> c) Copy from reads header line and do nothing for the header line, we >> need not clear EOL & need not convert to server encoding for the >> header line. > > > We have a patch for column matching feature [1] that may need a header line > to be further processed. Even without that I think it is preferable to > process the header line for nothing than adding those checks to the loop, > performance-wise. I had seen that patch, I feel that change to match the header if the header is specified can be addressed in this patch if that patch gets committed first or vice versa. We are doing a lot of processing for the data which we need not do anything. Shouldn't this be skipped if not required. Similar check is present in NextCopyFromRawFields also to skip header. Thoughts? Regards, VIgnesh EnterpriseDB: http://www.enterprisedb.com
Re: Improvements in Copy From
My basic understanding of first part of your patch is that by adjusting the "minread" it now allows it to loop multiple times internally within the CopyGetData rather than calling CopyLoadRawBuf for every N lines. There doesn't seem to be much change to what other code gets executed so the saving is essentially whatever is the cost of making 2 x function calls (CopyLoadRawBuff + CopyGetData) x N. Is that understanding correct? But with that change there seems to be opportunity for yet another tiny saving possible. IIUC, now you are processing a lot more data within the CopyGetData so it is now very likely that you will also gobble the COPY_NEW_FE's 'c' marker. So cstate->reached_eof will be set. So this means the calling code of CopyReadLineText may no longer need to call the CopyLoadRawBuf one last time just to discover there are no more bytes to read - something that it already knows if cstate->reached_eof == true. For example, with your change can't you also modify CopyReadLineText like below: BEFORE if (!CopyLoadRawBuf(cstate)) hit_eof = true; AFTER if (cstate->reached_eof) { cstate->raw_buf[0] = '\0'; cstate->raw_buf_index = cstate->raw_buf_len = 0; hit_eof = true; } else if (!CopyLoadRawBuf(cstate)) { hit_eof = true; } Whether such a micro-optimisation is worth doing is another question. --- Kind Regards, Peter Smith. Fujitsu Australia On Sun, Aug 30, 2020 at 5:25 PM vignesh C wrote: > > On Thu, Aug 27, 2020 at 11:02 AM Peter Smith wrote: > > > > Hello. > > > > FYI - that patch has conflicts when applied. > > > > Thanks for letting me know. Attached new patch which is rebased on top of > head. > > Regards, > VIgnesh > EnterpriseDB: http://www.enterprisedb.com
Re: Improvements in Copy From
Hi Vignesh On Wed, Jul 1, 2020 at 3:46 PM vignesh C wrote: > Hi, > > While reviewing copy from I identified few improvements for copy from > that can be done : > a) copy from stdin copies lesser amount of data to buffer even though > space is available in buffer because minread was passed as 1 to > CopyGetData, Hence it only reads until the data read from libpq is > less than minread. This can be fixed by passing the actual space > available in buffer, this reduces the unnecessary frequent calls to > CopyGetData. > why not applying the same optimization on file read ? > c) Copy from reads header line and do nothing for the header line, we > need not clear EOL & need not convert to server encoding for the > header line. > We have a patch for column matching feature [1] that may need a header line to be further processed. Even without that I think it is preferable to process the header line for nothing than adding those checks to the loop, performance-wise. [1]. https://www.postgresql.org/message-id/flat/caf1-j-0ptcwmeltswwgv2m70u26n4g33gpe1rckqqe6wvqd...@mail.gmail.com regards Surafel
Re: Improvements in Copy From
On Thu, Aug 27, 2020 at 11:02 AM Peter Smith wrote: > > Hello. > > FYI - that patch has conflicts when applied. > Thanks for letting me know. Attached new patch which is rebased on top of head. Regards, VIgnesh EnterpriseDB: http://www.enterprisedb.com From a343fe1f8fdf4293d2ef6841e243390b99f29e28 Mon Sep 17 00:00:00 2001 From: Vignesh C Date: Sun, 30 Aug 2020 12:31:12 +0530 Subject: [PATCH v2] Improvements in copy from. There are couple of improvements for copy from in this patch which is detailed below: a) copy from stdin copies lesser amount of data to buffer even though space is available in buffer because minread was passed as 1 to CopyGetData, fixed it by passing the actual space available in buffer, this reduces the frequent call to CopyGetData. b) Copy from reads header line and does nothing for the read line, we need not clear EOL & need not convert to server encoding for the header line. --- src/backend/commands/copy.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index db7d24a..c688baa 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -801,14 +801,18 @@ CopyLoadRawBuf(CopyState cstate) { int nbytes = RAW_BUF_BYTES(cstate); int inbytes; + int minread = 1; /* Copy down the unprocessed data if any. */ if (nbytes > 0) memmove(cstate->raw_buf, cstate->raw_buf + cstate->raw_buf_index, nbytes); + if (cstate->copy_dest == COPY_NEW_FE) + minread = RAW_BUF_SIZE - nbytes; + inbytes = CopyGetData(cstate, cstate->raw_buf + nbytes, - 1, RAW_BUF_SIZE - nbytes); + minread, RAW_BUF_SIZE - nbytes); nbytes += inbytes; cstate->raw_buf[nbytes] = '\0'; cstate->raw_buf_index = 0; @@ -3917,7 +3921,7 @@ CopyReadLine(CopyState cstate) } while (CopyLoadRawBuf(cstate)); } } - else + else if (!(cstate->cur_lineno == 0 && cstate->header_line)) { /* * If we didn't hit EOF, then we must have transferred the EOL marker @@ -3951,8 +3955,9 @@ CopyReadLine(CopyState cstate) } } - /* Done reading the line. Convert it to server encoding. */ - if (cstate->need_transcoding) + /* Done reading the line. Convert it to server encoding if not header. */ + if (cstate->need_transcoding && + !(cstate->cur_lineno == 0 && cstate->header_line)) { char *cvt; -- 1.8.3.1
Re: Improvements in Copy From
Hello. FYI - that patch has conflicts when applied. Kind Regards Peter Smith Fujitsu Australia. On Thu, Aug 27, 2020 at 3:11 PM vignesh C wrote: > > On Tue, Jul 14, 2020 at 12:17 PM vignesh C wrote: > > > > On Tue, Jul 14, 2020 at 11:13 AM David Rowley wrote: > > > > > > On Tue, 14 Jul 2020 at 17:22, David Rowley wrote: > > > > > > > > On Thu, 2 Jul 2020 at 00:46, vignesh C wrote: > > > > > b) CopyMultiInsertInfoNextFreeSlot had an unused function parameter > > > > > that is not being used, it can be removed. > > > > > > > > This was raised in [1]. We decided not to remove it. > > > > > > I just added a comment to the function to mention why we want to keep > > > the parameter. I hope that will save any wasted time proposing its > > > removal in the future. > > > > > > FWIW, the function is inlined. Removing it will gain us nothing > > > performance-wise anyway. > > > > > > David > > > > > > > [1] > > > > https://www.postgresql.org/message-id/flat/CAKJS1f-A5aYvPHe10Wy9LjC4RzLsBrya8b2gfuQHFabhwZT_NQ%40mail.gmail.com#3bae9a84be253c527b0e621add0fbaef > > > > Thanks David for pointing it out, as this has been discussed and > > concluded no point in discussing the same thing again. This patch has > > a couple of other improvements which can still be taken forward. I > > will remove this change and post a new patch to retain the other > > issues that were fixed. > > > > I have removed the changes that david had pointed out and retained the > remaining changes. Attaching the patch for the same. > Thoughts? > > Regards, > Vignesh > EnterpriseDB: http://www.enterprisedb.com
Re: Improvements in Copy From
On Tue, Jul 14, 2020 at 12:17 PM vignesh C wrote: > > On Tue, Jul 14, 2020 at 11:13 AM David Rowley wrote: > > > > On Tue, 14 Jul 2020 at 17:22, David Rowley wrote: > > > > > > On Thu, 2 Jul 2020 at 00:46, vignesh C wrote: > > > > b) CopyMultiInsertInfoNextFreeSlot had an unused function parameter > > > > that is not being used, it can be removed. > > > > > > This was raised in [1]. We decided not to remove it. > > > > I just added a comment to the function to mention why we want to keep > > the parameter. I hope that will save any wasted time proposing its > > removal in the future. > > > > FWIW, the function is inlined. Removing it will gain us nothing > > performance-wise anyway. > > > > David > > > > > [1] > > > https://www.postgresql.org/message-id/flat/CAKJS1f-A5aYvPHe10Wy9LjC4RzLsBrya8b2gfuQHFabhwZT_NQ%40mail.gmail.com#3bae9a84be253c527b0e621add0fbaef > > Thanks David for pointing it out, as this has been discussed and > concluded no point in discussing the same thing again. This patch has > a couple of other improvements which can still be taken forward. I > will remove this change and post a new patch to retain the other > issues that were fixed. > I have removed the changes that david had pointed out and retained the remaining changes. Attaching the patch for the same. Thoughts? Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com From fbafa5eaaa84028b3bbfb7cde0cbcc3963fd033a Mon Sep 17 00:00:00 2001 From: Vignesh C Date: Tue, 14 Jul 2020 12:21:37 +0530 Subject: [PATCH] Improvements in copy from. There are couple of improvements for copy from in this patch which is detailed below: a) copy from stdin copies lesser amount of data to buffer even though space is available in buffer because minread was passed as 1 to CopyGetData, fixed it by passing the actual space available in buffer, this reduces the frequent call to CopyGetData. b) Copy from reads header line and does nothing for the read line, we need not clear EOL & need not convert to server encoding for the header line. --- src/backend/commands/copy.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 44da71c..bc27dfc 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -796,6 +796,7 @@ CopyLoadRawBuf(CopyState cstate) { int nbytes; int inbytes; + int minread = 1; if (cstate->raw_buf_index < cstate->raw_buf_len) { @@ -807,8 +808,11 @@ CopyLoadRawBuf(CopyState cstate) else nbytes = 0;/* no data need be saved */ + if (cstate->copy_dest == COPY_NEW_FE) + minread = RAW_BUF_SIZE - nbytes; + inbytes = CopyGetData(cstate, cstate->raw_buf + nbytes, - 1, RAW_BUF_SIZE - nbytes); + minread, RAW_BUF_SIZE - nbytes); nbytes += inbytes; cstate->raw_buf[nbytes] = '\0'; cstate->raw_buf_index = 0; @@ -3869,7 +3873,7 @@ CopyReadLine(CopyState cstate) } while (CopyLoadRawBuf(cstate)); } } - else + else if (!(cstate->cur_lineno == 0 && cstate->header_line)) { /* * If we didn't hit EOF, then we must have transferred the EOL marker @@ -3903,8 +3907,9 @@ CopyReadLine(CopyState cstate) } } - /* Done reading the line. Convert it to server encoding. */ - if (cstate->need_transcoding) + /* Done reading the line. Convert it to server encoding if not header. */ + if (cstate->need_transcoding && + !(cstate->cur_lineno == 0 && cstate->header_line)) { char *cvt; -- 1.8.3.1
Re: Improvements in Copy From
On Tue, Jul 14, 2020 at 11:13 AM David Rowley wrote: > > On Tue, 14 Jul 2020 at 17:22, David Rowley wrote: > > > > On Thu, 2 Jul 2020 at 00:46, vignesh C wrote: > > > b) CopyMultiInsertInfoNextFreeSlot had an unused function parameter > > > that is not being used, it can be removed. > > > > This was raised in [1]. We decided not to remove it. > > I just added a comment to the function to mention why we want to keep > the parameter. I hope that will save any wasted time proposing its > removal in the future. > > FWIW, the function is inlined. Removing it will gain us nothing > performance-wise anyway. > > David > > > [1] > > https://www.postgresql.org/message-id/flat/CAKJS1f-A5aYvPHe10Wy9LjC4RzLsBrya8b2gfuQHFabhwZT_NQ%40mail.gmail.com#3bae9a84be253c527b0e621add0fbaef Thanks David for pointing it out, as this has been discussed and concluded no point in discussing the same thing again. This patch has a couple of other improvements which can still be taken forward. I will remove this change and post a new patch to retain the other issues that were fixed. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
Re: Improvements in Copy From
On Tue, 14 Jul 2020 at 17:22, David Rowley wrote: > > On Thu, 2 Jul 2020 at 00:46, vignesh C wrote: > > b) CopyMultiInsertInfoNextFreeSlot had an unused function parameter > > that is not being used, it can be removed. > > This was raised in [1]. We decided not to remove it. I just added a comment to the function to mention why we want to keep the parameter. I hope that will save any wasted time proposing its removal in the future. FWIW, the function is inlined. Removing it will gain us nothing performance-wise anyway. David > [1] > https://www.postgresql.org/message-id/flat/CAKJS1f-A5aYvPHe10Wy9LjC4RzLsBrya8b2gfuQHFabhwZT_NQ%40mail.gmail.com#3bae9a84be253c527b0e621add0fbaef
Re: Improvements in Copy From
On Thu, 2 Jul 2020 at 00:46, vignesh C wrote: > b) CopyMultiInsertInfoNextFreeSlot had an unused function parameter > that is not being used, it can be removed. This was raised in [1]. We decided not to remove it. David [1] https://www.postgresql.org/message-id/flat/CAKJS1f-A5aYvPHe10Wy9LjC4RzLsBrya8b2gfuQHFabhwZT_NQ%40mail.gmail.com#3bae9a84be253c527b0e621add0fbaef
Re: Improvements in Copy From
On Wed, Jul 1, 2020 at 6:16 PM vignesh C wrote: > Attached patch has the changes for the same. > Thoughts? > Added a commitfest entry for this: https://commitfest.postgresql.org/29/2642/ Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
Improvements in Copy From
Hi, While reviewing copy from I identified few improvements for copy from that can be done : a) copy from stdin copies lesser amount of data to buffer even though space is available in buffer because minread was passed as 1 to CopyGetData, Hence it only reads until the data read from libpq is less than minread. This can be fixed by passing the actual space available in buffer, this reduces the unnecessary frequent calls to CopyGetData. b) CopyMultiInsertInfoNextFreeSlot had an unused function parameter that is not being used, it can be removed. c) Copy from reads header line and do nothing for the header line, we need not clear EOL & need not convert to server encoding for the header line. Attached patch has the changes for the same. Thoughts? Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com From b86edd9bf4f4350598a0518996e64f40f13aea21 Mon Sep 17 00:00:00 2001 From: Vignesh C Date: Wed, 1 Jul 2020 17:51:51 +0530 Subject: [PATCH] Improvements in copy from. There are 3 improvements for copy from in this patch which is detailed below: a) copy from stdin copies lesser amount of data to buffer even though space is available in buffer because minread was passed as 1 to CopyGetData, fixed it by passing the actual space available in buffer, this reduces the frequent call to CopyGetData. b) CopyMultiInsertInfoNextFreeSlot had an unused function parameter, this is not being used, it has been removed. c) Copy from reads header line and does nothing for the read line, we need not clear EOL & need not convert to server encoding for the header line. --- src/backend/commands/copy.c | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 3e199bd..dfb7d92 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -793,6 +793,7 @@ CopyLoadRawBuf(CopyState cstate) { int nbytes; int inbytes; + int minread = 1; if (cstate->raw_buf_index < cstate->raw_buf_len) { @@ -804,8 +805,11 @@ CopyLoadRawBuf(CopyState cstate) else nbytes = 0;/* no data need be saved */ + if (cstate->copy_dest == COPY_NEW_FE) + minread = RAW_BUF_SIZE - nbytes; + inbytes = CopyGetData(cstate, cstate->raw_buf + nbytes, - 1, RAW_BUF_SIZE - nbytes); + minread, RAW_BUF_SIZE - nbytes); nbytes += inbytes; cstate->raw_buf[nbytes] = '\0'; cstate->raw_buf_index = 0; @@ -2603,8 +2607,7 @@ CopyMultiInsertInfoCleanup(CopyMultiInsertInfo *miinfo) * Callers must ensure that the buffer is not full. */ static inline TupleTableSlot * -CopyMultiInsertInfoNextFreeSlot(CopyMultiInsertInfo *miinfo, -ResultRelInfo *rri) +CopyMultiInsertInfoNextFreeSlot(ResultRelInfo *rri) { CopyMultiInsertBuffer *buffer = rri->ri_CopyMultiInsertBuffer; int nused = buffer->nused; @@ -2969,8 +2972,7 @@ CopyFrom(CopyState cstate) Assert(resultRelInfo == target_resultRelInfo); Assert(insertMethod == CIM_MULTI); - myslot = CopyMultiInsertInfoNextFreeSlot(&multiInsertInfo, - resultRelInfo); + myslot = CopyMultiInsertInfoNextFreeSlot(resultRelInfo); } /* @@ -3117,8 +3119,7 @@ CopyFrom(CopyState cstate) /* no other path available for partitioned table */ Assert(insertMethod == CIM_MULTI_CONDITIONAL); -batchslot = CopyMultiInsertInfoNextFreeSlot(&multiInsertInfo, - resultRelInfo); +batchslot = CopyMultiInsertInfoNextFreeSlot(resultRelInfo); if (map != NULL) myslot = execute_attr_map_slot(map->attrMap, myslot, @@ -3856,7 +3857,7 @@ CopyReadLine(CopyState cstate) } while (CopyLoadRawBuf(cstate)); } } - else + else if (!(cstate->cur_lineno == 0 && cstate->header_line)) { /* * If we didn't hit EOF, then we must have transferred the EOL marker @@ -3890,8 +3891,9 @@ CopyReadLine(CopyState cstate) } } - /* Done reading the line. Convert it to server encoding. */ - if (cstate->need_transcoding) + /* Done reading the line. Convert it to server encoding if not header. */ + if (cstate->need_transcoding && + !(cstate->cur_lineno == 0 && cstate->header_line)) { char *cvt; -- 1.8.3.1