Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Thanks a lot for the merge. I did some tests and the master branch runs up to 15% faster than the last patch I tried (v22). Amazing! Cheers, Matthieu Garrigues On Tue, Mar 16, 2021 at 9:00 PM Andres Freund wrote: > > Hi, > > On 2021-03-05 21:35:59 -0300, Alvaro Herrera wrote: > > I'll take the weekend to think about the issue with conn->last_query and > > conn->queryclass that I mentioned yesterday; other than that detail my > > feeling is that this is committable, so I'll be looking at getting this > > pushed early next weeks, barring opinions from others. > > It is *very* exciting to see this being merged. Thanks for all the work > to all that contributed! > > Greetings, > > Andres Freund
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Hi, On 2021-03-05 21:35:59 -0300, Alvaro Herrera wrote: > I'll take the weekend to think about the issue with conn->last_query and > conn->queryclass that I mentioned yesterday; other than that detail my > feeling is that this is committable, so I'll be looking at getting this > pushed early next weeks, barring opinions from others. It is *very* exciting to see this being merged. Thanks for all the work to all that contributed! Greetings, Andres Freund
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On 2021-Mar-15, Justin Pryzby wrote: > Are you going to update the assertion ? > > +#if 0 > > > Assert((meta == META_NONE && varprefix == NULL) || > > >((meta == META_GSET || meta == META_ASET) && varprefix != > NULL)); > > +#endif > > Yeah, caught that just after sending. Here's a notpatch. -- Álvaro Herrera39°49'30"S 73°17'W "La virtud es el justo medio entre dos defectos" (Aristóteles) diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index ba7b35d83c..e69d43b26b 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -2823,13 +2823,12 @@ readCommandResponse(CState *st, MetaCommand meta, char *varprefix) int qrynum = 0; /* -* varprefix should be set only with \gset or \aset, and SQL commands do -* not need it. +* varprefix should be set only with \gset or \aset, and \endpipeline and +* SQL commands do not need it. */ -#if 0 Assert((meta == META_NONE && varprefix == NULL) || + ((meta == META_ENDPIPELINE) && varprefix == NULL) || ((meta == META_GSET || meta == META_ASET) && varprefix != NULL)); -#endif res = PQgetResult(st->con);
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Are you going to update the assertion ? +#if 0 Assert((meta == META_NONE && varprefix == NULL) || ((meta == META_GSET || meta == META_ASET) && varprefix != NULL)); +#endif
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Here's what seems a final version of the patch. I renamed one more function: PQsendPipeline is now PQpipelineSync. I also reworded the docs in a couple of places, added a few tests to the pgbench patch, and made it work. Note the pgbench results in pipeline mode: ./pgbench -r -Mextended -n -f /home/alvherre/Code/pgsql-build/pipeline/src/bin/pgbench/tmp_check/t_001_pgbench_with_server_main_data/001_pgbench_pipeline -c 100 -t1 pgbench (PostgreSQL) 14.0 transaction type: /home/alvherre/Code/pgsql-build/pipeline/src/bin/pgbench/tmp_check/t_001_pgbench_with_server_main_data/001_pgbench_pipeline scaling factor: 1 query mode: extended number of clients: 100 number of threads: 1 number of transactions per client: 1 number of transactions actually processed: 100/100 latency average = 2.316 ms initial connection time = 113.859 ms tps = 43182.438635 (without initial connection time) statement latencies in milliseconds: 0.000 \startpipeline 0.000 select 1; 0.000 select 1; 0.000 select 1; 0.000 select 1; 0.000 select 1; 0.000 select 1; 0.000 select 1; 0.000 select 1; 0.000 select 1; 0.000 select 1; 1.624 \endpipeline If I just replace the \startpipeline and \endpipeline lines with BEGIN and COMMIT respectively, I get this: tps = 10220.259051 (without initial connection time) 0.830 begin; 0.765 select 1; 0.752 select 1; 0.753 select 1; 0.755 select 1; 0.754 select 1; 0.755 select 1; 0.757 select 1; 0.756 select 1; 0.756 select 1; 0.756 select 1; 0.750 commit; Yes, you could say that this is a ltle bit unfair -- but it seems quite impressive nonetheless. -- Álvaro Herrera39°49'30"S 73°17'W >From 5e4fdd5246d559caf0d75ad74001f09a48ec4c0e Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Mon, 15 Mar 2021 15:05:22 -0300 Subject: [PATCH v37 1/2] Implement pipeline mode in libpq MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Pipeline mode in libpq lets an application avoid the Sync messages in the FE/BE protocol that are implicit in the old libpq API after each query. The application can then insert Sync at its leisure with a new libpq function PQpipelineSync. This can lead to substantial reductions in query latency. Co-authored-by: Craig Ringer Co-authored-by: Matthieu Garrigues Co-authored-by: Álvaro Herrera Reviewed-by: Andres Freund Reviewed-by: Aya Iwata Reviewed-by: Daniel Vérité Reviewed-by: David G. Johnston Reviewed-by: Justin Pryzby Reviewed-by: Kirk Jamison Reviewed-by: Michael Paquier Reviewed-by: Nikhil Sontakke Reviewed-by: Vaishnavi Prabakaran Reviewed-by: Zhihong Yu Discussion: https://postgr.es/m/CAMsr+YFUjJytRyV4J-16bEoiZyH=4nj+sQ7JP9ajwz=b4dm...@mail.gmail.com Discussion: https://postgr.es/m/cajkzx4t5e-2cqe3dtv2r78dyfvz+in8py7a8marvlhs_pg7...@mail.gmail.com --- doc/src/sgml/libpq.sgml | 522 ++- doc/src/sgml/lobj.sgml|4 + doc/src/sgml/ref/pgbench.sgml | 21 + .../libpqwalreceiver/libpqwalreceiver.c |6 + src/bin/pg_amcheck/pg_amcheck.c |2 + src/interfaces/libpq/exports.txt |4 + src/interfaces/libpq/fe-connect.c | 37 +- src/interfaces/libpq/fe-exec.c| 717 - src/interfaces/libpq/fe-protocol3.c | 77 +- src/interfaces/libpq/libpq-fe.h | 21 +- src/interfaces/libpq/libpq-int.h | 60 +- src/test/modules/Makefile |1 + src/test/modules/libpq_pipeline/.gitignore|5 + src/test/modules/libpq_pipeline/Makefile | 20 + src/test/modules/libpq_pipeline/README|1 + .../modules/libpq_pipeline/libpq_pipeline.c | 1303 + .../libpq_pipeline/t/001_libpq_pipeline.pl| 28 + src/tools/msvc/Mkvcbuild.pm |9 +- src/tools/pgindent/typedefs.list |2 + 19 files changed, 2727 insertions(+), 113 deletions(-) create mode 100644 src/test/modules/libpq_pipeline/.gitignore create mode 100644 src/test/modules/libpq_pipeline/Makefile create mode 100644 src/test/modules/libpq_pipeline/README create mode 100644 src/test/modules/libpq_pipeline/libpq_pipeline.c create mode 100644 src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 910e9a81ea..be674fbaa9 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3180,6 +3180,33 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_PIPELINE_SYNC + + +The PGresult represents a +synchronization point in pipeline mode, requested b
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On 2021-Mar-11, Tom Lane wrote: > I think the changes in pqParseInput3() are broken. You should have > kept the else-structure as-is and inserted the check for "not really > idle" inside the else-clause that reports an error. As it stands, > after successfully processing an asynchronously-received error or > ParameterStatus message, the added code will cause us to return without > advancing inStart, creating an infinite loop of reprocessing that message. > > It's possible that we should redefine the way things happen so that if > we're waiting for another pipeline event, we should hold off processing > of async error & ParameterStatus; but in that case you should have added > the pre-emptive return ahead of that if/else structure, where the existing > "If not IDLE state, just wait ..." test is. I think I agree that holding off 'E' and 'S' messages when in between processing results for different queries in a pipeline, so keeping the original if/else structure is correct. An error would be correctly dealt with in the BUSY state immediately afterwards; and the fact that we pass 'inError' false at that point causes the wrong reaction (namely that the pipeline is not put in aborted state). I made a number of other changes: documentation adjustments per comments from David Johnston, some function renaming as previously noted, and added test code for PQsendDescribePrepared, PQsendDescribePortal. Also rebased on latest changes. I also absorbed one change that I already had when I submitted v35, but hadn't done "git add" on (which caused a compile failure for CF bot). -- Álvaro Herrera Valdivia, Chile diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 910e9a81ea..7b938c106c 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3180,6 +3180,33 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_PIPELINE_SYNC + + +The PGresult represents a +synchronization point in pipeline mode, requested by +. +This status occurs only when pipeline mode has been selected. + + + + + + PGRES_PIPELINE_ABORTED + + +The PGresult represents a pipeline that has +received an error from the server. PQgetResult +must be called repeatedly, and each time it will return this status code +until the end of the current pipeline, at which point it will return +PGRES_PIPELINE_SYNC and normal processing can +resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4926,6 +4953,479 @@ int PQflush(PGconn *conn); + + Pipeline Mode + + + libpq + pipeline mode + + + + pipelining + in libpq + + + + batch mode + in libpq + + + + libpq pipeline mode allows applications to + send a query without having to read the result of the previously + sent query. Taking advantage of the pipeline mode, a client will wait + less for the server, since multiple queries/results can be + sent/received in a single network transaction. + + + + While pipeline mode provides a significant performance boost, writing + clients using the pipeline mode is more complex because it involves + managing a queue of pending queries and finding which result + corresponds to which query in the queue. + + + + Pipeline mode also generally consumes more memory on both the client and server, + though careful and aggressive management of the send/receive queue can mitigate + this. This applies whether or not the connection is in blocking or non-blocking + mode. + + + + While the pipeline API was introduced in + PostgreSQL 14, it is a client-side feature + which doesn't require special server support, and works on any server + that supports the v3 extended query protocol. + + + + Using Pipeline Mode + + +To issue pipelines, the application must switch the connection +into pipeline mode, +which is done with . + can be used +to test whether pipeline mode is active. +In pipeline mode, only asynchronous operations +are permitted, and COPY is disallowed. +Using synchronous command execution functions +such as PQfn, +PQexec, +PQexecParams, +PQprepare, +PQexecPrepared, +PQdescribePrepared, +PQdescribePortal, +is an error condition. +Once all dispatched commands have had their results processed, and +the end pipeline result has been consumed, the application may return +to non-pipelined mode with . + + + + + It is best to use pipeline mode with libpq in + non-blocking mode. If used + in blocking mode it is possible for a client/server deadlock to occur. + + +The client will block trying to send queries to the server,
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Alvaro Herrera writes: > [ v35-libpq-pipeline.patch ] I think the changes in pqParseInput3() are broken. You should have kept the else-structure as-is and inserted the check for "not really idle" inside the else-clause that reports an error. As it stands, after successfully processing an asynchronously-received error or ParameterStatus message, the added code will cause us to return without advancing inStart, creating an infinite loop of reprocessing that message. It's possible that we should redefine the way things happen so that if we're waiting for another pipeline event, we should hold off processing of async error & ParameterStatus; but in that case you should have added the pre-emptive return ahead of that if/else structure, where the existing "If not IDLE state, just wait ..." test is. My guess though is that we do need to process error messages in that state, so that the correct patch looks more like else { +/* + * We're notionally not-IDLE when in pipeline mode we have + * completed processing the results of one query and are waiting + * for the next one in the pipeline. In this case, as above, just + * wait. + */ +if (conn->asyncStatus == PGASYNC_IDLE && +conn->pipelineStatus != PQ_PIPELINE_OFF && +conn->cmd_queue_head != NULL) +return; + pqInternalNotice(&conn->noticeHooks, "message type 0x%02x arrived from server while idle", id); /* Discard the unexpected message */ conn->inCursor += msgLength; } It'd be appropriate to do more than nothing to the comment block above this if/else chain, too, because really that one ought to explain why we should consume ERROR when in pipeline state. (I've not looked at the rest of this patch, just scanned what you did in fe-protocol3.c, because I wondered if there would be any interaction with the where-to-advance-inStart changes I'm about to commit. Looks okay modulo this issue.) regards, tom lane
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On 2021-Mar-05, Alvaro Herrera wrote: > I'll take the weekend to think about the issue with conn->last_query and > conn->queryclass that I mentioned yesterday; other than that detail my > feeling is that this is committable, so I'll be looking at getting this > pushed early next weeks, barring opinions from others. It took longer than I expected, but it works well now. conn->last_query is gone; all commands, both in pipeline mode and in no-pipeline mode, go via the command queue. This is cleaner all around; we don't have to have the pipeline code "cheat" so that it looks like each command is "last" at each point. I have not absorbed David Johnston's latest doc suggestions yet. I'm going to give the code a last renaming pass, on the idea that the command queue is no longer exclusively for the pipeline mode, so some things need less exclusionary names. But functionality wise AFAICS this patch has the shape it ought to have. -- Álvaro Herrera39°49'30"S 73°17'W diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 910e9a81ea..0bffb92462 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3180,6 +3180,33 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_PIPELINE_SYNC + + +The PGresult represents a +synchronization point in pipeline mode, requested by +. +This status occurs only when pipeline mode has been selected. + + + + + + PGRES_PIPELINE_ABORTED + + +The PGresult represents a pipeline that has +received an error from the server. PQgetResult +must be called repeatedly, and each time it will return this status code +until the end of the current pipeline, at which point it will return +PGRES_PIPELINE_SYNC and normal processing can +resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4926,6 +4953,473 @@ int PQflush(PGconn *conn); + + Pipeline Mode + + + libpq + pipeline mode + + + + pipelining + in libpq + + + + batch mode + in libpq + + + + libpq pipeline mode allows applications to + send a query without having to read the result of the previously + sent query. Taking advantage of the pipeline mode, a client will wait + less for the server, since multiple queries/results can be + sent/received in a single network transaction. + + + + While pipeline mode provides a significant performance boost, writing + clients using the pipeline mode is more complex because it involves + managing a queue of pending queries and finding which result + corresponds to which query in the queue. + + + + Pipeline mode also generally consumes more memory on both the client and server, + though careful and aggressive management of the send/receive queue can mitigate + this. This applies whether or not the connection is in blocking or non-blocking + mode. + + + + Using Pipeline Mode + + +To issue pipelines, the application must switch the connection +into pipeline mode, +which is done with . + can be used +to test whether pipeline mode is active. +In pipeline mode, only asynchronous operations +are permitted, and COPY is disallowed. +Using synchronous command execution functions +such as PQfn, +PQexec, +PQexecParams, +PQprepare, +PQexecPrepared, +PQdescribePrepared, +PQdescribePortal, +is an error condition. +Once all dispatched commands have had their results processed, and +the end pipeline result has been consumed, the application may return +to non-pipelined mode with . + + + + + It is best to use pipeline mode with libpq in + non-blocking mode. If used + in blocking mode it is possible for a client/server deadlock to occur. + + +The client will block trying to send queries to the server, but the +server will block trying to send results to the client from queries +it has already processed. This only occurs when the client sends +enough queries to fill both its output buffer and the server's receive +buffer before it switches to processing input from the server, +but it's hard to predict exactly when that will happen. + + + + + + +Issuing Queries + + + After entering pipeline mode, the application dispatches requests using + , + , + or its prepared-query sibling + . + These requests are queued on the client-side until flushed to the server; + this occurs when is used to + establish a synchronization point in the pipeline, + or when is called. + The functions , + , and + also work in pipeline mode. + R
Re: PATCH: Batch/pipelining support for libpq
On 2021-Mar-03, 'Alvaro Herrera' wrote: > I wonder if it would make sense to get rid of conn->last_query > completely and just rely always on conn->cmd_queue_head, where the > normal (non-pipeline) would use the first entry of the command queue. > That might end up being simpler than pipeline mode "pretending" to take > over ->last_query. I experimented with this today and it appears to be a good change, though I haven't been able to make everything work correctly yet. -- Álvaro Herrera Valdivia, Chile "Entristecido, Wutra (canción de Las Barreras) echa a Freyr a rodar y a nosotros al mar"
Re: PATCH: Batch/pipelining support for libpq
On Wed, Mar 3, 2021 at 5:45 PM Justin Pryzby wrote: > I'm proposing some minor changes. > > Some additional tweaks/comments for the documentation with the edit proposed edits: (edit) + PQresultStatus, will report a Remove the comma (orig) + the failed operation are skipped entirely. The same behaviour holds if the We seem to use "behavior" more frequently (edit) + From the client perspective, after PQresultStatus Possessive "client's perspective" (edit) + its expected results queue. Based on available memory, results from the "its corresponding results queue" - to go along with this change: - of the order in which it sent queries and the expected results. + of the order in which it sent queries, to associate them with their + corresponding results. (orig) + pipeline mode. If the current statement isn't finished processing + or there are results pending for collection with Needs a comma after processing. "results pending for collection" reads oddly to me...not sure what would be better though... (edit) + + + The pipeline API was introduced in PostgreSQL 14. + Pipeline mode is a client-side feature which doesn't require server + support, and works on any server that supports the v3 extended query + protocol. + + This note seems like it should be placed either near the very beginning of the feature or incorporated into the feature introduction. (orig) + If any statement encounters an error, the server aborts the current +(-) transaction and skips processing commands in the pipeline until the + next synchronization point established by PQsendPipeline. I dislike "skips" here - even if it doesn't execute a command it still will place a result on the socket so that the client can have something to match up with the queries it sent, correct? + transaction and creates a PGRES_PIPELINE_ABORTED result for all commands in the pipeline until the David J.
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
v33 was indeed marked a pass by cfbot. However, it only *builds* the test program, it does not *run* it. I guess we'll have to wait for the buildfarm to tell us more. In the meantime, I implemented PQsendQuery() as callable in pipeline mode; it does that by using the extended-query protocol directly rather than sending 'Q' as in non-pipeline mode. I also adjusted the docs a little bit more. That's what you see here as v34. I'll take the weekend to think about the issue with conn->last_query and conn->queryclass that I mentioned yesterday; other than that detail my feeling is that this is committable, so I'll be looking at getting this pushed early next weeks, barring opinions from others. -- Álvaro Herrera Valdivia, Chile "El sabio habla porque tiene algo que decir; el tonto, porque tiene que decir algo" (Platon). diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 0553279314..e3cd5c377b 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3173,6 +3173,33 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_PIPELINE_SYNC + + +The PGresult represents a +synchronization point in pipeline mode, requested by +. +This status occurs only when pipeline mode has been selected. + + + + + + PGRES_PIPELINE_ABORTED + + +The PGresult represents a pipeline that has +received an error from the server. PQgetResult +must be called repeatedly, and each time it will return this status code +until the end of the current pipeline, at which point it will return +PGRES_PIPELINE_SYNC and normal processing can +resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4919,6 +4946,473 @@ int PQflush(PGconn *conn); + + Pipeline Mode + + + libpq + pipeline mode + + + + pipelining + in libpq + + + + batch mode + in libpq + + + + libpq pipeline mode allows applications to + send a query without having to read the result of the previously + sent query. Taking advantage of the pipeline mode, a client will wait + less for the server, since multiple queries/results can be + sent/received in a single network transaction. + + + + While pipeline mode provides a significant performance boost, writing + clients using the pipeline mode is more complex because it involves + managing a queue of pending queries and finding which result + corresponds to which query in the queue. + + + + Pipeline mode also generally consumes more memory on both the client and server, + though careful and aggressive management of the send/receive queue can mitigate + this. This applies whether or not the connection is in blocking or non-blocking + mode. + + + + Using Pipeline Mode + + +To issue pipelines, the application must switch the connection +into pipeline mode, +which is done with . + can be used +to test whether pipeline mode is active. +In pipeline mode, only asynchronous operations +are permitted, and COPY is disallowed. +Using synchronous command execution functions +such as PQfn, +PQexec, +PQexecParams, +PQprepare, +PQexecPrepared, +PQdescribePrepared, +PQdescribePortal, +is an error condition. +Once all dispatched commands have had their results processed, and +the end pipeline result has been consumed, the application may return +to non-pipelined mode with . + + + + + It is best to use pipeline mode with libpq in + non-blocking mode. If used + in blocking mode it is possible for a client/server deadlock to occur. + + +The client will block trying to send queries to the server, but the +server will block trying to send results to the client from queries +it has already processed. This only occurs when the client sends +enough queries to fill both its output buffer and the server's receive +buffer before it switches to processing input from the server, +but it's hard to predict exactly when that will happen. + + + + + + +Issuing Queries + + + After entering pipeline mode, the application dispatches requests using + , + , + or its prepared-query sibling + . + These requests are queued on the client-side until flushed to the server; + this occurs when is used to + establish a synchronization point in the pipeline, + or when is called. + The functions , + , and + also work in pipeline mode. + Result processing is described below. + + + + The server executes statements, and returns results, in the order the + client sends them. The server will
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Hi, + \gset and \aset cannot be used + pipeline mode, since query results are not immediately 'used pipeline mode' -> 'used in pipeline mode' --- /dev/null +++ b/src/test/modules/libpq_pipeline/libpq_pipeline.c @@ -0,0 +1,1144 @@ +/* + * src/test/modules/libpq_pipeline/libpq_pipeline.c Looks like license information is missing from the header. Cheers On Thu, Mar 4, 2021 at 1:40 PM Alvaro Herrera wrote: > On 2021-Mar-04, Alvaro Herrera wrote: > > > I don't know where do __WSAFDIsSet and __imp_select come from or what to > > do about them. Let's see if adding pgport and pgcommon fixes things. > > Indeed all those other problems were fixed and these remain. New > failure is: > > "C:\projects\postgresql\pgsql.sln" (default target) (1) -> > 6007"C:\projects\postgresql\libpq_pipeline.vcxproj" (default target) (55) > -> > 6008(Link target) -> > 6009 libpq_pipeline.obj : error LNK2019: unresolved external symbol > __WSAFDIsSet referenced in function test_pipelined_insert > [C:\projects\postgresql\libpq_pipeline.vcxproj] > 6010 libpq_pipeline.obj : error LNK2019: unresolved external symbol > __imp_select referenced in function test_pipelined_insert > [C:\projects\postgresql\libpq_pipeline.vcxproj] > 6011 .\Release\libpq_pipeline\libpq_pipeline.exe : fatal error LNK1120: 2 > unresolved externals [C:\projects\postgresql\libpq_pipeline.vcxproj] > > I did notice that isolationtester.c is using select(), and one > difference is that it includes which libpq_pipeline.c > does not -- and it also pulls in ws2_32.lib. Let's see if those two > changes fix things. > > -- > Álvaro Herrera Valdivia, Chile >
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On 2021-Mar-04, Alvaro Herrera wrote: > I don't know where do __WSAFDIsSet and __imp_select come from or what to > do about them. Let's see if adding pgport and pgcommon fixes things. Indeed all those other problems were fixed and these remain. New failure is: "C:\projects\postgresql\pgsql.sln" (default target) (1) -> 6007"C:\projects\postgresql\libpq_pipeline.vcxproj" (default target) (55) -> 6008(Link target) -> 6009 libpq_pipeline.obj : error LNK2019: unresolved external symbol __WSAFDIsSet referenced in function test_pipelined_insert [C:\projects\postgresql\libpq_pipeline.vcxproj] 6010 libpq_pipeline.obj : error LNK2019: unresolved external symbol __imp_select referenced in function test_pipelined_insert [C:\projects\postgresql\libpq_pipeline.vcxproj] 6011 .\Release\libpq_pipeline\libpq_pipeline.exe : fatal error LNK1120: 2 unresolved externals [C:\projects\postgresql\libpq_pipeline.vcxproj] I did notice that isolationtester.c is using select(), and one difference is that it includes which libpq_pipeline.c does not -- and it also pulls in ws2_32.lib. Let's see if those two changes fix things. -- Álvaro Herrera Valdivia, Chile diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 0553279314..c87b0ce911 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3173,6 +3173,33 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_PIPELINE_SYNC + + +The PGresult represents a +synchronization point in pipeline mode, requested by +. +This status occurs only when pipeline mode has been selected. + + + + + + PGRES_PIPELINE_ABORTED + + +The PGresult represents a pipeline that has +received an error from the server. PQgetResult +must be called repeatedly, and each time it will return this status code +until the end of the current pipeline, at which point it will return +PGRES_PIPELINE_SYNC and normal processing can +resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4919,6 +4946,498 @@ int PQflush(PGconn *conn); + + Pipeline Mode + + + libpq + pipeline mode + + + + pipelining + in libpq + + + + batch mode + in libpq + + + + libpq pipeline mode allows applications to + send a query without having to read the result of the previously + sent query. Taking advantage of the pipeline mode, a client will wait + less for the server, since multiple queries/results can be sent/ + received in a single network transaction. + + + + While pipeline mode provides a significant performance boost, writing + clients using the pipeline mode is more complex because it involves + managing a queue of pending queries and finding which result + corresponds to which query in the queue. + + + + Pipeline mode also generally consumes more memory on both the client and server, + though careful and aggressive management of the send/receive queue can mitigate + this. This applies whether or not the connection is in blocking or non-blocking + mode. + + + + Using Pipeline Mode + + +To issue pipelines, the application must switch a connection into pipeline mode. +Enter pipeline mode with +or test whether pipeline mode is active with +. +In pipeline mode, only asynchronous operations +are permitted, and COPY is disallowed. +Using any synchronous command execution functions, +such as PQfn, or PQexec +and its sibling functions, is an error condition. + + + + + It is best to use pipeline mode with libpq in + non-blocking mode. If used + in blocking mode it is possible for a client/server deadlock to occur. + + +The client will block trying to send queries to the server, but the +server will block trying to send results to the client from queries +it has already processed. This only occurs when the client sends +enough queries to fill both its output buffer and the server's receive +buffer before it switches to processing input from the server, +but it's hard to predict exactly when that will happen. + + + + + + +Issuing Queries + + + After entering pipeline mode, the application dispatches requests using + , + or its prepared-query sibling + . + These requests are queued on the client-side until flushed to the server; + this occurs when is used to + establish a synchronization point in the pipeline, + or when is called. + The functions , + , and + also work in pipeline mode. + Result processing is described below. + + + + The server executes statements, and returns results,
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On 2021-Mar-04, Alvaro Herrera wrote: > v31. Got this: libpq_pipeline.obj : error LNK2019: unresolved external symbol __WSAFDIsSet referenced in function test_pipelined_insert [C:\projects\postgresql\libpq_pipeline.vcxproj] 5019libpq_pipeline.obj : error LNK2019: unresolved external symbol __imp_select referenced in function test_pipelined_insert [C:\projects\postgresql\libpq_pipeline.vcxproj] 5020libpq_pipeline.obj : error LNK2019: unresolved external symbol pg_snprintf referenced in function test_pipelined_insert [C:\projects\postgresql\libpq_pipeline.vcxproj] 5021libpq_pipeline.obj : error LNK2019: unresolved external symbol pg_vfprintf referenced in function pg_fatal_impl [C:\projects\postgresql\libpq_pipeline.vcxproj] 5022libpq_pipeline.obj : error LNK2019: unresolved external symbol pg_fprintf referenced in function pg_fatal_impl [C:\projects\postgresql\libpq_pipeline.vcxproj] 5023libpq_pipeline.obj : error LNK2019: unresolved external symbol pg_printf referenced in function pg_fatal_impl [C:\projects\postgresql\libpq_pipeline.vcxproj] 5024libpq_pipeline.obj : error LNK2019: unresolved external symbol pg_strerror referenced in function test_pipelined_insert [C:\projects\postgresql\libpq_pipeline.vcxproj] 5025libpq_pipeline.obj : error LNK2019: unresolved external symbol pg_strdup referenced in function main [C:\projects\postgresql\libpq_pipeline.vcxproj] 5026libpq_pipeline.obj : error LNK2019: unresolved external symbol pfree referenced in function test_singlerowmode [C:\projects\postgresql\libpq_pipeline.vcxproj] 5027libpq_pipeline.obj : error LNK2019: unresolved external symbol psprintf referenced in function test_singlerowmode [C:\projects\postgresql\libpq_pipeline.vcxproj] pg_snprintf, pg_vfprintf, pg_fprintf, pg_printf, pg_strerror are in pgport. pg_strdup and pfree, psprintf are in pgcommon. I don't know where do __WSAFDIsSet and __imp_select come from or what to do about them. Let's see if adding pgport and pgcommon fixes things. -- Álvaro Herrera39°49'30"S 73°17'W diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 0553279314..c87b0ce911 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3173,6 +3173,33 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_PIPELINE_SYNC + + +The PGresult represents a +synchronization point in pipeline mode, requested by +. +This status occurs only when pipeline mode has been selected. + + + + + + PGRES_PIPELINE_ABORTED + + +The PGresult represents a pipeline that has +received an error from the server. PQgetResult +must be called repeatedly, and each time it will return this status code +until the end of the current pipeline, at which point it will return +PGRES_PIPELINE_SYNC and normal processing can +resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4919,6 +4946,498 @@ int PQflush(PGconn *conn); + + Pipeline Mode + + + libpq + pipeline mode + + + + pipelining + in libpq + + + + batch mode + in libpq + + + + libpq pipeline mode allows applications to + send a query without having to read the result of the previously + sent query. Taking advantage of the pipeline mode, a client will wait + less for the server, since multiple queries/results can be sent/ + received in a single network transaction. + + + + While pipeline mode provides a significant performance boost, writing + clients using the pipeline mode is more complex because it involves + managing a queue of pending queries and finding which result + corresponds to which query in the queue. + + + + Pipeline mode also generally consumes more memory on both the client and server, + though careful and aggressive management of the send/receive queue can mitigate + this. This applies whether or not the connection is in blocking or non-blocking + mode. + + + + Using Pipeline Mode + + +To issue pipelines, the application must switch a connection into pipeline mode. +Enter pipeline mode with +or test whether pipeline mode is active with +. +In pipeline mode, only asynchronous operations +are permitted, and COPY is disallowed. +Using any synchronous command execution functions, +such as PQfn, or PQexec +and its sibling functions, is an error condition. + + + + + It is best to use pipeline mode with libpq in + non-blocking mode. If used + in blocking mode it is possible for a client/server deadlock to occur. + + +The client will block trying to send queries to the server, but the +server will block trying to send results t
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On 2021-Mar-04, Alvaro Herrera wrote: > I think the problem is that the project is called pipeline and not > test_libpq, so there's no match in the name. I'm going to rename the > whole thing to src/test/modules/libpq_pipeline/ and see if the msvc > tooling likes that better. v31. -- Álvaro Herrera Valdivia, Chile "I'm impressed how quickly you are fixing this obscure issue. I came from MS SQL and it would be hard for me to put into words how much of a better job you all are doing on [PostgreSQL]." Steve Midgley, http://archives.postgresql.org/pgsql-sql/2008-08/msg0.php diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 0553279314..c87b0ce911 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3173,6 +3173,33 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_PIPELINE_SYNC + + +The PGresult represents a +synchronization point in pipeline mode, requested by +. +This status occurs only when pipeline mode has been selected. + + + + + + PGRES_PIPELINE_ABORTED + + +The PGresult represents a pipeline that has +received an error from the server. PQgetResult +must be called repeatedly, and each time it will return this status code +until the end of the current pipeline, at which point it will return +PGRES_PIPELINE_SYNC and normal processing can +resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4919,6 +4946,498 @@ int PQflush(PGconn *conn); + + Pipeline Mode + + + libpq + pipeline mode + + + + pipelining + in libpq + + + + batch mode + in libpq + + + + libpq pipeline mode allows applications to + send a query without having to read the result of the previously + sent query. Taking advantage of the pipeline mode, a client will wait + less for the server, since multiple queries/results can be sent/ + received in a single network transaction. + + + + While pipeline mode provides a significant performance boost, writing + clients using the pipeline mode is more complex because it involves + managing a queue of pending queries and finding which result + corresponds to which query in the queue. + + + + Pipeline mode also generally consumes more memory on both the client and server, + though careful and aggressive management of the send/receive queue can mitigate + this. This applies whether or not the connection is in blocking or non-blocking + mode. + + + + Using Pipeline Mode + + +To issue pipelines, the application must switch a connection into pipeline mode. +Enter pipeline mode with +or test whether pipeline mode is active with +. +In pipeline mode, only asynchronous operations +are permitted, and COPY is disallowed. +Using any synchronous command execution functions, +such as PQfn, or PQexec +and its sibling functions, is an error condition. + + + + + It is best to use pipeline mode with libpq in + non-blocking mode. If used + in blocking mode it is possible for a client/server deadlock to occur. + + +The client will block trying to send queries to the server, but the +server will block trying to send results to the client from queries +it has already processed. This only occurs when the client sends +enough queries to fill both its output buffer and the server's receive +buffer before it switches to processing input from the server, +but it's hard to predict exactly when that will happen. + + + + + + +Issuing Queries + + + After entering pipeline mode, the application dispatches requests using + , + or its prepared-query sibling + . + These requests are queued on the client-side until flushed to the server; + this occurs when is used to + establish a synchronization point in the pipeline, + or when is called. + The functions , + , and + also work in pipeline mode. + Result processing is described below. + + + + The server executes statements, and returns results, in the order the + client sends them. The server will begin executing the commands in the + pipeline immediately, not waiting for the end of the pipeline. + If any statement encounters an error, the server aborts the current + transaction and skips processing commands in the pipeline until the + next synchronization point established by PQsendPipeline. + (This remains true even if the commands in the pipeline would rollback + the transaction.) + Query processing resumes after the synchronization point. + + + + It's fine for one
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On 2021-Mar-04, Alvaro Herrera wrote: > v30 contains changes to hopefully make it build on MSVC. Hm, that didn't work -- appveyor still says: Project "C:\projects\postgresql\pgsql.sln" (1) is building "C:\projects\postgresql\pipeline.vcxproj" (75) on node 1 (default targets). PrepareForBuild: Creating directory ".\Release\pipeline\". Creating directory ".\Release\pipeline\pipeline.tlog\". InitializeBuildStatus: Creating ".\Release\pipeline\pipeline.tlog\unsuccessfulbuild" because "AlwaysCreate" was specified. ClCompile: C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64\CL.exe /c /Isrc/include /Isrc/include/port/win32 /Isrc/include/port/win32_msvc /Zi /nologo /W3 /WX- /Ox /D WIN32 /D _WINDOWS /D __WINDOWS__ /D __WIN32__ /D WIN32_STACK_RLIMIT=4194304 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _MBCS /GF /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope /Fo".\Release\pipeline\\" /Fd".\Release\pipeline\vc120.pdb" /Gd /TC /wd4018 /wd4244 /wd4273 /wd4102 /wd4090 /wd4267 /errorReport:queue /MP src/test/modules/test_libpq/pipeline.c pipeline.c src/test/modules/test_libpq/pipeline.c(11): fatal error C1083: Cannot open include file: 'libpq-fe.h': No such file or directory [C:\projects\postgresql\pipeline.vcxproj] Done Building Project "C:\projects\postgresql\pipeline.vcxproj" (default targets) -- FAILED. Project "C:\projects\postgresql\pgsql.sln" (1) is building "C:\projects\postgresql\test_parser.vcxproj" (76) on node 1 (default targets). I think the problem is that the project is called pipeline and not test_libpq, so there's no match in the name. I'm going to rename the whole thing to src/test/modules/libpq_pipeline/ and see if the msvc tooling likes that better. -- Álvaro Herrera39°49'30"S 73°17'W
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
v30 contains changes to hopefully make it build on MSVC. -- Álvaro Herrera Valdivia, Chile diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 0553279314..c16befa314 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3173,6 +3173,33 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_PIPELINE_SYNC + + +The PGresult represents a +synchronization point in pipeline mode, requested by +. +This status occurs only when pipeline mode has been selected. + + + + + + PGRES_PIPELINE_ABORTED + + +The PGresult represents a pipeline that has +received an error from the server. PQgetResult +must be called repeatedly, and each time it will return this status code +until the end of the current pipeline, at which point it will return +PGRES_PIPELINE_SYNC and normal processing can +resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4919,6 +4946,498 @@ int PQflush(PGconn *conn); + + Pipeline Mode + + + libpq + pipeline mode + + + + pipelining + in libpq + + + + batch mode + in libpq + + + + libpq pipeline mode allows applications to + send a query without having to read the result of the previously + sent query. Taking advantage of the pipeline mode, a client will wait + less for the server, since multiple queries/results can be sent/ + received in a single network transaction. + + + + While pipeline mode provides a significant performance boost, writing + clients using the pipeline mode is more complex because it involves + managing a queue of pending queries and finding which result + corresponds to which query in the queue. + + + + Pipeline mode also generally consumes more memory on both the client and server, + though careful and aggressive management of the send/receive queue can mitigate + this. This applies whether or not the connection is in blocking or non-blocking + mode. + + + + Using Pipeline Mode + + +To issue pipelines, the application must switch a connection into pipeline mode. +Enter pipeline mode with +or test whether pipeline mode is active with +. +In pipeline mode, only asynchronous operations +are permitted, and COPY is disallowed. +Using any synchronous command execution functions, +such as PQfn, or PQexec +and its sibling functions, is an error condition. + + + + + It is best to use pipeline mode with libpq in + non-blocking mode. If used + in blocking mode it is possible for a client/server deadlock to occur. + + +The client will block trying to send queries to the server, but the +server will block trying to send results to the client from queries +it has already processed. This only occurs when the client sends +enough queries to fill both its output buffer and the server's receive +buffer before it switches to processing input from the server, +but it's hard to predict exactly when that will happen. + + + + + + +Issuing Queries + + + After entering pipeline mode, the application dispatches requests using + , + or its prepared-query sibling + . + These requests are queued on the client-side until flushed to the server; + this occurs when is used to + establish a synchronization point in the pipeline, + or when is called. + The functions , + , and + also work in pipeline mode. + Result processing is described below. + + + + The server executes statements, and returns results, in the order the + client sends them. The server will begin executing the commands in the + pipeline immediately, not waiting for the end of the pipeline. + If any statement encounters an error, the server aborts the current + transaction and skips processing commands in the pipeline until the + next synchronization point established by PQsendPipeline. + (This remains true even if the commands in the pipeline would rollback + the transaction.) + Query processing resumes after the synchronization point. + + + + It's fine for one operation to depend on the results of a + prior one; for example, one query may define a table that the next + query in the same pipeline uses. Similarly, an application may + create a named prepared statement and execute it with later + statements in the same pipeline. + + + + +Processing Results + + + To process the result of one query in a pipeline, the application calls + PQgetResult repeatedly and handles each result + until PQgetResult re
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
> I think it's just because you forgot the patch. > https://www.postgresql.org/message-id/20210304142627.GA5978%40alvherre.pgsql -- Álvaro Herrera39°49'30"S 73°17'W diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 0553279314..c16befa314 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3173,6 +3173,33 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_PIPELINE_SYNC + + +The PGresult represents a +synchronization point in pipeline mode, requested by +. +This status occurs only when pipeline mode has been selected. + + + + + + PGRES_PIPELINE_ABORTED + + +The PGresult represents a pipeline that has +received an error from the server. PQgetResult +must be called repeatedly, and each time it will return this status code +until the end of the current pipeline, at which point it will return +PGRES_PIPELINE_SYNC and normal processing can +resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4919,6 +4946,498 @@ int PQflush(PGconn *conn); + + Pipeline Mode + + + libpq + pipeline mode + + + + pipelining + in libpq + + + + batch mode + in libpq + + + + libpq pipeline mode allows applications to + send a query without having to read the result of the previously + sent query. Taking advantage of the pipeline mode, a client will wait + less for the server, since multiple queries/results can be sent/ + received in a single network transaction. + + + + While pipeline mode provides a significant performance boost, writing + clients using the pipeline mode is more complex because it involves + managing a queue of pending queries and finding which result + corresponds to which query in the queue. + + + + Pipeline mode also generally consumes more memory on both the client and server, + though careful and aggressive management of the send/receive queue can mitigate + this. This applies whether or not the connection is in blocking or non-blocking + mode. + + + + Using Pipeline Mode + + +To issue pipelines, the application must switch a connection into pipeline mode. +Enter pipeline mode with +or test whether pipeline mode is active with +. +In pipeline mode, only asynchronous operations +are permitted, and COPY is disallowed. +Using any synchronous command execution functions, +such as PQfn, or PQexec +and its sibling functions, is an error condition. + + + + + It is best to use pipeline mode with libpq in + non-blocking mode. If used + in blocking mode it is possible for a client/server deadlock to occur. + + +The client will block trying to send queries to the server, but the +server will block trying to send results to the client from queries +it has already processed. This only occurs when the client sends +enough queries to fill both its output buffer and the server's receive +buffer before it switches to processing input from the server, +but it's hard to predict exactly when that will happen. + + + + + + +Issuing Queries + + + After entering pipeline mode, the application dispatches requests using + , + or its prepared-query sibling + . + These requests are queued on the client-side until flushed to the server; + this occurs when is used to + establish a synchronization point in the pipeline, + or when is called. + The functions , + , and + also work in pipeline mode. + Result processing is described below. + + + + The server executes statements, and returns results, in the order the + client sends them. The server will begin executing the commands in the + pipeline immediately, not waiting for the end of the pipeline. + If any statement encounters an error, the server aborts the current + transaction and skips processing commands in the pipeline until the + next synchronization point established by PQsendPipeline. + (This remains true even if the commands in the pipeline would rollback + the transaction.) + Query processing resumes after the synchronization point. + + + + It's fine for one operation to depend on the results of a + prior one; for example, one query may define a table that the next + query in the same pipeline uses. Similarly, an application may + create a named prepared statement and execute it with later + statements in the same pipeline. + + + + +Processing Results + + + To process the result of one query in a pipeline, the
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On Thu, Mar 04, 2021 at 12:01:37PM -0300, Alvaro Herrera wrote: > Apparently, the archives system or the commitfest system is not picking > up new messages to the thread, so the CF app is trying to apply a > very old patch version. I'm not sure what's up with that. Thomas, any > clues on where to look? I think it's just because you forgot the patch. https://www.postgresql.org/message-id/20210304142627.GA5978%40alvherre.pgsql -- Justin
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On Thu, Mar 4, 2021 at 8:01 PM Alvaro Herrera wrote: > Apparently, the archives system or the commitfest system is not picking > up new messages to the thread, so the CF app is trying to apply a > very old patch version. I'm not sure what's up with that. Thomas, any > clues on where to look? > > -- > Álvaro Herrera Valdivia, Chile > "Oh, great altar of passive entertainment, bestow upon me thy discordant > images > at such speed as to render linear thought impossible" (Calvin a la TV) > Hi Alvaro, The thread splits and CF app still has the previous thread. The old thread has the v18 as the latest patch which is failing. The new thread which ( https://www.postgresql.org/message-id/CAJkzx4T5E-2cQe3dtv2R78dYFvz%2Bin8PY7A8MArvLhs_pg75gg%40mail.gmail.com ) has the new patch. -- Ibrar Ahmed
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Apparently, the archives system or the commitfest system is not picking up new messages to the thread, so the CF app is trying to apply a very old patch version. I'm not sure what's up with that. Thomas, any clues on where to look? -- Álvaro Herrera Valdivia, Chile "Oh, great altar of passive entertainment, bestow upon me thy discordant images at such speed as to render linear thought impossible" (Calvin a la TV)
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On 2021-Mar-04, Ibrar Ahmed wrote: > The build is failing for this patch, can you please take a look at this? > > https://cirrus-ci.com/task/4568547922804736 > https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.129221 > > > I am marking the patch "Waiting on Author" I don't know why you chose to respond to such an old message, but here's a rebase which also includes the workaround I suggested last night for the problem of the outdated query printed by error messages, as well as Justin's suggested fixes. (I also made a few tweaks to the TAP test). -- Álvaro Herrera Valdivia, Chile "No renuncies a nada. No te aferres a nada."
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On Wed, Jul 15, 2020 at 12:18 AM Andres Freund wrote: > Hi, > > On 2020-07-10 19:01:49 -0400, Alvaro Herrera wrote: > > Totally unasked for, here's a rebase of this patch series. I didn't do > > anything other than rebasing to current master, solving a couple of very > > trivial conflicts, fixing some whitespace complaints by git apply, and > > running tests to verify everthing works. > > > > I don't foresee working on this at all, so if anyone is interested in > > seeing this feature in, I encourage them to read and address > > Horiguchi-san's feedback. > > Nor am I planning to do so, but I do think its a pretty important > improvement. > > > > > +/* > > + * PQrecyclePipelinedCommand > > + * Push a command queue entry onto the freelist. It must be a > dangling entry > > + * with null next pointer and not referenced by any other entry's > next pointer. > > + */ > > Dangling sounds a bit like it's already freed. > > > > > +/* > > + * PQbatchSendQueue > > + * End a batch submission by sending a protocol sync. The connection > will > > + * remain in batch mode and unavailable for new synchronous command > execution > > + * functions until all results from the batch are processed by the > client. > > I feel like the reference to the protocol sync is a bit too low level > for an external API. It should first document what the function does > from a user's POV. > > I think it'd also be good to document whether / whether not queries can > already have been sent before PQbatchSendQueue is called or not. > > > > +/* > > + * PQbatchProcessQueue > > + *In batch mode, start processing the next query in the queue. > > + * > > + * Returns 1 if the next query was popped from the queue and can > > + * be processed by PQconsumeInput, PQgetResult, etc. > > + * > > + * Returns 0 if the current query isn't done yet, the connection > > + * is not in a batch, or there are no more queries to process. > > + */ > > +int > > +PQbatchProcessQueue(PGconn *conn) > > +{ > > + PGcommandQueueEntry *next_query; > > + > > + if (!conn) > > + return 0; > > + > > + if (conn->batch_status == PQBATCH_MODE_OFF) > > + return 0; > > + > > + switch (conn->asyncStatus) > > + { > > + case PGASYNC_COPY_IN: > > + case PGASYNC_COPY_OUT: > > + case PGASYNC_COPY_BOTH: > > + printfPQExpBuffer(&conn->errorMessage, > > +libpq_gettext_noop("internal error, > COPY in batch mode")); > > + break; > > Shouldn't there be a return 0 here? > > > > > + if (conn->batch_status == PQBATCH_MODE_ABORTED && conn->queryclass > != PGQUERY_SYNC) > > + { > > + /* > > + * In an aborted batch we don't get anything from the > server for each > > + * result; we're just discarding input until we get to the > next sync > > + * from the server. The client needs to know its queries > got aborted > > + * so we create a fake PGresult to return immediately from > > + * PQgetResult. > > + */ > > + conn->result = PQmakeEmptyPGresult(conn, > > + > PGRES_BATCH_ABORTED); > > + if (!conn->result) > > + { > > + printfPQExpBuffer(&conn->errorMessage, > > + > libpq_gettext("out of memory")); > > + pqSaveErrorResult(conn); > > + return 0; > > Is there any way an application can recover at this point? ISTM we'd be > stuck in the previous asyncStatus, no? > > > > +/* pqBatchFlush > > + * In batch mode, data will be flushed only when the out buffer reaches > the threshold value. > > + * In non-batch mode, data will be flushed all the time. > > + */ > > +static int > > +pqBatchFlush(PGconn *conn) > > +{ > > + if ((conn->batch_status == PQBATCH_MODE_OFF)||(conn->outCount >= > OUTBUFFER_THRESHOLD)) > > + return(pqFlush(conn)); > > + return 0; /* Just to keep compiler quiet */ > > +} > > unnecessarily long line. > > > > +/* > > + * Connection's outbuffer threshold is set to 64k as it is safe > > + * in Windows as per comments in pqSendSome() API. > > + */ > > +#define OUTBUFFER_THRESHOLD 65536 > > I don't think the comment explains much. It's fine to send more than 64k > with pqSendSome(), they'll just be send with separate pgsecure_write() > invocations. And only on windows. > > It clearly makes sense to start sending out data at a certain > granularity to avoid needing unnecessary amounts of memory, and to make > more efficient use of latency / serer side compute. > > It's not implausible that 64k is the right amount for that, I just don't > think the explanation above is good. > > > diff --git a/src/test/modules/test_libpq/testlibpqbatch.c > b/src/test/modules/test_libpq/testlibpqbatch.c > > new file mode 100644 > > index 00..4d6ba266e5 > > --- /dev/null > > +++ b/src/test/modules/test_libpq/testlibpqbatch.c
Re: PATCH: Batch/pipelining support for libpq
On 2021-Mar-03, 'Alvaro Herrera' wrote: > On 2021-Mar-03, 'Alvaro Herrera' wrote: > > > This should obviously not occur. I'm trying to figure out how to repair > > it and not break everything again ... > > I think trying to set up the connection state so that the next query > appears in conn->last_query prior to PQgetResult being called again > leads to worse breakage. The simplest fix seems to make fe-protocol3.c > aware that in this case, the query is in conn->cmd_queue_head instead, > as in the attached patch. I wonder if it would make sense to get rid of conn->last_query completely and just rely always on conn->cmd_queue_head, where the normal (non-pipeline) would use the first entry of the command queue. That might end up being simpler than pipeline mode "pretending" to take over ->last_query. -- Álvaro Herrera39°49'30"S 73°17'W "La verdad no siempre es bonita, pero el hambre de ella sí"
Re: PATCH: Batch/pipelining support for libpq
I'm proposing some minor changes. -- Justin >From 35ed4dc1fc770834972396f7eeed142f6dabee88 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Wed, 3 Mar 2021 16:36:40 -0600 Subject: [PATCH 2/2] doc review --- doc/src/sgml/libpq.sgml| 67 +- src/interfaces/libpq/fe-exec.c | 2 +- src/test/modules/test_libpq/pipeline.c | 6 +-- 3 files changed, 38 insertions(+), 37 deletions(-) diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index a34ecbb144..e2865a218b 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3130,8 +3130,9 @@ ExecStatusType PQresultStatus(const PGresult *res); PGRES_PIPELINE_SYNC -The PGresult represents a point in a -pipeline where a synchronization point has been established. +The PGresult represents a +synchronization point in pipeline mode, requested by +. This status occurs only when pipeline mode has been selected. @@ -3141,9 +3142,9 @@ ExecStatusType PQresultStatus(const PGresult *res); PGRES_PIPELINE_ABORTED -The PGresult represents a pipeline that's +The PGresult represents a pipeline that has received an error from the server. PQgetResult -must be called repeatedly, and it will return this status code, +must be called repeatedly, and each time it will return this status code until the end of the current pipeline, at which point it will return PGRES_PIPELINE_SYNC and normal processing can resume. @@ -4953,7 +4954,7 @@ int PQflush(PGconn *conn); Using Pipeline Mode -To issue pipelines the application must switch a connection into pipeline mode. +To issue pipelines, the application must switch a connection into pipeline mode. Enter pipeline mode with or test whether pipeline mode is active with . @@ -4975,7 +4976,7 @@ int PQflush(PGconn *conn); server will block trying to send results to the client from queries it has already processed. This only occurs when the client sends enough queries to fill both its output buffer and the server's receive -buffer before switching to processing input from the server, +buffer before it switches to processing input from the server, but it's hard to predict exactly when that will happen. @@ -5025,7 +5026,7 @@ int PQflush(PGconn *conn); Processing Results - To process the result of one pipeline query, the application calls + To process the result of one query in a pipeline, the application calls PQgetResult repeatedly and handles each result until PQgetResult returns null. The result from the next query in the pipeline may then be retrieved using @@ -5057,10 +5058,10 @@ int PQflush(PGconn *conn); PGresult types PGRES_PIPELINE_SYNC and PGRES_PIPELINE_ABORTED. PGRES_PIPELINE_SYNC is reported exactly once for each - PQsendPipeline call at the corresponding point in - the result stream. + PQsendPipeline after retrieving results for all + queries in the pipeline. PGRES_PIPELINE_ABORTED is emitted in place of a normal - result stream result for the first error and all subsequent results + stream result for the first error and all subsequent results except PGRES_PIPELINE_SYNC and null; see . @@ -5075,7 +5076,8 @@ int PQflush(PGconn *conn); application about the query currently being processed (except that PQgetResult returns null to indicate that we start returning the results of next query). The application must keep track - of the order in which it sent queries and the expected results. + of the order in which it sent queries, to associate them with their + corresponding results. Applications will typically use a state machine or a FIFO queue for this. @@ -5091,10 +5093,10 @@ int PQflush(PGconn *conn); - From the client perspective, after the client gets a - PGRES_FATAL_ERROR return from - PQresultStatus the pipeline is flagged as aborted. - libpq will report + From the client perspective, after PQresultStatus + returns PGRES_FATAL_ERROR, + the pipeline is flagged as aborted. + PQresultStatus, will report a PGRES_PIPELINE_ABORTED result for each remaining queued operation in an aborted pipeline. The result for PQsendPipeline is reported as @@ -5108,8 +5110,8 @@ int PQflush(PGconn *conn); - If the pipeline used an implicit transaction then operations that have - already executed are rolled back and operations that were queued for after + If the pipeline used an implicit transaction, then operations that have + already executed are rolled back and operations that we
Re: PATCH: Batch/pipelining support for libpq
On 2021-Mar-03, 'Alvaro Herrera' wrote: > This should obviously not occur. I'm trying to figure out how to repair > it and not break everything again ... I think trying to set up the connection state so that the next query appears in conn->last_query prior to PQgetResult being called again leads to worse breakage. The simplest fix seems to make fe-protocol3.c aware that in this case, the query is in conn->cmd_queue_head instead, as in the attached patch. -- Álvaro Herrera39°49'30"S 73°17'W diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c index 584922b340..b89f500902 100644 --- a/src/interfaces/libpq/fe-protocol3.c +++ b/src/interfaces/libpq/fe-protocol3.c @@ -962,9 +962,21 @@ pqGetErrorNotice3(PGconn *conn, bool isError) * Save the active query text, if any, into res as well; but only if we * might need it for an error cursor display, which is only true if there * is a PG_DIAG_STATEMENT_POSITION field. + * + * Note that in pipeline mode, we have not yet advanced the query pointer + * to the next query, so we have to look at that. */ - if (have_position && conn->last_query && res) - res->errQuery = pqResultStrdup(res, conn->last_query); + if (have_position && res) + { + if (conn->pipelineStatus != PQ_PIPELINE_OFF && + conn->asyncStatus == PGASYNC_IDLE) + { + if (conn->cmd_queue_head) +res->errQuery = pqResultStrdup(res, conn->cmd_queue_head->query); + } + else if (conn->last_query) + res->errQuery = pqResultStrdup(res, conn->last_query); + } /* * Now build the "overall" error message for PQresultErrorMessage.
Re: PATCH: Batch/pipelining support for libpq
On 2021-Mar-03, 'Alvaro Herrera' wrote: > I'm much more comfortable with this version, so I'm marking the patch as > Ready for Committer in case anybody else wants to look at this before I > push it. Actually, I just noticed a pretty serious problem, which is that when we report an error, we don't set "conn->errQuery" to the current query but to the *previous* query, leading to non-sensical error reports like $ ./pipeline pipeline_abort aborted pipeline... ERROR: no existe la función no_such_function(integer) LINE 1: INSERT INTO pq_pipeline_demo(itemno) VALUES ($1); ^ HINT: Ninguna función coincide en el nombre y tipos de argumentos. Puede ser necesario agregar conversión explícita de tipos. ok (Sorry about the Spanish there, but the gist of the problem is that we're positioning the error cursor on a query that succeeded prior to the query that raised the error.) This should obviously not occur. I'm trying to figure out how to repair it and not break everything again ... -- Álvaro Herrera Valdivia, Chile
Re: PATCH: Batch/pipelining support for libpq
On 2021-Mar-03, k.jami...@fujitsu.com wrote: > I tried applying this patch to test it on top of Iwata-san's libpq trace log > [1]. > In my environment, the compiler complains. > It seems that in libpqwalreceiver.c: libpqrcv_exec() > the switch for PQresultStatus needs to handle the > cases for PGRES_PIPELINE_END and PGRES_PIPELINE_ABORTED too. Yeah, I noticed this too and decided to handle these cases as errors. Here's v28, which includes that fix, and also gets rid of the PGASYNC_QUEUED state; I instead made the rest of the code work using the existing PGASYNC_IDLE state. This allows us get rid of a few cases where we had to report "internal error: IDLE state unexpected" in a few cases, and was rather pointless. With the current formulation, both pipeline mode and normal mode share pretty much all state. Also, I've renamed PGRES_PIPELINE_END to PGRES_PIPELINE_SYNC, because that's closer to what it means: that result state by no means that pipeline mode has ended. It only means that you called PQsendPipeline() so the server gives you a Sync message. I'm much more comfortable with this version, so I'm marking the patch as Ready for Committer in case anybody else wants to look at this before I push it. However: on errors, I think it's a bit odd that you get a NULL from PQgetResult after getting PGRES_PIPELINE_ABORTED. Maybe I should suppress that. I'm no longer wrestling with the NULL after PGRES_PIPELINE_END however, because that's not critical and you can not ask for it and things work fine. (Also, the test pipeline.c program in src/test/modules/test_libpq has a new "transaction" mode that is not exercised by the TAP program; I'll plug that in before commit.) -- Álvaro Herrera Valdivia, Chile diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index cc20b0f234..4ea8cb8c93 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3199,6 +3199,32 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_PIPELINE_SYNC + + +The PGresult represents a point in a +pipeline where a synchronization point has been established. +This status occurs only when pipeline mode has been selected. + + + + + + PGRES_PIPELINE_ABORTED + + +The PGresult represents a pipeline that's +received an error from the server. PQgetResult +must be called repeatedly, and it will return this status code, +until the end of the current pipeline, at which point it will return +PGRES_PIPELINE_SYNC and normal processing can +resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4957,6 +4983,498 @@ int PQflush(PGconn *conn); + + Pipeline Mode + + + libpq + pipeline mode + + + + pipelining + in libpq + + + + batch mode + in libpq + + + + libpq pipeline mode allows applications to + send a query without having to read the result of the previously + sent query. Taking advantage of the pipeline mode, a client will wait + less for the server, since multiple queries/results can be sent/ + received in a single network transaction. + + + + While pipeline mode provides a significant performance boost, writing + clients using the pipeline mode is more complex because it involves + managing a queue of pending queries and finding which result + corresponds to which query in the queue. + + + + Pipeline mode also generally consumes more memory on both the client and server, + though careful and aggressive management of the send/receive queue can mitigate + this. This applies whether or not the connection is in blocking or non-blocking + mode. + + + + Using Pipeline Mode + + +To issue pipelines the application must switch a connection into pipeline mode. +Enter pipeline mode with +or test whether pipeline mode is active with +. +In pipeline mode, only asynchronous operations +are permitted, and COPY is disallowed. +Using any synchronous command execution functions, +such as PQfn, or PQexec +and its sibling functions, is an error condition. + + + + + It is best to use pipeline mode with libpq in + non-blocking mode. If used + in blocking mode it is possible for a client/server deadlock to occur. + + +The client will block trying to send queries to the server, but the +server will block trying to send results to the client from queries +it has already processed. This only occurs when the client sends +enough queries to fill both its output buffer and the server's receive +buffer before switching to processing input from the server, +but it's hard to predict exactly when that will happen. + + +
RE: PATCH: Batch/pipelining support for libpq
On Wed, Feb 17, 2021 8:14 AM (JST), Alvaro Herrera wrote: Hi Alvaro, > Here's a new version, where I've renamed everything to "pipeline". I think > the > docs could use some additional tweaks now in order to make a coherent > story on pipeline mode, how it can be used in a batched fashion, etc. I tried applying this patch to test it on top of Iwata-san's libpq trace log [1]. In my environment, the compiler complains. It seems that in libpqwalreceiver.c: libpqrcv_exec() the switch for PQresultStatus needs to handle the cases for PGRES_PIPELINE_END and PGRES_PIPELINE_ABORTED too. switch (PQresultStatus(pgres)) { ... Case PGRES_PIPELINE_ABORTED: ... Case PGRES_PIPELINE_ END: Regards, Kirk Jamison [1] https://www.postgresql.org/message-id/flat/1d650278-552a-4bd2-8411-a907fd54446c%40www.fastmail.com#32df02a4e2a08185508a2126e6e0caf1
Re: PATCH: Batch/pipelining support for libpq
On 2021-Feb-20, Craig Ringer wrote: > On Wed, 17 Feb 2021, 07:13 Alvaro Herrera, wrote: > > I > > think the docs could use some additional tweaks now in order to make a > > coherent story on pipeline mode, how it can be used in a batched > > fashion, etc. > > I'll do that soon and send an update. I started to do that, but was sidetracked having a look at error handling while checking one of the claims. So now it starts with this: Issuing Queries After entering pipeline mode, the application dispatches requests using , or its prepared-query sibling . These requests are queued on the client-side until flushed to the server; this occurs when is used to establish a synchronization point in the pipeline, or when is called. The functions , , and also work in pipeline mode. Result processing is described below. The server executes statements, and returns results, in the order the client sends them. The server will begin executing the commands in the pipeline immediately, not waiting for the end of the pipeline. If any statement encounters an error, the server aborts the current transaction and skips processing commands in the pipeline until the next synchronization point established by PQsendPipeline. (This remains true even if the commands in the pipeline would rollback the transaction.) Query processing resumes after the synchronization point. (Note I changed the wording that "the pipeline is ended by PQsendPipeline" to "a synchronization point is established". Is this easily understandable? On re-reading it, I'm not sure it's really an improvement.) BTW we don't explain why doesn't PQsendQuery work (to wit: because it uses "simple" query protocol and thus doesn't require the Sync message). I think we should either document that, or change things so that PQsendQuery no longer uses a 'Q' message when in pipeline mode; instead use extended query protocol. I don't see why we'd force people to use PQsendQueryParams when not necessary. BTW, in my experimentation, the sequence of PGresult that you get when handling a pipeline with errors don't make a lot of sense. I'll spend some more time on it. While at it, as for the PGresult sequence and NULL returns: I think for PQexec and maybe PQsendQuery, it makes sense to loop until PQgetResult returns NULL, because you never know how many results are you going to get. But for PQsendQueryParams, this is no longer true, because you can't send multiple queries in one query string. So the only way to get multiple results, is by using single-row mode. But that already has its own protocol for multiple results, namely to get a stream of PGRES_SINGLE_TUPLE terminated by a zero-rows PGRES_TUPLES_OK. So I'm not sure there is a strong need for the mandatory NULL result. -- Álvaro Herrera39°49'30"S 73°17'W
Re: PATCH: Batch/pipelining support for libpq
On Wed, 17 Feb 2021, 07:13 Alvaro Herrera, wrote: > Here's a new version, where I've renamed everything to "pipeline". Wow. Thanks. I > think the docs could use some additional tweaks now in order to make a > coherent story on pipeline mode, how it can be used in a batched > fashion, etc. > I'll do that soon and send an update. > >
Re: PATCH: Batch/pipelining support for libpq
Hi, Thanks for the response. On Fri, Feb 19, 2021 at 12:46 PM Alvaro Herrera wrote: > Hi, > > On 2021-Feb-19, Zhihong Yu wrote: > > > Hi, > > +static int pqBatchProcessQueue(PGconn *conn); > > > > I was suggesting changing: > > > > +int > > +PQexitBatchMode(PGconn *conn) > > > > to: > > > > +static int > > +PQexitBatchMode(PGconn *conn) > > I understand that, but what I'm saying is that it doesn't work. > pqBatchProcessQueue can be static because it's only used internally in > libpq, but PQexitBatchMode is supposed to be called by the client > application. > > -- > Álvaro Herrera Valdivia, Chile > "No necesitamos banderas > No reconocemos fronteras" (Jorge González) >
Re: PATCH: Batch/pipelining support for libpq
Hi, On 2021-Feb-19, Zhihong Yu wrote: > Hi, > +static int pqBatchProcessQueue(PGconn *conn); > > I was suggesting changing: > > +int > +PQexitBatchMode(PGconn *conn) > > to: > > +static int > +PQexitBatchMode(PGconn *conn) I understand that, but what I'm saying is that it doesn't work. pqBatchProcessQueue can be static because it's only used internally in libpq, but PQexitBatchMode is supposed to be called by the client application. -- Álvaro Herrera Valdivia, Chile "No necesitamos banderas No reconocemos fronteras" (Jorge González)
Re: PATCH: Batch/pipelining support for libpq
Hi, +static int pqBatchProcessQueue(PGconn *conn); I was suggesting changing: +int +PQexitBatchMode(PGconn *conn) to: +static int +PQexitBatchMode(PGconn *conn) Cheers On Fri, Feb 19, 2021 at 10:43 AM Alvaro Herrera wrote: > On 2021-Jan-21, Zhihong Yu wrote: > > > It seems '\\gset or \\aset is not ' would correspond to the check more > > closely. > > > > + if (my_command->argc != 1) > > + syntax_error(source, lineno, my_command->first_line, > > my_command->argv[0], > > > > It is possible that my_command->argc == 0 (where my_command->argv[0] > > shouldn't be accessed) ? > > No -- process_backslash_command reads the word and sets it as argv[0]. > > > + appendPQExpBufferStr(&conn->errorMessage, > > + libpq_gettext("cannot queue commands > > during COPY\n")); > > + return false; > > + break; > > > > Is the break necessary ? Similar comment for pqBatchProcessQueue(). > > Not necessary. I removed them. > > > +int > > +PQexitBatchMode(PGconn *conn) > > > > Since either 0 or 1 is returned, maybe change the return type to bool. > > I was kinda tempted to do that, but in reality a lot of libpq's API is > defined like that -- to return 0 (failure)/1 (success) as ints, not > bools. For example see PQsendQuery(). I'm not inclined to do different > for these new functions. (One curious case is PQsetvalue, which returns > int, and is documented as "returns zero for failure" and then uses > "return false"). > > > Also, the function operates on PGconn - should the function be static > > (pqBatchProcessQueue is) ? > > I don't understand this suggestion. How would an application call it, > if we make it static? > > Thanks > > -- > Álvaro Herrera Valdivia, Chile > "Cómo ponemos nuestros dedos en la arcilla del otro. Eso es la amistad; > jugar > al alfarero y ver qué formas se pueden sacar del otro" (C. Halloway en > La Feria de las Tinieblas, R. Bradbury) >
Re: PATCH: Batch/pipelining support for libpq
On 2021-Jan-21, Zhihong Yu wrote: > It seems '\\gset or \\aset is not ' would correspond to the check more > closely. > > + if (my_command->argc != 1) > + syntax_error(source, lineno, my_command->first_line, > my_command->argv[0], > > It is possible that my_command->argc == 0 (where my_command->argv[0] > shouldn't be accessed) ? No -- process_backslash_command reads the word and sets it as argv[0]. > + appendPQExpBufferStr(&conn->errorMessage, > + libpq_gettext("cannot queue commands > during COPY\n")); > + return false; > + break; > > Is the break necessary ? Similar comment for pqBatchProcessQueue(). Not necessary. I removed them. > +int > +PQexitBatchMode(PGconn *conn) > > Since either 0 or 1 is returned, maybe change the return type to bool. I was kinda tempted to do that, but in reality a lot of libpq's API is defined like that -- to return 0 (failure)/1 (success) as ints, not bools. For example see PQsendQuery(). I'm not inclined to do different for these new functions. (One curious case is PQsetvalue, which returns int, and is documented as "returns zero for failure" and then uses "return false"). > Also, the function operates on PGconn - should the function be static > (pqBatchProcessQueue is) ? I don't understand this suggestion. How would an application call it, if we make it static? Thanks -- Álvaro Herrera Valdivia, Chile "Cómo ponemos nuestros dedos en la arcilla del otro. Eso es la amistad; jugar al alfarero y ver qué formas se pueden sacar del otro" (C. Halloway en La Feria de las Tinieblas, R. Bradbury)
Re: PATCH: Batch/pipelining support for libpq
Hello, thanks for looking at this patch. On 2021-Feb-16, Zhihong Yu wrote: > + if (querymode == QUERY_SIMPLE) > + { > + commandFailed(st, "startpipeline", "cannot use pipeline mode > with the simple query protocol"); > + st->state = CSTATE_ABORTED; > + return CSTATE_ABORTED; > > I wonder why the st->state is only assigned for this if block. The state is > not set for other cases where CSTATE_ABORTED is returned. Yeah, that's a simple oversight. We don't need to set st->state, because the caller sets it to the value we return. > Should PQ_PIPELINE_OFF be returned for the following case ? > > +PQpipelineStatus(const PGconn *conn) > +{ > + if (!conn) > + return false; Yep. Thanks -- Álvaro Herrera39°49'30"S 73°17'W
Re: PATCH: Batch/pipelining support for libpq
Hi, + if (querymode == QUERY_SIMPLE) + { + commandFailed(st, "startpipeline", "cannot use pipeline mode with the simple query protocol"); + st->state = CSTATE_ABORTED; + return CSTATE_ABORTED; I wonder why the st->state is only assigned for this if block. The state is not set for other cases where CSTATE_ABORTED is returned. Should PQ_PIPELINE_OFF be returned for the following case ? +PQpipelineStatus(const PGconn *conn) +{ + if (!conn) + return false; Cheers On Tue, Feb 16, 2021 at 3:14 PM Alvaro Herrera wrote: > Here's a new version, where I've renamed everything to "pipeline". I > think the docs could use some additional tweaks now in order to make a > coherent story on pipeline mode, how it can be used in a batched > fashion, etc. > > Here's the renames I applied. It's mostly mechanical, except > PQbatchSendQueue is now PQsendPipeline: > > PQBatchStatus -> PGpipelineStatus (enum) > PQBATCH_MODE_OFF -> PQ_PIPELINE_OFF > PQBATCH_MODE_ON -> PQ_PIPELINE_ON > PQBATCH_MODE_ABORTED -> PQ_PIPELINE_ABORTED > PQbatchStatus -> PQpipelineStatus (function) > PQenterBatchMode -> PQenterPipelineMode > PQexitBatchMode -> PQexitPipelineMode > PQbatchSendQueue -> PQsendPipeline > PGRES_BATCH_END -> PGRES_PIPELINE_END > PGRES_BATCH_ABORTED -> PGRES_PIPELINE_ABORTED > > Also, PQbatchStatus(conn) returns enum PGpipelineStatus (it previously > returned int). > > I'm tempted to rename PGASYNC_QUEUED to PGASYNC_PIPELINE_IDLE (not sure > if PGASYNC_PIPELINE_READY fits better with the existing one). > > > In pgbench, I changed the metacommands to be \startpipeline and > \endpipeline. There's a failing Assert() there which I commented out; > needs fixed. > > -- > Álvaro Herrera39°49'30"S 73°17'W >
Re: PATCH: Batch/pipelining support for libpq
Here's a new version, where I've renamed everything to "pipeline". I think the docs could use some additional tweaks now in order to make a coherent story on pipeline mode, how it can be used in a batched fashion, etc. Here's the renames I applied. It's mostly mechanical, except PQbatchSendQueue is now PQsendPipeline: PQBatchStatus -> PGpipelineStatus (enum) PQBATCH_MODE_OFF -> PQ_PIPELINE_OFF PQBATCH_MODE_ON -> PQ_PIPELINE_ON PQBATCH_MODE_ABORTED -> PQ_PIPELINE_ABORTED PQbatchStatus -> PQpipelineStatus (function) PQenterBatchMode -> PQenterPipelineMode PQexitBatchMode -> PQexitPipelineMode PQbatchSendQueue -> PQsendPipeline PGRES_BATCH_END -> PGRES_PIPELINE_END PGRES_BATCH_ABORTED -> PGRES_PIPELINE_ABORTED Also, PQbatchStatus(conn) returns enum PGpipelineStatus (it previously returned int). I'm tempted to rename PGASYNC_QUEUED to PGASYNC_PIPELINE_IDLE (not sure if PGASYNC_PIPELINE_READY fits better with the existing one). In pgbench, I changed the metacommands to be \startpipeline and \endpipeline. There's a failing Assert() there which I commented out; needs fixed. -- Álvaro Herrera39°49'30"S 73°17'W diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index b7a82453f0..9f98257dbe 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3099,6 +3099,31 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_PIPELINE_END + + +The PGresult represents the end of a pipeline. +This status occurs only when pipeline mode has been selected. + + + + + + PGRES_PIPELINE_ABORTED + + +The PGresult represents a pipeline that's +received an error from the server. PQgetResult +must be called repeatedly, and it will return this status code, +until the end of the current pipeline, at which point it will return +PGRES_PIPELINE_END and normal processing can +resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4857,6 +4882,494 @@ int PQflush(PGconn *conn); + + Pipeline Mode + + + libpq + pipeline mode + + + + pipelining + in libpq + + + + batch mode + in libpq + + + + libpq pipeline mode allows applications to + send a query without having to read the result of the previously + sent query. Taking advantage of the pipeline mode, a client will wait + less for the server, since multiple queries/results can be sent/ + received in a single network transaction. + + + + While pipeline mode provides a significant performance boost, writing + clients using the pipeline mode is more complex because it involves + managing a queue of pending queries and finding which result + corresponds to which query in the queue. + + + + Pipeline mode also generally consumes more memory on both the client and server, + though careful and aggressive management of the send/receive queue can mitigate + this. This applies whether or not the connection is in blocking or non-blocking + mode. + + + + Using Pipeline Mode + + +To issue pipelines the application must switch a connection into pipeline mode. +Enter pipeline mode with +or test whether pipeline mode is active with +. +In pipeline mode, only asynchronous operations +are permitted, and COPY is disallowed. +Using any synchronous command execution functions, +such as PQfn, or PQexec +and its sibling functions, is an error condition. + + + + + It is best to use pipeline mode with libpq in + non-blocking mode. If used + in blocking mode it is possible for a client/server deadlock to occur. + + +The client will block trying to send queries to the server, but the +server will block trying to send results to the client from queries +it has already processed. This only occurs when the client sends +enough queries to fill its output buffer and the server's receive +buffer before switching to processing input from the server, +but it's hard to predict exactly when that will happen. + + + + + + +Issuing Queries + + + After entering pipeline mode the application dispatches requests using + normal asynchronous libpq functions, such as: + PQsendQueryParams, PQsendPrepare, + PQsendQueryPrepared, PQsendDescribePortal, + and PQsendDescribePrepared. + The asynchronous requests are followed by a + + call to mark the end of the pipeline. The client need not + call PQgetResult immediately after + dispatching each operation. + Result processing + is handled separately. + + + + The serve
Re: PATCH: Batch/pipelining support for libpq
On 2021-Feb-16, Craig Ringer wrote: > FWIW I'm also thinking of revising the docs to mostly use the term > "pipeline" instead of "batch". Use "pipelining and batching" in the chapter > topic, and mention "batch" in the index, and add a para that explains how > to run batches on top of pipelining, but otherwise use the term "pipeline" > not "batch". Hmm, this is a good point. It means I have a lot of API renaming to do. I'll get on it now and come back with a proposal. -- Álvaro Herrera Valdivia, Chile
Re: PATCH: Batch/pipelining support for libpq
On Tue, 16 Feb 2021 at 09:19, Craig Ringer wrote: > > So long as there is a way to "send A", "send B", "send C", "read results > from A", "send D", and there's a way for the application to associate some > kind of state (an application specific id or index, a pointer to an > application query-queue struct, whatever) so it can match queries to > results ... then I'm happy. > FWIW I'm also thinking of revising the docs to mostly use the term "pipeline" instead of "batch". Use "pipelining and batching" in the chapter topic, and mention "batch" in the index, and add a para that explains how to run batches on top of pipelining, but otherwise use the term "pipeline" not "batch". That's because pipelining isn't actually batching, and using it as a naïve batch interface will get you in trouble. If you PQsendQuery(...) a long list of queries without consuming results, you'll block unless you're in libpq's nonblocking-send mode. You'll then be deadlocked because the server can't send results until you consume some (tx buffer full) and can't consume queries until it can send some results, but you can't consume results because you're blocked on a send that'll never end. An actual batch interface where you can bind and send a long list of queries might be worthwhile, but should be addressed separately, and it'd be confusing if this pipelining interface claimed the term "batch". I'm not convinced enough application developers actually code directly against libpq for it to be worth creating a pretty batch interface where you can submit (say) an array of struct PQbatchEntry { const char * query, int nparams, ... } to a PQsendBatch(...) and let libpq handle the socket I/O for you. But if someone wants to add one later it'll be easier if we don't use the term "batch" for the pipelining feature. I'll submit a reworded patch in a bit.
Re: PATCH: Batch/pipelining support for libpq
On Thu, 11 Feb 2021 at 07:51, Alvaro Herrera wrote: > On 2021-Jan-21, Alvaro Herrera wrote: > > > As you can see in an XXX comment in the libpq test program, the current > > implementation has the behavior that PQgetResult() returns NULL after a > > batch is finished and has reported PGRES_BATCH_END. I don't know if > > there's a hard reason to do that, but I'd like to supress it because it > > seems weird and out of place. > > Hello Craig, a question for you since this API is your devising. I've > been looking at changing the way this works to prevent those NULL > returns from PQgetResult. That is, instead of having what seems like a > "NULL separator" of query results, you'd just get the PGRES_BATCH_END to > signify a batch end (not a NULL result after the BATCH_END); and the > normal PGRES_COMMAND_OK or PGRES_TUPLES_OK etc when the result of a > command has been sent. It's not working yet so I'm not sending an > updated patch, but I wanted to know if you had a rationale for including > this NULL return "separator" or was it just a convenience because of how > the code grew together. > The existing API for libpq actually specifies[1] that you should call PQgetResult() until it returns NULL: > After successfully calling PQsendQuery, call PQgetResult one or more times to obtain the results. PQsendQuery cannot be called again (on the same connection) until PQgetResult has returned a null pointer, indicating that the command is done. Similarly, in single-row mode, the existing API specifies that you should call PQgetResult() until it returns NULL. Also, IIRC the protocol already permits multiple result sets to be returned, and the caller cannot safely assume that a single PQsendQuery() will produce only a single result set. (I really should write a test extension that exercises that and how libpq reacts to it.) That's why I went for the NULL return. I think. It's been a while now so it's a bit fuzzy. I would definitely like an API that does not rely on testing for a NULL return. Especially since NULL return can be ambiguous in the context of row-at-a-time mode. New explicit enumerations for PGresult would make a lot more sense. So +1 from me for the general idea. I need to look at the patch as it has evolved soon too. Remember that the original patch I submitted for this was a 1-day weekend hack and proof of concept to show that libpq could be modified to support query pipelining (and thus batching too), so I could illustrate the performance benefits that could be attained by doing so. I'd been aware of the benefits and the protocol's ability to support it for some time thanks to working on PgJDBC, but couldn't get anyone interested without some C code to demonstrate it, so I wrote some. I am not going to argue that the API I added for it is ideal in any way, and happy to see improvements. The only change I would very strongly object to would be anything that turned this into a *batch* mode without query-pipelining support. If you have to queue all the queries up in advance then send them as a batch and wait for all the results, you miss out on a lot of the potential round-trip-time optimisations and you add initial latency. So long as there is a way to "send A", "send B", "send C", "read results from A", "send D", and there's a way for the application to associate some kind of state (an application specific id or index, a pointer to an application query-queue struct, whatever) so it can match queries to results ... then I'm happy. [1] https://www.postgresql.org/docs/current/libpq-async.html#LIBPQ-PQSENDQUERY
Re: PATCH: Batch/pipelining support for libpq
On 2021-Jan-21, Alvaro Herrera wrote: > As you can see in an XXX comment in the libpq test program, the current > implementation has the behavior that PQgetResult() returns NULL after a > batch is finished and has reported PGRES_BATCH_END. I don't know if > there's a hard reason to do that, but I'd like to supress it because it > seems weird and out of place. Hello Craig, a question for you since this API is your devising. I've been looking at changing the way this works to prevent those NULL returns from PQgetResult. That is, instead of having what seems like a "NULL separator" of query results, you'd just get the PGRES_BATCH_END to signify a batch end (not a NULL result after the BATCH_END); and the normal PGRES_COMMAND_OK or PGRES_TUPLES_OK etc when the result of a command has been sent. It's not working yet so I'm not sending an updated patch, but I wanted to know if you had a rationale for including this NULL return "separator" or was it just a convenience because of how the code grew together. Such a decision has obvious consequences for the test program (which right now expects that PQgetResult() returns NULL at each step); and naturally for libpq's internal state machine that controls how it all works. Thanks, -- Álvaro Herrera39°49'30"S 73°17'W
Re: PATCH: Batch/pipelining support for libpq
Hi, + commandFailed(st, "SQL", "\\gset and \\aset are not allowed in a batch section"); It seems '\\gset or \\aset is not ' would correspond to the check more closely. + if (my_command->argc != 1) + syntax_error(source, lineno, my_command->first_line, my_command->argv[0], It is possible that my_command->argc == 0 (where my_command->argv[0] shouldn't be accessed) ? + appendPQExpBufferStr(&conn->errorMessage, + libpq_gettext("cannot queue commands during COPY\n")); + return false; + break; Is the break necessary ? Similar comment for pqBatchProcessQueue(). +int +PQexitBatchMode(PGconn *conn) Since either 0 or 1 is returned, maybe change the return type to bool. Also, the function operates on PGconn - should the function be static (pqBatchProcessQueue is) ? Cheers On Thu, Jan 21, 2021 at 3:39 PM Alvaro Herrera wrote: > Thanks David Johnston and Daniel Vérité, I have incorporated your > changes into this patch, which is now v26. Also, it's been rebased on > current sources. > > I've been using the new PQtrace() stuff to verify the behavior of the > new feature. It's not perfect, but at least it doesn't crash > immediately as it did when I tried a few weeks ago. There are > imperfections that I think are due to bugs in the PQtrace > implementation, not in this patch. > > -- > Álvaro Herrera39°49'30"S 73°17'W > "El conflicto es el camino real hacia la unión" >
Re: PATCH: Batch/pipelining support for libpq
As you can see in an XXX comment in the libpq test program, the current implementation has the behavior that PQgetResult() returns NULL after a batch is finished and has reported PGRES_BATCH_END. I don't know if there's a hard reason to do that, but I'd like to supress it because it seems weird and out of place. -- Álvaro Herrera39°49'30"S 73°17'W
Re: PATCH: Batch/pipelining support for libpq
Thanks David Johnston and Daniel Vérité, I have incorporated your changes into this patch, which is now v26. Also, it's been rebased on current sources. I've been using the new PQtrace() stuff to verify the behavior of the new feature. It's not perfect, but at least it doesn't crash immediately as it did when I tried a few weeks ago. There are imperfections that I think are due to bugs in the PQtrace implementation, not in this patch. -- Álvaro Herrera39°49'30"S 73°17'W "El conflicto es el camino real hacia la unión" diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 2bb3bf77e4..d22b866677 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3099,6 +3099,30 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_BATCH_END + + +The PGresult represents the end of a batch. +This status occurs only when batch mode has been selected. + + + + + + PGRES_BATCH_ABORTED + + +The PGresult represents a batch that's +received an error from the server. PQgetResult +must be called repeatedly, and it will return this status code, +until the end of the current batch, at which point it will return +PGRES_BATCH_END and normal processing can resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4857,6 +4881,488 @@ int PQflush(PGconn *conn); + + Batch Mode + + + libpq + batch mode + + + + pipelining + in libpq + + + + libpq batch mode allows applications to + send a query without having to read the result of the previously + sent query. Taking advantage of the batch mode, a client will wait + less for the server, since multiple queries/results can be sent/ + received in a single network transaction. + + + + While batch mode provides a significant performance boost, writing + clients using the batch mode is more complex because it involves + managing a queue of pending queries and finding which result + corresponds to which query in the queue. + + + + Batch mode also generally consumes more memory on both the client and server, + though careful and aggressive management of the send/receive queue can mitigate + this. This applies whether or not the connection is in blocking or non-blocking + mode. + + + + Using Batch Mode + + +To issue batches the application must switch a connection into batch mode. +Enter batch mode with +or test whether batch mode is active with +. +In batch mode, only asynchronous operations +are permitted, and COPY is not recommended as it +may trigger failure in batch processing. Using any synchronous +command execution functions, such as PQfn, or +PQexec and its sibling functions, is an error +condition. + + + + + It is best to use batch mode with libpq in + non-blocking mode. If used + in blocking mode it is possible for a client/server deadlock to occur. + + +The client will block trying to send queries to the server, but the +server will block trying to send results to the client from queries +it has already processed. This only occurs when the client sends +enough queries to fill its output buffer and the server's receive +buffer before switching to processing input from the server, +but it's hard to predict exactly when that will happen. + + + + + + +Issuing Queries + + + After entering batch mode the application dispatches requests using + normal asynchronous libpq functions, such as: + PQsendQueryParams, PQsendPrepare, + PQsendQueryPrepared, PQsendDescribePortal, + and PQsendDescribePrepared. + The asynchronous requests are followed by a + + call to mark the end of the batch. The client need not + call PQgetResult immediately after + dispatching each operation. + Result processing + is handled separately. + + + + The server executes statements, and returns results, in the order the + client sends them. The server will begin executing the batch commands + immediately, not waiting for the end of the batch. + If any statement encounters an error the server aborts the current + transaction and skips processing the rest of the batch. + Query processing resumes after the end of the failed batch. + + + + It's fine for one operation to depend on the results of a + prior one. One query may define a table that the next query in the same + batch uses; similarly, an application may create a named prepared statement + then execute it with later statements in the same batch. + + + + +Processing Results + +
Re: PATCH: Batch/pipelining support for libpq
On 2020-Nov-23, Daniel Verite wrote: > Hi, > > Here's a new version with the pgbench support included. Thanks, incorporated into my local copy.
Re: PATCH: Batch/pipelining support for libpq
Hi, Here's a new version with the pgbench support included. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: https://www.manitou-mail.org Twitter: @DanielVerite diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 9ce32fb39b..2a94f8f6b9 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3061,6 +3061,30 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_BATCH_END + + +The PGresult represents the end of a batch. +This status occurs only when batch mode has been selected. + + + + + + PGRES_BATCH_ABORTED + + +The PGresult represents a batch that's +received an error from the server. PQgetResult +must be called repeatedly, and it will return this status code, +until the end of the current batch, at which point it will return +PGRES_BATCH_END and normal processing can resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4819,6 +4843,482 @@ int PQflush(PGconn *conn); + + Batch Mode + + + libpq + batch mode + + + + pipelining + in libpq + + + + libpq batch mode allows applications to + send a query without having to read the result of the previously + sent query. Taking advantage of the batch mode, a client will wait + less for the server, since multiple queries/results can be sent/ + received in a single network transaction. + + + + While batch mode provides a significant performance boost, writing + clients using the batch mode is more complex because it involves + managing a queue of pending queries and finding which result + corresponds to which query in the queue. + + + + Using Batch Mode + + +To issue batches the application must switch a connection into batch mode. +Enter batch mode with +or test whether batch mode is active with +. +In batch mode, only asynchronous operations +are permitted, and COPY is not recommended as it +may trigger failure in batch processing. Using any synchronous +command execution functions such as PQfn, +PQexec or one of its sibling functions are error +conditions. + + + + + It is best to use batch mode with libpq in + non-blocking mode. If used + in blocking mode it is possible for a client/server deadlock to occur. + + +The client will block trying to send queries to the server, but the +server will block trying to send results to the client from queries +it has already processed. This only occurs when the client sends +enough queries to fill its output buffer and the server's receive +buffer before switching to processing input from the server, +but it's hard to predict exactly when that will happen. + + + + + Batch mode consumes more memory when send/receive is not done as required, + even in non-blocking mode. + + + + +Issuing Queries + + + After entering batch mode the application dispatches requests using + normal asynchronous libpq functions such as + PQsendQueryParams, PQsendPrepare, + PQsendQueryPrepared, PQsendDescribePortal, + PQsendDescribePrepared. + The asynchronous requests are followed by a + + call to mark the end of the batch. The client needs not + call PQgetResult immediately after + dispatching each operation. + Result processing + is handled separately. + + + + The server executes statements, and returns results, in the order the + client sends them. The server may begin executing the batch before all + commands in the batch are queued and the end of batch command is sent. + If any statement encounters an error the server aborts the current + transaction and skips processing the rest of the batch. + Query processing resumes after the end of the failed batch. + + + + It's fine for one operation to depend on the results of a + prior one. One query may define a table that the next query in the same + batch uses; similarly, an application may create a named prepared statement + then execute it with later statements in the same batch. + + + + +Processing Results + + + To process the result of one batch query, the application calls + PQgetResult repeatedly and handles each result + until PQgetResult returns null. + The result from the next batch query may then be retrieved using + PQgetResult again and the cycle repeated. + The application handles individual statement results as normal. + When the results of all the queries in the batch have been + returned, PQgetResult returns a result + containing the status value PGRES_BA
Re: PATCH: Batch/pipelining support for libpq
Hi On Wed, Nov 18, 2020, at 09:51, Alvaro Herrera wrote: > On 2020-Nov-14, Daniel Verite wrote: > > > The patch I posted in [1] was pretty simple, but at the time, query > > results were always discarded. Now that pgbench can instantiate > > variables from query results, a script can do: > > select 1 as var \gset > > select :var; > > This kind of sequence wouldn't work in batch mode since it > > sends queries before getting results of previous queries. > > > > So maybe \gset should be rejected when inside a batch section. > > Hah. > > Hacking pgbench extensively is beyond what I'm willing to do for this > feature at this time. Making \gset rejected in a batch section sounds > simple enough and supports \beginbatch et al sufficiently to compare > performance, so I'm OK with a patch that does that. That'd be a small > extension to your previous patch, if I understand correctly. > > If you or others want to send patches to extend batch support with > read-write tracking for variables, feel free, but I hereby declare that > I'm not taking immediate responsibility for getting them committed. I think minimal support is entirely sufficient initially. Andres
Re: PATCH: Batch/pipelining support for libpq
On 2020-Nov-14, Daniel Verite wrote: > The patch I posted in [1] was pretty simple, but at the time, query > results were always discarded. Now that pgbench can instantiate > variables from query results, a script can do: > select 1 as var \gset > select :var; > This kind of sequence wouldn't work in batch mode since it > sends queries before getting results of previous queries. > > So maybe \gset should be rejected when inside a batch section. Hah. Hacking pgbench extensively is beyond what I'm willing to do for this feature at this time. Making \gset rejected in a batch section sounds simple enough and supports \beginbatch et al sufficiently to compare performance, so I'm OK with a patch that does that. That'd be a small extension to your previous patch, if I understand correctly. If you or others want to send patches to extend batch support with read-write tracking for variables, feel free, but I hereby declare that I'm not taking immediate responsibility for getting them committed.
Re: PATCH: Batch/pipelining support for libpq
On Fri, Nov 13, 2020 at 5:38 PM Alvaro Herrera wrote: > Here's a v25. > > I made a few more changes to the docs per David's suggestions; I also > reordered the sections quite a bit. It's now like this: > * Batch Mode >* Using Batch Mode > * Issuing Queries > * Processing Results > * Error Handling > * Interleaving Result Processing and Query Dispatch > * Ending Batch Mode >* Functions Associated with Batch Mode >* When to Use Batching > Thanks! I like the new flow and changes. I've attached a patch that covers some missing commas and typos along with a few parts that made me pause. The impact of memory came out of nowhere under the non-blocking mode commentary. I took a stab at incorporating it more broadly. The "are error conditions" might be a well-known phrasing to those using libpq but that sentence reads odd to me. I did try to make the example listing flow a bit better and added a needed comma. Tried to clean up a few phrasings after that. The error handling part I'm not especially happy with but I think it's closer and more useful than just "emitted during error handling" - it gets emitted upon error, handling is a client concern. Seems odd to say the new feature was introduced in v14.0, the .0 seems ok to imply. I didn't actually fix it in the attached but "the PostgreSQL 14 version of libpq" is going to become outdated quickly, better just to drop it. "The batch API was introduced in PostgreSQL 14, but clients can use batches on servers supporting v3 of the extended query protocol, potentially going as far back as version 7.4." David J. v25-01-libpq-batch-dgj-doc-suggestions.patch Description: Binary data
Re: PATCH: Batch/pipelining support for libpq
Alvaro Herrera wrote: > I adapted the test code to our code style. I also removed the "timings" > stuff; I think that's something better left to pgbench. > > (I haven't looked at Daniel's pgbench stuff yet, but I will do that > next.) The patch I posted in [1] was pretty simple, but at the time, query results were always discarded. Now that pgbench can instantiate variables from query results, a script can do: select 1 as var \gset select :var; This kind of sequence wouldn't work in batch mode since it sends queries before getting results of previous queries. So maybe \gset should be rejected when inside a batch section. Or alternatively pgbench should collect results before a variable is reinjected into a query, thereby stalling the pipeline. To do this only when necessary, it would have to track read-write dependencies among variables, which seems overly complicated though. [1] https://www.postgresql.org/message-id/b4e34135-2bd9-4b8a-94ca-27d760da2...@manitou-mail.org Best regards, -- Daniel Vérité PostgreSQL-powered mailer: https://www.manitou-mail.org Twitter: @DanielVerite
Re: PATCH: Batch/pipelining support for libpq
Here's a v25. I made a few more changes to the docs per David's suggestions; I also reordered the sections quite a bit. It's now like this: * Batch Mode * Using Batch Mode * Issuing Queries * Processing Results * Error Handling * Interleaving Result Processing and Query Dispatch * Ending Batch Mode * Functions Associated with Batch Mode * When to Use Batching To me as a reader, this makes more sense, but if you disagree, I think we should discuss further changes. (For example, maybe we should move the "Functions" section at the end?) The original had "When to Use Batching" at the start, but it seemed to me that the points it's making are not as critical as understanding what it is. I reworked the test program to better fit the TAP model; I found that if one test mecha failed in whatever way, the connection would be in a weird state and cause the next test to also fail. I changed so that it runs one test and exits; then the t/001_libpq_async.pl file (hmm, need to rename to 001_batch.pl I guess) calls it once for each test. I adapted the test code to our code style. I also removed the "timings" stuff; I think that's something better left to pgbench. (I haven't looked at Daniel's pgbench stuff yet, but I will do that next.) While looking at how the tests worked, I gave a hard stare at the new libpq code and cleaned it up also. There's a lot of minor changes, but nothing truly substantial. I moved the code around a lot, to keep things where grouped together they belong. I'm not 100% clear on things like PGconn->batch_status and how PGconn->asyncStatus works. Currently everything seems to work correctly, but I'm worried that because we add new status values to asyncStatus, some existing code might not be doing everything correctly (for example when changing to/from ASYNC_BUSY in some cases, are we 100% we shouldn't be changing to ASYNC_QUEUED?) While looking this over I noticed a thread from 2014[1] where Matt Newell tried to implement this stuff and apparently the main review comment he got before abandoning the patch was that the user would like a way to access the query that corresponded to each result. The current patch set does not address that need; the approach to this problem is that it's on the application's head to keep track of this. Honestly I don't understand why it would be otherwise ... I'm not sure that it makes sense to expect that the application is stupid enough that it doesn't keep track in which order it sent things, but bright enough to keep pointers to the queries it sent (??). So this seems okay to me. But added Heikki and Claudio to CC because of that old thread. [1] https://postgr.es/m/2059418.jZQvL3lH90@obsidian diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 73c6a6ea4d..60477ca85c 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3061,6 +3061,30 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_BATCH_END + + +The PGresult represents the end of a batch. +This status occurs only when batch mode has been selected. + + + + + + PGRES_BATCH_ABORTED + + +The PGresult represents a batch that's +received an error from the server. PQgetResult +must be called repeatedly, and it will return this status code, +until the end of the current batch, at which point it will return +PGRES_BATCH_END and normal processing can resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4819,6 +4843,482 @@ int PQflush(PGconn *conn); + + Batch Mode + + + libpq + batch mode + + + + pipelining + in libpq + + + + libpq batch mode allows applications to + send a query without having to read the result of the previously + sent query. Taking advantage of the batch mode, a client will wait + less for the server, since multiple queries/results can be sent/ + received in a single network transaction. + + + + While batch mode provides a significant performance boost, writing + clients using the batch mode is more complex because it involves + managing a queue of pending queries and finding which result + corresponds to which query in the queue. + + + + Using Batch Mode + + +To issue batches the application must switch a connection into batch mode. +Enter batch mode with +or test whether batch mode is active with +. +In batch mode, only asynchronous operations +are permitted, and COPY is not recommended as it +may trigger failure in batch processing. Using any synchronous +command execution functions such as PQfn, +PQexec or one of its sibling functions are error +conditions. + + + + + It is best to use batch mode
Re: PATCH: Batch/pipelining support for libpq
(Adding previous reviewers to CC) On 2020-Nov-03, David G. Johnston wrote: > Given the caveats around blocking mode connections why not just require > non-blocking mode, in a similar fashion to how synchronous functions are > disallowed? This is a very good question. Why indeed? Does anybody have a good answer to this? If not, I propose we just require that non-blocking mode is in use in order for batch mode to be used. I've been doing a review pass over this patch and have an updated version, which I intend to share later today (after I fix what appears to be a misunderstanding in the "singlerow" test in testlibpqbatch.c)
Re: PATCH: Batch/pipelining support for libpq
Hi David, Thanks for the feedback. I did rework a bit the doc based on your remarks. Here is the v24 patch. Matthieu Garrigues On Tue, Nov 3, 2020 at 6:21 PM David G. Johnston wrote: > > On Mon, Nov 2, 2020 at 8:58 AM Alvaro Herrera wrote: >> >> On 2020-Nov-02, Alvaro Herrera wrote: >> >> > In v23 I've gone over docs; discovered that PQgetResults docs were >> > missing the new values. Added those. No significant other changes yet. >> > > Just reading the documentation of this patch, haven't been following the > longer thread: > > Given the caveats around blocking mode connections why not just require > non-blocking mode, in a similar fashion to how synchronous functions are > disallowed? > > "Batched operations will be executed by the server in the order the client > sends them. The server will send the results in the order the statements > executed." > > Maybe: > > "The server executes statements, and returns results, in the order the client > sends them." > > Using two sentences and relying on the user to mentally link the two "in the > order" descriptions together seems to add unnecessary cognitive load. > > + The client interleaves result > + processing with sending batch queries, or for small batches may > + process all results after sending the whole batch. > > Suggest: "The client may choose to interleave result processing with sending > batch queries, or wait until the complete batch has been sent." > > I would expect to process the results of a batch only after sending the > entire batch to the server. That I don't have to is informative but knowing > when I should avoid doing so, and why, is informative as well. To the > extreme while you can use batch mode and interleave if you just poll > getResult after every command you will make the whole batch thing pointless. > Directing the reader from here to the section "Interleaving Result Processing > and Query Dispatch" seems worth considering. The dynamics of small sizes and > sockets remains a bit unclear as to what will break (if anything, or is it > just process memory on the server) if interleaving it not performed and sizes > are large. > > I would suggest placing commentary about "all transactions subsequent to a > failed transaction in a batch are ignored while previous completed > transactions are retained" in the "When to Use Batching". Something like > "Batching is less useful, and more complex, when a single batch contains > multiple transactions (see Error Handling)." > > My imagined use case would be to open a batch, start a transaction, send all > of its components, end the transaction, end the batch, check for batch > failure and if it doesn't fail have the option to easily continue without > processing individual pgResults (or if it does fail, have the option to > extract the first error pgResult and continue, ignoring the rest, knowing > that the transaction as a whole was reverted and the batch unapplied). I've > never interfaced with libpq directly. Though given how the existing C API > works what is implemented here seems consistent. > > The "queueing up queries into a pipeline to be executed as a batch on the > server" can be read as a client-side behavior where nothing is sent to the > server until the batch has been completed. Reading further it becomes clear > that all it basically is is a sever-side toggle that instructs the server to > continue processing incoming commands even while prior commands have their > results waiting to be ingested by the client. > > Batch seems like the user-visible term to describe this feature. Pipeline > seems like an implementation detail that doesn't need to be mentioned in the > documentation - especially given that pipeline doesn't get a mentioned beyond > the first two paragraphs of the chapter and never without being linked > directly to "batch". I would probably leave the indexterm and have a > paragraph describing that batching is implemented using a query pipeline so > that people with the implementation detail on their mind can find this > chapter, but the prose for the user should just stick to batching. > > Sorry, that all is a bit unfocused, but the documentation for the user of the > API could be cleaned up a bit and some more words spent on what trade-offs > are being made when using batching versus normal command-response processing. > That said, while I don't see all of this purely a matter of style I'm also > not seeing anything demonstrably wrong with the documentation at the moment. > Hopefully my perspective helps though, and depending on what happens next I > may try and make my thoughts more concrete with an actual patch. > > David J. > diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 9ce32fb39b..044e3a705f 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3061,6 +3061,30 @@ ExecStatusType PQresultStatus(const PGresult *res);
Re: PATCH: Batch/pipelining support for libpq
Hi, On 2020-11-03 10:42:34 -0300, Alvaro Herrera wrote: > It would definitely help if you (and others) could think about the API > being added: Does it fulfill the promises being made? Does it offer the > guarantees that real-world apps want to have? I'm not much of an > application writer myself -- particularly high-traffic apps that would > want to use this. Somewhere earlier in this thread there was a patch with support for batching in pgbench. I think it'd be good to refresh that. Both because it shows at least some real-world-lite usage of the feature and because we need a way to stress it to see whether it has unnecessary bottlenecks. Greetings, Andres Freund
Re: PATCH: Batch/pipelining support for libpq
On Mon, Nov 2, 2020 at 8:58 AM Alvaro Herrera wrote: > On 2020-Nov-02, Alvaro Herrera wrote: > > > In v23 I've gone over docs; discovered that PQgetResults docs were > > missing the new values. Added those. No significant other changes yet. > > Just reading the documentation of this patch, haven't been following the longer thread: Given the caveats around blocking mode connections why not just require non-blocking mode, in a similar fashion to how synchronous functions are disallowed? "Batched operations will be executed by the server in the order the client sends them. The server will send the results in the order the statements executed." Maybe: "The server executes statements, and returns results, in the order the client sends them." Using two sentences and relying on the user to mentally link the two "in the order" descriptions together seems to add unnecessary cognitive load. + The client interleaves result + processing with sending batch queries, or for small batches may + process all results after sending the whole batch. Suggest: "The client may choose to interleave result processing with sending batch queries, or wait until the complete batch has been sent." I would expect to process the results of a batch only after sending the entire batch to the server. That I don't have to is informative but knowing when I should avoid doing so, and why, is informative as well. To the extreme while you can use batch mode and interleave if you just poll getResult after every command you will make the whole batch thing pointless. Directing the reader from here to the section "Interleaving Result Processing and Query Dispatch" seems worth considering. The dynamics of small sizes and sockets remains a bit unclear as to what will break (if anything, or is it just process memory on the server) if interleaving it not performed and sizes are large. I would suggest placing commentary about "all transactions subsequent to a failed transaction in a batch are ignored while previous completed transactions are retained" in the "When to Use Batching". Something like "Batching is less useful, and more complex, when a single batch contains multiple transactions (see Error Handling)." My imagined use case would be to open a batch, start a transaction, send all of its components, end the transaction, end the batch, check for batch failure and if it doesn't fail have the option to easily continue without processing individual pgResults (or if it does fail, have the option to extract the first error pgResult and continue, ignoring the rest, knowing that the transaction as a whole was reverted and the batch unapplied). I've never interfaced with libpq directly. Though given how the existing C API works what is implemented here seems consistent. The "queueing up queries into a pipeline to be executed as a batch on the server" can be read as a client-side behavior where nothing is sent to the server until the batch has been completed. Reading further it becomes clear that all it basically is is a sever-side toggle that instructs the server to continue processing incoming commands even while prior commands have their results waiting to be ingested by the client. Batch seems like the user-visible term to describe this feature. Pipeline seems like an implementation detail that doesn't need to be mentioned in the documentation - especially given that pipeline doesn't get a mentioned beyond the first two paragraphs of the chapter and never without being linked directly to "batch". I would probably leave the indexterm and have a paragraph describing that batching is implemented using a query pipeline so that people with the implementation detail on their mind can find this chapter, but the prose for the user should just stick to batching. Sorry, that all is a bit unfocused, but the documentation for the user of the API could be cleaned up a bit and some more words spent on what trade-offs are being made when using batching versus normal command-response processing. That said, while I don't see all of this purely a matter of style I'm also not seeing anything demonstrably wrong with the documentation at the moment. Hopefully my perspective helps though, and depending on what happens next I may try and make my thoughts more concrete with an actual patch. David J.
Re: PATCH: Batch/pipelining support for libpq
I implemented a C++ async HTTP server using this new batch mode and it provides everything I needed to transparently batch sql requests. It gives a performance boost between x2 and x3 on this benchmark: https://www.techempower.com/benchmarks/#section=test&runid=3097dbae-5228-454c-ba2e-2055d3982790&hw=ph&test=query&a=2&f=zik0zj-zik0zj-zik0zj-zik0zj-zieepr-zik0zj-zik0zj-zik0zj-zik0zj-zik0zj-zik0zj I'll ask other users interested in this to review the API. Matthieu Garrigues On Tue, Nov 3, 2020 at 4:56 PM Dave Cramer wrote: > > > > On Tue, 3 Nov 2020 at 08:42, Alvaro Herrera wrote: >> >> Hi Dave, >> >> On 2020-Nov-03, Dave Cramer wrote: >> >> > On Mon, 2 Nov 2020 at 10:57, Alvaro Herrera >> > wrote: >> > >> > > On 2020-Nov-02, Alvaro Herrera wrote: >> > > >> > > > In v23 I've gone over docs; discovered that PQgetResults docs were >> > > > missing the new values. Added those. No significant other changes >> > > > yet. >> > >> > Thanks for looking at this. >> > >> > What else does it need to get it in shape to apply? >> >> I want to go over the code in depth to grok the design more fully. >> >> It would definitely help if you (and others) could think about the API >> being added: Does it fulfill the promises being made? Does it offer the >> guarantees that real-world apps want to have? I'm not much of an >> application writer myself -- particularly high-traffic apps that would >> want to use this. As a driver author I would welcome your insight in >> these questions. >> > > I'm sort of in the same boat as you. While I'm closer to the client. I don't > personally write that much client code. > > I'd really like to hear from the users here. > > > Dave Cramer > www.postgres.rocks
Re: PATCH: Batch/pipelining support for libpq
On Tue, 3 Nov 2020 at 08:42, Alvaro Herrera wrote: > Hi Dave, > > On 2020-Nov-03, Dave Cramer wrote: > > > On Mon, 2 Nov 2020 at 10:57, Alvaro Herrera > wrote: > > > > > On 2020-Nov-02, Alvaro Herrera wrote: > > > > > > > In v23 I've gone over docs; discovered that PQgetResults docs were > > > > missing the new values. Added those. No significant other changes > yet. > > > > Thanks for looking at this. > > > > What else does it need to get it in shape to apply? > > I want to go over the code in depth to grok the design more fully. > > It would definitely help if you (and others) could think about the API > being added: Does it fulfill the promises being made? Does it offer the > guarantees that real-world apps want to have? I'm not much of an > application writer myself -- particularly high-traffic apps that would > want to use this. As a driver author I would welcome your insight in > these questions. > > I'm sort of in the same boat as you. While I'm closer to the client. I don't personally write that much client code. I'd really like to hear from the users here. Dave Cramer www.postgres.rocks
Re: PATCH: Batch/pipelining support for libpq
Hi Dave, On 2020-Nov-03, Dave Cramer wrote: > On Mon, 2 Nov 2020 at 10:57, Alvaro Herrera wrote: > > > On 2020-Nov-02, Alvaro Herrera wrote: > > > > > In v23 I've gone over docs; discovered that PQgetResults docs were > > > missing the new values. Added those. No significant other changes yet. > > Thanks for looking at this. > > What else does it need to get it in shape to apply? I want to go over the code in depth to grok the design more fully. It would definitely help if you (and others) could think about the API being added: Does it fulfill the promises being made? Does it offer the guarantees that real-world apps want to have? I'm not much of an application writer myself -- particularly high-traffic apps that would want to use this. As a driver author I would welcome your insight in these questions.
Re: PATCH: Batch/pipelining support for libpq
Alvaro, On Mon, 2 Nov 2020 at 10:57, Alvaro Herrera wrote: > On 2020-Nov-02, Alvaro Herrera wrote: > > > In v23 I've gone over docs; discovered that PQgetResults docs were > > missing the new values. Added those. No significant other changes yet. > > > Thanks for looking at this. What else does it need to get it in shape to apply? Dave Cramer www.postgres.rocks
Re: PATCH: Batch/pipelining support for libpq
On 2020-Nov-02, Alvaro Herrera wrote: > In v23 I've gone over docs; discovered that PQgetResults docs were > missing the new values. Added those. No significant other changes yet. >From 3f305d05000981445fe4dbbb96c2e88ace89f439 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Mon, 26 Oct 2020 13:24:24 -0300 Subject: [PATCH v23] libpq batch --- doc/src/sgml/libpq.sgml | 505 doc/src/sgml/lobj.sgml|4 + .../libpqwalreceiver/libpqwalreceiver.c |3 + src/interfaces/libpq/exports.txt |4 + src/interfaces/libpq/fe-connect.c | 27 + src/interfaces/libpq/fe-exec.c| 630 +- src/interfaces/libpq/fe-protocol2.c |6 + src/interfaces/libpq/fe-protocol3.c | 16 +- src/interfaces/libpq/libpq-fe.h | 22 +- src/interfaces/libpq/libpq-int.h | 46 +- src/test/modules/Makefile |1 + src/test/modules/test_libpq/.gitignore|5 + src/test/modules/test_libpq/Makefile | 25 + src/test/modules/test_libpq/README|1 + .../modules/test_libpq/t/001_libpq_async.pl | 27 + src/test/modules/test_libpq/testlibpqbatch.c | 1013 + src/tools/msvc/Mkvcbuild.pm |3 +- src/tools/pgindent/typedefs.list |3 + 18 files changed, 2295 insertions(+), 46 deletions(-) create mode 100644 src/test/modules/test_libpq/.gitignore create mode 100644 src/test/modules/test_libpq/Makefile create mode 100644 src/test/modules/test_libpq/README create mode 100644 src/test/modules/test_libpq/t/001_libpq_async.pl create mode 100644 src/test/modules/test_libpq/testlibpqbatch.c diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 9ce32fb39b..839cadfaf0 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3061,6 +3061,30 @@ ExecStatusType PQresultStatus(const PGresult *res); + + + PGRES_BATCH_END + + +The PGresult represents the end of a batch. +This status occurs only when batch mode has been selected. + + + + + + PGRES_BATCH_ABORTED + + +The PGresult represents a batch that's +received an error from the server. PQgetResult +must be called repeatedly, and it will return this status code, +until the end of the current batch, at which point it will return +PGRES_BATCH_END and normal processing can resume. + + + + If the result status is PGRES_TUPLES_OK or @@ -4819,6 +4843,478 @@ int PQflush(PGconn *conn); + + Batch Mode and Query Pipelining + + + libpq + batch mode + + + + libpq + pipelining + + + + libpq supports queueing up queries into + a pipeline to be executed as a batch on the server. Batching queries + allows applications to avoid a client/server round-trip after each + query to get the results before issuing the next query. + + + + When to Use Batching + + +Much like asynchronous query mode, there is no performance disadvantage to +using batching and pipelining. It increases client application complexity +and extra caution is required to prevent client/server deadlocks, but +pipelining can sometimes offer considerable performance improvements. + + + +Batching is most useful when the server is distant, i.e., network latency +(ping time) is high, and also when many small operations +are being performed in rapid sequence. There is usually less benefit +in using batches when each query takes many multiples of the client/server +round-trip time to execute. A 100-statement operation run on a server +300ms round-trip-time away would take 30 seconds in network latency alone +without batching; with batching it may spend as little as 0.3s waiting for +results from the server. + + + +Use batches when your application does lots of small +INSERT, UPDATE and +DELETE operations that can't easily be transformed +into operations on sets or into a COPY operation. + + + +Batching is not useful when information from one operation is required by +the client to produce the next operation. In such cases, the client +must introduce a synchronization point and wait for a full client/server +round-trip to get the results it needs. However, it's often possible to +adjust the client design to exchange the required information server-side. +Read-modify-write cycles are especially good candidates; for example: + +BEGIN; +SELECT x FROM mytable WHERE id = 42 FOR UPDATE; +-- result: x=2 +-- client adds 1 to x: +UPDATE mytable SET x = 3 WHERE id = 42; +COMMIT; + +could be much more efficiently done with: + +
Re: PATCH: Batch/pipelining support for libpq
In v23 I've gone over docs; discovered that PQgetResults docs were missing the new values. Added those. No significant other changes yet.
Re: PATCH: Batch/pipelining support for libpq
I started reading this patch. As a starting point I decided to post an updated copy (v22), wherein I reverted the overwriting of src/include/port/linux.h with the win32.h contents (?) and the inclusion of in libpq-fe.h. If those are needed, some different solution will have to be found. Trivial other changes (pgindent etc); nothing of substance. With the PQtrace() patch I posted at [1] the added test program crashes -- I don't know if the fault lies in this patch or the other one. [1] https://postgr.es/m/20201026162313.GA22502@alvherre.pgsql >From 4d4e86c817599c7370b44c75e3caca9b4d3fea2d Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Mon, 26 Oct 2020 13:24:24 -0300 Subject: [PATCH v22] libpq batch --- doc/src/sgml/libpq.sgml | 468 doc/src/sgml/lobj.sgml|4 + .../libpqwalreceiver/libpqwalreceiver.c |3 + src/interfaces/libpq/exports.txt |4 + src/interfaces/libpq/fe-connect.c | 27 + src/interfaces/libpq/fe-exec.c| 630 ++- src/interfaces/libpq/fe-protocol2.c |6 + src/interfaces/libpq/fe-protocol3.c | 16 +- src/interfaces/libpq/libpq-fe.h | 22 +- src/interfaces/libpq/libpq-int.h | 46 +- src/test/modules/Makefile |1 + src/test/modules/test_libpq/.gitignore|5 + src/test/modules/test_libpq/Makefile | 25 + src/test/modules/test_libpq/README|1 + .../modules/test_libpq/t/001_libpq_async.pl | 27 + src/test/modules/test_libpq/testlibpqbatch.c | 1008 + src/tools/msvc/Mkvcbuild.pm |3 +- src/tools/pgindent/typedefs.list |3 + 18 files changed, 2253 insertions(+), 46 deletions(-) create mode 100644 src/test/modules/test_libpq/.gitignore create mode 100644 src/test/modules/test_libpq/Makefile create mode 100644 src/test/modules/test_libpq/README create mode 100644 src/test/modules/test_libpq/t/001_libpq_async.pl create mode 100644 src/test/modules/test_libpq/testlibpqbatch.c diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index de60281fcb..a768feffd3 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -4819,6 +4819,465 @@ int PQflush(PGconn *conn); + + Batch mode and query pipelining + + + libpq + batch mode + + + + libpq + pipelining + + + + libpq supports queueing up queries into + a pipeline to be executed as a batch on the server. Batching queries allows + applications to avoid a client/server round-trip after each query to get + the results before issuing the next query. + + + + When to use batching + + +Much like asynchronous query mode, there is no performance disadvantage to +using batching and pipelining. It increases client application complexity +and extra caution is required to prevent client/server deadlocks but +can sometimes offer considerable performance improvements. + + + +Batching is most useful when the server is distant, i.e. network latency +(ping time) is high, and when many small operations are being performed in +rapid sequence. There is usually less benefit in using batches when each +query takes many multiples of the client/server round-trip time to execute. +A 100-statement operation run on a server 300ms round-trip-time away would take +30 seconds in network latency alone without batching; with batching it may spend +as little as 0.3s waiting for results from the server. + + + +Use batches when your application does lots of small +INSERT, UPDATE and +DELETE operations that can't easily be transformed into +operations on sets or into a +COPY operation. + + + +Batching is not useful when information from one operation is required by the +client before it knows enough to send the next operation. The client must +introduce a synchronisation point and wait for a full client/server +round-trip to get the results it needs. However, it's often possible to +adjust the client design to exchange the required information server-side. +Read-modify-write cycles are especially good candidates; for example: + + BEGIN; + SELECT x FROM mytable WHERE id = 42 FOR UPDATE; + -- result: x=2 + -- client adds 1 to x: + UPDATE mytable SET x = 3 WHERE id = 42; + COMMIT; + +could be much more efficiently done with: + + UPDATE mytable SET x = x + 1 WHERE id = 42; + + + + + + The batch API was introduced in PostgreSQL 14.0, but clients using PostgresSQL 14.0 version of libpq can + use batches on server versions 7.4 and newer. Batching works on any server + that supports the v3 extended query protocol. + + + + + + + Using batch mode + + +To issue batches the application must switch +a connection into batch mode. Enter
Re: PATCH: Batch/pipelining support for libpq
This patch fixes compilation on windows and compilation of the documentation. Matthieu Garrigues On Thu, Oct 1, 2020 at 8:41 AM Matthieu Garrigues wrote: > > On Thu, Oct 1, 2020 at 6:35 AM Michael Paquier wrote: > > > The documentation is failing to build, and the patch does not build > > correctly on Windows. Could you address that? > > -- > > Michael > > Yes I'm on it. > > -- > Matthieu diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 92556c7ce0..932561d0c5 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -4829,6 +4829,465 @@ int PQflush(PGconn *conn); + + Batch mode and query pipelining + + + libpq + batch mode + + + + libpq + pipelining + + + + libpq supports queueing up queries into + a pipeline to be executed as a batch on the server. Batching queries allows + applications to avoid a client/server round-trip after each query to get + the results before issuing the next query. + + + + When to use batching + + +Much like asynchronous query mode, there is no performance disadvantage to +using batching and pipelining. It increases client application complexity +and extra caution is required to prevent client/server deadlocks but +can sometimes offer considerable performance improvements. + + + +Batching is most useful when the server is distant, i.e. network latency +(ping time) is high, and when many small operations are being performed in +rapid sequence. There is usually less benefit in using batches when each +query takes many multiples of the client/server round-trip time to execute. +A 100-statement operation run on a server 300ms round-trip-time away would take +30 seconds in network latency alone without batching; with batching it may spend +as little as 0.3s waiting for results from the server. + + + +Use batches when your application does lots of small +INSERT, UPDATE and +DELETE operations that can't easily be transformed into +operations on sets or into a +COPY operation. + + + +Batching is not useful when information from one operation is required by the +client before it knows enough to send the next operation. The client must +introduce a synchronisation point and wait for a full client/server +round-trip to get the results it needs. However, it's often possible to +adjust the client design to exchange the required information server-side. +Read-modify-write cycles are especially good candidates; for example: + + BEGIN; + SELECT x FROM mytable WHERE id = 42 FOR UPDATE; + -- result: x=2 + -- client adds 1 to x: + UPDATE mytable SET x = 3 WHERE id = 42; + COMMIT; + +could be much more efficiently done with: + + UPDATE mytable SET x = x + 1 WHERE id = 42; + + + + + + The batch API was introduced in PostgreSQL 14.0, but clients using PostgresSQL 14.0 version of libpq can + use batches on server versions 7.4 and newer. Batching works on any server + that supports the v3 extended query protocol. + + + + + + + Using batch mode + + +To issue batches the application must switch +a connection into batch mode. Enter batch mode with PQenterBatchMode(conn) or test +whether batch mode is active with PQbatchStatus(conn). In batch mode only asynchronous operations are permitted, and +COPY is not recommended as it most likely will trigger failure in batch processing. +Using any synchronous command execution functions such as PQfn, +PQexec or one of its sibling functions are error conditions. +Functions allowed in batch mode are described in . + + + +The client uses libpq's asynchronous query functions to dispatch work, +marking the end of each batch with PQbatchSendQueue. +And to get results, it uses PQgetResult. It may eventually exit +batch mode with PQexitBatchMode once all results are +processed. + + + + + It is best to use batch mode with libpq in + non-blocking mode. If used in + blocking mode it is possible for a client/server deadlock to occur. The + client will block trying to send queries to the server, but the server will + block trying to send results from queries it has already processed to the + client. This only occurs when the client sends enough queries to fill its + output buffer and the server's receive buffer before switching to + processing input from the server, but it's hard to predict exactly when + that'll happen so it's best to always use non-blocking mode. + Batch mode consumes more memory when send/recv is not done as required even in non-blocking mode. + + + + +Issuing queries + + + After entering batch mode the application dispatches requests + using normal asynchronous libpq functions such as + PQsendQueryParams, PQsendPrepare, + PQsendQueryPrepared, PQsendDesc
Re: PATCH: Batch/pipelining support for libpq
On Thu, Oct 1, 2020 at 6:35 AM Michael Paquier wrote: > The documentation is failing to build, and the patch does not build > correctly on Windows. Could you address that? > -- > Michael Yes I'm on it. -- Matthieu
Re: PATCH: Batch/pipelining support for libpq
On Mon, Sep 21, 2020 at 07:55:03PM +0200, Matthieu Garrigues wrote: > I merged PQbatchProcessQueue into PQgetResult. > One first init call to PQbatchProcessQueue was also required in > PQsendQueue to have > PQgetResult ready to read the first batch query. > > Tests and documentation are updated accordingly. The documentation is failing to build, and the patch does not build correctly on Windows. Could you address that? -- Michael signature.asc Description: PGP signature
Re: PATCH: Batch/pipelining support for libpq
Hi Dave, I merged PQbatchProcessQueue into PQgetResult. One first init call to PQbatchProcessQueue was also required in PQsendQueue to have PQgetResult ready to read the first batch query. Tests and documentation are updated accordingly. Matthieu Garrigues On Mon, Sep 21, 2020 at 3:39 PM Dave Cramer wrote: > > > > On Mon, 21 Sep 2020 at 09:21, Matthieu Garrigues > wrote: >> >> Matthieu Garrigues >> >> On Mon, Sep 21, 2020 at 3:09 PM Dave Cramer >> wrote: >> >> >> > There was a comment upthread a while back that people should look at the >> > comments made in >> > https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jp >> > by Horiguchi-San. >> > >> > From what I can tell this has not been addressed. The one big thing is the >> > use of PQbatchProcessQueue vs just putting it in PQgetResult. >> > >> > The argument is that adding PQbatchProcessQueue is unnecessary and just >> > adds another step. Looking at this, it seems like putting this inside >> > PQgetResult would get my vote as it leaves the interface unchanged. >> > >> >> Ok. I'll merge PQbatchProcessQueue into PQgetResult. But just one >> thing: I'll keep PQgetResult returning null between the result of two >> batched query so the user >> can know which result comes from which query. > > > Fair enough. > > There may be other things in his comments that need to be addressed. That was > the big one that stuck out for me. > > Thanks for working on this! > > > Dave Cramer > www.postgres.rocks diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 92556c7ce0..15d0c03c89 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -4829,6 +4829,465 @@ int PQflush(PGconn *conn); + + Batch mode and query pipelining + + + libpq + batch mode + + + + libpq + pipelining + + + + libpq supports queueing up queries into + a pipeline to be executed as a batch on the server. Batching queries allows + applications to avoid a client/server round-trip after each query to get + the results before issuing the next query. + + + + When to use batching + + +Much like asynchronous query mode, there is no performance disadvantage to +using batching and pipelining. It increases client application complexity +and extra caution is required to prevent client/server deadlocks but +can sometimes offer considerable performance improvements. + + + +Batching is most useful when the server is distant, i.e. network latency +(ping time) is high, and when many small operations are being performed in +rapid sequence. There is usually less benefit in using batches when each +query takes many multiples of the client/server round-trip time to execute. +A 100-statement operation run on a server 300ms round-trip-time away would take +30 seconds in network latency alone without batching; with batching it may spend +as little as 0.3s waiting for results from the server. + + + +Use batches when your application does lots of small +INSERT, UPDATE and +DELETE operations that can't easily be transformed into +operations on sets or into a +COPY operation. + + + +Batching is not useful when information from one operation is required by the +client before it knows enough to send the next operation. The client must +introduce a synchronisation point and wait for a full client/server +round-trip to get the results it needs. However, it's often possible to +adjust the client design to exchange the required information server-side. +Read-modify-write cycles are especially good candidates; for example: + + BEGIN; + SELECT x FROM mytable WHERE id = 42 FOR UPDATE; + -- result: x=2 + -- client adds 1 to x: + UPDATE mytable SET x = 3 WHERE id = 42; + COMMIT; + +could be much more efficiently done with: + + UPDATE mytable SET x = x + 1 WHERE id = 42; + + + + + + The batch API was introduced in PostgreSQL 10.0, but clients using PostgresSQL 10.0 version of libpq can + use batches on server versions 7.4 and newer. Batching works on any server + that supports the v3 extended query protocol. + + + + + + + Using batch mode + + +To issue batches the application must switch +a connection into batch mode. Enter batch mode with PQenterBatchMode(conn) or test +whether batch mode is active with PQbatchStatus(conn). In batch mode only asynchronous operations are permitted, and +COPY is not recommended as it most likely will trigger failure in batch processing. +Using any synchronous command execution functions such as PQfn, +PQexec or one of its sibling functions are error conditions. +Functions allowed in batch mode are described in . + + + +The client uses libpq's asynchronous query functions to dispatch work, +marking the end of each batch with PQbatchSendQueue. +And
Re: PATCH: Batch/pipelining support for libpq
On Mon, Sep 21, 2020 at 3:39 PM Dave Cramer wrote: > > > > On Mon, 21 Sep 2020 at 09:21, Matthieu Garrigues > wrote: >> >> Matthieu Garrigues >> >> On Mon, Sep 21, 2020 at 3:09 PM Dave Cramer >> wrote: >> >> >> > There was a comment upthread a while back that people should look at the >> > comments made in >> > https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jp >> > by Horiguchi-San. >> > >> > From what I can tell this has not been addressed. The one big thing is the >> > use of PQbatchProcessQueue vs just putting it in PQgetResult. >> > >> > The argument is that adding PQbatchProcessQueue is unnecessary and just >> > adds another step. Looking at this, it seems like putting this inside >> > PQgetResult would get my vote as it leaves the interface unchanged. >> > >> >> Ok. I'll merge PQbatchProcessQueue into PQgetResult. But just one >> thing: I'll keep PQgetResult returning null between the result of two >> batched query so the user >> can know which result comes from which query. > > > Fair enough. > > There may be other things in his comments that need to be addressed. That was > the big one that stuck out for me. > > Thanks for working on this! > Yes I already addressed the other things in the v19 patch: https://www.postgresql.org/message-id/flat/cajkzx4t5e-2cqe3dtv2r78dyfvz+in8py7a8marvlhs_pg7...@mail.gmail.com
Re: PATCH: Batch/pipelining support for libpq
On Mon, 21 Sep 2020 at 09:21, Matthieu Garrigues < matthieu.garrig...@gmail.com> wrote: > Matthieu Garrigues > > On Mon, Sep 21, 2020 at 3:09 PM Dave Cramer > wrote: > >> > > There was a comment upthread a while back that people should look at the > comments made in > https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jp > by Horiguchi-San. > > > > From what I can tell this has not been addressed. The one big thing is > the use of PQbatchProcessQueue vs just putting it in PQgetResult. > > > > The argument is that adding PQbatchProcessQueue is unnecessary and just > adds another step. Looking at this, it seems like putting this inside > PQgetResult would get my vote as it leaves the interface unchanged. > > > > Ok. I'll merge PQbatchProcessQueue into PQgetResult. But just one > thing: I'll keep PQgetResult returning null between the result of two > batched query so the user > can know which result comes from which query. > Fair enough. There may be other things in his comments that need to be addressed. That was the big one that stuck out for me. Thanks for working on this! Dave Cramer www.postgres.rocks
Re: PATCH: Batch/pipelining support for libpq
Matthieu Garrigues On Mon, Sep 21, 2020 at 3:09 PM Dave Cramer wrote: >> > There was a comment upthread a while back that people should look at the > comments made in > https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jp > by Horiguchi-San. > > From what I can tell this has not been addressed. The one big thing is the > use of PQbatchProcessQueue vs just putting it in PQgetResult. > > The argument is that adding PQbatchProcessQueue is unnecessary and just adds > another step. Looking at this, it seems like putting this inside PQgetResult > would get my vote as it leaves the interface unchanged. > Ok. I'll merge PQbatchProcessQueue into PQgetResult. But just one thing: I'll keep PQgetResult returning null between the result of two batched query so the user can know which result comes from which query.
Re: PATCH: Batch/pipelining support for libpq
On 2020-Sep-21, Dave Cramer wrote: Hello Dave, > I am looking for this in the commitfest and can't find it. However there is > an old commitfest entry > > https://commitfest.postgresql.org/13/1024/ > > Do you have the link for the new one ? Here you go: https://commitfest.postgresql.org/29/2724/ -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: PATCH: Batch/pipelining support for libpq
Dave Cramer www.postgres.rocks On Mon, 31 Aug 2020 at 11:46, Matthieu Garrigues < matthieu.garrig...@gmail.com> wrote: > Hi, > > It seems like this patch is nearly finished. I fixed all the remaining > issues. I'm also asking > a confirmation of the test scenarios you want to see in the next > version of the patch. > > > Hi, > > > > On 2020-07-10 19:01:49 -0400, Alvaro Herrera wrote: > > > Totally unasked for, here's a rebase of this patch series. I didn't do > > > anything other than rebasing to current master, solving a couple of > very > > > trivial conflicts, fixing some whitespace complaints by git apply, and > > > running tests to verify everthing works. > > > > > > I don't foresee working on this at all, so if anyone is interested in > > > seeing this feature in, I encourage them to read and address > > > Horiguchi-san's feedback. > > > > Nor am I planning to do so, but I do think its a pretty important > > improvement. > > > > > Fixed > > > > > > > > +/* > > > + * PQrecyclePipelinedCommand > > > + * Push a command queue entry onto the freelist. It must be a > dangling entry > > > + * with null next pointer and not referenced by any other entry's > next pointer. > > > + */ > > > > Dangling sounds a bit like it's already freed. > > > > > Fixed > > > > > > > > +/* > > > + * PQbatchSendQueue > > > + * End a batch submission by sending a protocol sync. The connection > will > > > + * remain in batch mode and unavailable for new synchronous command > execution > > > + * functions until all results from the batch are processed by the > client. > > > > I feel like the reference to the protocol sync is a bit too low level > > for an external API. It should first document what the function does > > from a user's POV. > > > > I think it'd also be good to document whether / whether not queries can > > already have been sent before PQbatchSendQueue is called or not. > > > Fixed > > > > > > > > > > + if (conn->batch_status == PQBATCH_MODE_ABORTED && conn->queryclass > != PGQUERY_SYNC) > > > + { > > > + /* > > > + * In an aborted batch we don't get anything from the server for each > > > + * result; we're just discarding input until we get to the next sync > > > + * from the server. The client needs to know its queries got aborted > > > + * so we create a fake PGresult to return immediately from > > > + * PQgetResult. > > > + */ > > > + conn->result = PQmakeEmptyPGresult(conn, > > > + PGRES_BATCH_ABORTED); > > > + if (!conn->result) > > > + { > > > + printfPQExpBuffer(&conn->errorMessage, > > > + libpq_gettext("out of memory")); > > > + pqSaveErrorResult(conn); > > > + return 0; > > > > Is there any way an application can recover at this point? ISTM we'd be > > stuck in the previous asyncStatus, no? > > > > conn->result is null when malloc(sizeof(PGresult)) returns null. It's > very unlikely unless > the server machine is out of memory, so the server will probably be > unresponsive anyway. > > I'm leaving this as it is but if anyone has a solution simple to > implement I'll fix it. > > > > > > > > +/* pqBatchFlush > > > + * In batch mode, data will be flushed only when the out buffer > reaches the threshold value. > > > + * In non-batch mode, data will be flushed all the time. > > > + */ > > > +static int > > > +pqBatchFlush(PGconn *conn) > > > +{ > > > + if ((conn->batch_status == PQBATCH_MODE_OFF)||(conn->outCount >= > OUTBUFFER_THRESHOLD)) > > > + return(pqFlush(conn)); > > > + return 0; /* Just to keep compiler quiet */ > > > +} > > > > unnecessarily long line. > > > Fixed > > > > > > +/* > > > + * Connection's outbuffer threshold is set to 64k as it is safe > > > + * in Windows as per comments in pqSendSome() API. > > > + */ > > > +#define OUTBUFFER_THRESHOLD 65536 > > > > I don't think the comment explains much. It's fine to send more than 64k > > with pqSendSome(), they'll just be send with separate pgsecure_write() > > invocations. And only on windows. > > > > It clearly makes sense to start sending out data at a certain > > granularity to avoid needing unnecessary amounts of memory, and to make > > more efficient use of latency / serer side compute. > > > > It's not implausible that 64k is the right amount for that, I just don't > > think the explanation above is good. > > > > Fixed > > > > diff --git a/src/test/modules/test_libpq/testlibpqbatch.c > b/src/test/modules/test_libpq/testlibpqbatch.c > > > new file mode 100644 > > > index 00..4d6ba266e5 > > > --- /dev/null > > > +++ b/src/test/modules/test_libpq/testlibpqbatch.c > > > @@ -0,0 +1,1456 @@ > > > +/* > > > + * src/test/modules/test_libpq/testlibpqbatch.c > > > + * > > > + * > > > + * testlibpqbatch.c > > > + * Test of batch execution functionality > > > + */ > > > + > > > +#ifdef WIN32 > > > +#include > > > +#endif > > > > ISTM that this shouldn't be needed in a test program like this? > > Shouldn't libpq abstract all of this away? > > > > Fixed. > > > > > > +static void > > > +simple_batch(PGconn *conn) > > > +{ > > > > ISTM that
Re: PATCH: Batch/pipelining support for libpq
Alvaro, On Fri, 4 Sep 2020 at 17:26, Alvaro Herrera wrote: > On 2020-Aug-31, Matthieu Garrigues wrote: > > > It seems like this patch is nearly finished. I fixed all the remaining > > issues. I'm also asking a confirmation of the test scenarios you want > > to see in the next version of the patch. > > Hmm, apparently nobody noticed that this patch is not registered in the > current commitfest. > > Since it was submitted ahead of the deadline, I'm going to add it there. > (The patch has also undergone a lot of review already; it's certainly > not a newcomer.) > I am looking for this in the commitfest and can't find it. However there is an old commitfest entry https://commitfest.postgresql.org/13/1024/ Do you have the link for the new one ? Dave Cramer www.postgres.rocks > > >
Re: PATCH: Batch/pipelining support for libpq
On 2020-Aug-31, Matthieu Garrigues wrote: > It seems like this patch is nearly finished. I fixed all the remaining > issues. I'm also asking a confirmation of the test scenarios you want > to see in the next version of the patch. Hmm, apparently nobody noticed that this patch is not registered in the current commitfest. Since it was submitted ahead of the deadline, I'm going to add it there. (The patch has also undergone a lot of review already; it's certainly not a newcomer.) -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: PATCH: Batch/pipelining support for libpq
Hi, It seems like this patch is nearly finished. I fixed all the remaining issues. I'm also asking a confirmation of the test scenarios you want to see in the next version of the patch. > Hi, > > On 2020-07-10 19:01:49 -0400, Alvaro Herrera wrote: > > Totally unasked for, here's a rebase of this patch series. I didn't do > > anything other than rebasing to current master, solving a couple of very > > trivial conflicts, fixing some whitespace complaints by git apply, and > > running tests to verify everthing works. > > > > I don't foresee working on this at all, so if anyone is interested in > > seeing this feature in, I encourage them to read and address > > Horiguchi-san's feedback. > > Nor am I planning to do so, but I do think its a pretty important > improvement. > > Fixed > > > > +/* > > + * PQrecyclePipelinedCommand > > + * Push a command queue entry onto the freelist. It must be a dangling > > entry > > + * with null next pointer and not referenced by any other entry's next > > pointer. > > + */ > > Dangling sounds a bit like it's already freed. > > Fixed > > > > +/* > > + * PQbatchSendQueue > > + * End a batch submission by sending a protocol sync. The connection will > > + * remain in batch mode and unavailable for new synchronous command > > execution > > + * functions until all results from the batch are processed by the client. > > I feel like the reference to the protocol sync is a bit too low level > for an external API. It should first document what the function does > from a user's POV. > > I think it'd also be good to document whether / whether not queries can > already have been sent before PQbatchSendQueue is called or not. > Fixed > > > > > + if (conn->batch_status == PQBATCH_MODE_ABORTED && conn->queryclass != > > PGQUERY_SYNC) > > + { > > + /* > > + * In an aborted batch we don't get anything from the server for each > > + * result; we're just discarding input until we get to the next sync > > + * from the server. The client needs to know its queries got aborted > > + * so we create a fake PGresult to return immediately from > > + * PQgetResult. > > + */ > > + conn->result = PQmakeEmptyPGresult(conn, > > + PGRES_BATCH_ABORTED); > > + if (!conn->result) > > + { > > + printfPQExpBuffer(&conn->errorMessage, > > + libpq_gettext("out of memory")); > > + pqSaveErrorResult(conn); > > + return 0; > > Is there any way an application can recover at this point? ISTM we'd be > stuck in the previous asyncStatus, no? > conn->result is null when malloc(sizeof(PGresult)) returns null. It's very unlikely unless the server machine is out of memory, so the server will probably be unresponsive anyway. I'm leaving this as it is but if anyone has a solution simple to implement I'll fix it. > > > > +/* pqBatchFlush > > + * In batch mode, data will be flushed only when the out buffer reaches > > the threshold value. > > + * In non-batch mode, data will be flushed all the time. > > + */ > > +static int > > +pqBatchFlush(PGconn *conn) > > +{ > > + if ((conn->batch_status == PQBATCH_MODE_OFF)||(conn->outCount >= > > OUTBUFFER_THRESHOLD)) > > + return(pqFlush(conn)); > > + return 0; /* Just to keep compiler quiet */ > > +} > > unnecessarily long line. > Fixed > > > +/* > > + * Connection's outbuffer threshold is set to 64k as it is safe > > + * in Windows as per comments in pqSendSome() API. > > + */ > > +#define OUTBUFFER_THRESHOLD 65536 > > I don't think the comment explains much. It's fine to send more than 64k > with pqSendSome(), they'll just be send with separate pgsecure_write() > invocations. And only on windows. > > It clearly makes sense to start sending out data at a certain > granularity to avoid needing unnecessary amounts of memory, and to make > more efficient use of latency / serer side compute. > > It's not implausible that 64k is the right amount for that, I just don't > think the explanation above is good. > Fixed > > diff --git a/src/test/modules/test_libpq/testlibpqbatch.c > > b/src/test/modules/test_libpq/testlibpqbatch.c > > new file mode 100644 > > index 00..4d6ba266e5 > > --- /dev/null > > +++ b/src/test/modules/test_libpq/testlibpqbatch.c > > @@ -0,0 +1,1456 @@ > > +/* > > + * src/test/modules/test_libpq/testlibpqbatch.c > > + * > > + * > > + * testlibpqbatch.c > > + * Test of batch execution functionality > > + */ > > + > > +#ifdef WIN32 > > +#include > > +#endif > > ISTM that this shouldn't be needed in a test program like this? > Shouldn't libpq abstract all of this away? > Fixed. > > > +static void > > +simple_batch(PGconn *conn) > > +{ > > ISTM that all or at least several of these should include tests of > transactional behaviour with pipelining (e.g. using both implicit and > explicit transactions inside a single batch, using transactions across > batches, multiple explicit transactions inside a batch). > @Andres, just to make sure I understood, here is the test scenarios I'll add: Implicit and explicit multiple transactions: start batch:
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Hi, On 2020-07-10 19:01:49 -0400, Alvaro Herrera wrote: > Totally unasked for, here's a rebase of this patch series. I didn't do > anything other than rebasing to current master, solving a couple of very > trivial conflicts, fixing some whitespace complaints by git apply, and > running tests to verify everthing works. > > I don't foresee working on this at all, so if anyone is interested in > seeing this feature in, I encourage them to read and address > Horiguchi-san's feedback. Nor am I planning to do so, but I do think its a pretty important improvement. > +/* > + * PQrecyclePipelinedCommand > + * Push a command queue entry onto the freelist. It must be a dangling > entry > + * with null next pointer and not referenced by any other entry's next > pointer. > + */ Dangling sounds a bit like it's already freed. > +/* > + * PQbatchSendQueue > + * End a batch submission by sending a protocol sync. The connection will > + * remain in batch mode and unavailable for new synchronous command > execution > + * functions until all results from the batch are processed by the client. I feel like the reference to the protocol sync is a bit too low level for an external API. It should first document what the function does from a user's POV. I think it'd also be good to document whether / whether not queries can already have been sent before PQbatchSendQueue is called or not. > +/* > + * PQbatchProcessQueue > + *In batch mode, start processing the next query in the queue. > + * > + * Returns 1 if the next query was popped from the queue and can > + * be processed by PQconsumeInput, PQgetResult, etc. > + * > + * Returns 0 if the current query isn't done yet, the connection > + * is not in a batch, or there are no more queries to process. > + */ > +int > +PQbatchProcessQueue(PGconn *conn) > +{ > + PGcommandQueueEntry *next_query; > + > + if (!conn) > + return 0; > + > + if (conn->batch_status == PQBATCH_MODE_OFF) > + return 0; > + > + switch (conn->asyncStatus) > + { > + case PGASYNC_COPY_IN: > + case PGASYNC_COPY_OUT: > + case PGASYNC_COPY_BOTH: > + printfPQExpBuffer(&conn->errorMessage, > +libpq_gettext_noop("internal error, COPY in > batch mode")); > + break; Shouldn't there be a return 0 here? > + if (conn->batch_status == PQBATCH_MODE_ABORTED && conn->queryclass != > PGQUERY_SYNC) > + { > + /* > + * In an aborted batch we don't get anything from the server > for each > + * result; we're just discarding input until we get to the next > sync > + * from the server. The client needs to know its queries got > aborted > + * so we create a fake PGresult to return immediately from > + * PQgetResult. > + */ > + conn->result = PQmakeEmptyPGresult(conn, > + >PGRES_BATCH_ABORTED); > + if (!conn->result) > + { > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext("out of > memory")); > + pqSaveErrorResult(conn); > + return 0; Is there any way an application can recover at this point? ISTM we'd be stuck in the previous asyncStatus, no? > +/* pqBatchFlush > + * In batch mode, data will be flushed only when the out buffer reaches the > threshold value. > + * In non-batch mode, data will be flushed all the time. > + */ > +static int > +pqBatchFlush(PGconn *conn) > +{ > + if ((conn->batch_status == PQBATCH_MODE_OFF)||(conn->outCount >= > OUTBUFFER_THRESHOLD)) > + return(pqFlush(conn)); > + return 0; /* Just to keep compiler quiet */ > +} unnecessarily long line. > +/* > + * Connection's outbuffer threshold is set to 64k as it is safe > + * in Windows as per comments in pqSendSome() API. > + */ > +#define OUTBUFFER_THRESHOLD 65536 I don't think the comment explains much. It's fine to send more than 64k with pqSendSome(), they'll just be send with separate pgsecure_write() invocations. And only on windows. It clearly makes sense to start sending out data at a certain granularity to avoid needing unnecessary amounts of memory, and to make more efficient use of latency / serer side compute. It's not implausible that 64k is the right amount for that, I just don't think the explanation above is good. > diff --git a/src/test/modules/test_libpq/testlibpqbatch.c > b/src/test/modules/test_libpq/testlibpqbatch.c > new file mode 100644 > index 00..4d6ba266e5 > --- /dev/null > +++ b/src/test/modules/test_libpq/testlibpqbatch.c > @@ -0,0 +1,1456 @@ > +/* > + * src/test/modules/test_libpq/testlibpqbatch.c > + * > + * > + * testlibpqbatch.c > + * Test of batch execution functiona
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On 2019-Sep-09, Amit Kapila wrote: > Thanks for picking up this. However, I noticed that previously > Horiguchi-San has given some comments on this patch [1] which doesn't > seem to be addressed or at least not all of them are addressed. It is > possible that you would have already addressed those, but in that > case, it would be good if you respond to his email as well. If those > are not addressed, then it will be good to address those. Totally unasked for, here's a rebase of this patch series. I didn't do anything other than rebasing to current master, solving a couple of very trivial conflicts, fixing some whitespace complaints by git apply, and running tests to verify everthing works. I don't foresee working on this at all, so if anyone is interested in seeing this feature in, I encourage them to read and address Horiguchi-san's feedback. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >From c2de4e2a2cbc301b48a7946a9ee0b9acceae7233 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Fri, 10 Jul 2020 11:29:11 -0400 Subject: [PATCH v18 1/2] libpq batch support --- doc/src/sgml/libpq.sgml | 502 +++ doc/src/sgml/lobj.sgml| 4 + .../libpqwalreceiver/libpqwalreceiver.c | 3 + src/interfaces/libpq/exports.txt | 5 + src/interfaces/libpq/fe-connect.c | 27 + src/interfaces/libpq/fe-exec.c| 597 -- src/interfaces/libpq/fe-protocol2.c | 6 + src/interfaces/libpq/fe-protocol3.c | 15 +- src/interfaces/libpq/libpq-fe.h | 24 +- src/interfaces/libpq/libpq-int.h | 47 +- 10 files changed, 1186 insertions(+), 44 deletions(-) diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index d1ccaa775a..ad168727c7 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -4826,6 +4826,500 @@ int PQflush(PGconn *conn); + + Batch mode and query pipelining + + + libpq + batch mode + + + + libpq + pipelining + + + + libpq supports queueing up queries into + a pipeline to be executed as a batch on the server. Batching queries allows + applications to avoid a client/server round-trip after each query to get + the results before issuing the next query. + + + + When to use batching + + +Much like asynchronous query mode, there is no performance disadvantage to +using batching and pipelining. It increases client application complexity +and extra caution is required to prevent client/server deadlocks but +can sometimes offer considerable performance improvements. + + + +Batching is most useful when the server is distant, i.e. network latency +(ping time) is high, and when many small operations are being performed in +rapid sequence. There is usually less benefit in using batches when each +query takes many multiples of the client/server round-trip time to execute. +A 100-statement operation run on a server 300ms round-trip-time away would take +30 seconds in network latency alone without batching; with batching it may spend +as little as 0.3s waiting for results from the server. + + + +Use batches when your application does lots of small +INSERT, UPDATE and +DELETE operations that can't easily be transformed into +operations on sets or into a +COPY operation. + + + +Batching is not useful when information from one operation is required by the +client before it knows enough to send the next operation. The client must +introduce a synchronisation point and wait for a full client/server +round-trip to get the results it needs. However, it's often possible to +adjust the client design to exchange the required information server-side. +Read-modify-write cycles are especially good candidates; for example: + + BEGIN; + SELECT x FROM mytable WHERE id = 42 FOR UPDATE; + -- result: x=2 + -- client adds 1 to x: + UPDATE mytable SET x = 3 WHERE id = 42; + COMMIT; + +could be much more efficiently done with: + + UPDATE mytable SET x = x + 1 WHERE id = 42; + + + + + + The batch API was introduced in PostgreSQL 10.0, but clients using PostgresSQL 10.0 version of libpq can + use batches on server versions 7.4 and newer. Batching works on any server + that supports the v3 extended query protocol. + + + + + + + Using batch mode + + +To issue batches the application must switch +a connection into batch mode. Enter batch mode with PQenterBatchMode(conn) or test +whether batch mode is active with PQbatchStatus(conn). In batch mode only asynchronous operations are permitted, and +COPY is not recommended as it most likely will trigger failure in batch processing. +Using any synchronous command execution functio
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On Fri, Aug 30, 2019 at 7:06 PM Nikhil Sontakke wrote: > > Hi, > > > > This patch has been around for some time now, the last version fails to > > > apply cleanly and in-depth reviews have happened. I am moving that to > > > the next CF, waiting on its author. > > > > Unfortunately, nothing was changed since then, so there is already some > > amount > > of unaddressed review feedback. I'll move this patch to "Returned with > > feedback". > > > > Craig Ringer mentioned about this thread to me recently. > > This effort has seen decent reviews from Craig, Andres and Michael > already. So, I thought of refreshing it to work against latest master > HEAD. > Thanks for picking up this. However, I noticed that previously Horiguchi-San has given some comments on this patch [1] which doesn't seem to be addressed or at least not all of them are addressed. It is possible that you would have already addressed those, but in that case, it would be good if you respond to his email as well. If those are not addressed, then it will be good to address those. [1] - https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jp -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Hi, > > This patch has been around for some time now, the last version fails to > > apply cleanly and in-depth reviews have happened. I am moving that to > > the next CF, waiting on its author. > > Unfortunately, nothing was changed since then, so there is already some amount > of unaddressed review feedback. I'll move this patch to "Returned with > feedback". > Craig Ringer mentioned about this thread to me recently. This effort has seen decent reviews from Craig, Andres and Michael already. So, I thought of refreshing it to work against latest master HEAD. PFA, main patch as well as the test patch (I named the test patch v17 to be consistent with the main patch). The major grouse with the test patch AFAICS was the use of non-Windows compliant timersub() function. I have now used INSTR_TIME_SET_CURRENT/INSTR_TIME_SUBTRACT family of portable macros for the same. Please let me know on what we think of the above. Regards, Nikhil -- Nikhil Sontakke 2ndQuadrant - PostgreSQL Solutions for the Enterprise https://www.2ndQuadrant.com/ 0002-libpq_batch_tests_community_master.v17.patch Description: Binary data 0001-libpq_batch_support_commmunity_master.v17.patch Description: Binary data
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
> On Mon, Oct 1, 2018 at 8:34 AM Michael Paquier wrote: > > On Fri, Mar 23, 2018 at 02:18:09PM +0100, Daniel Verite wrote: > > There's a batch mode for pgbench in a patch posted in [1], > > with \beginbatch and \endbatch commands, but nothing > > for psql AFAICS. > > psql is more complicated because currently it uses a > > blocking PQexec() call at its core. Craig mentioned psql > > integration in [2] and [3]. > > This patch has been around for some time now, the last version fails to > apply cleanly and in-depth reviews have happened. I am moving that to > the next CF, waiting on its author. Unfortunately, nothing was changed since then, so there is already some amount of unaddressed review feedback. I'll move this patch to "Returned with feedback".
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On Fri, Mar 23, 2018 at 02:18:09PM +0100, Daniel Verite wrote: > There's a batch mode for pgbench in a patch posted in [1], > with \beginbatch and \endbatch commands, but nothing > for psql AFAICS. > psql is more complicated because currently it uses a > blocking PQexec() call at its core. Craig mentioned psql > integration in [2] and [3]. This patch has been around for some time now, the last version fails to apply cleanly and in-depth reviews have happened. I am moving that to the next CF, waiting on its author. -- Michael signature.asc Description: PGP signature
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Kyotaro HORIGUCHI wrote: > A disucssion on psql batch mode was held in another branch of > this thread. How do we treat that? There's a batch mode for pgbench in a patch posted in [1], with \beginbatch and \endbatch commands, but nothing for psql AFAICS. psql is more complicated because currently it uses a blocking PQexec() call at its core. Craig mentioned psql integration in [2] and [3]. Also a script can have inter-query dependencies such as in insert into table(...) returning id \gset update othertable set col= :id where ...; which is a problem in batch mode, as we don't want to send the update before the right value for :id is known. Whether we want to support these dependencies and how needs discussion. For instance we might not support them at all, or create a synchronization command that collects all results of queries sent so far, or do it implicitly when a variable is injected into a query... This looks like substantial work that might be best done separately from the libpq patch. [1] https://www.postgresql.org/message-id/b4e34135-2bd9-4b8a-94ca-27d760da2...@manitou-mail.org [2] https://www.postgresql.org/message-id/CAMsr+YGLhaDkjymLuNVQy4MrSKQoA=F1vO=aN8XQf30N=aq...@mail.gmail.com [3] https://www.postgresql.org/message-id/CAMsr+YE6BK4iAaQz=ny3xdnblhnnz_4tp-ptjqbnnpszmgo...@mail.gmail.com Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Hello. At Fri, 12 Jan 2018 10:12:35 +1100, Vaishnavi Prabakaran wrote in > Corrected compilation error in documentation portion of patch with latest > postgres code. Attached the corrected patch. My understanding of this patch is that it intends to do the following things without changing the protocol. A. Refrain from sending sync message and socket flush during batching. B. Set required information to PGconn before receiving a result. Multi statement query does the a similar thing but a bit different. Queries are not needed to be built and mreged as a string in advance with this feature. Since I'm not sure what is the current stage of this patch and I haven't fully recognize the disucssion so far, I might make stupid or duplicate comments but would like to make some comments. - Previous comments A disucssion on psql batch mode was held in another branch of this thread. How do we treat that? - Interface complteness I think PQenter/exitBatchMode() is undoubtedly needed. PQbatchSendQueue() seems to be required to make sure to send queries before calling PQgetResult. PQbatchProcessQueue() doesn't seem essential. If there's no case where it is called without following PQgetResult, it can be included in PQgetResult. Client code may be simpler if it is not required. The latest patch has extern definition of PQbatchQueueCount, the body and doc of which seems to have been removed, or haven't exist since the beginning. - Some comments on the code -- PQbatchProcessQueue() PQbatchProcessQueue() returns 0 for several undistinguishable reasons. If asyncStatus is PGASYNC_BUSY at the time, the connection is out of sync and doesn't accept further operations. So it should be distinguished from the end-of-queue case. (It can reutrn an error result if combining with PQgetResult.) The function handles COPY_* in inconsistent way. It sets an error message to conn in hardly noticeable way but allows to go further. The document is telling as the follows. > COPY is not recommended as it most likely will trigger failure > in batch processing I don't think the desciription is informative, but the reason for the description would be the fact that we cannot detect a command leads to protocol failure before the failure actually happens. The test code suggests that that is the case where any command (other than Sync) following "COPY FROM" breaks the protocol. We get an error messages like below in the case. (The header part is of my test program.) > ABORTED: (PGRES_BATCH_ABORTED) ERROR: unexpected message type 0x42 during > COPY from stdin > CONTEXT: COPY t, line 1 I haven't investigated further, but it seems unable to restore the sync of connection until disconnection. It would need a side-channel for COPY_IN to avoid the failure but we don't have it now. The possible choices here are: A. Make the description more precisely, by adding the condition to cause the error and how it looks. B. The situation seems detectable in PGgetResult. Return an error result containing any meaningful error. But this doesn't stop the following protocol error messages. > ABORTED: (PGRES_FATAL_ERROR) COPY_IN cannot be followed by any command in a > batch > ABORTED: (PGRES_FATAL_ERROR) ERROR: 08P01: unexpected message type 0x42 > during COPY from stdin > ABORTED: (PGRES_BATCH_ABORTED) ERROR: unexpected message type 0x42 during > COPY from stdin > ABORTED: (PGRES_BATCH_ABORTED) ERROR: unexpected message type 0x42 during > COPY from stdin C. Cancel COPY on the server and return error to the client for the situation. This might let us be able to avoid unrecoverable desync of the protocol in the case. (I'm not sure this can be done cleanly.) D. Wait for a protocol change that can solve this problem. -- PQbatchFlush() > static int pqBatchFlush(Pgconn *conn) ... > if (...) > return(pqFlush(conn)); > return 0; /* Just to keep compiler quiet */ The comment in the last line is wrong. + if ((conn->batch_status == PQBATCH_MODE_OFF)||(conn->outCount >= OUTBUFFER_T Isn't this a bit too long? -- PGcommandQueueEntry The resason for the fact that it is a linked list seems to me to allow queueing and result processing happen alternately in one batch period. But I coundn't find a description about the behavior in the documentation. (State transition chart is required?) Aside from the above, the interface to handle the command queue seems to be divided into too-small pieces. PQsendQueryGuts() > el = PQmakePipelinedCommand(conn) > el->members = hoge; > .. > PQappendPipelinedCommand(conn, el) Isn't the following enough instead of the above? > PQappendPipelinedCommand(conn, conn->queryclass, conn->last_query); regards, -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On Fri, Jan 5, 2018 at 4:55 PM, Vaishnavi Prabakaran < vaishnaviprabaka...@gmail.com> wrote: > > > On Tue, Nov 28, 2017 at 12:57 PM, Michael Paquier < > michael.paqu...@gmail.com> wrote: > >> On Thu, Oct 5, 2017 at 9:58 AM, Vaishnavi Prabakaran >> wrote: >> > Thanks for the suggestion and, OK I will create a new patch in upcoming >> > commitfest with attached patch addressing above review comments. >> >> The patch still applies and there has been no updates for the last >> month, as well as no reviews. I am bumping it to next CF. > > > Thank you, I see the patch generates a compilation error due to usage of > "FALSE" with latest postgres code, Hence attaching the patch with > correction. > Corrected compilation error in documentation portion of patch with latest postgres code. Attached the corrected patch. Thanks & Regards, Vaishnavi, Fujitsu Australia. 0001-Pipelining-batch-support-for-libpq-code-v16.patch Description: Binary data
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On Tue, Nov 28, 2017 at 12:57 PM, Michael Paquier wrote: > On Thu, Oct 5, 2017 at 9:58 AM, Vaishnavi Prabakaran > wrote: > > Thanks for the suggestion and, OK I will create a new patch in upcoming > > commitfest with attached patch addressing above review comments. > > The patch still applies and there has been no updates for the last > month, as well as no reviews. I am bumping it to next CF. Thank you, I see the patch generates a compilation error due to usage of "FALSE" with latest postgres code, Hence attaching the patch with correction. Thanks & Regards, Vaishnavi, Fujitsu Australia. 0001-Pipelining-batch-support-for-libpq-code-v15.patch Description: Binary data
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On Thu, Oct 5, 2017 at 9:58 AM, Vaishnavi Prabakaran wrote: > Thanks for the suggestion and, OK I will create a new patch in upcoming > commitfest with attached patch addressing above review comments. The patch still applies and there has been no updates for the last month, as well as no reviews. I am bumping it to next CF. -- Michael