Re: [HACKERS] RFC: Async query processing

2014-04-22 Thread Florian Weimer

On 01/05/2014 01:56 PM, Craig Ringer wrote:


JDBC also has a statement batching interface. Right now PgJDBC just
unwraps the batch and runs each query individually. Any async-support
improvements server-side should probably consider the need of executing
a batch. The batch might be one PreparedStatement with many different
parameters, or it might be a series of unrelated statements. A way for
PgJDBC to run the batch without syncing with the server after each query
would be really helpful.


The server already supports this because it supports pipelining.

Anyway, I have now switched my application to COPY FROM STDIN where 
possible, and the increase in throughput is phenomenal—even though it is 
running completely locally.  I could squeeze out more round-trips if I 
had an asynchronous query facility in libpq.


Feedback in this thread was, we want something like this in libpq, but 
not the thing you proposed.  But there have been no concrete 
counter-proposals, and some of the responses did not take into account 
the inherent complexities of round-trip avoidance.  So I'm not sure how 
to move this topic forward.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-04-22 Thread Claudio Freire
On Tue, Apr 22, 2014 at 8:19 AM, Florian Weimer fwei...@redhat.com wrote:
 Feedback in this thread was, we want something like this in libpq, but not
 the thing you proposed.  But there have been no concrete counter-proposals,
 and some of the responses did not take into account the inherent
 complexities of round-trip avoidance.  So I'm not sure how to move this
 topic forward.

What exactly do you mean by not taking into account?

I do not believe you responded to my proposed interface. I'm not
requesting a change in roundtrip avoidance measures, just the
interface.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-04-22 Thread Florian Weimer

On 01/03/2014 06:06 PM, Claudio Freire wrote:


Per-query expectations could be such a thing. And it can even work with PQexec:

PQexec(con, SELECT nextval('a_id_seq') FROM generate_series(1,10););
--read--
PQexec(con, SELECT nextval('b_id_seq') FROM generate_series(1,10););
--read--
PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
... 9 times...
PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC);
do {
// do something useful
} while (PQflush());

Here, the PQASYNC flag would temporarily switch to non-blocking I/O,
and buffer what cannot be sent. PQASNC_CORK, would only buffer (only
send if the buffer is full). After any ASYNC call, PQflush would be
necessary (to flush the send queue and to consume the expected
responses), but I can imagine any synchronous call (PQexec,
PQsendQuery or whatever) could detect a non-empty buffer and just
blockingly flush right there.


How would you consume results once they arrive?  I think this only 
covers the no-result case, and it has since come to my attention that 
the Node.js folks are looking for general async response processing.


(In order to implement your proposal, we'd have to add quite a few 
additional APIs to include the flag argument, but that's a minor detail.)


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-04-22 Thread Florian Weimer

On 04/22/2014 07:03 PM, Claudio Freire wrote:

On Tue, Apr 22, 2014 at 8:19 AM, Florian Weimer fwei...@redhat.com wrote:

Feedback in this thread was, we want something like this in libpq, but not
the thing you proposed.  But there have been no concrete counter-proposals,
and some of the responses did not take into account the inherent
complexities of round-trip avoidance.  So I'm not sure how to move this
topic forward.


What exactly do you mean by not taking into account?


A couple of them were along the lines let's just send batches of 
queries and sync between batches.  This does not seem very helpful to 
me because sizing the batches is difficult, and the sizes could be quite 
large.



I do not believe you responded to my proposed interface. I'm not
requesting a change in roundtrip avoidance measures, just the
interface.


I've sent a reply now, I think I missed this part.  Sorry.

--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-04-22 Thread Claudio Freire
On Tue, Apr 22, 2014 at 3:45 PM, Florian Weimer fwei...@redhat.com wrote:
 On 01/03/2014 06:06 PM, Claudio Freire wrote:

 Per-query expectations could be such a thing. And it can even work with
 PQexec:

 PQexec(con, SELECT nextval('a_id_seq') FROM generate_series(1,10););
 --read--
 PQexec(con, SELECT nextval('b_id_seq') FROM generate_series(1,10););
 --read--
 PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 ... 9 times...
 PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC);
 do {
 // do something useful
 } while (PQflush());

 Here, the PQASYNC flag would temporarily switch to non-blocking I/O,
 and buffer what cannot be sent. PQASNC_CORK, would only buffer (only
 send if the buffer is full). After any ASYNC call, PQflush would be
 necessary (to flush the send queue and to consume the expected
 responses), but I can imagine any synchronous call (PQexec,
 PQsendQuery or whatever) could detect a non-empty buffer and just
 blockingly flush right there.


 How would you consume results once they arrive?  I think this only covers
 the no-result case,

You could do PQEXPECT_ROWCOUNT for storing rowcounts (and supply a
pointer to a result buffer), or PQ_BUFFER_RESULTS and do the same. The
user would have to know beforehand the size of the result set (or an
upper bound of it), and readiness notification would also need to be
solved. There could also be PQEXPECT_DISCARD_RESULTS.

Alternatively, you could use a callback, node-style, and it would
solve everything (including readiness and error notification), but I'm
not sure about the portability of such a thing. Callbacks certainly
would be tricky when ABI compatibility has to be maintained. It would
however be a much better interface.

The pattern here, is the API needs to perform all the magic and
complex buffering and flushing, it should not be on the application
side.

 and it has since come to my attention that the Node.js
 folks are looking for general async response processing.

Node support would take a little more work. Specifically, for node to
work with this API, the non-blocking case has to be handled properly,
allowing node to wait on the FDs instead of requiring it to flush and
block on the event loop thread.

That means a buffer at least as large as the query parameters, which
should be no problem (but might be tricky to implement), and handling
the equivalent of EWOULDBLOCK at the PQexec(.., PQASYNC) calls.

In any case, on any specific connection, query processing is linear.
So you really need a result callback queue (however you implement
it, be the aplication or the API). What I propose is moving as much as
possible to the API, since it will be common to all users of the async
functionality, and it will make it possible to fix bugs in that code
centrally too.

My earlier examples where all about discarding results, because that's
what enables the most thoughput, and it covers lots of cases. But, as
has been mentioned in previous posts, rowcounts at the very least have
to be handled as well, so there's that. I guess we can throw in
generic result callbacks (and errbacks) if the ABI allows it, and it
will be a net win in clarity and simplicity.


On Tue, Apr 22, 2014 at 3:49 PM, Florian Weimer fwei...@redhat.com wrote:
 On 04/22/2014 07:03 PM, Claudio Freire wrote:

 On Tue, Apr 22, 2014 at 8:19 AM, Florian Weimer fwei...@redhat.com
 wrote:

 Feedback in this thread was, we want something like this in libpq, but
 not
 the thing you proposed.  But there have been no concrete
 counter-proposals,
 and some of the responses did not take into account the inherent
 complexities of round-trip avoidance.  So I'm not sure how to move this
 topic forward.


 What exactly do you mean by not taking into account?


 A couple of them were along the lines let's just send batches of queries
 and sync between batches.  This does not seem very helpful to me because
 sizing the batches is difficult, and the sizes could be quite large.

Not at all.

The proposal certainly has that in their examples, but the API can be
used with no explicit synchronization. That's what I was going for
when I suggested that other API calls could flush implicitly when
needed. If you never call flush, flush happens on its own when
necessary. If you never call synchronous API, you never wait for
replies. Though when you need the results (as in right now), you need
to flush explicitly, there's no way around that.

Also, feel free to consider all this mere opinion. I'm certainly not
the one implementing it ;-)


-- 
Sent via 

Re: [HACKERS] RFC: Async query processing

2014-01-05 Thread Craig Ringer
On 01/04/2014 01:06 AM, Claudio Freire wrote:
 You're forgetting ORM workloads.

I'm impressed that you've come up with an area where ORMs are beneficial ;-)

JDBC also has a statement batching interface. Right now PgJDBC just
unwraps the batch and runs each query individually. Any async-support
improvements server-side should probably consider the need of executing
a batch. The batch might be one PreparedStatement with many different
parameters, or it might be a series of unrelated statements. A way for
PgJDBC to run the batch without syncing with the server after each query
would be really helpful.

So would a way to BIND an array of parameters, so we could execute a
prepared statmenet once with multiple parameters and then sync up with
the server after all executions.

As for ORMs benefitting from this: Remember that nPgSQL and PgJDBC don't
use libpq. So the libpq changes would only help ORMs based on things
like Python (psycopg2), Ruby (Pg gem), etc, where they're using libpq
wrapper drivers.

 Execute-many of prepared statements is another one, quite common.

That's the case I'd really love to see proper server-side batch support
for. BIND_MULTIPLE, EXECUTE.

 I'm not sure what would happen if one of the queries returned an
 error. If in a transaction, all the following queries would error out
 I'd imagine. If not, they would simply be executed blindly.. am I
 correct?

It's not just dealing with erroring out. Many ORMs look at the count of
rows affected to detect whether an operation conflicted with another
concurrent operation when optimistic concurrency control is in use. E.g.

UPDATE t SET x = 'fred' WHERE rowversion = 4;

will be seen to fail if it reports that it's affected zero rows. This
is one of the reasons ORM users have such serious problems with the
write side of our partitioning support - we discard affected row counts,
and many ORMs don't deal well with that.

At least in JDBC, executeBatch returns an array of rowcounts. So you
can't throw away affected row counts when running batches, they must be
returned to the client. Doesn't matter if it's a single protocol message
with a list of IDs, or a series of individual messages, though.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-05 Thread Craig Ringer
On 01/04/2014 01:22 AM, Merlin Moncure wrote:
 Long term, I'd rather see an optimized 'ORM flush' assemble the data
 into a structured data set (perhaps a JSON document) and pass it to
 some receiving routine that decomposed it into records.

The same is true on the input side. I'd much rather be sending an ORM
client a big JSON / YAML / whatever graph than a horrible,
duplication-filled chained LEFT JOIN projection like they currently rely
on. When they're not just doing n+1 selects, which is worse.

I think that's really a side-issue though. ORMs aren't going to change
in a hurry, and batching / fire-and-forget support is good for all sorts
of other jobs too.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-05 Thread Greg Stark
On Fri, Jan 3, 2014 at 3:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I think Florian has a good point there, and the reason is this: what
 you are talking about will be of exactly zero use to applications that
 want to see the results of one query before launching the next.

There are techniques for handling that actually. For a better
explanation than I can do see
http://kentonv.github.io/capnproto/rpc.html. It's mostly a language
feature but it does require support from the protocol to be able to
reference data in earlier responses in your subsequent requests.

You can pull this trick by having an RPC server near the database so
all the latency is handled by the RPC server and not the database
connection. But it seems to me that that's unnecessary complication
and shouldn't be necessary. The database protocol basically is an RPC
layer.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-05 Thread Florian Weimer

On 01/05/2014 03:11 PM, Greg Stark wrote:

On Fri, Jan 3, 2014 at 3:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:

I think Florian has a good point there, and the reason is this: what
you are talking about will be of exactly zero use to applications that
want to see the results of one query before launching the next.


There are techniques for handling that actually. For a better
explanation than I can do see
http://kentonv.github.io/capnproto/rpc.html. It's mostly a language
feature but it does require support from the protocol to be able to
reference data in earlier responses in your subsequent requests.


Purely from a perspective of latency avoidance, sub-queries, WITH or 
stored procedures can achieve the same thing, and work even if the 
intermediate result has to undergo some transformation. :-)


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-05 Thread Florian Weimer

On 01/04/2014 04:39 PM, Martijn van Oosterhout wrote:


Why switch between COPY commands, why could you not do it in one? For
example:

COPY table1(col1, col2, ...),
  table2(col1, col2, ...)
FROM STDIN WITH (tableoids);
tableoid1tabcol1tabcol2...
tableoid2tab...
...
\.


My originally idea was to avoid any server-side changes, so that 
applications do not need fallback code for talking to old servers.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-05 Thread David Fetter
On Fri, Jan 03, 2014 at 03:06:11PM -0200, Claudio Freire wrote:
 On Fri, Jan 3, 2014 at 12:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Claudio Freire klaussfre...@gmail.com writes:
  On Fri, Jan 3, 2014 at 10:22 AM, Florian Weimer fwei...@redhat.com wrote:
  Loading data into the database isn't such an uncommon task.  Not 
  everything
  is OLTP.
 
  Truly, but a sustained insert stream of 10 Mbps is certainly way
  beyond common non-OLTP loads. This is far more specific than non-OLTP.
 
  I think Florian has a good point there, and the reason is this: what
  you are talking about will be of exactly zero use to applications that
  want to see the results of one query before launching the next.  Which
  eliminates a whole lot of apps.  I suspect that almost the *only*
  common use case in which a stream of queries can be launched without
  feedback is going to be bulk data loading.  It's not clear at all
  that pipelining the PQexec code path is the way to better performance
  for that --- why not use COPY, instead?
 
 You're forgetting ORM workloads.
 
 ORMs can usually plan the inserts to be in a sequence that both don't
 require feedback (except the knowledge that they were successful), and
 that do not violate constraints.
 
 Flushing a whole object hierarchy for instance, can be done without
 feedback. Not even serial columns need feedback, since many ORMs
 (SQLAlchemy, Hibernate) support allocation of ID sequences in batches
 (by issuing a proper select nextval).

This is already doable using data-modifying WITH clauses.  Whether
ORMs are (or should be made) smart enough to take advantage of this is
probably out of the scope of this discussion, though.  For what it's
worth, I'm not in favor of orthogonality in languages, so +1 from me.
Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-04 Thread Martijn van Oosterhout
On Fri, Jan 03, 2014 at 04:46:23PM +0100, Florian Weimer wrote:
 On 01/03/2014 04:20 PM, Tom Lane wrote:
 
 I think Florian has a good point there, and the reason is this: what
 you are talking about will be of exactly zero use to applications that
 want to see the results of one query before launching the next.  Which
 eliminates a whole lot of apps.  I suspect that almost the *only*
 common use case in which a stream of queries can be launched without
 feedback is going to be bulk data loading.  It's not clear at all
 that pipelining the PQexec code path is the way to better performance
 for that --- why not use COPY, instead?
 
 The data I encounter has to be distributed across multiple tables.
 Switching between the COPY TO commands would again need client-side
 buffering and heuristics for sizing these buffers.  Lengths of runs
 vary a lot in my case.

Why switch between COPY commands, why could you not do it in one? For
example:

COPY table1(col1, col2, ...),
 table2(col1, col2, ...)
FROM STDIN WITH (tableoids);
tableoid1tabcol1tabcol2...
tableoid2tab...
...
\.

There's no especially good reason why a COPY can only write to one
table. In this way you provide, per row, which table this row should go
to.

There's always the issue of generated primary keys. So you could, if
you wanted to, do:

COPY table1(col1, col2, ...) RETURNING pkey1,
 table2(col1, col2, ...)
FROM STDIN WITH (tableoids);
tableoid1tabcol1tabcol2...
tableoid2tab\K tableoidtab...
...
\.

So, like we have \N for NULL, a \K oid which be the value of the
RETURNING column for the table with that oid.

This may be way outfield, but we have a streaming interface, there's no
reason why we can't extend it.

There's also the idea of WITH x AS (COPY ... FROM STDIN) which you
could do with a similar multiplexing of rows in one stream. This would
be extremely powerful together with MERGE.

Have a nice way,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] RFC: Async query processing

2014-01-03 Thread Florian Weimer

On 01/02/2014 07:52 PM, Claudio Freire wrote:


No, because this doesn't scale automatically with the bandwidth-delay
product.  It also requires that the client buffers queries and their
parameters even though the network has to do that anyway.


Why not? I'm talking about transport-level packets, btw, not libpq
frames/whatever.

Yes, the network stack will sometimes do that. But the it doesn't have
to do it. It does it sometimes, which is not the same.


The network inevitably buffers because the speed of light is not infinite.

Here's a concrete example.  Suppose the server is 100ms away, and you 
want to send data at a constant rate of 10 Mbps.  The server needs to 
acknowledge the data you sent, but this acknowledgment arrives after 200 
ms.  As a result, you've sent 2 Mbits before the acknowledgment arrives, 
so the network appears to have buffered 250 KB.  This effect can 
actually be used for data storage, called delay line memory, but it is 
somewhat out of fashion now.



And buffering algorithms are quite platform-dependent anyway, so it's
not the best idea to make libpq highly reliant on them.


That is why I think libpq needs to keep sending until the first response 
from the server arrives.  Batching a fixed number of INSERTs together in 
a single conceptual query does not achieve auto-tuning to the buffering 
characteristics of the path.



So, trusting the network start to do the quick start won't work. For
steady streams of queries, it will work. But not for short bursts,
which will be the most heavily used case I believe (most apps create
short bursts of inserts and not continuous streams at full bandwidth).


Loading data into the database isn't such an uncommon task.  Not 
everything is OLTP.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-03 Thread Claudio Freire
On Fri, Jan 3, 2014 at 10:22 AM, Florian Weimer fwei...@redhat.com wrote:
 On 01/02/2014 07:52 PM, Claudio Freire wrote:

 No, because this doesn't scale automatically with the bandwidth-delay
 product.  It also requires that the client buffers queries and their
 parameters even though the network has to do that anyway.


 Why not? I'm talking about transport-level packets, btw, not libpq
 frames/whatever.

 Yes, the network stack will sometimes do that. But the it doesn't have
 to do it. It does it sometimes, which is not the same.


 The network inevitably buffers because the speed of light is not infinite.

 Here's a concrete example.  Suppose the server is 100ms away, and you want
 to send data at a constant rate of 10 Mbps.  The server needs to acknowledge
 the data you sent, but this acknowledgment arrives after 200 ms.  As a
 result, you've sent 2 Mbits before the acknowledgment arrives, so the
 network appears to have buffered 250 KB.  This effect can actually be used
 for data storage, called delay line memory, but it is somewhat out of
 fashion now.
...
 So, trusting the network start to do the quick start won't work. For
 steady streams of queries, it will work. But not for short bursts,
 which will be the most heavily used case I believe (most apps create
 short bursts of inserts and not continuous streams at full bandwidth).


 Loading data into the database isn't such an uncommon task.  Not everything
 is OLTP.

Truly, but a sustained insert stream of 10 Mbps is certainly way
beyond common non-OLTP loads. This is far more specific than non-OLTP.

Buffering will benefit the vast majority of applications that don't do
steady, sustained query streams. Which is the vast majority of
applications. An ORM doing a flush falls in this category, so it's an
overwhelmingly common case.

 And buffering algorithms are quite platform-dependent anyway, so it's
 not the best idea to make libpq highly reliant on them.


 That is why I think libpq needs to keep sending until the first response
 from the server arrives.  Batching a fixed number of INSERTs together in a
 single conceptual query does not achieve auto-tuning to the buffering
 characteristics of the path.

Not on its own, but it does improve thoughput during slow start, which
benefits OLTP, which is a hugely common use case. As you say, the
network will then auto-tune when the query stream is consistent
enough, so what's the problem with explicitly buffering a little then?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-03 Thread Tom Lane
Claudio Freire klaussfre...@gmail.com writes:
 On Fri, Jan 3, 2014 at 10:22 AM, Florian Weimer fwei...@redhat.com wrote:
 Loading data into the database isn't such an uncommon task.  Not everything
 is OLTP.

 Truly, but a sustained insert stream of 10 Mbps is certainly way
 beyond common non-OLTP loads. This is far more specific than non-OLTP.

I think Florian has a good point there, and the reason is this: what
you are talking about will be of exactly zero use to applications that
want to see the results of one query before launching the next.  Which
eliminates a whole lot of apps.  I suspect that almost the *only*
common use case in which a stream of queries can be launched without
feedback is going to be bulk data loading.  It's not clear at all
that pipelining the PQexec code path is the way to better performance
for that --- why not use COPY, instead?

Or to put it another way, I don't subscribe to if you build it they
will come for this proposed feature.  I think that making any use of
it would be so complex and error-prone that the vast majority of apps
won't bother.  Before we start adding a lot of complexity to libpq's
API and internals to support this, you need to make a better case
that there would be a significant number of users.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-03 Thread Florian Weimer

On 01/03/2014 04:20 PM, Tom Lane wrote:


I think Florian has a good point there, and the reason is this: what
you are talking about will be of exactly zero use to applications that
want to see the results of one query before launching the next.  Which
eliminates a whole lot of apps.  I suspect that almost the *only*
common use case in which a stream of queries can be launched without
feedback is going to be bulk data loading.  It's not clear at all
that pipelining the PQexec code path is the way to better performance
for that --- why not use COPY, instead?


The data I encounter has to be distributed across multiple tables. 
Switching between the COPY TO commands would again need client-side 
buffering and heuristics for sizing these buffers.  Lengths of runs vary 
a lot in my case.


I also want to use binary mode as a far as possible to avoid the integer 
conversion overhead, but some columns use custom enum types and are 
better transferred in text mode.


Some INSERTs happen via stored procedures, to implement de-duplication.

These issues could be addressed by using temporary staging tables. 
However, when I did that in the past, this caused pg_shdepend bloat. 
Carefully reusing them when possible might avoid that.  Again, due to 
the variance in lengths of runs, the staging tables are not always 
beneficial.


I understand that pipelining introduces complexity.  But solving the 
issues described above is no picnic, either.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-03 Thread Merlin Moncure
On Fri, Jan 3, 2014 at 9:46 AM, Florian Weimer fwei...@redhat.com wrote:
 On 01/03/2014 04:20 PM, Tom Lane wrote:

 I think Florian has a good point there, and the reason is this: what
 you are talking about will be of exactly zero use to applications that
 want to see the results of one query before launching the next.  Which
 eliminates a whole lot of apps.  I suspect that almost the *only*
 common use case in which a stream of queries can be launched without
 feedback is going to be bulk data loading.  It's not clear at all
 that pipelining the PQexec code path is the way to better performance
 for that --- why not use COPY, instead?


 The data I encounter has to be distributed across multiple tables. Switching
 between the COPY TO commands would again need client-side buffering and
 heuristics for sizing these buffers.  Lengths of runs vary a lot in my case.

 I also want to use binary mode as a far as possible to avoid the integer
 conversion overhead, but some columns use custom enum types and are better
 transferred in text mode.

 Some INSERTs happen via stored procedures, to implement de-duplication.

 These issues could be addressed by using temporary staging tables. However,
 when I did that in the past, this caused pg_shdepend bloat. Carefully
 reusing them when possible might avoid that.  Again, due to the variance in
 lengths of runs, the staging tables are not always beneficial.

 I understand that pipelining introduces complexity.  But solving the issues
 described above is no picnic, either.

Maybe consider using libpqtypes (http://libpqtypes.esilo.com/)?  It
transfers most everything in binary (enums notably are handled as
strings).  A typical usage of libpqtypes would be to arrange multiple
records into an array on the client then hand them off to a stored
procedure on the server side (perhaps over an asynchronous call while
you assemble the next batch).  libpqtypes was written for C
applications with very high performance requirements (for non
performance critical cases we might use json instead).  In my
experience it's not too difficult to arrange an assembly/push loop
that amortizes the round trip overhead to zero; it's not as efficient
as COPY but much more flexible and will blow away any scheme that
sends data row per query.

I agree with Tom that major changes to the libpq network stack is
probably not a good idea.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-03 Thread Claudio Freire
On Fri, Jan 3, 2014 at 12:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Claudio Freire klaussfre...@gmail.com writes:
 On Fri, Jan 3, 2014 at 10:22 AM, Florian Weimer fwei...@redhat.com wrote:
 Loading data into the database isn't such an uncommon task.  Not everything
 is OLTP.

 Truly, but a sustained insert stream of 10 Mbps is certainly way
 beyond common non-OLTP loads. This is far more specific than non-OLTP.

 I think Florian has a good point there, and the reason is this: what
 you are talking about will be of exactly zero use to applications that
 want to see the results of one query before launching the next.  Which
 eliminates a whole lot of apps.  I suspect that almost the *only*
 common use case in which a stream of queries can be launched without
 feedback is going to be bulk data loading.  It's not clear at all
 that pipelining the PQexec code path is the way to better performance
 for that --- why not use COPY, instead?

You're forgetting ORM workloads.

ORMs can usually plan the inserts to be in a sequence that both don't
require feedback (except the knowledge that they were successful), and
that do not violate constraints.

Flushing a whole object hierarchy for instance, can be done without
feedback. Not even serial columns need feedback, since many ORMs
(SQLAlchemy, Hibernate) support allocation of ID sequences in batches
(by issuing a proper select nextval).

I agree, that with the proposed API, it's too error prone to be
useful. But I also think, if the API is simple and fool-proof enough,
it could be build them and they will come. I know I'll be happy to
implement support for SQLAlchemy (since it will benefit me), if the
API resembles the proposition below (at least in simplicity).

Per-query expectations could be such a thing. And it can even work with PQexec:

PQexec(con, SELECT nextval('a_id_seq') FROM generate_series(1,10););
--read--
PQexec(con, SELECT nextval('b_id_seq') FROM generate_series(1,10););
--read--
PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
... 9 times...
PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC);
do {
   // do something useful
} while (PQflush());

Here, the PQASYNC flag would temporarily switch to non-blocking I/O,
and buffer what cannot be sent. PQASNC_CORK, would only buffer (only
send if the buffer is full). After any ASYNC call, PQflush would be
necessary (to flush the send queue and to consume the expected
responses), but I can imagine any synchronous call (PQexec,
PQsendQuery or whatever) could detect a non-empty buffer and just
blockingly flush right there.

This can benefit many useful patterns. ORM flush, is one, if there can
be preallocation of IDs (which I know at least SQLAlchemy and
Hibernate both support).

Execute-many of prepared statements is another one, quite common.

I'm not sure what would happen if one of the queries returned an
error. If in a transaction, all the following queries would error out
I'd imagine. If not, they would simply be executed blindly.. am I
correct?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-03 Thread Merlin Moncure
On Fri, Jan 3, 2014 at 11:06 AM, Claudio Freire klaussfre...@gmail.com wrote:
 On Fri, Jan 3, 2014 at 12:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Claudio Freire klaussfre...@gmail.com writes:
 On Fri, Jan 3, 2014 at 10:22 AM, Florian Weimer fwei...@redhat.com wrote:
 Loading data into the database isn't such an uncommon task.  Not everything
 is OLTP.

 Truly, but a sustained insert stream of 10 Mbps is certainly way
 beyond common non-OLTP loads. This is far more specific than non-OLTP.

 I think Florian has a good point there, and the reason is this: what
 you are talking about will be of exactly zero use to applications that
 want to see the results of one query before launching the next.  Which
 eliminates a whole lot of apps.  I suspect that almost the *only*
 common use case in which a stream of queries can be launched without
 feedback is going to be bulk data loading.  It's not clear at all
 that pipelining the PQexec code path is the way to better performance
 for that --- why not use COPY, instead?

 You're forgetting ORM workloads.

 ORMs can usually plan the inserts to be in a sequence that both don't
 require feedback (except the knowledge that they were successful), and
 that do not violate constraints.

 Flushing a whole object hierarchy for instance, can be done without
 feedback. Not even serial columns need feedback, since many ORMs
 (SQLAlchemy, Hibernate) support allocation of ID sequences in batches
 (by issuing a proper select nextval).

 I agree, that with the proposed API, it's too error prone to be
 useful. But I also think, if the API is simple and fool-proof enough,
 it could be build them and they will come. I know I'll be happy to
 implement support for SQLAlchemy (since it will benefit me), if the
 API resembles the proposition below (at least in simplicity).

 Per-query expectations could be such a thing. And it can even work with 
 PQexec:

 PQexec(con, SELECT nextval('a_id_seq') FROM generate_series(1,10););
 --read--
 PQexec(con, SELECT nextval('b_id_seq') FROM generate_series(1,10););
 --read--
 PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 ... 9 times...
 PQexec(con, INSERT INTO a (...);, PQEXPECT_NO_RESULT | PQASYNC_CORK);
 PQexec(con, INSERT INTO b (...);, PQEXPECT_NO_RESULT | PQASYNC);
 do {
// do something useful
 } while (PQflush());

 Here, the PQASYNC flag would temporarily switch to non-blocking I/O,
 and buffer what cannot be sent. PQASNC_CORK, would only buffer (only
 send if the buffer is full). After any ASYNC call, PQflush would be
 necessary (to flush the send queue and to consume the expected
 responses), but I can imagine any synchronous call (PQexec,
 PQsendQuery or whatever) could detect a non-empty buffer and just
 blockingly flush right there.

 This can benefit many useful patterns. ORM flush, is one, if there can
 be preallocation of IDs (which I know at least SQLAlchemy and
 Hibernate both support).

 Execute-many of prepared statements is another one, quite common.

 I'm not sure what would happen if one of the queries returned an
 error. If in a transaction, all the following queries would error out
 I'd imagine. If not, they would simply be executed blindly.. am I
 correct?

Long term, I'd rather see an optimized 'ORM flush' assemble the data
into a structured data set (perhaps a JSON document) and pass it to
some receiving routine that decomposed it into records.  This is a
better way to so things on so many levels.  Maybe I'm an old cranky
guy yelling at pigeons, but I don't think the current approach that
many ORMs take is going to withstand the test of time.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-02 Thread Claudio Freire
On Wed, Dec 18, 2013 at 1:50 PM, Florian Weimer fwei...@redhat.com wrote:
 On 11/04/2013 02:51 AM, Claudio Freire wrote:

 On Sun, Nov 3, 2013 at 3:58 PM, Florian Weimer fwei...@redhat.com wrote:

 I would like to add truly asynchronous query processing to libpq,
 enabling
 command pipelining.  The idea is to to allow applications to auto-tune to
 the bandwidth-delay product and reduce the number of context switches
 when
 running against a local server.

 ...

 If the application is not interested in intermediate query results, it
 would
 use something like this:

 ...

 If there is no need to exit from the loop early (say, because errors are
 expected to be extremely rare), the PQgetResultNoWait call can be left
 out.


 It doesn't seem wise to me making such a distinction. It sounds like
 you're oversimplifying, and that's why you need modes, to overcome
 the evidently restrictive limits of the simplified interface, and that
 it would only be a matter of (a short) time when some other limitation
 requires some other mode.


 I need modes because I want to avoid unbound buffering, which means that
 result data has to be consumed in the order queries are issued.
...
 In any case, I don't want to change the wire protocol, I just want to enable
 libpq clients to use more of its capabilities.

I believe you will at least need to use TCP_CORK or some advanced
socket options if you intend to decrease the number of packets without
changing the protocol.

Due to the interactive and synchronized nature of the protocol, TCP
will immediately send the first query in a packet since it's already
ready to do so. Buffering will only happen from the second query
onwards, and this won't benefit a two-query loop as the one in your
sample.

As for expectations, they can be part of the connection object and not
the wire protocol if you wish. The point I was making, is that the
expectation should be part of the query call, since that's less error
prone than setting a discard results mode. Think of it as
PQsendQueryParams with an extra async argument that defaults to
PQASYNC_NOT (ie, sync). There you can tell libpq to expect either no
results, expect and discard them, or whatever. The benefit here is a
simplified usage: your example code will be part of libpq and thus all
this complexity will be hidden from users. Furthermore, libpq will do
the small sanity check of actually checking that the server returns no
results when expecting no result.

PGAsyncMode oldMode = PQsetsendAsyncMode(conn, PQASYNC_RESULT);
bool more_data;
do {
   more_data = ...;
   if (more_data) {
 int ret = PQsendQueryParams(conn,
   INSERT ... RETURNING ..., ...);
 if (ret == 0) {
   // handle low-level error
 }
   }
   // Consume all pending results.
   while (1) {
 PGresult *res;
 if (more_data) {
   res = PQgetResultNoWait(conn);
 } else {
   res = PQgetResult(conn);
 }


 Somehow, that code looks backwards. I mean, really backwards. Wouldn't
 that be !more_data?

 No, if more data is available to transfer to the server, the no-wait variant
 has to be used to avoid a needless synchronization with the server.

Ok, yeah. Now I get it. It's client-side more_data.

 In any case, pipelining like that, without a clear distinction, in the
 wire protocol, of which results pertain to which query, could be a
 recipe for trouble when subtle bugs, either in lib usage or
 implementation, mistakenly treat one query's result as another's.


 We already use pipelining in libpq (see pqFlush, PQsendQueryGuts and
 pqParseInput3), the server is supposed to support it, and there is a lack of
 a clear tit-for-tat response mechanism anyway because of NOTIFY/LISTEN and
 the way certain errors are reported.

pqFlush doesn't seem overly related, since the API specifically states
that you cannot queue multiple PQsendQuery. It looks more like
low-level buffering. Ie: when the command itself is larger than the os
buffer and nonblocking operation requires multiple send() calls for
one PQsendQuery. Am I wrong?

 Instead of buffering the results, we could buffer the encoded command
 messages in PQASYNC_RESULT mode.  This means that PQsendQueryParams would
 not block when it cannot send the (complete) command message, but store
 in
 the connection object so that the subsequent PQgetResultNoWait and
 PQgetResult would send it.  This might work better with single-tuple
 result
 mode.  We cannot avoid buffering either multiple queries or multiple
 responses if we want to utilize the link bandwidth, or we'd risk
 deadlocks.


 This is a non-solution. Such an implementation, at least as described,
 would not remove neither network latency nor context switches, it
 would be a purely API change with no externally visible behavior
 change.


 Ugh, why?

Oh, sorry. I had this elaborate answer prepared, but I just noticed
it's wrong: you do say if it cannot send it rightaway.

Re: [HACKERS] RFC: Async query processing

2013-12-18 Thread Florian Weimer

On 11/04/2013 02:51 AM, Claudio Freire wrote:

On Sun, Nov 3, 2013 at 3:58 PM, Florian Weimer fwei...@redhat.com wrote:

I would like to add truly asynchronous query processing to libpq, enabling
command pipelining.  The idea is to to allow applications to auto-tune to
the bandwidth-delay product and reduce the number of context switches when
running against a local server.

...

If the application is not interested in intermediate query results, it would
use something like this:

...

If there is no need to exit from the loop early (say, because errors are
expected to be extremely rare), the PQgetResultNoWait call can be left out.


It doesn't seem wise to me making such a distinction. It sounds like
you're oversimplifying, and that's why you need modes, to overcome
the evidently restrictive limits of the simplified interface, and that
it would only be a matter of (a short) time when some other limitation
requires some other mode.


I need modes because I want to avoid unbound buffering, which means that 
result data has to be consumed in the order queries are issued.



   PGAsyncMode oldMode = PQsetsendAsyncMode(conn, PQASYNC_RESULT);
   bool more_data;
   do {
  more_data = ...;
  if (more_data) {
int ret = PQsendQueryParams(conn,
  INSERT ... RETURNING ..., ...);
if (ret == 0) {
  // handle low-level error
}
  }
  // Consume all pending results.
  while (1) {
PGresult *res;
if (more_data) {
  res = PQgetResultNoWait(conn);
} else {
  res = PQgetResult(conn);
}


Somehow, that code looks backwards. I mean, really backwards. Wouldn't
that be !more_data?


No, if more data is available to transfer to the server, the no-wait 
variant has to be used to avoid a needless synchronization with the server.



In any case, pipelining like that, without a clear distinction, in the
wire protocol, of which results pertain to which query, could be a
recipe for trouble when subtle bugs, either in lib usage or
implementation, mistakenly treat one query's result as another's.


We already use pipelining in libpq (see pqFlush, PQsendQueryGuts and 
pqParseInput3), the server is supposed to support it, and there is a 
lack of a clear tit-for-tat response mechanism anyway because of 
NOTIFY/LISTEN and the way certain errors are reported.



Instead of buffering the results, we could buffer the encoded command
messages in PQASYNC_RESULT mode.  This means that PQsendQueryParams would
not block when it cannot send the (complete) command message, but store in
the connection object so that the subsequent PQgetResultNoWait and
PQgetResult would send it.  This might work better with single-tuple result
mode.  We cannot avoid buffering either multiple queries or multiple
responses if we want to utilize the link bandwidth, or we'd risk deadlocks.


This is a non-solution. Such an implementation, at least as described,
would not remove neither network latency nor context switches, it
would be a purely API change with no externally visible behavior
change.


Ugh, why?


An effective solution must include multi-command packets. Without
knowing the wire protocol in detail, something like:

PARSE: INSERT blah
BIND: args
EXECUTE with DISCARD
PARSE: INSERT blah
BIND: args
EXECUTE with DISCARD
PARSE: SELECT  blah
BIND: args
EXECUTE with FETCH ALL

All in one packet, would be efficient and error-free (IMO).


No, because this doesn't scale automatically with the bandwidth-delay 
product.  It also requires that the client buffers queries and their 
parameters even though the network has to do that anyway.


In any case, I don't want to change the wire protocol, I just want to 
enable libpq clients to use more of its capabilities.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] RFC: Async query processing

2013-11-03 Thread Florian Weimer
I would like to add truly asynchronous query processing to libpq, 
enabling command pipelining.  The idea is to to allow applications to 
auto-tune to the bandwidth-delay product and reduce the number of 
context switches when running against a local server.


Here's a sketch of what the interface would look like.  I believe it can 
be implemented in principle within the existing libpq structure, 
although the interaction with single-row mode looks complicated.


If the application is not interested in intermediate query results, it 
would use something like this:


  PGAsyncMode oldMode = PQsetsendAsyncMode(conn, PQASYNC_DISCARD);
  PGresult *res NULL;
  while (more_data()) {
 int ret = PQsendQueryParams(conn, INSERT ..., ...);
 if (ret == 0) {
   // handle low-level error
 }
 // We can issue a second command.  There is no need to fetch
 // the result immediately.
 ret = PQsendQueryParams(conn, INSERT ..., ...);
 if (ret == 0) {
   // handle low-level error
 }
 res = PQgetResultNoWait(conn);
 if (res != NULL) {
   // Only returns non-NULL in case of error.  Does not wait for
   // data to arrive from the server.  If there is an error,
   // drains all pending errors and reports only the first error.
   break;
 }
  }
  if (res == NULL) {
// NULL means that all commands completed without error.
// Again, drains all pending errors.
// (Necessarily syncs with the server.)
res = PQgetResult(conn));
  }
  PQsetsendAsyncMode(conn, oldMode);
  if (res != NULL) {
// Handle error result.
...
PQclear(res);
throw_error();
  }

If there is no need to exit from the loop early (say, because errors are 
expected to be extremely rare), the PQgetResultNoWait call can be left out.


Internally, libpq will check in PQsendQueryParams (and similar 
functions) whether data is available for reading.  If yes, it is 
consumed, discarded unless it is an error, and stored for later 
retrieval with PQgetResultNoWait or PQgetResult.  (After the first 
error, all responses are discarded—this is necessary to avoid 
deadlocks.)  PQgetResultNoWait returns immediately if there is no stored 
error object.  If there is one, it syncs with the server (draining all 
pending results).  PQgetResult just drains all pending results, 
reporting the first error.


The general expectation here is that commands are sent within a 
transaction block, so only the first error is interesting.  (It make 
sense for fatal errors to overtake regular errors.)


If the programmer needs results, not just the errors, a different mode 
is needed:


  PGAsyncMode oldMode = PQsetsendAsyncMode(conn, PQASYNC_RESULT);
  bool more_data;
  do {
 more_data = ...;
 if (more_data) {
   int ret = PQsendQueryParams(conn,
 INSERT ... RETURNING ..., ...);
   if (ret == 0) {
 // handle low-level error
   }
 }
 // Consume all pending results.
 while (1) {
   PGresult *res;
   if (more_data) {
 res = PQgetResultNoWait(conn);
   } else {
 res = PQgetResult(conn);
   }
   if (res == NULL) {
 // No result so far.
 break;
   }

   // Result data is available.  Check if it is an error.
   switch (PGresultStatus(res)) {
   case PGRES_TUPLES_OK:
  // Process asynchronous result data.
  ...
  break;

   // Error handling.
   case PGRES_BAD_RESPONSE:
   case PGRES_NONFATAL_ERROR:
   case PGRES_FATAL_ERROR:
  // Store error somewhere.
  ...
  // Keep draining results, eventually
  // synchronizing with the server.
  more_data = false;
  break;
   }
   PQclear(res);
 }
  } while (more_data);
  PQsetsendAsyncMode(conn, oldMode);

(This suggest that PQgetResultNoWait should have a different name and a 
flag argument that tells it whether it is blocking or not.)


In PQASYNC_RESULT mode, libpq has to build up a list of pending results 
each time it notices available data during a PQsendQueryParams 
operation.  The number of responses that need to be stored this way is 
not totally unbounded, it is related to what fits into the 
bandwidth-delay product of the link.  This, and the fairly elaborate 
processing logic, is the reason why I think it makes to have sense 
PQASYNC_DISCARD as well.


Instead of buffering the results, we could buffer the encoded command 
messages in PQASYNC_RESULT mode.  This means that PQsendQueryParams 
would not block when it cannot send the (complete) command message, but 
store in the connection object so that the subsequent PQgetResultNoWait 
and PQgetResult would send it.  This might work better with single-tuple 
result mode.  We cannot avoid buffering either multiple queries or 
multiple responses if we want to utilize the link bandwidth, or we'd 
risk deadlocks.


There'd also be the default PQASYNC_DEFAULT mode, where it is an error 
to call 

Re: [HACKERS] RFC: Async query processing

2013-11-03 Thread Claudio Freire
On Sun, Nov 3, 2013 at 3:58 PM, Florian Weimer fwei...@redhat.com wrote:
 I would like to add truly asynchronous query processing to libpq, enabling
 command pipelining.  The idea is to to allow applications to auto-tune to
 the bandwidth-delay product and reduce the number of context switches when
 running against a local server.
...
 If the application is not interested in intermediate query results, it would
 use something like this:
...
 If there is no need to exit from the loop early (say, because errors are
 expected to be extremely rare), the PQgetResultNoWait call can be left out.

It doesn't seem wise to me making such a distinction. It sounds like
you're oversimplifying, and that's why you need modes, to overcome
the evidently restrictive limits of the simplified interface, and that
it would only be a matter of (a short) time when some other limitation
requires some other mode.


   PGAsyncMode oldMode = PQsetsendAsyncMode(conn, PQASYNC_RESULT);
   bool more_data;
   do {
  more_data = ...;
  if (more_data) {
int ret = PQsendQueryParams(conn,
  INSERT ... RETURNING ..., ...);
if (ret == 0) {
  // handle low-level error
}
  }
  // Consume all pending results.
  while (1) {
PGresult *res;
if (more_data) {
  res = PQgetResultNoWait(conn);
} else {
  res = PQgetResult(conn);
}

Somehow, that code looks backwards. I mean, really backwards. Wouldn't
that be !more_data?

In any case, pipelining like that, without a clear distinction, in the
wire protocol, of which results pertain to which query, could be a
recipe for trouble when subtle bugs, either in lib usage or
implementation, mistakenly treat one query's result as another's.

Notice that it's not an uncommon mistake, and this is much more likely
with such an unclear interface.


 Instead of buffering the results, we could buffer the encoded command
 messages in PQASYNC_RESULT mode.  This means that PQsendQueryParams would
 not block when it cannot send the (complete) command message, but store in
 the connection object so that the subsequent PQgetResultNoWait and
 PQgetResult would send it.  This might work better with single-tuple result
 mode.  We cannot avoid buffering either multiple queries or multiple
 responses if we want to utilize the link bandwidth, or we'd risk deadlocks.

This is a non-solution. Such an implementation, at least as described,
would not remove neither network latency nor context switches, it
would be a purely API change with no externally visible behavior
change.

An effective solution must include multi-command packets. Without
knowing the wire protocol in detail, something like:

PARSE: INSERT blah
BIND: args
EXECUTE with DISCARD
PARSE: INSERT blah
BIND: args
EXECUTE with DISCARD
PARSE: SELECT  blah
BIND: args
EXECUTE with FETCH ALL

All in one packet, would be efficient and error-free (IMO). This
precludes multiple result-containing comands, but it could hold
multiple result-less ones.

This could better be specified with expectations

PARSE: INSERT blah
EXPECT: ParseComplete
BIND: args
EXECUTE
EXPECT: CommandComplete
PARSE: INSERT blah
EXPECT: ParseComplete
BIND: args
EXECUTE
EXPECT: CommandComplete
PARSE: SELECT blah
EXPECT: ParseComplete
BIND: args
EXECUTE

A failed expectation would discard the command buffer and send an
ExpectationFailed message.

API ways of performing that could vary. Perhaps a
PQqueueQueryExpectNoResults could build up a message buffer, flushed
with either an explicit call to a flush command or to a PQsendQuery /
PQgetresult. Perhaps all of libpq could work with a message queue and
only flush it at specific points (PQgetresult, for instance, multiple
PQsendQuery calls could be turned into expect command complete).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers