Re: [HACKERS] Pipelining executions to postgresql server

Craig Ringer Mon, 03 Nov 2014 06:29:19 -0800

On 11/02/2014 09:27 PM, Mikko Tiihonen wrote:
> Is the following summary correct:
> - the network protocol supports pipelinings


Yes.

All you have to do is *not* send a Sync message and be aware that the
server will discard all input until the next Sync, so pipelining +
autocommit doesn't make a ton of sense for error handling reasons.

> - the server handles operations in order, starting the processing of next 
> operation only after fully processing the previous one - thus pipelining is 
> invisible to the server

As far as I know, yes. The server just doesn't care.

> - libpq driver does not support pipelining, but that is due to internal 
> limitations

Yep.

> - if proper error handling is done by the client then there is no reason why 
> pipelining could be supported by any pg client

Indeed, and most should support it. Sending batches of related queries
would make things a LOT faster.

PgJDBC's batch support is currently write-oriented. There is no
fundamental reason it can't be expanded for reads. I've already written
a patch to do just that for the case of returning generated keys.

https://github.com/ringerc/pgjdbc/tree/batch-returning-support

and just need to rebase it so I can send a pull for upstream PgJDBC.
It's already linked in the issues documenting the limitatations in batch
support.


If you want to have more general support for batches that return rowsets
there's no fundamental technical reason why it can't be added. It just
requires some tedious refactoring of the driver to either:

- Sync and wait before it fills its *send* buffer, rather than trying
  to manage its receive buffer (the server send buffer), so it can
  reliably avoid deadlocks; or

- Do async I/O in a select()-like loop over a protocol state machine,
  so it can simultaneously read and write on the wire.

I might need to do some of that myself soon, but it's a big (and
therefore error-prone) job I've so far avoided by making smaller, more
targeted changes.

Doing async I/O using Java nio channels is by far the better approach,
but also the more invasive one. The driver currently sends data on the
wire where it generates it and blocks to receive expected data.
Switching to send-side buffer management doesn't have the full
performance gains that doing bidirectional I/O via channels does,
though, and may be a significant performance _loss_ if you're sending
big queries but getting small replies.

For JDBC the JDBC batch interface is the right place to do this, and you
should not IMO attempt to add pipelining outside that interface.
(Multiple open resultsets from portals, yes, but not pipelining of queries).


-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pipelining executions to postgresql server

Reply via email to