Re: Extensibility of the PostgreSQL wire protocol
On 3/4/21 7:38 PM, Hannu Krosing wrote: On Thu, Mar 4, 2021 at 9:55 PM Jan Wieck wrote: but the whole thing was developed that way from the beginning and it is working. I don't have a definitive date when that code will be presented. Kuntal or Prateek may be able to fill in more details. Are you really fully replacing the main loop, or are you running a second main loop in parallel in the same database server instance, perhaps as a separate TDS_postmaster backend ? Will the data still also be accessible "as postgres" via port 5432 when TDS/SQLServer support is active ? The individual backend (session) is running a different main loop. A libpq based client will still get the regular libpq and the original PostgresMain() behavior on port 5432. The default port for TDS is 1433 and with everything in place I can connect to the same database on that port with Microsoft's SQLCMD. The whole point of all this is to allow the postmaster to listen to more than just 5432 and have different communication protocols on those *additional* ports. Nothing is really *replaced*. The parts of the backend, that do actual socket communication, are just routed through function pointers so that an extension can change their behavior. Regards, Jan -- Jan Wieck Principle Database Engineer Amazon Web Services
Re: Extensibility of the PostgreSQL wire protocol
On Thu, Mar 4, 2021 at 9:55 PM Jan Wieck wrote: > > Another possibility, and this is what is being used by the AWS team > implementing the TDS protocol for Babelfish, is to completely replace > the entire TCOP mainloop function PostgresMain(). I suspect this is the only reasonable way to do it for protocols which are not very close to libpq. > That is of course a > rather drastic move and requires a lot more coding on the extension > side, Not necessarily - if the new protocol is close to existing one, then it is copy/paste + some changes. If it is radically different, then trying to fit it into the current mainloop will be even harder than writing from scratch. And will very likely fail in the end anyway :) > but the whole thing was developed that way from the beginning and > it is working. I don't have a definitive date when that code will be > presented. Kuntal or Prateek may be able to fill in more details. Are you really fully replacing the main loop, or are you running a second main loop in parallel in the same database server instance, perhaps as a separate TDS_postmaster backend ? Will the data still also be accessible "as postgres" via port 5432 when TDS/SQLServer support is active ?
Re: Extensibility of the PostgreSQL wire protocol
On 3/3/21 2:43 PM, Peter Eisentraut wrote: I think, the way the abstractions are chosen in this patch, it is still very much tied to how the libpq protocol works. For example, there is a cancel key and a ready-for-query message. Some of the details of the simple and the extended query are exposed. So you could create a protocol that has a different way of encoding the payloads, as your telnet example does, but I don't believe that you could implement a competitor's protocol through this. Unless you have that done and want to show it? Correct, a lot of what this patch does is to allow a developer of such protocol extension to just "extend" what the server side libpq does, by building a wrapper around the function they are interested in. That doesn't change the protocol, but rather allows additional functionality like the telemetry data gathering, Fabrizio was talking about. The telnet_srv tutorial extension (which needs more documentation) is an example of how far one can go by replacing those funcitons, in that it actually implements a very different wire protocol. This one still fits into the regular libpq message flow. Another possibility, and this is what is being used by the AWS team implementing the TDS protocol for Babelfish, is to completely replace the entire TCOP mainloop function PostgresMain(). That is of course a rather drastic move and requires a lot more coding on the extension side, but the whole thing was developed that way from the beginning and it is working. I don't have a definitive date when that code will be presented. Kuntal or Prateek may be able to fill in more details. Regards, Jan -- Jan Wieck Principle Database Engineer Amazon Web Services
Re: Extensibility of the PostgreSQL wire protocol
I have not looked at the actual patch, but does it allow you to set up its own channels to listen to ? For example if I'd want to set up a server to listen to incoming connections over QUIC [1] - a protocol which create a connection over UDP and allows clients to move to new IP addresses (among other things) then would the current extensibility proposal cover this ? Maybe a correct approach would be to just start up a separate "postmaster" to listen to a different protocol ? [1] https://en.wikipedia.org/wiki/QUIC Cheers Hannu
Re: Extensibility of the PostgreSQL wire protocol
I think, the way the abstractions are chosen in this patch, it is still very much tied to how the libpq protocol works. For example, there is a cancel key and a ready-for-query message. Some of the details of the simple and the extended query are exposed. So you could create a protocol that has a different way of encoding the payloads, as your telnet example does, but I don't believe that you could implement a competitor's protocol through this. Unless you have that done and want to show it?
Re: Extensibility of the PostgreSQL wire protocol
On 2/19/21 10:13 AM, Jan Wieck wrote: Give the function, that postmaster is calling to accept a connection when a server_fd is ready, a return code that it can use to tell postmaster "forget about it, don't fork or do anything else with it". This function is normally calling StreamConnection() before the postmaster then forks the backend. But it could instead hand over the socket to the pool background worker (I presume Jonah is transferring them from process to process via UDP packet). The pool worker is then launching the actual backends which receive a requesting client via the same socket transfer to perform one or more transactions, then hand the socket back to the pool worker. The function in question, which is StreamConnection() and with this patch can be replaced with an extension funtion via the fn_accept pointer, already has that capability. If StreamConnection() or its replacement returns a NULL pointer, the postmaster just skips calling BackendStartup(). So everything is already in place for the above to work. Regards, Jan -- Jan Wieck Principle Database Engineer Amazon Web Services
Re: Extensibility of the PostgreSQL wire protocol
On 2/22/21 1:01 PM, Tom Lane wrote: Yeah, and as I pointed out somewhere upthread, trying to replace the whole parser will just end in a maintenance nightmare. The constructs that the parser has to emit are complex, Postgres-specific, and constantly evolving. We are NOT going to promise any sort of cross version compatibility for parse trees. Absolutely agreed. We cannot promise that the parsetree generated in one version will work with the planner, optimizer and executor of the next. These types of projects will need to pay close attention and more importantly, develop their own regression test suites that detect when something has changed in core. That said, discussion about the parser hook should happen in the other thread. I don't even expect that we can guarantee that the functions I am trying to allow to be redirected for the wire protocol will be stable forever. libpq V4 may need to change some of the call signatures, which has happened before. For example, the function to send the command completion message to the frontend (tcop/dest.c EndCommand()) changed from 12 to 13. Regards, Jan -- Jan Wieck Principle Database Engineer Amazon Web Services
Re: Extensibility of the PostgreSQL wire protocol
On Mon, Feb 22, 2021 at 1:01 PM Tom Lane wrote: > Jan Wieck writes: > > As Jim Mlodgenski just posted in [0], having the ability to also extend > > and/or replace the parser will give them the ability to do just that. > > Yeah, and as I pointed out somewhere upthread, trying to replace the > whole parser will just end in a maintenance nightmare. The constructs > that the parser has to emit are complex, Postgres-specific, and > constantly evolving. We are NOT going to promise any sort of cross > version compatibility for parse trees. > Wholeheartedly agreed. Core should only ever maintain the hooks, never their usage. It's the responsibility of the extension author to maintain their code just as it is to manage their use of all other hook usages. Yes, it's sometimes a maintenance nightmare - but with great power comes great responsibility... as is anything loaded directly into the process. -- Jonah H. Harris
Re: Extensibility of the PostgreSQL wire protocol
Jan Wieck writes: > On 2/10/21 1:10 PM, Tom Lane wrote: >> What I'm actually more concerned about, in this whole line of development, >> is the follow-on requests that will surely occur to kluge up Postgres >> to make its behavior more like $whatever. As in "well, now that we >> can serve MySQL clients protocol-wise, can't we pretty please have a >> mode that makes the parser act more like MySQL". > Those requests will naturally follow. But I don't see it as the main > project's responsibility to satisfy them. It would be rather natural to > develop the two things together. The same developer or group of > developers, who are trying to connect a certain client, will want to > have other compatibility features. > As Jim Mlodgenski just posted in [0], having the ability to also extend > and/or replace the parser will give them the ability to do just that. Yeah, and as I pointed out somewhere upthread, trying to replace the whole parser will just end in a maintenance nightmare. The constructs that the parser has to emit are complex, Postgres-specific, and constantly evolving. We are NOT going to promise any sort of cross version compatibility for parse trees. regards, tom lane
Re: Extensibility of the PostgreSQL wire protocol
On 2/10/21 1:10 PM, Tom Lane wrote: What I'm actually more concerned about, in this whole line of development, is the follow-on requests that will surely occur to kluge up Postgres to make its behavior more like $whatever. As in "well, now that we can serve MySQL clients protocol-wise, can't we pretty please have a mode that makes the parser act more like MySQL". Those requests will naturally follow. But I don't see it as the main project's responsibility to satisfy them. It would be rather natural to develop the two things together. The same developer or group of developers, who are trying to connect a certain client, will want to have other compatibility features. As Jim Mlodgenski just posted in [0], having the ability to also extend and/or replace the parser will give them the ability to do just that. Regards, Jan [0] https://www.postgresql.org/message-id/CAB_5SReoPJAPO26Z8+WN6ugfBb2UDc3c21rRz9=bzibmcap...@mail.gmail.com -- Jan Wieck Principle Database Engineer Amazon Web Services
Re: Extensibility of the PostgreSQL wire protocol
On Mon, Feb 22, 2021 at 07:34:51AM -0500, Dave Cramer wrote: > On Fri, 19 Feb 2021 at 15:39, Álvaro Hernández wrote: > > > On 19/2/21 19:30, Jan Wieck wrote: > > > [...] > > > > > > I also am not sure if building a connection pool into a > > > background worker or postmaster is a good idea to begin with. > > > One of the important features of a pool is to be able to suspend > > > traffic and make the server completely idle to for example be > > > able to restart the postmaster without forcibly disconnecting > > > all clients. A pool built into a background worker cannot do > > > that. > > Yes, when did it become a good idea to put a connection pooler in > the backend??? It became a great idea when we noticed just how large and resource-intensive backends were, especially in light of applications' broad tendency to assume that they're free. While I agree that that's not a good assumption, it's one that's so common everywhere in computing that we really need to face up to the fact that it's not going away any time soon. Decoupling the parts that serve requests from the parts that execute queries also goes a long way toward things we've wanted for quite awhile like admission control systems and/or seamless zero-downtime upgrades. Separately, as the folks at AWS and elsewhere have mentioned, being able to pretend at some level to be a different RDBMS can only happen if we respond to its wire protocol. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: Extensibility of the PostgreSQL wire protocol
On 2/19/21 4:36 AM, Kuntal Ghosh wrote: On Thu, Feb 18, 2021 at 9:32 PM Jan Wieck wrote: Few comments in the extension code (although experimental): 1. In telnet_srv.c, + static intpe_port; .. + DefineCustomIntVariable("telnet_srv.port", + "Telnet server port.", + NULL, + &pe_port, + pe_port, + 1024, + 65536, + PGC_POSTMASTER, + 0, + NULL, + NULL, + NULL); The variable pe_port should be initialized to a value which is > 1024 and < 65536. Otherwise, the following assert will fail, TRAP: FailedAssertion("newval >= conf->min", File: "guc.c", Line: 5541, PID: 12100) 2. The function pq_putbytes shouldn't be used by anyone other than old-style COPY out. + pq_putbytes(msg, strlen(msg)); Otherwise, the following assert will fail in the same function: /* Should only be called by old-style COPY OUT */ Assert(DoingCopyOut); Attached are an updated patch and telnet_srv addressing the above problems. Regards, Jan -- Jan Wieck Principle Database Engineer Amazon Web Services telnet_srv_2021-02-22_1.tgz Description: application/compressed-tar wire_proto_2021-02-22_1.diff.gz Description: application/gzip
Re: Extensibility of the PostgreSQL wire protocol
On 2/22/21 7:34 AM, Dave Cramer wrote: Yes, when did it become a good idea to put a connection pooler in the backend??? Dave Cramer www.postgres.rocks As Alvaro said, different pool purposes lead to different pool architecture and placement. However, the changes proposed here, aiming at the ability to load modified or entirely different wire protocol handlers, do not limit such connection pooling. To the contrary. Any connection pool, that wants to maintain more client connections than actual database backends, must know when it is appropriate to do so. Usually the right moment to break the current client-backend association is when the backend is outside a transaction block and waiting for the next client request. To do so pools cannot blindly shovel data back and forth. They need to scan one way or another for the backend's 'Z' message, sent in tcop/dest.c ReadyForQuery(), where the backend also reports the current transaction state. IOW the pool must follow the flow of libpq messages on all connections, message by message, row by row, just for the purpose of seeing that one, single bit. It is possible to transmit that information to the pool on a separate channel. Regards, Jan -- Jan Wieck Principle Database Engineer Amazon Web Services
Re: Extensibility of the PostgreSQL wire protocol
On Fri, 19 Feb 2021 at 15:39, Álvaro Hernández wrote: > > > On 19/2/21 19:30, Jan Wieck wrote: > > [...] > > > > I also am not sure if building a connection pool into a background > > worker or postmaster is a good idea to begin with. One of the > > important features of a pool is to be able to suspend traffic and make > > the server completely idle to for example be able to restart the > > postmaster without forcibly disconnecting all clients. A pool built > > into a background worker cannot do that. > > > > > Yes, when did it become a good idea to put a connection pooler in the backend??? Dave Cramer www.postgres.rocks
Re: Extensibility of the PostgreSQL wire protocol
On 19/2/21 19:30, Jan Wieck wrote: > [...] > > I also am not sure if building a connection pool into a background > worker or postmaster is a good idea to begin with. One of the > important features of a pool is to be able to suspend traffic and make > the server completely idle to for example be able to restart the > postmaster without forcibly disconnecting all clients. A pool built > into a background worker cannot do that. > > In my opinion, there are different reasons to use a connection pool, that lead to different placements of that connection pool on the architecture of the system. The ability of a pool to suspend (pause) traffic and apply live re-configurations is a very important one to implement high availability practices, transparent scaling, and others. But these poolers belong to middleware layers (as in different processes in different servers), where these pausing operations make complete sense. Connection poolers fronting the database have other specific missions, namely to control the fan-in of connections to the database. These connection poolers make sense being as close to the database as possible (ideally: embedded) but don't need to perform pause operations here. Álvaro -- Alvaro Hernandez --- OnGres
Re: Extensibility of the PostgreSQL wire protocol
On 19/2/21 14:48, Heikki Linnakangas wrote: > [...] > > I can see value in supporting different protocols. I don't like the > approach discussed in this thread, however. > > For example, there has been discussion elsewhere about integrating > connection pooling into the server itself. For that, you want to have > a custom process that listens for incoming connections, and launches > backends independently of the incoming connections. These hooks would > not help with that. > > Similarly, if you want to integrate a web server into the database > server, you probably also want some kind of connection pooling. A > one-to-one relationship between HTTP connections and backend processes > doesn't seem nice. While I'm far from an HTTP/2 expert and I may be wrong, from all I know HTTP/2 allows to create full-duplex, multiplexed, asynchronous channels. So multiple connections can be funneled through a single connection. This decreases the need and aim for connection pooling. It doesn't eliminate it completely, as you may have the channel busy if a single tenant is sending a lot of data; and you could not have more than one concurrent action from a single tenant. OTOH, given these HTTP/2 properties, a non-pooled HTTP/2 endpoint may provide already significant benefits due to the multiplexing capabilities. So definitely we don't need to consider an HTTP endpoint would entail a 1:1 mapping between connections and backend processes. Álvaro -- Alvaro Hernandez --- OnGres
Re: Extensibility of the PostgreSQL wire protocol
> On 19 Feb 2021, at 19:30, Jan Wieck wrote: > > An "extended" libpq protocol could allow the pool to give clients a unique > ID. The protocol handler would then maintain maps with the SQL of prepared > statements and what the client thinks their prepared statement name is. Or, the connection pooler could support a different wire protocol that has some form of client cookies and could let the client hold on to an opaque token to present back with every query and use that to route to the right backend with a prepared statement for that client (or match the appropriate cached p statement from the cache), even across client disconnections. > Most of that would of course be possible on the pool side itself. But the > internal structure of pgbouncer isn't suitable for that. It is very > lightweight and for long SQL queries may never have the complete 'P' message > in memory. It would also not have direct access to security related > information like the search path, which would require extra round trips > between the pool and the backend to retrieve it. > > So while not suitable to create a built in pool by itself, loadable wire > protocols can definitely help with connection pooling. I think loadable wire protocols will have a positive effect on developing more sophisticated connection poolers. > I also am not sure if building a connection pool into a background worker or > postmaster is a good idea to begin with. One of the important features of a > pool is to be able to suspend traffic and make the server completely idle to > for example be able to restart the postmaster without forcibly disconnecting > all clients. Agreed. Going even further, a connection pooler supporting a protocol like quic (where the notion of connection is decoupled from the actual socket connection) could help a lot with balancing load between servers and data centers, which also would not be convenient for the actual Postgres to do with present architecture. (And here, too, a pluggable wire protocol would help with keeping tabs on individual backends). -- Damir
Re: Extensibility of the PostgreSQL wire protocol
On 2/19/21 12:18 PM, Damir Simunic wrote: On 19 Feb 2021, at 14:48, Heikki Linnakangas wrote: For example, there has been discussion elsewhere about integrating connection pooling into the server itself. For that, you want to have a custom process that listens for incoming connections, and launches backends independently of the incoming connections. These hooks would not help with that. Not clear how the connection polling in the core is linked to discussing pluggable wire protocols. It isn't per se. But there are things pluggable wire protocols can help with in regards to connection pooling. For example a connection pool like pgbouncer can be configured to switch client-backend association on a transaction level. It therefore scans the traffic for the in transaction state. This however only works if an application uses identical session states across all connections in a pool. The JDBC driver for example only really prepares PreparedStatements after a number of executions and then assigns a name based on a counter to them. So it is neither guaranteed that a certain backend has the same statements prepared, nor that they are named the same. Therefore JDBC based applications cannot use PreparedStatements through pgbouncer in transaction mode. An "extended" libpq protocol could allow the pool to give clients a unique ID. The protocol handler would then maintain maps with the SQL of prepared statements and what the client thinks their prepared statement name is. So when a client sends a P packet, the protocol handler would lookup the mapping and see if it already has that statement prepared. Just add the mapping info or actually create a new statement entry in the maps. These maps are of course shared across backends. So if then another client sends bind+execute and the backend doesn't have a plan for that query, it would internally create one. There are security implications here, so things like the search path might have to be part of the maps, but those are implementation details. At the end this would allow a project like pgbouncer to create an extended version of libpq protocol that caters to the very special needs of that pool. Most of that would of course be possible on the pool side itself. But the internal structure of pgbouncer isn't suitable for that. It is very lightweight and for long SQL queries may never have the complete 'P' message in memory. It would also not have direct access to security related information like the search path, which would require extra round trips between the pool and the backend to retrieve it. So while not suitable to create a built in pool by itself, loadable wire protocols can definitely help with connection pooling. I also am not sure if building a connection pool into a background worker or postmaster is a good idea to begin with. One of the important features of a pool is to be able to suspend traffic and make the server completely idle to for example be able to restart the postmaster without forcibly disconnecting all clients. A pool built into a background worker cannot do that. Regards, Jan -- Jan Wieck Principle Database Engineer Amazon Web Services
Re: Extensibility of the PostgreSQL wire protocol
> On 19 Feb 2021, at 14:48, Heikki Linnakangas wrote: > > For example, there has been discussion elsewhere about integrating connection > pooling into the server itself. For that, you want to have a custom process > that listens for incoming connections, and launches backends independently of > the incoming connections. These hooks would not help with that. > Not clear how the connection polling in the core is linked to discussing pluggable wire protocols. > Similarly, if you want to integrate a web server into the database server, > you probably also want some kind of connection pooling. A one-to-one > relationship between HTTP connections and backend processes doesn't seem nice. > HTTP/2 is just a protocol, not unlike fe/be that has a one-to-one relationship to backend processes as it stands. It shuttles data back and forth in query/response exchanges, and happens to be used by web servers and web browsers, among other things. My mentioning of it was simply an example I can speak of from experience, as opposed to speculating. Could have brought up any other wire protocol if I had experience with it, say MQTT. To make it clear, “a pluggable wire protocol” as discussed here is a set of rules that defines how data is transmitted: what the requests and responses are, and how is the data laid out on the wire, what to do in case of error, etc. Nothing to do with a web server; why would one want to integrate it in the database, anyway? The intended contribution to the discussion of big picture of pluggable wire protocols is that there are significant use cases where the protocol choice is restricted on the client side, and allowing a pluggable wire protocol on the server side brings tangible benefits in performance and architectural simplification. That’s all. The rest were supporting facts that hopefully can also serve as a counterpoint to “pluggable wire protocol is primarily useful to make Postgres pretend to be Mysql." Protocol conversion HTTP/2<—>FE/BE on the connection pooler already brings a lot of the mentioned benefits, and I’m satisfied with it. Beyond that I’m simply supporting the idea of pluggable protocols as experience so far allows me to see advantages that might sound theoretical to someone who never tried this scenario in production. Glad to offer a couple of examples where I see potential for performance gains for having such a wire protocol pluggable in the core. Let me know if you want me to elaborate. > Querying multiple databases over a single connection is not possible with the > approach taken here. Indeed, querying multiple databases over a single connection is something you need a proxy for and a different client protocol from fe/be. No need to mix that with the talk about pluggable wire protocol. My mentioning of it was in the sense “a lot of LoB backend code is nothing more than a bloated protocol converter that happens to also allow connecting to multiple databases from a single client connection => letting the client speak to the database [trough a proxy in this case] removed the bloated source of latency but kept the advantages.” -- Damir
Re: Extensibility of the PostgreSQL wire protocol
On 2/19/21 8:48 AM, Heikki Linnakangas wrote: I can see value in supporting different protocols. I don't like the approach discussed in this thread, however. For example, there has been discussion elsewhere about integrating connection pooling into the server itself. For that, you want to have a custom process that listens for incoming connections, and launches backends independently of the incoming connections. These hooks would not help with that. The two are not mutually exclusive. You are right that the current proposal would not help with that type of built in connection pool, but it may be extended to that. Give the function, that postmaster is calling to accept a connection when a server_fd is ready, a return code that it can use to tell postmaster "forget about it, don't fork or do anything else with it". This function is normally calling StreamConnection() before the postmaster then forks the backend. But it could instead hand over the socket to the pool background worker (I presume Jonah is transferring them from process to process via UDP packet). The pool worker is then launching the actual backends which receive a requesting client via the same socket transfer to perform one or more transactions, then hand the socket back to the pool worker. All of that would still require a protocol extension that has special messages for "here is a client socket for you" and "you can have that back". I would recommend this approach: write a separate program that sits between the client and PostgreSQL, speaking custom protocol to the client, and libpq to the backend. And then move that program into a background worker process. That is a classic protocol converting proxy. It has been done in the past with not really good results, both performance wise as with respect to protocol completeness. Regards, Jan -- Jan Wieck Principle Database Engineer Amazon Web Services
Re: Extensibility of the PostgreSQL wire protocol
On Fri, Feb 19, 2021 at 8:48 AM Heikki Linnakangas wrote: > With the hooks that exist today, would it possible to write a background > worker that listens on a port, instead of postmaster? Can you launch > backends from a background worker? And communicate the backend processes > using a shared memory message queue (see pqmq.c). > Yes. That's similar to how mine work: A background worker that acts as a listener for the new protocol which then sets up a new dynamic background worker on accept(), waits for its creation, passes the fd to the new background worker, and sits in a while (!got_sigterm) loop checking the socket for activity and running the protocol similar to postmaster. I haven't looked at the latest connection pooling patches but, in general, connection pooling is an abstract issue and should be usable for any type of connection as, realistically, it's just an event loop and state problem - it shouldn't be protocol specific. I would recommend this approach: write a separate program that sits > between the client and PostgreSQL, speaking custom protocol to the > client, and libpq to the backend. And then move that program into a > background worker process. > Doing protocol conversion between libpq and a different protocol works, but is slow. My implementations were originally all proxies that worked outside the database, then I moved them inside, then I replaced all the libpq code with SPI-related calls. > > In a recent case, letting the browser talk directly to the database > > allowed me to get rid of a ~100k-sloc .net backend and all the > > complexity and infrastructure that goes with > > coding/testing/deploying/maintaining it, while keeping all the > > positives: per-query compression/data conversion, querying multiple > > databases over a single connection, session cookies, etc. Deployment > > is trivial compared to what was before. Latency is down 2x-4x across > > the board. > > Querying multiple databases over a single connection is not possible > with the approach taken here. Not sure about the others things you listed. > Accessing multiple databases from the same backend is problematic overall - I didn't solve that in my implementations either. IIRC, once a bgworker is attached to a specific database, it's basically stuck with that database. -- Jonah H. Harris
Re: Extensibility of the PostgreSQL wire protocol
On 19/02/2021 14:29, Damir Simunic wrote: On 11 Feb 2021, at 16:06, Tom Lane wrote: Maybe there is some useful thing that can be accomplished here, but we need to consider the bigger picture rather than believing (without proof) that a few hook variables will be enough to do anything. Pluggable wire protocol is a game-changer on its own. The bigger picture is that a right protocol choice enables large-scale architectural simplifications for whole classes of production applications. For browser-based applications (lob, saas, e-commerce), having the database server speak the browser protocol enables architectures without backend application code. This in turn leads to significant reductions of latency, complexity, and application development time. And it’s not just lack of backend code: one also profits from all the existing infrastructure like per-query compression/format choice, browser connection management, sse, multiple streams, prioritization, caching/cdns, etc. Don’t know if you’d consider it as a proof, yet I am seeing 2x to 4x latency reduction in production applications from protocol conversion to http/2. My present solution is a simple connection pooler I built on top of Nginx transforming the tcp stream as it passes through. I can see value in supporting different protocols. I don't like the approach discussed in this thread, however. For example, there has been discussion elsewhere about integrating connection pooling into the server itself. For that, you want to have a custom process that listens for incoming connections, and launches backends independently of the incoming connections. These hooks would not help with that. Similarly, if you want to integrate a web server into the database server, you probably also want some kind of connection pooling. A one-to-one relationship between HTTP connections and backend processes doesn't seem nice. With the hooks that exist today, would it possible to write a background worker that listens on a port, instead of postmaster? Can you launch backends from a background worker? And communicate the backend processes using a shared memory message queue (see pqmq.c). I would recommend this approach: write a separate program that sits between the client and PostgreSQL, speaking custom protocol to the client, and libpq to the backend. And then move that program into a background worker process. In a recent case, letting the browser talk directly to the database allowed me to get rid of a ~100k-sloc .net backend and all the complexity and infrastructure that goes with coding/testing/deploying/maintaining it, while keeping all the positives: per-query compression/data conversion, querying multiple databases over a single connection, session cookies, etc. Deployment is trivial compared to what was before. Latency is down 2x-4x across the board. Querying multiple databases over a single connection is not possible with the approach taken here. Not sure about the others things you listed. - Heikki
Re: Extensibility of the PostgreSQL wire protocol
Thank you Kuntal, On Fri, Feb 19, 2021 at 4:36 AM Kuntal Ghosh wrote: > On Thu, Feb 18, 2021 at 9:32 PM Jan Wieck wrote: > > > Few comments in the extension code (although experimental): > > 1. In telnet_srv.c, > + static intpe_port; > .. > + DefineCustomIntVariable("telnet_srv.port", > + "Telnet server > port.", > + NULL, > + &pe_port, > + pe_port, > + 1024, > + 65536, > + PGC_POSTMASTER, > + 0, > + NULL, > + NULL, > + NULL); > > The variable pe_port should be initialized to a value which is > 1024 > and < 65536. Otherwise, the following assert will fail, > TRAP: FailedAssertion("newval >= conf->min", File: "guc.c", Line: > 5541, PID: 12100) > > Right, forgot to turn on Asserts. > 2. The function pq_putbytes shouldn't be used by anyone other than > old-style COPY out. > + pq_putbytes(msg, strlen(msg)); > Otherwise, the following assert will fail in the same function: > /* Should only be called by old-style COPY OUT */ > Assert(DoingCopyOut); > I would argue that the Assert needs to be changed. It is obvious that the Assert in place is meant to guard against direct usage of pg_putbytes() in an attempt to force all code to use pq_putmessage() instead. This is good when speaking libpq wire protocol since all messages there are prefixed with a one byte message type. It does not apply to other protocols. I propose to create another global boolean IsNonLibpqFrontend which the protocol extension will set to true when accepting the connection and the above then will change to Assert(DoingCopyOut || IsNonLibpqFrontend); Regards, Jan > > -- > Thanks & Regards, > Kuntal Ghosh > Amazon Web Services > -- Jan Wieck
Re: Extensibility of the PostgreSQL wire protocol
> On 11 Feb 2021, at 16:06, Tom Lane wrote: > > Maybe there is some useful thing that can be accomplished here, but we > need to consider the bigger picture rather than believing (without proof) > that a few hook variables will be enough to do anything. > > regards, tom lane > Pluggable wire protocol is a game-changer on its own. The bigger picture is that a right protocol choice enables large-scale architectural simplifications for whole classes of production applications. For browser-based applications (lob, saas, e-commerce), having the database server speak the browser protocol enables architectures without backend application code. This in turn leads to significant reductions of latency, complexity, and application development time. And it’s not just lack of backend code: one also profits from all the existing infrastructure like per-query compression/format choice, browser connection management, sse, multiple streams, prioritization, caching/cdns, etc. Don’t know if you’d consider it as a proof, yet I am seeing 2x to 4x latency reduction in production applications from protocol conversion to http/2. My present solution is a simple connection pooler I built on top of Nginx transforming the tcp stream as it passes through. In a recent case, letting the browser talk directly to the database allowed me to get rid of a ~100k-sloc .net backend and all the complexity and infrastructure that goes with coding/testing/deploying/maintaining it, while keeping all the positives: per-query compression/data conversion, querying multiple databases over a single connection, session cookies, etc. Deployment is trivial compared to what was before. Latency is down 2x-4x across the board. Having some production experience with this approach, I can see how http/2-speaking Postgres would further reduce latency, processing cost, and time-to-interaction for applications. A similar case can be made for IoT where one would want to plug an iot-optimized protocol. Again, most of the benefit is possible with a protocol-converting proxy, but there are additional non-trivial performance gains to be had if the database server speaks the right protocol. While not the only use cases, I’d venture a guess these represent a sizable chunk of what Postgres is used for today, and will be used even more for, so the positive impact of a pluggable protocol would be significant. -- Damir
Re: Extensibility of the PostgreSQL wire protocol
On Thu, Feb 18, 2021 at 9:32 PM Jan Wieck wrote: > And, here is how it looks with the following configuration: telnet_srv.port = 1433 telnet_srv.listen_addresses = '*' telnet localhost 1433 master Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. PostgreSQL Telnet Interface database name: postgres username: kuntal password: changeme > select 1; ?column? 1 SELECT 1 > select 1/0; Message: ERROR - division by zero Few comments in the extension code (although experimental): 1. In telnet_srv.c, + static intpe_port; .. + DefineCustomIntVariable("telnet_srv.port", + "Telnet server port.", + NULL, + &pe_port, + pe_port, + 1024, + 65536, + PGC_POSTMASTER, + 0, + NULL, + NULL, + NULL); The variable pe_port should be initialized to a value which is > 1024 and < 65536. Otherwise, the following assert will fail, TRAP: FailedAssertion("newval >= conf->min", File: "guc.c", Line: 5541, PID: 12100) 2. The function pq_putbytes shouldn't be used by anyone other than old-style COPY out. + pq_putbytes(msg, strlen(msg)); Otherwise, the following assert will fail in the same function: /* Should only be called by old-style COPY OUT */ Assert(DoingCopyOut); -- Thanks & Regards, Kuntal Ghosh Amazon Web Services
Re: Extensibility of the PostgreSQL wire protocol
Attached are a first patch and a functioning extension that implements a telnet protocol server. The extension needs to be loaded via shared_preload_libraries and configured for a port number and listen_addresses as follows: shared_preload_libraries = 'telnet_srv' telnet_srv.listen_addresses = '*' telnet_srv.port = 54323 It is incomplete in that it doesn't address things like the COPY protocol. But it is enough to give a more detailed idea of what this interface will look like and what someone would do to implement their own protocol or extend an existing one. The overall idea here is to route all functions, that communicate with the frontend, through function pointers that hang off of MyProcPort. Since we are performing socket communication in them I believe one extra function pointer indirection is unlikely to have significant performance impact. Best Regards, Jan On behalf of Amazon Web Services On Sun, Feb 14, 2021 at 12:36 PM Dave Cramer wrote: > > > On Thu, 11 Feb 2021 at 09:28, Robert Haas wrote: > >> On Wed, Feb 10, 2021 at 2:04 PM Tom Lane wrote: >> > That is a spot-on definition of where I do NOT want to end up. Hooks >> > everywhere and enormous extensions that break anytime we change anything >> > in the core. It's not really clear that anybody is going to find that >> > more maintainable than a straight fork, except to the extent that it >> > enables the erstwhile forkers to shove some of their work onto the PG >> > community. >> >> +1. >> >> Making the lexer and parser extensible seems desirable to me. It would >> be beneficial not only for companies like EDB and Amazon that might >> want to extend the grammar in various ways, but also for extension >> authors. However, it's vastly harder than Jan's proposal to make the >> wire protocol pluggable. The wire protocol is pretty well-isolated >> from the rest of the system. As long as you can get queries out of the >> packets the client sends and package up the results to send back, it's >> all good. > > > I would have to disagree that the wire protocol is well-isolated. Sending > and receiving are not in a single file > The codes are not even named constants so trying to find a specific one is > difficult. > > Anything that would clean this up would be a benefit > > > That being said, I'm not in favor of transferring maintenance work to >> the community for this set of hooks any more than I am for something >> on the parsing side. In general, I'm in favor of as much extensibility >> as we can reasonably create, but with a complicated proposal like this >> one, the community should expect to be able to get something out of >> it. And so far what I hear Jan saying is that these hooks could in >> theory be used for things other than Amazon's proprietary efforts and >> those things could in theory bring benefits to the community, but >> there are no actual plans to do anything with this that would benefit >> anyone other than Amazon. Which seems to bring us right back to >> expecting the community to maintain things for the benefit of >> third-party forks. >> > > if this proposal brought us the ability stream results that would be a > huge plus! > > Dave Cramer > www.postgres.rocks > >> >> -- Jan Wieck wire_proto_2021-02-18_1.diff.gz Description: application/gzip telnet_srv_2021-02-18_1.tgz Description: application/compressed-tar
Re: Extensibility of the PostgreSQL wire protocol
On Thu, 11 Feb 2021 at 09:28, Robert Haas wrote: > On Wed, Feb 10, 2021 at 2:04 PM Tom Lane wrote: > > That is a spot-on definition of where I do NOT want to end up. Hooks > > everywhere and enormous extensions that break anytime we change anything > > in the core. It's not really clear that anybody is going to find that > > more maintainable than a straight fork, except to the extent that it > > enables the erstwhile forkers to shove some of their work onto the PG > > community. > > +1. > > Making the lexer and parser extensible seems desirable to me. It would > be beneficial not only for companies like EDB and Amazon that might > want to extend the grammar in various ways, but also for extension > authors. However, it's vastly harder than Jan's proposal to make the > wire protocol pluggable. The wire protocol is pretty well-isolated > from the rest of the system. As long as you can get queries out of the > packets the client sends and package up the results to send back, it's > all good. I would have to disagree that the wire protocol is well-isolated. Sending and receiving are not in a single file The codes are not even named constants so trying to find a specific one is difficult. Anything that would clean this up would be a benefit That being said, I'm not in favor of transferring maintenance work to > the community for this set of hooks any more than I am for something > on the parsing side. In general, I'm in favor of as much extensibility > as we can reasonably create, but with a complicated proposal like this > one, the community should expect to be able to get something out of > it. And so far what I hear Jan saying is that these hooks could in > theory be used for things other than Amazon's proprietary efforts and > those things could in theory bring benefits to the community, but > there are no actual plans to do anything with this that would benefit > anyone other than Amazon. Which seems to bring us right back to > expecting the community to maintain things for the benefit of > third-party forks. > if this proposal brought us the ability stream results that would be a huge plus! Dave Cramer www.postgres.rocks > >
Re: Extensibility of the PostgreSQL wire protocol
On Thu, Feb 11, 2021 at 12:07 PM Tom Lane wrote: > > Robert Haas writes: > > On Thu, Feb 11, 2021 at 9:42 AM Jonah H. Harris wrote: > >> As Jan said in his last email, they're not proposing all the different > >> aspects needed. In fact, nothing has actually been proposed yet. This > >> is an entirely philosophical debate. I don't even know what's being > >> proposed at this point - I just know it *could* be useful. Let's just > >> wait and see what is actually proposed before shooting it down, yes? > > > I don't think I'm trying to shoot anything down, because as I said, I > > like extensibility and am generally in favor of it. Rather, I'm > > expressing a concern which seems to me to be justified, based on what > > was posted. I'm sorry that my tone seems to have aggravated you, but > > it wasn't intended to do so. > > Likewise, the point I was trying to make is that a "pluggable wire > protocol" is only a tiny part of what would be needed to have a credible > MySQL, Oracle, or whatever clone. There are large semantic differences > from those products; there are maintenance issues arising from the fact > that we whack structures like parse trees around all the time; and so on. > Maybe there is some useful thing that can be accomplished here, but we > need to consider the bigger picture rather than believing (without proof) > that a few hook variables will be enough to do anything. > Just to don't miss the point, creating a compat protocol to mimic others (TDS, MySQL, etc) is just one use case. There are other use cases to make wire protocol extensible, for example for telemetry I can use some hooks to propagate context [1] and get more detailed tracing information about the negotiation between frontend and backend and being able to implement a truly query tracing tool, for example. Another use case is extending the current protocol to, for example, send more information about query execution on CommandComplete command instead of just the number of affected rows. About the HTTP protocol I think PG should have it, maybe pure HTTP (no REST, just HTTP) because it's the most interoperable. Performance can still be very good with HTTP2, and you have a huge ecosystem of tools and proxies (like Envoy) that would do wonders with this. You could safely query a db from a web page (passing through proxies that would do auth, TLS, etc). Or maybe a higher performing gRPC version (which is also HTTP2 and is amazing), but this makes it a bit more difficult to query from a web page. In either case, context propagation is already built-in, and in a standard way. Regards, [1] https://www.w3.org/TR/trace-context/ -- Fabrízio de Royes Mello PostgreSQL Developer at OnGres Inc. - https://ongres.com
Re: Extensibility of the PostgreSQL wire protocol
On Wed, Feb 10, 2021 at 11:04 AM Tom Lane wrote: > "Jonah H. Harris" writes: > > On Wed, Feb 10, 2021 at 1:10 PM Tom Lane wrote: > >> ... If we start having > >> modes for MySQL identifier quoting, Oracle outer join syntax, yadda > >> yadda, it's going to be way more of a maintenance nightmare than some > >> hook functions. So if we accept any patch along this line, I want to > >> drive a hard stake in the ground that the answer to that sort of thing > >> will be NO. > > > Actually, a substantial amount can be done with hooks. For Oracle, which > is > > substantially harder than MySQL, I have a completely separate parser that > > generates a PG-compatible parse tree packaged up as an extension. To > handle > > autonomous transactions, database links, hierarchical query conversion, > > hints, and some execution-related items requires core changes. > > That is a spot-on definition of where I do NOT want to end up. Hooks > everywhere and enormous extensions that break anytime we change anything > in the core. It's not really clear that anybody is going to find that > more maintainable than a straight fork, except to the extent that it > enables the erstwhile forkers to shove some of their work onto the PG > community. > > My feeling about this is if you want to use Oracle, go use Oracle. > Don't ask PG to take on a ton of maintenance issues so you can have > a frankenOracle. > PostgreSQL over the last decade spent a considerable amount of time allowing it to become extensible outside of core. We are now useful in workloads nobody would have considered in 2004 or 2008. The more extensibility we add, the LESS we maintain. It is a lot easier to maintain an API than it is an entire kernel. When I look at all the interesting features coming from the ecosystem, they are all built on the hooks that this community worked so hard to create. This idea is an extension of that and a result of the community's success. The more extensible we make PostgreSQL, the more the hacker community can innovate without damaging the PostgreSQL reputation as a rock solid database system. Features like these only enable the entire community to innovate. Is the real issue that the more extensible PostgreSQL is, the more boring it will become? JD > > regards, tom lane > > >
Re: Extensibility of the PostgreSQL wire protocol
On Thu, Feb 11, 2021 at 10:29 AM Andrew Dunstan wrote: > > On 2/11/21 10:06 AM, Tom Lane wrote: > > Robert Haas writes: > >> On Thu, Feb 11, 2021 at 9:42 AM Jonah H. Harris > wrote: > >>> As Jan said in his last email, they're not proposing all the different > >>> aspects needed. In fact, nothing has actually been proposed yet. This > >>> is an entirely philosophical debate. I don't even know what's being > >>> proposed at this point - I just know it *could* be useful. Let's just > >>> wait and see what is actually proposed before shooting it down, yes? > >> I don't think I'm trying to shoot anything down, because as I said, I > >> like extensibility and am generally in favor of it. Rather, I'm > >> expressing a concern which seems to me to be justified, based on what > >> was posted. I'm sorry that my tone seems to have aggravated you, but > >> it wasn't intended to do so. > > Likewise, the point I was trying to make is that a "pluggable wire > > protocol" is only a tiny part of what would be needed to have a credible > > MySQL, Oracle, or whatever clone. There are large semantic differences > > from those products; there are maintenance issues arising from the fact > > that we whack structures like parse trees around all the time; and so on. > > Maybe there is some useful thing that can be accomplished here, but we > > need to consider the bigger picture rather than believing (without proof) > > that a few hook variables will be enough to do anything. > > > > Yeah. I think we'd need a fairly fully worked implementation to see > where it goes. Is Amazon going to release (under TPL) its TDS > implementation of this? That might go a long way to convincing me this > is worth considering. > > Everything is planned to be released under the Apache 2.0 license so people are free to do with it as they choose.
Re: Extensibility of the PostgreSQL wire protocol
On 2/11/21 10:06 AM, Tom Lane wrote: > Robert Haas writes: >> On Thu, Feb 11, 2021 at 9:42 AM Jonah H. Harris >> wrote: >>> As Jan said in his last email, they're not proposing all the different >>> aspects needed. In fact, nothing has actually been proposed yet. This >>> is an entirely philosophical debate. I don't even know what's being >>> proposed at this point - I just know it *could* be useful. Let's just >>> wait and see what is actually proposed before shooting it down, yes? >> I don't think I'm trying to shoot anything down, because as I said, I >> like extensibility and am generally in favor of it. Rather, I'm >> expressing a concern which seems to me to be justified, based on what >> was posted. I'm sorry that my tone seems to have aggravated you, but >> it wasn't intended to do so. > Likewise, the point I was trying to make is that a "pluggable wire > protocol" is only a tiny part of what would be needed to have a credible > MySQL, Oracle, or whatever clone. There are large semantic differences > from those products; there are maintenance issues arising from the fact > that we whack structures like parse trees around all the time; and so on. > Maybe there is some useful thing that can be accomplished here, but we > need to consider the bigger picture rather than believing (without proof) > that a few hook variables will be enough to do anything. Yeah. I think we'd need a fairly fully worked implementation to see where it goes. Is Amazon going to release (under TPL) its TDS implementation of this? That might go a long way to convincing me this is worth considering. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Extensibility of the PostgreSQL wire protocol
Robert Haas writes: > On Thu, Feb 11, 2021 at 9:42 AM Jonah H. Harris > wrote: >> As Jan said in his last email, they're not proposing all the different >> aspects needed. In fact, nothing has actually been proposed yet. This >> is an entirely philosophical debate. I don't even know what's being >> proposed at this point - I just know it *could* be useful. Let's just >> wait and see what is actually proposed before shooting it down, yes? > I don't think I'm trying to shoot anything down, because as I said, I > like extensibility and am generally in favor of it. Rather, I'm > expressing a concern which seems to me to be justified, based on what > was posted. I'm sorry that my tone seems to have aggravated you, but > it wasn't intended to do so. Likewise, the point I was trying to make is that a "pluggable wire protocol" is only a tiny part of what would be needed to have a credible MySQL, Oracle, or whatever clone. There are large semantic differences from those products; there are maintenance issues arising from the fact that we whack structures like parse trees around all the time; and so on. Maybe there is some useful thing that can be accomplished here, but we need to consider the bigger picture rather than believing (without proof) that a few hook variables will be enough to do anything. regards, tom lane
Re: Extensibility of the PostgreSQL wire protocol
On Thu, Feb 11, 2021 at 9:42 AM Jonah H. Harris wrote: > I'm quite sure I said I'd open source my MySQL implementation, which allows > Postgres to appear to MySQL clients as a MySQL/MariaDB server. This is > neither proprietary nor Amazon-related and makes Postgres substantially more > useful for a large number of applications. OK. There's stuff to think about there, too: do we want that in contrib? Is it in good enough shape to be in contrib even if we did? If it's not in contrib, how do we incorporate it into, say, the buildfarm, so that we know if we break something? Is it actively maintained and stable, so that if it needs adjustment for upstream changes we can count on that getting addressed in a timely fashion? I don't know the answers to these questions and am not trying to prejudge, but I think they are important and relevant questions. > As Jan said in his last email, they're not proposing all the different > aspects needed. In fact, nothing has actually been proposed yet. This is an > entirely philosophical debate. I don't even know what's being proposed at > this point - I just know it *could* be useful. Let's just wait and see what > is actually proposed before shooting it down, yes? I don't think I'm trying to shoot anything down, because as I said, I like extensibility and am generally in favor of it. Rather, I'm expressing a concern which seems to me to be justified, based on what was posted. I'm sorry that my tone seems to have aggravated you, but it wasn't intended to do so. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Extensibility of the PostgreSQL wire protocol
On Thu, Feb 11, 2021 at 9:28 AM Robert Haas wrote: > That being said, I'm not in favor of transferring maintenance work to > the community for this set of hooks any more than I am for something > on the parsing side. In general, I'm in favor of as much extensibility > as we can reasonably create, but with a complicated proposal like this > one, the community should expect to be able to get something out of > it. And so far what I hear Jan saying is that these hooks could in > theory be used for things other than Amazon's proprietary efforts and > those things could in theory bring benefits to the community, but > there are no actual plans to do anything with this that would benefit > anyone other than Amazon. Which seems to bring us right back to > expecting the community to maintain things for the benefit of > third-party forks. > I'm quite sure I said I'd open source my MySQL implementation, which allows Postgres to appear to MySQL clients as a MySQL/MariaDB server. This is neither proprietary nor Amazon-related and makes Postgres substantially more useful for a large number of applications. As Jan said in his last email, they're not proposing all the different aspects needed. In fact, nothing has actually been proposed yet. This is an entirely philosophical debate. I don't even know what's being proposed at this point - I just know it *could* be useful. Let's just wait and see what is actually proposed before shooting it down, yes? -- Jonah H. Harris
Re: Extensibility of the PostgreSQL wire protocol
On Wed, Feb 10, 2021 at 2:04 PM Tom Lane wrote: > That is a spot-on definition of where I do NOT want to end up. Hooks > everywhere and enormous extensions that break anytime we change anything > in the core. It's not really clear that anybody is going to find that > more maintainable than a straight fork, except to the extent that it > enables the erstwhile forkers to shove some of their work onto the PG > community. +1. Making the lexer and parser extensible seems desirable to me. It would be beneficial not only for companies like EDB and Amazon that might want to extend the grammar in various ways, but also for extension authors. However, it's vastly harder than Jan's proposal to make the wire protocol pluggable. The wire protocol is pretty well-isolated from the rest of the system. As long as you can get queries out of the packets the client sends and package up the results to send back, it's all good. The parser, on the other hand, is not at all well-isolated from the rest of the system. There's a LOT of code that knows a whole lot of stuff about the structure of parse trees, so your variant parser can't produce parse trees for new kinds of DDL, or for new query constructs. And if it parsed some completely different syntax where, say, joins were not explicit, it would still have to figure out how to represent them in a way that looked just like it came out of the regular parser -- otherwise, parse analysis and query planning and so forth are not going to work, unless you go and change a lot of other code too, and I don't really have any idea how we could solve that, even in theory. But that kind of thing just isn't a problem for the proposal on this thread. That being said, I'm not in favor of transferring maintenance work to the community for this set of hooks any more than I am for something on the parsing side. In general, I'm in favor of as much extensibility as we can reasonably create, but with a complicated proposal like this one, the community should expect to be able to get something out of it. And so far what I hear Jan saying is that these hooks could in theory be used for things other than Amazon's proprietary efforts and those things could in theory bring benefits to the community, but there are no actual plans to do anything with this that would benefit anyone other than Amazon. Which seems to bring us right back to expecting the community to maintain things for the benefit of third-party forks. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Extensibility of the PostgreSQL wire protocol
On Wed, Feb 10, 2021 at 11:43 AM Robert Haas wrote: > On Mon, Jan 25, 2021 at 10:07 AM Jan Wieck wrote: > > Our current plan is to create a new set of API calls and hooks that > allow to register additional wire protocols. The existing backend libpq > implementation will be modified to register itself using the new API. This > will serve as a proof of concept as well as ensure that the API definition > is not slanted towards a specific protocol. It is also similar to the way > table access methods and compression methods are added. > > If we're going to end up with an open source implementation of > something useful in contrib or whatever, then I think this is fine. > But, if not, then we're just making it easier for Amazon to do > proprietary stuff without getting any benefit for the open-source > project. In fact, in that case PostgreSQL would ensure have to somehow > ensure that the hooks don't get broken without having any code that > actually uses them, so not only would the project get no benefit, but > it would actually incur a small tax. I wouldn't say that's an > absolutely show-stopper, but it definitely isn't my first choice. > At this very moment there are several parts to this. One is the hooks to make wire protocols into loadable modules, which is what this effort is about. Another is the TDS protocol as it is being implemented for Babelfish and third is the Babelfish extension itself. Both will require additional hooks and APIs I am not going to address here. I consider them not material to my effort. As for making the wire protocol itself expandable I really see a lot of potential outside of what Amazon wants here. And I would not be advertising it if it would be for Babelfish alone. As I laid out, just the ability for a third party to add additional messages for special connection pool support would be enough to make it useful. There also have been discussions in the JDBC subproject to combine certain messages into one single message. Why not allow the JDBC project to develop their own, JDBC-optimized backend side? Last but not least, what would be wrong with listening for MariaDB clients? I am planning on a follow up project to this, demoting libpq itself to just another loadable protocol. Just the way procedural languages are all on the same level because that is how I developed the loadable, procedural language handler all those years ago. Considering how spread out and quite frankly unorganized our wire protocol handling is, this is not a small order. Regards, Jan > > -- > Robert Haas > EDB: http://www.enterprisedb.com > -- Jan Wieck
Re: Extensibility of the PostgreSQL wire protocol
On Wed, Feb 10, 2021 at 2:04 PM Tom Lane wrote: > That is a spot-on definition of where I do NOT want to end up. Hooks > everywhere and enormous extensions that break anytime we change anything > in the core. It's not really clear that anybody is going to find that > more maintainable than a straight fork, except to the extent that it > enables the erstwhile forkers to shove some of their work onto the PG > community. > Given the work over the last few major releases to make several other aspects of Postgres pluggable, how is implementing a pluggable protocol API any different? To me, this sounds more like a philosophical disagreement with how people could potentially use Postgres than a technical one. My point is only that, using current PG functionality, I could equally write a pluggable storage interface for my Oracle and InnoDB data file readers/writers, which would similarly allow for the creation of a Postgres franken-Oracle by extension only. I don't think anyone is asking for hooks for all the things I mentioned - a pluggable transaction manager, for example, doesn't make much sense. But, when it comes to having actually done this vs. posited about its usefulness, I'd say it has some merit and doesn't really introduce that much complexity or maintenance overhead to core - whether the extensions still work properly is up to the extension authors... isn't that the whole point of extensions? -- Jonah H. Harris
Re: Extensibility of the PostgreSQL wire protocol
"Jonah H. Harris" writes: > On Wed, Feb 10, 2021 at 1:10 PM Tom Lane wrote: >> ... If we start having >> modes for MySQL identifier quoting, Oracle outer join syntax, yadda >> yadda, it's going to be way more of a maintenance nightmare than some >> hook functions. So if we accept any patch along this line, I want to >> drive a hard stake in the ground that the answer to that sort of thing >> will be NO. > Actually, a substantial amount can be done with hooks. For Oracle, which is > substantially harder than MySQL, I have a completely separate parser that > generates a PG-compatible parse tree packaged up as an extension. To handle > autonomous transactions, database links, hierarchical query conversion, > hints, and some execution-related items requires core changes. That is a spot-on definition of where I do NOT want to end up. Hooks everywhere and enormous extensions that break anytime we change anything in the core. It's not really clear that anybody is going to find that more maintainable than a straight fork, except to the extent that it enables the erstwhile forkers to shove some of their work onto the PG community. My feeling about this is if you want to use Oracle, go use Oracle. Don't ask PG to take on a ton of maintenance issues so you can have a frankenOracle. regards, tom lane
Re: Extensibility of the PostgreSQL wire protocol
On Wed, Feb 10, 2021 at 1:10 PM Tom Lane wrote: > What I'm actually more concerned about, in this whole line of development, > is the follow-on requests that will surely occur to kluge up Postgres > to make its behavior more like $whatever. As in "well, now that we > can serve MySQL clients protocol-wise, can't we pretty please have a > mode that makes the parser act more like MySQL". If we start having > modes for MySQL identifier quoting, Oracle outer join syntax, yadda > yadda, it's going to be way more of a maintenance nightmare than some > hook functions. So if we accept any patch along this line, I want to > drive a hard stake in the ground that the answer to that sort of thing > will be NO. > Actually, a substantial amount can be done with hooks. For Oracle, which is substantially harder than MySQL, I have a completely separate parser that generates a PG-compatible parse tree packaged up as an extension. To handle autonomous transactions, database links, hierarchical query conversion, hints, and some execution-related items requires core changes. But, the protocol and parsing can definitely be done with hooks. And, as was mentioned previously, this isn't tied directly to emulating another database - it would enable us to support an HTTP-ish interface directly in the server as an extension as well. A lot of this can be done with background worker extensions now, which is how my stuff was primarily architected, but it's hacky when it comes to areas where the items Jan discussed could clean things up and make them more pluggable. Assuming we're going to keep to that, though, it seems like people > doing this sort of thing will inevitably end up with a fork anyway. > So maybe we should just not bother with the first step either. > Perhaps I'm misunderstanding you, but I wouldn't throw this entire idea out (which enables a substantial addition of extensible functionality with a limited set of touchpoints) on the premise of future objections. -- Jonah H. Harris
Re: Extensibility of the PostgreSQL wire protocol
Robert Haas writes: > If we're going to end up with an open source implementation of > something useful in contrib or whatever, then I think this is fine. > But, if not, then we're just making it easier for Amazon to do > proprietary stuff without getting any benefit for the open-source > project. In fact, in that case PostgreSQL would ensure have to somehow > ensure that the hooks don't get broken without having any code that > actually uses them, so not only would the project get no benefit, but > it would actually incur a small tax. I wouldn't say that's an > absolutely show-stopper, but it definitely isn't my first choice. As others noted, a test module could be built to add some coverage here. What I'm actually more concerned about, in this whole line of development, is the follow-on requests that will surely occur to kluge up Postgres to make its behavior more like $whatever. As in "well, now that we can serve MySQL clients protocol-wise, can't we pretty please have a mode that makes the parser act more like MySQL". If we start having modes for MySQL identifier quoting, Oracle outer join syntax, yadda yadda, it's going to be way more of a maintenance nightmare than some hook functions. So if we accept any patch along this line, I want to drive a hard stake in the ground that the answer to that sort of thing will be NO. Assuming we're going to keep to that, though, it seems like people doing this sort of thing will inevitably end up with a fork anyway. So maybe we should just not bother with the first step either. regards, tom lane
Re: Extensibility of the PostgreSQL wire protocol
On Wed, Feb 10, 2021 at 11:43 AM Robert Haas wrote: > On Mon, Jan 25, 2021 at 10:07 AM Jan Wieck wrote: > > Our current plan is to create a new set of API calls and hooks that > allow to register additional wire protocols. The existing backend libpq > implementation will be modified to register itself using the new API. This > will serve as a proof of concept as well as ensure that the API definition > is not slanted towards a specific protocol. It is also similar to the way > table access methods and compression methods are added. > > If we're going to end up with an open source implementation of > something useful in contrib or whatever, then I think this is fine. > But, if not, then we're just making it easier for Amazon to do > proprietary stuff without getting any benefit for the open-source > project. In fact, in that case PostgreSQL would ensure have to somehow > ensure that the hooks don't get broken without having any code that > actually uses them, so not only would the project get no benefit, but > it would actually incur a small tax. I wouldn't say that's an > absolutely show-stopper, but it definitely isn't my first choice. > Agreed on adding substantial hooks if they're not likely to be used. While I haven't yet seen AWS' implementation or concrete proposal, given the people involved, I assume it's fairly similar to how I implemented it. Assuming that's correct and it doesn't require substantial redevelopment, I'd certainly open-source my MySQL-compatible protocol and parser implementation. From my perspective, it would be awesome if these could be done as extensions. While I'm not planning to open source it as of yet, for my Oracle-compatible stuff, I don't think I'd be able to do anything other than the protocol as an extension given the core-related changes similar to what EDB has to do. I don't think there's any easy way to get around that. But, for the protocol and any type of simple translation to Postgres' dialect, I think that could easily be hook-based. -- Jonah H. Harris
Re: Extensibility of the PostgreSQL wire protocol
On Wed, Feb 10, 2021 at 1:43 PM Robert Haas wrote: > > On Mon, Jan 25, 2021 at 10:07 AM Jan Wieck wrote: > > Our current plan is to create a new set of API calls and hooks that allow to register additional wire protocols. The existing backend libpq implementation will be modified to register itself using the new API. This will serve as a proof of concept as well as ensure that the API definition is not slanted towards a specific protocol. It is also similar to the way table access methods and compression methods are added. > > If we're going to end up with an open source implementation of > something useful in contrib or whatever, then I think this is fine. > But, if not, then we're just making it easier for Amazon to do > proprietary stuff without getting any benefit for the open-source > project. In fact, in that case PostgreSQL would ensure have to somehow > ensure that the hooks don't get broken without having any code that > actually uses them, so not only would the project get no benefit, but > it would actually incur a small tax. I wouldn't say that's an > absolutely show-stopper, but it definitely isn't my first choice. As far I understood Jan's proposal is to add enough hooks on PostgreSQL to enable us to extend the wire protocol and add a contrib module as an example (maybe TDS, HTTP or just adding new capabilities to current implementation). Regards, -- Fabrízio de Royes Mello PostgreSQL Developer at OnGres Inc. - https://ongres.com
Re: Extensibility of the PostgreSQL wire protocol
On Mon, Jan 25, 2021 at 10:07 AM Jan Wieck wrote: > Our current plan is to create a new set of API calls and hooks that allow to > register additional wire protocols. The existing backend libpq implementation > will be modified to register itself using the new API. This will serve as a > proof of concept as well as ensure that the API definition is not slanted > towards a specific protocol. It is also similar to the way table access > methods and compression methods are added. If we're going to end up with an open source implementation of something useful in contrib or whatever, then I think this is fine. But, if not, then we're just making it easier for Amazon to do proprietary stuff without getting any benefit for the open-source project. In fact, in that case PostgreSQL would ensure have to somehow ensure that the hooks don't get broken without having any code that actually uses them, so not only would the project get no benefit, but it would actually incur a small tax. I wouldn't say that's an absolutely show-stopper, but it definitely isn't my first choice. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Extensibility of the PostgreSQL wire protocol
Hi Jonah, On Mon, Jan 25, 2021 at 10:18 AM Jonah H. Harris wrote: > On Mon, Jan 25, 2021 at 10:07 AM Jan Wieck wrote: > >> The following is a request for discussion and comments, not a refined >> proposal accompanied by a working patch. >> > > After implementing this three different ways inside the backend over the > years, I landed on almost this identical approach for handling the MySQL, > TDS, MongoDB, and Oracle protocols for NEXTGRES. > Could any of that be open sourced? It would be an excellent addition to add one of those as example code. Regards, Jan > > Initially, each was implemented as an background worker extension which > had to handle its own networking, passing the fd off to new > protocol-specific connections, etc. This worked, but duplicate a good > amount of logic. It would be great to have a standard, loadable, way to add > support for a new protocol. > > -- > Jonah H. Harris > > -- Jan Wieck
Re: Extensibility of the PostgreSQL wire protocol
On Mon, Jan 25, 2021 at 10:07 AM Jan Wieck wrote: > The following is a request for discussion and comments, not a refined > proposal accompanied by a working patch. > After implementing this three different ways inside the backend over the years, I landed on almost this identical approach for handling the MySQL, TDS, MongoDB, and Oracle protocols for NEXTGRES. Initially, each was implemented as an background worker extension which had to handle its own networking, passing the fd off to new protocol-specific connections, etc. This worked, but duplicate a good amount of logic. It would be great to have a standard, loadable, way to add support for a new protocol. -- Jonah H. Harris