Ideas about a better API for postgres_fdw remote estimates
In <1116564.1593813...@sss.pgh.pa.us> I wrote: > I wonder whether someday we ought to invent a new API that's more > suited to postgres_fdw's needs than EXPLAIN is. It's not like the > remote planner doesn't know the number we want; it just fails to > include it in EXPLAIN. I've been thinking about this a little more, and I'd like to get some ideas down on electrons before they vanish. The current method for postgres_fdw to obtain remote estimates is to issue an EXPLAIN command to the remote server and then decipher the result. This has just one big advantage, which is that it works against existing, even very old, remote PG versions. In every other way it's pretty awful: it involves a lot of cycles on the far end to create output details we don't really care about, it requires a fair amount of logic to parse that output, and we can't get some details that we *do* care about (such as the total size of the foreign table, as per the other discussion). We can do better. I don't propose removing the existing logic, because being able to work against old remote PG versions seems pretty useful. But we could probe at connection start for whether the remote server has support for a better way, and then use that way if available. What should the better way look like? I suggest the following: * Rather than adding a core-server feature, the remote-end part of the new API should be a function installed by an extension (either postgres_fdw itself, or a new extension "postgres_fdw_remote" or the like). One attraction of this approach is that it'd be conceivable to back-port the new code into existing PG releases by updating the extension. Also there'd be room for multiple versions of the support. The connection-start probe could be of the form "does this function exist in pg_proc?". * I'm imagining the function being of the form function pg_catalog.postgres_fdw_support(query text) returns something where the input is still the text of a query we're considering issuing, and the output is some structure that contains the items of EXPLAIN-like data we need, but not the items we don't. The implementation of the function would run the query through parse/plan, then pick out the data we want and return that. * We could do a lot worse than to have the "structure" be JSON. This'd allow structured, labeled data to be returned; it would not be too difficult to construct, even in PG server versions predating the addition of JSON logic to the core; and the receiving postgres_fdw extension could use the core's JSON logic to parse the data. * The contents of the structure need to be designed with forethought for extensibility, but this doesn't seem hard if it's all a collection of labeled fields. We can just say that the recipient must ignore fields it doesn't recognize. Once a given field has been defined, we can't change its contents, but we can introduce new fields as needed. Note that I would not be in favor of putting an overall version number within the structure; that's way too coarse-grained. I'm not planning to do anything about these ideas myself, at least not in the short term. But perhaps somebody else would like to run with them. regards, tom lane
Re: Ideas about a better API for postgres_fdw remote estimates
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > In <1116564.1593813...@sss.pgh.pa.us> I wrote: > > I wonder whether someday we ought to invent a new API that's more > > suited to postgres_fdw's needs than EXPLAIN is. It's not like the > > remote planner doesn't know the number we want; it just fails to > > include it in EXPLAIN. > > I've been thinking about this a little more, and I'd like to get some > ideas down on electrons before they vanish. > > The current method for postgres_fdw to obtain remote estimates is to > issue an EXPLAIN command to the remote server and then decipher the > result. This has just one big advantage, which is that it works > against existing, even very old, remote PG versions. In every other > way it's pretty awful: it involves a lot of cycles on the far end > to create output details we don't really care about, it requires a > fair amount of logic to parse that output, and we can't get some > details that we *do* care about (such as the total size of the foreign > table, as per the other discussion). > > We can do better. I don't propose removing the existing logic, because > being able to work against old remote PG versions seems pretty useful. > But we could probe at connection start for whether the remote server > has support for a better way, and then use that way if available. I agree we can, and should, try to do better here. > What should the better way look like? I suggest the following: > > * Rather than adding a core-server feature, the remote-end part of the new > API should be a function installed by an extension (either postgres_fdw > itself, or a new extension "postgres_fdw_remote" or the like). One > attraction of this approach is that it'd be conceivable to back-port the > new code into existing PG releases by updating the extension. Also > there'd be room for multiple versions of the support. The > connection-start probe could be of the form "does this function exist > in pg_proc?". > > * I'm imagining the function being of the form > > function pg_catalog.postgres_fdw_support(query text) returns something > > where the input is still the text of a query we're considering issuing, > and the output is some structure that contains the items of EXPLAIN-like > data we need, but not the items we don't. The implementation of the > function would run the query through parse/plan, then pick out the > data we want and return that. > > * We could do a lot worse than to have the "structure" be JSON. > This'd allow structured, labeled data to be returned; it would not be > too difficult to construct, even in PG server versions predating the > addition of JSON logic to the core; and the receiving postgres_fdw > extension could use the core's JSON logic to parse the data. I also tend to agree with using JSON for this. > * The contents of the structure need to be designed with forethought > for extensibility, but this doesn't seem hard if it's all a collection > of labeled fields. We can just say that the recipient must ignore > fields it doesn't recognize. Once a given field has been defined, we > can't change its contents, but we can introduce new fields as needed. > Note that I would not be in favor of putting an overall version number > within the structure; that's way too coarse-grained. This also makes sense to me. > I'm not planning to do anything about these ideas myself, at least > not in the short term. But perhaps somebody else would like to > run with them. I'm trying to figure out why it makes more sense to use 'postgres_fdw_support(query text)', which would still do parse/plan and return EXPLAIN-like data, rather than having: EXPLAIN (FORMAT JSON, FDW true) query ... (Or, perhaps better, individual boolean options for whatever stuff we want to ask for, or to exclude if we don't want it, so that other tools could use this...). Thanks, Stephen signature.asc Description: PGP signature
Re: Ideas about a better API for postgres_fdw remote estimates
Stephen Frost writes: > * Tom Lane (t...@sss.pgh.pa.us) wrote: >> * Rather than adding a core-server feature, the remote-end part of the new >> API should be a function installed by an extension (either postgres_fdw >> itself, or a new extension "postgres_fdw_remote" or the like). > I'm trying to figure out why it makes more sense to use > 'postgres_fdw_support(query text)', which would still do parse/plan and > return EXPLAIN-like data, rather than having: > EXPLAIN (FORMAT JSON, FDW true) query ... I see a couple of reasons not to do it like that: 1. This is specific to postgres_fdw. Some other extension might want some other data, and different versions of postgres_fdw might want different data. So putting it into core seems like the wrong thing. 2. Wedging this into EXPLAIN would be quite ugly, because (at least as I envision it) the output would have just about nothing to do with any existing EXPLAIN output. 3. We surely would not back-patch a core change like this. OTOH, if the added infrastructure is in an extension, somebody might want to back-patch that (even if unofficially). This argument falls to the ground of course if we're forced to make any core changes to be able to get at the data we need; but I'm not sure that will be needed. regards, tom lane
Re: Ideas about a better API for postgres_fdw remote estimates
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frost writes: > > * Tom Lane (t...@sss.pgh.pa.us) wrote: > >> * Rather than adding a core-server feature, the remote-end part of the new > >> API should be a function installed by an extension (either postgres_fdw > >> itself, or a new extension "postgres_fdw_remote" or the like). > > > I'm trying to figure out why it makes more sense to use > > 'postgres_fdw_support(query text)', which would still do parse/plan and > > return EXPLAIN-like data, rather than having: > > > EXPLAIN (FORMAT JSON, FDW true) query ... > > I see a couple of reasons not to do it like that: > > 1. This is specific to postgres_fdw. Some other extension might want some > other data, and different versions of postgres_fdw might want different > data. So putting it into core seems like the wrong thing. Another extension or use-case might want exactly the same information too though. In a way, we'd be 'hiding' that information from other potential users unless they want to install their own extension, which is a pretty big leap. Are we sure this information wouldn't be at all interesting to pgAdmin4 or explain.depesz.com? > 2. Wedging this into EXPLAIN would be quite ugly, because (at least > as I envision it) the output would have just about nothing to do with > any existing EXPLAIN output. This is a better argument for not making it part of EXPLAIN, though I don't really feel like I've got a decent idea of what you are suggesting the output *would* look like, so it's difficult for me to agree (or disagree) about this particular point. > 3. We surely would not back-patch a core change like this. OTOH, if > the added infrastructure is in an extension, somebody might want to > back-patch that (even if unofficially). This argument falls to the > ground of course if we're forced to make any core changes to be able > to get at the data we need; but I'm not sure that will be needed. Since postgres_fdw is part of core and core's release cycle, and the packagers manage the extensions from core in a way that they have to match up, this argument doesn't hold any weight with me. For this to be a viable argument, we would need to segregate extensions from core and give them their own release cycle and clear indication of which extension versions work with which PG major versions, etc. I'm actually generally in support of *that* idea- and with that, would agree with your point 3 above, but that's not the reality of today. Thanks, Stephen signature.asc Description: PGP signature
Re: Ideas about a better API for postgres_fdw remote estimates
Stephen Frost writes: > * Tom Lane (t...@sss.pgh.pa.us) wrote: >> 2. Wedging this into EXPLAIN would be quite ugly, because (at least >> as I envision it) the output would have just about nothing to do with >> any existing EXPLAIN output. > This is a better argument for not making it part of EXPLAIN, though I > don't really feel like I've got a decent idea of what you are suggesting > the output *would* look like, so it's difficult for me to agree (or > disagree) about this particular point. Per postgres_fdw's get_remote_estimate(), the only data we use right now is the startup_cost, total_cost, rows and width estimates from the top-level Plan node. That's available immediately from the Plan tree, meaning that basically *nothing* of the substantial display effort expended by explain.c and ruleutils.c is of any value. So the level-zero implementation of this would be to run the parser and planner, format those four numbers into a JSON object (which would require little more infrastructure than sprintf), and return that. Sure, we could make that into some kind of early-exit path in explain.c, but I think it'd be a pretty substantial wart, especially since it'd mean that none of the other EXPLAIN options are sensible in combination with this one. Further down the road, we might want to rethink the whole idea of completely constructing a concrete Plan. We could get the data we need at the list-of-Paths stage. Even more interesting, we could (with very little more work) return data about multiple Paths, so that the client could find out, for example, the costs of sorted and unsorted output without paying two network round trips to discover that. That'd definitely require changes in the core planner, since it has no API to stop at that point. And it's even less within the charter of EXPLAIN. I grant your point that there might be other users for this besides postgres_fdw, but that doesn't mean it must be a core feature. >> 3. We surely would not back-patch a core change like this. OTOH, if >> the added infrastructure is in an extension, somebody might want to >> back-patch that (even if unofficially). > Since postgres_fdw is part of core and core's release cycle, and the > packagers manage the extensions from core in a way that they have to > match up, this argument doesn't hold any weight with me. Certainly only v14 (or whenever) and later postgres_fdw would be able to *use* this data. The scenario I'm imagining is that somebody wants to be able to use that client against an older remote server, and is willing to install some simple extension on the remote server to do so. Perhaps this scenario is not worth troubling over, but I don't think it's entirely far-fetched. regards, tom lane
Re: Ideas about a better API for postgres_fdw remote estimates
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frost writes: > > * Tom Lane (t...@sss.pgh.pa.us) wrote: > >> 2. Wedging this into EXPLAIN would be quite ugly, because (at least > >> as I envision it) the output would have just about nothing to do with > >> any existing EXPLAIN output. > > > This is a better argument for not making it part of EXPLAIN, though I > > don't really feel like I've got a decent idea of what you are suggesting > > the output *would* look like, so it's difficult for me to agree (or > > disagree) about this particular point. > > Per postgres_fdw's get_remote_estimate(), the only data we use right now > is the startup_cost, total_cost, rows and width estimates from the > top-level Plan node. That's available immediately from the Plan tree, > meaning that basically *nothing* of the substantial display effort > expended by explain.c and ruleutils.c is of any value. So the level-zero The 'display effort' you're referring to, when using JSON format with explain, is basically to format the results into JSON and return them- which is what you're suggesting this mode would do anyway, no..? If the remote side 'table' is actually a view that's complicated then having a way to get just the top-level information (and excluding the rest) sounds like it'd be useful and perhaps excluding that other info doesn't really fit into EXPLAIN's mandate, but that's also much less common. > implementation of this would be to run the parser and planner, format > those four numbers into a JSON object (which would require little more > infrastructure than sprintf), and return that. Sure, we could make that > into some kind of early-exit path in explain.c, but I think it'd be a > pretty substantial wart, especially since it'd mean that none of the > other EXPLAIN options are sensible in combination with this one. That EXPLAIN has options that only make sense in combination with certain other options isn't anything new- BUFFERS makes no sense without ANALYZE, etc. > Further down the road, we might want to rethink the whole idea of > completely constructing a concrete Plan. We could get the data we need > at the list-of-Paths stage. Even more interesting, we could (with very > little more work) return data about multiple Paths, so that the client > could find out, for example, the costs of sorted and unsorted output > without paying two network round trips to discover that. That'd > definitely require changes in the core planner, since it has no API to > stop at that point. And it's even less within the charter of EXPLAIN. I have to admit that I'm not really sure how we could make it work, but having a way to get multiple paths returned by EXPLAIN would certainly be interesting to a lot of users. Certainly it's easier to see how we could get at that info in a postgres_fdw-specific function, and be able to understand how to deal with it there and what could be done, but once it's there I wonder if other tools might see that and possibly even build on it because it'd be the only way to get that kind of info, which certainly wouldn't be ideal. > I grant your point that there might be other users for this besides > postgres_fdw, but that doesn't mean it must be a core feature. That postgres_fdw is an extension is almost as much of a wart as anything being discussed here and suggesting that things added to postgres_fdw aren't 'core features' seems akin to ignoring the forest for the trees- consider that, today, there isn't even an option to install only the core server from the PGDG repos (at least for Debian / Ubuntu, not sure if the RPMs have caught up to that yet, but they probably should). The 'postgresql-12' .deb includes all the extensions that are part of the core git repo, because they're released and maintained just the same as the core server and, from a practical perspective, to run a decent PG system you really should have them installed, so why bother having a separate package? > >> 3. We surely would not back-patch a core change like this. OTOH, if > >> the added infrastructure is in an extension, somebody might want to > >> back-patch that (even if unofficially). > > > Since postgres_fdw is part of core and core's release cycle, and the > > packagers manage the extensions from core in a way that they have to > > match up, this argument doesn't hold any weight with me. > > Certainly only v14 (or whenever) and later postgres_fdw would be able > to *use* this data. The scenario I'm imagining is that somebody wants > to be able to use that client against an older remote server, and is > willing to install some simple extension on the remote server to do so. > Perhaps this scenario is not worth troubling over, but I don't think > it's entirely far-fetched. I definitely don't think that such an extension should be maintained outside of core, and I seriously doubt any of our packagers would be anxious to build an indepedent package for this to be usable in older servers.
Re: Ideas about a better API for postgres_fdw remote estimates
Stephen Frost writes: > * Tom Lane (t...@sss.pgh.pa.us) wrote: >> Per postgres_fdw's get_remote_estimate(), the only data we use right now >> is the startup_cost, total_cost, rows and width estimates from the >> top-level Plan node. That's available immediately from the Plan tree, >> meaning that basically *nothing* of the substantial display effort >> expended by explain.c and ruleutils.c is of any value. So the level-zero > The 'display effort' you're referring to, when using JSON format with > explain, is basically to format the results into JSON and return them- > which is what you're suggesting this mode would do anyway, no..? Not hardly. Spend some time studying ruleutils.c sometime --- reverse-compiling a plan is *expensive*. For instance, we have to look up the names of all the operators used in the query quals, decide what needs quoting, decide what needs parenthesization, etc. There's also a fun little bit that assigns unique aliases to each table appearing in the query, which from memory is at least O(N^2) and maybe worse. (Admittedly, shipped queries are usually not so complicated that N would be large.) And by the way, we're also starting up the executor, even if you didn't say ANALYZE. A little bit of fooling with "perf" suggests that when explaining a pretty simple bitmapscan query --- I used EXPLAIN SELECT * FROM tenk1 WHERE unique1 > 9995 which ought to be somewhat representative of what postgres_fdw needs --- only about half of the runtime is spent within pg_plan_query, and the other half is spent on explain.c + ruleutils.c formatting work. So while getting rid of that overhead wouldn't be an earthshattering improvement, I think it'd be worthwhile. >> Further down the road, we might want to rethink the whole idea of >> completely constructing a concrete Plan. We could get the data we need >> at the list-of-Paths stage. Even more interesting, we could (with very >> little more work) return data about multiple Paths, so that the client >> could find out, for example, the costs of sorted and unsorted output >> without paying two network round trips to discover that. > I have to admit that I'm not really sure how we could make it work, but > having a way to get multiple paths returned by EXPLAIN would certainly > be interesting to a lot of users. Certainly it's easier to see how we > could get at that info in a postgres_fdw-specific function, and be able > to understand how to deal with it there and what could be done, but once > it's there I wonder if other tools might see that and possibly even > build on it because it'd be the only way to get that kind of info, which > certainly wouldn't be ideal. Yeah, thinking about it as a function that inspects partial planner results, it might be useful for other purposes besides postgres_fdw. As I said before, I don't think this necessarily has to be bundled as part of postgres_fdw. That still doesn't make it part of EXPLAIN. > That postgres_fdw is an extension is almost as much of a wart as > anything being discussed here and suggesting that things added to > postgres_fdw aren't 'core features' seems akin to ignoring the forest > for the trees- I think we just had this discussion in another thread. The fact that postgres_fdw is an extension is a feature, not a bug, because (a) it means that somebody could implement their own version if they wanted it to act differently; and (b) it keeps us honest about whether the APIs needed by an FDW are accessible from outside core. I think moving postgres_fdw into core would be a large step backwards. regards, tom lane
Re: Ideas about a better API for postgres_fdw remote estimates
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frost writes: > > * Tom Lane (t...@sss.pgh.pa.us) wrote: > >> Per postgres_fdw's get_remote_estimate(), the only data we use right now > >> is the startup_cost, total_cost, rows and width estimates from the > >> top-level Plan node. That's available immediately from the Plan tree, > >> meaning that basically *nothing* of the substantial display effort > >> expended by explain.c and ruleutils.c is of any value. So the level-zero > > > The 'display effort' you're referring to, when using JSON format with > > explain, is basically to format the results into JSON and return them- > > which is what you're suggesting this mode would do anyway, no..? > > Not hardly. Spend some time studying ruleutils.c sometime --- > reverse-compiling a plan is *expensive*. For instance, we have to > look up the names of all the operators used in the query quals, > decide what needs quoting, decide what needs parenthesization, etc. Ah, alright, that makes more sense then. > There's also a fun little bit that assigns unique aliases to each > table appearing in the query, which from memory is at least O(N^2) > and maybe worse. (Admittedly, shipped queries are usually not so > complicated that N would be large.) And by the way, we're also > starting up the executor, even if you didn't say ANALYZE. > > A little bit of fooling with "perf" suggests that when explaining > a pretty simple bitmapscan query --- I used > EXPLAIN SELECT * FROM tenk1 WHERE unique1 > 9995 > which ought to be somewhat representative of what postgres_fdw needs > --- only about half of the runtime is spent within pg_plan_query, and > the other half is spent on explain.c + ruleutils.c formatting work. > So while getting rid of that overhead wouldn't be an earthshattering > improvement, I think it'd be worthwhile. Sure. > >> Further down the road, we might want to rethink the whole idea of > >> completely constructing a concrete Plan. We could get the data we need > >> at the list-of-Paths stage. Even more interesting, we could (with very > >> little more work) return data about multiple Paths, so that the client > >> could find out, for example, the costs of sorted and unsorted output > >> without paying two network round trips to discover that. > > > I have to admit that I'm not really sure how we could make it work, but > > having a way to get multiple paths returned by EXPLAIN would certainly > > be interesting to a lot of users. Certainly it's easier to see how we > > could get at that info in a postgres_fdw-specific function, and be able > > to understand how to deal with it there and what could be done, but once > > it's there I wonder if other tools might see that and possibly even > > build on it because it'd be the only way to get that kind of info, which > > certainly wouldn't be ideal. > > Yeah, thinking about it as a function that inspects partial planner > results, it might be useful for other purposes besides postgres_fdw. > As I said before, I don't think this necessarily has to be bundled as > part of postgres_fdw. That still doesn't make it part of EXPLAIN. Providing it as a function rather than through EXPLAIN does make a bit more sense if we're going to skip things like the lookups you mention above. I'm still inclined to have it be a part of core rather than having it as postgres_fdw though. I'm not completely against it being part of postgres_fdw... but I would think that would really be appropriate if it's actually using something in postgres_fdw, but if everything that it's doing is part of core and nothing related specifically to the postgres FDW, then having it as part of core makes more sense to me. Also, having it as part of core would make it more appropriate for other tools to look at and adding that kind of inspection capability for partial planner results could be very interesting for tools like pgAdmin and such. > > That postgres_fdw is an extension is almost as much of a wart as > > anything being discussed here and suggesting that things added to > > postgres_fdw aren't 'core features' seems akin to ignoring the forest > > for the trees- > > I think we just had this discussion in another thread. The fact that > postgres_fdw is an extension is a feature, not a bug, because (a) it > means that somebody could implement their own version if they wanted > it to act differently; and (b) it keeps us honest about whether the > APIs needed by an FDW are accessible from outside core. I think moving > postgres_fdw into core would be a large step backwards. I'm not looking to change it today, as that ship has sailed, but while having FDWs as a general capability that can be implemented by extensions is certainly great and I'd love to see more of that (even better would be more of those that are well maintained and cared for by this community of folks), requiring users to install an extension into every database where they want to query another PG s
Re: Ideas about a better API for postgres_fdw remote estimates
On Mon, Jul 6, 2020 at 11:28:28AM -0400, Stephen Frost wrote: > > Yeah, thinking about it as a function that inspects partial planner > > results, it might be useful for other purposes besides postgres_fdw. > > As I said before, I don't think this necessarily has to be bundled as > > part of postgres_fdw. That still doesn't make it part of EXPLAIN. > > Providing it as a function rather than through EXPLAIN does make a bit > more sense if we're going to skip things like the lookups you mention > above. I'm still inclined to have it be a part of core rather than > having it as postgres_fdw though. I'm not completely against it being > part of postgres_fdw... but I would think that would really be > appropriate if it's actually using something in postgres_fdw, but if > everything that it's doing is part of core and nothing related > specifically to the postgres FDW, then having it as part of core makes > more sense to me. Also, having it as part of core would make it more > appropriate for other tools to look at and adding that kind of > inspection capability for partial planner results could be very > interesting for tools like pgAdmin and such. I agree the statistics extraction should probably be part of core. There is the goal if FDWs returning data, and returning the data quickly. I think we can require all-new FDW servers to get improved performance. I am not even clear if we have a full understanding of the performance characteristics of FDWs yet. I know Tomas did some research on its DML behavior, but other than that, I haven't seen much. On a related note, I have wished to be able to see all the costs associated with plans not chosen, and I think others would like that as well. Getting multiple costs for a query goes in that direction. -- Bruce Momjian https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: Ideas about a better API for postgres_fdw remote estimates
On 7/14/20 6:32 AM, Bruce Momjian wrote: On Mon, Jul 6, 2020 at 11:28:28AM -0400, Stephen Frost wrote: Yeah, thinking about it as a function that inspects partial planner results, it might be useful for other purposes besides postgres_fdw. As I said before, I don't think this necessarily has to be bundled as part of postgres_fdw. That still doesn't make it part of EXPLAIN. Providing it as a function rather than through EXPLAIN does make a bit more sense if we're going to skip things like the lookups you mention above. I'm still inclined to have it be a part of core rather than having it as postgres_fdw though. I'm not completely against it being part of postgres_fdw... but I would think that would really be appropriate if it's actually using something in postgres_fdw, but if everything that it's doing is part of core and nothing related specifically to the postgres FDW, then having it as part of core makes more sense to me. Also, having it as part of core would make it more appropriate for other tools to look at and adding that kind of inspection capability for partial planner results could be very interesting for tools like pgAdmin and such. I agree the statistics extraction should probably be part of core. There is the goal if FDWs returning data, and returning the data quickly. I think we can require all-new FDW servers to get improved performance. I am not even clear if we have a full understanding of the performance characteristics of FDWs yet. I know Tomas did some research on its DML behavior, but other than that, I haven't seen much. On a related note, I have wished to be able to see all the costs associated with plans not chosen, and I think others would like that as well. Getting multiple costs for a query goes in that direction. During the implementation of sharding related improvements i noticed that if we use a lot of foreign partitions, we have bad plans because of vacuum don't update statistics of foreign tables.This is done by the ANALYZE command, but it is very expensive operation for foreign table. Problem with statistics demonstrates with TAP-test from the first patch in attachment. I implemented some FDW + pg core machinery to reduce weight of the problem. The ANALYZE command on foreign table executes query on foreign server that extracts statistics tuple, serializes it into json-formatted string and returns to the caller. The caller deserializes this string, generates statistics for this foreign table and update it. The second patch is a proof-of-concept. This patch speedup analyze command and provides statistics relevance on a foreign table after autovacuum operation. Its effectiveness depends on relevance of statistics on the remote server, but still. -- regards, Andrey Lepikhov Postgres Professional >From 329954981959ee3fc97e52266c89a436d02ddf5e Mon Sep 17 00:00:00 2001 From: "Andrey V. Lepikhov" Date: Tue, 4 Aug 2020 09:29:37 +0500 Subject: [PATCH 1/2] TAP-Test on bad statistics. Add dummy postgres_fdw_stat() routine. It will return stat tuples for each input attribute. --- contrib/postgres_fdw/Makefile | 4 ++-- contrib/postgres_fdw/expected/foreign_stat.out | 18 ++ .../postgres_fdw/postgres_fdw--1.0--1.1.sql| 11 +++ contrib/postgres_fdw/postgres_fdw.c| 17 + contrib/postgres_fdw/postgres_fdw.control | 2 +- contrib/postgres_fdw/sql/foreign_stat.sql | 10 ++ 6 files changed, 59 insertions(+), 3 deletions(-) create mode 100644 contrib/postgres_fdw/expected/foreign_stat.out create mode 100644 contrib/postgres_fdw/postgres_fdw--1.0--1.1.sql create mode 100644 contrib/postgres_fdw/sql/foreign_stat.sql diff --git a/contrib/postgres_fdw/Makefile b/contrib/postgres_fdw/Makefile index ee8a80a392..15a6f6c353 100644 --- a/contrib/postgres_fdw/Makefile +++ b/contrib/postgres_fdw/Makefile @@ -14,9 +14,9 @@ PG_CPPFLAGS = -I$(libpq_srcdir) SHLIB_LINK_INTERNAL = $(libpq) EXTENSION = postgres_fdw -DATA = postgres_fdw--1.0.sql +DATA = postgres_fdw--1.0.sql postgres_fdw--1.0--1.1.sql -REGRESS = postgres_fdw +REGRESS = postgres_fdw foreign_stat ifdef USE_PGXS PG_CONFIG = pg_config diff --git a/contrib/postgres_fdw/expected/foreign_stat.out b/contrib/postgres_fdw/expected/foreign_stat.out new file mode 100644 index 00..28a470bccc --- /dev/null +++ b/contrib/postgres_fdw/expected/foreign_stat.out @@ -0,0 +1,18 @@ +CREATE TABLE ltable (a int, b real); +CREATE FOREIGN TABLE ftable (a int) server loopback options (table_name 'ltable'); +INSERT INTO ltable (a, b) (SELECT *, 1.01 FROM generate_series(1, 1E4)); +VACUUM; +-- Check statistic interface routine +SELECT * FROM postgres_fdw_stat('public', 'test', 'a'); + postgres_fdw_stat +--- + +(1 row) + +EXPLAIN (TIMING OFF, SUMMARY OFF, COSTS ON, ANALYZE) +SELECT * FROM ftable; + QUERY PLAN +-
Re: Ideas about a better API for postgres_fdw remote estimates
Greetings, * Andrey Lepikhov (a.lepik...@postgrespro.ru) wrote: > During the implementation of sharding related improvements i noticed that if > we use a lot of foreign partitions, we have bad plans because of vacuum > don't update statistics of foreign tables.This is done by the ANALYZE > command, but it is very expensive operation for foreign table. > Problem with statistics demonstrates with TAP-test from the first patch in > attachment. Yes, the way we handle ANALYZE today for FDWs is pretty terrible, since we stream the entire table across to do it. > I implemented some FDW + pg core machinery to reduce weight of the problem. > The ANALYZE command on foreign table executes query on foreign server that > extracts statistics tuple, serializes it into json-formatted string and > returns to the caller. The caller deserializes this string, generates > statistics for this foreign table and update it. The second patch is a > proof-of-concept. Isn't this going to create a version dependency that we'll need to deal with..? What if a newer major version has some kind of improved ANALYZE command, in terms of what it looks at or stores, and it's talking to an older server? When I was considering the issue with ANALYZE and FDWs, I had been thinking it'd make sense to just change the query that's built in deparseAnalyzeSql() to have a TABLESAMPLE clause, but otherwise run in more-or-less the same manner as today. If we don't like the available TABLESAMPLE methods then we could add a new one that's explicitly the 'right' sample for an ANALYZE call and use that when it's available on the remote side. Not sure if it'd make any sense for ANALYZE itself to start using that same TABLESAMPLE code, but maybe? Not that I think it'd be much of an issue if it's independent either, with appropriate comments to note that we should probably try to make them match up for the sake of FDWs. > This patch speedup analyze command and provides statistics relevance on a > foreign table after autovacuum operation. Its effectiveness depends on > relevance of statistics on the remote server, but still. If we do decide to go down this route, wouldn't it mean we'd have to solve the problem of what to do when it's a 9.6 foreign server being queried from a v12 server and dealing with any difference in the statistics structures of the two? Seems like we would... in which case I would say that we should pull that bit out and make it general, and use it for pg_upgrade too, which would benefit a great deal from having the ability to upgrade stats between major versions also. That's a much bigger piece to take on, of course, but seems to be what's implied with this approach for the FDW. Thanks, Stephen signature.asc Description: PGP signature
Re: Ideas about a better API for postgres_fdw remote estimates
Stephen Frost writes: > Isn't this going to create a version dependency that we'll need to deal > with..? What if a newer major version has some kind of improved ANALYZE > command, in terms of what it looks at or stores, and it's talking to an > older server? Yeah, this proposal is a nonstarter unless it can deal with the remote server being a different PG version with different stats. Years ago (when I was still at Salesforce, IIRC, so ~5 years) we had some discussions about making it possible for pg_dump and/or pg_upgrade to propagate stats data forward to the new database. There is at least one POC patch in the archives for doing that by dumping the stats data wrapped in a function call, where the target database's version of the function would be responsible for adapting the data if necessary, or maybe just discarding it if it couldn't adapt. We seem to have lost interest but it still seems like something worth pursuing. I'd guess that if such infrastructure existed it could be helpful for this. > When I was considering the issue with ANALYZE and FDWs, I had been > thinking it'd make sense to just change the query that's built in > deparseAnalyzeSql() to have a TABLESAMPLE clause, but otherwise run in > more-or-less the same manner as today. +1, that seems like something worth doing in any case, since even if we do get somewhere with the present idea it would only work for new remote servers. TABLESAMPLE would work pretty far back (9.5, looks like). regards, tom lane
Re: Ideas about a better API for postgres_fdw remote estimates
On 8/29/20 9:22 PM, Stephen Frost wrote: I implemented some FDW + pg core machinery to reduce weight of the problem. The ANALYZE command on foreign table executes query on foreign server that extracts statistics tuple, serializes it into json-formatted string and returns to the caller. The caller deserializes this string, generates statistics for this foreign table and update it. The second patch is a proof-of-concept. Isn't this going to create a version dependency that we'll need to deal with..? What if a newer major version has some kind of improved ANALYZE command, in terms of what it looks at or stores, and it's talking to an older server? When I was considering the issue with ANALYZE and FDWs, I had been thinking it'd make sense to just change the query that's built in deparseAnalyzeSql() to have a TABLESAMPLE clause, but otherwise run in more-or-less the same manner as today. If we don't like the available TABLESAMPLE methods then we could add a new one that's explicitly the 'right' sample for an ANALYZE call and use that when it's available on the remote side. Not sure if it'd make any sense for ANALYZE itself to start using that same TABLESAMPLE code, but maybe? Not that I think it'd be much of an issue if it's independent either, with appropriate comments to note that we should probably try to make them match up for the sake of FDWs. This approach does not contradict your idea. This is a lightweight opportunity to reduce the cost of analysis if we have a set of servers with actual versions of system catalog and fdw. This patch speedup analyze command and provides statistics relevance on a foreign table after autovacuum operation. Its effectiveness depends on relevance of statistics on the remote server, but still. If we do decide to go down this route, wouldn't it mean we'd have to solve the problem of what to do when it's a 9.6 foreign server being queried from a v12 server and dealing with any difference in the statistics structures of the two? Seems like we would... in which case I would say that we should pull that bit out and make it general, and use it for pg_upgrade too, which would benefit a great deal from having the ability to upgrade stats between major versions also. That's a much bigger piece to take on, of course, but seems to be what's implied with this approach for the FDW. Thank you for this use case. We can add field "version" to statistics string (btree uses versioning too). As you can see, in this patch we are only trying to get statistics. If for some reason this does not work out, then we go along a difficult route. Moreover, I believe this strategy should only work if we analyze a relation implicitly. If the user executes analysis explicitly by the command "ANALYZE ", we need to perform an fair analysis of the table. -- regards, Andrey Lepikhov Postgres Professional
Re: Ideas about a better API for postgres_fdw remote estimates
On 8/29/20 9:50 PM, Tom Lane wrote: Years ago (when I was still at Salesforce, IIRC, so ~5 years) we had some discussions about making it possible for pg_dump and/or pg_upgrade to propagate stats data forward to the new database. There is at least one POC patch in the archives for doing that by dumping the stats data wrapped in a function call, where the target database's version of the function would be responsible for adapting the data if necessary, or maybe just discarding it if it couldn't adapt. We seem to have lost interest but it still seems like something worth pursuing. I'd guess that if such infrastructure existed it could be helpful for this. Thanks for this helpful feedback. I found several threads related to the problem [1-3]. I agreed that this task needs to implement an API for serialization/deserialization of statistics: pg_load_relation_statistics(json_string text); pg_get_relation_statistics(relname text); We can use a version number for resolving conflicts with different statistics implementations. "Load" function will validate the values[] anyarray while deserializing the input json string to the datatype of the relation column. Maybe I didn't feel all the problems of this task? 1. https://www.postgresql.org/message-id/flat/724322880.K8vzik8zPz%40abook 2. https://www.postgresql.org/message-id/flat/CAAZKuFaWdLkK8eozSAooZBets9y_mfo2HS6urPAKXEPbd-JLCA%40mail.gmail.com 3. https://www.postgresql.org/message-id/flat/GNELIHDDFBOCMGBFGEFOOEOPCBAA.chriskl%40familyhealth.com.au -- regards, Andrey Lepikhov Postgres Professional
Re: Ideas about a better API for postgres_fdw remote estimates
On Mon, Aug 31, 2020 at 3:36 PM Andrey V. Lepikhov wrote: > > Thanks for this helpful feedback. > > I found several threads related to the problem [1-3]. > I agreed that this task needs to implement an API for > serialization/deserialization of statistics: > pg_load_relation_statistics(json_string text); > pg_get_relation_statistics(relname text); > We can use a version number for resolving conflicts with different > statistics implementations. > "Load" function will validate the values[] anyarray while deserializing > the input json string to the datatype of the relation column. > This is a valuable feature. Analysing a foreign table by fetching rows from the foreign server isn't very efficient. In fact the current FDW API for doing that forges that in-efficiency by requiring the FDW to return a sample of rows that will be analysed by the core. That's why I see that your patch introduces a new API to get foreign rel stat. I don't think there's any point in maintaining these two APIs just for ANALYSING table. Instead we should have only one FDW API which will do whatever it wants and return statistics that can be understood by the core and let core install it in the catalogs. I believe that's doable. In case of PostgreSQL it could get the stats available as is from the foreign server, convert it into a form that the core understands and returns. The patch introduces a new function postgres_fdw_stat() which will be available only from version 14 onwards. Can we use row_to_json(), which is available in all the supported versions, instead? In case of some other foreign server, an FDW will be responsible to return statistics in a form that the core will understand. It may fetch rows from the foreign server or be a bit smart and fetch the statistics and convert. This also means that FDWs will have to deal with the statistics format that the core understands and thus will need changes in their code with every version in the worst case. But AFAIR, PostgreSQL supports different forms of statistics so the problem may not remain that severe if FDWs and core agree on some bare minimum format that the core supports for long. I think the patch has some other problems like it works only for regular tables on foreign server but a foreign table can be pointing to any relation like a materialized view, partitioned table or a foreign table on the foreign server all of which have statistics associated with them. I didn't look closely but it does not consider that the foreign table may not have all the columns from the relation on the foreign server or may have different names. But I think those problems are kind of secondary. We have to agree on the design first. -- Best Wishes, Ashutosh Bapat
Re: Ideas about a better API for postgres_fdw remote estimates
On Sat, Aug 29, 2020 at 12:50:59PM -0400, Tom Lane wrote: > Stephen Frost writes: > > Isn't this going to create a version dependency that we'll need to deal > > with..? What if a newer major version has some kind of improved ANALYZE > > command, in terms of what it looks at or stores, and it's talking to an > > older server? > > Yeah, this proposal is a nonstarter unless it can deal with the remote > server being a different PG version with different stats. > > Years ago (when I was still at Salesforce, IIRC, so ~5 years) we had > some discussions about making it possible for pg_dump and/or pg_upgrade > to propagate stats data forward to the new database. There is at least > one POC patch in the archives for doing that by dumping the stats data > wrapped in a function call, where the target database's version of the > function would be responsible for adapting the data if necessary, or > maybe just discarding it if it couldn't adapt. We seem to have lost > interest but it still seems like something worth pursuing. I'd guess > that if such infrastructure existed it could be helpful for this. I don't think there was enough value to do statistics migration just for pg_upgrade, but doing it for pg_upgrade and FDWs seems like it might have enough demand to justify the required work and maintenance. -- Bruce Momjian https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: Ideas about a better API for postgres_fdw remote estimates
Greetings, * Bruce Momjian (br...@momjian.us) wrote: > On Sat, Aug 29, 2020 at 12:50:59PM -0400, Tom Lane wrote: > > Stephen Frost writes: > > > Isn't this going to create a version dependency that we'll need to deal > > > with..? What if a newer major version has some kind of improved ANALYZE > > > command, in terms of what it looks at or stores, and it's talking to an > > > older server? > > > > Yeah, this proposal is a nonstarter unless it can deal with the remote > > server being a different PG version with different stats. > > > > Years ago (when I was still at Salesforce, IIRC, so ~5 years) we had > > some discussions about making it possible for pg_dump and/or pg_upgrade > > to propagate stats data forward to the new database. There is at least > > one POC patch in the archives for doing that by dumping the stats data > > wrapped in a function call, where the target database's version of the > > function would be responsible for adapting the data if necessary, or > > maybe just discarding it if it couldn't adapt. We seem to have lost > > interest but it still seems like something worth pursuing. I'd guess > > that if such infrastructure existed it could be helpful for this. > > I don't think there was enough value to do statistics migration just for > pg_upgrade, but doing it for pg_upgrade and FDWs seems like it might > have enough demand to justify the required work and maintenance. Not sure that it really matters much, but I disagree with the assessment that there wasn't enough value to do it for pg_upgrade; I feel that it just hasn't been something that's had enough people interested in working on it, which isn't the same thing. If a good patch showed up tomorrow, with someone willing to spend time on it, I definitely think it'd be something we should include even if it's just for pg_upgrade. A solution that works for both pg_upgrade and the postgres FDW would be even better, of course. Thanks, Stephen signature.asc Description: PGP signature
Re: Ideas about a better API for postgres_fdw remote estimates
On Mon, Aug 31, 2020 at 12:19:52PM -0400, Stephen Frost wrote: > * Bruce Momjian (br...@momjian.us) wrote: > > I don't think there was enough value to do statistics migration just for > > pg_upgrade, but doing it for pg_upgrade and FDWs seems like it might > > have enough demand to justify the required work and maintenance. > > Not sure that it really matters much, but I disagree with the assessment > that there wasn't enough value to do it for pg_upgrade; I feel that it > just hasn't been something that's had enough people interested in > working on it, which isn't the same thing. I am not sure what point you are trying to make, but if it had enough value, wouldn't people work on it, or are you saying that it had enough value, but people didn't realize it, so didn't work on it? I guess I can see that. For me, it was the maintenance burden that always scared me from getting involved since it would be the rare case where pg_upgrade would have to be modified for perhaps every major release. > If a good patch showed up tomorrow, with someone willing to spend time > on it, I definitely think it'd be something we should include even if > it's just for pg_upgrade. A solution that works for both pg_upgrade and > the postgres FDW would be even better, of course. Yep, see above. The problem isn't mostly the initial patch, but someone who is going to work on it and test it for every major release, potentially forever. Frankly, this is a pg_dump feature, rather than something pg_upgrade should be doing, because not having to run ANALYZE after restoring a dump is also a needed feature. -- Bruce Momjian https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: Ideas about a better API for postgres_fdw remote estimates
Greetings, * Bruce Momjian (br...@momjian.us) wrote: > On Mon, Aug 31, 2020 at 12:19:52PM -0400, Stephen Frost wrote: > > * Bruce Momjian (br...@momjian.us) wrote: > > > I don't think there was enough value to do statistics migration just for > > > pg_upgrade, but doing it for pg_upgrade and FDWs seems like it might > > > have enough demand to justify the required work and maintenance. > > > > Not sure that it really matters much, but I disagree with the assessment > > that there wasn't enough value to do it for pg_upgrade; I feel that it > > just hasn't been something that's had enough people interested in > > working on it, which isn't the same thing. > > I am not sure what point you are trying to make, but if it had enough > value, wouldn't people work on it, or are you saying that it had enough > value, but people didn't realize it, so didn't work on it? I guess I > can see that. For me, it was the maintenance burden that always scared > me from getting involved since it would be the rare case where > pg_upgrade would have to be modified for perhaps every major release. The point I was making was that it has value and people did realize it but there's only so many resources to go around when it comes to hacking on PG and therefore it simply hasn't been done yet. There's a big difference between "yes, we all agree that would be good to have, but no one has had time to work on it" and "we don't think this is worth having because of the maintenance work it'd require." The latter shuts down anyone thinking of working on it, which is why I said anything. > > If a good patch showed up tomorrow, with someone willing to spend time > > on it, I definitely think it'd be something we should include even if > > it's just for pg_upgrade. A solution that works for both pg_upgrade and > > the postgres FDW would be even better, of course. > > Yep, see above. The problem isn't mostly the initial patch, but someone > who is going to work on it and test it for every major release, > potentially forever. Frankly, this is a pg_dump feature, rather than > something pg_upgrade should be doing, because not having to run ANALYZE > after restoring a dump is also a needed feature. I tend to agree with it being more of a pg_dump issue- but that also shows that your assessment above doesn't actually fit, because we definitely change pg_dump every release. Consider that if someone wants to add some new option to CREATE TABLE, which gets remembered in the catalog, they have to make sure that pg_dump support is added for that. If we added the statistics export/import to pg_dump, someone changing those parts of the system would also be expected to update pg_dump to manage those changes, including working with older versions of PG. Thanks, Stephen signature.asc Description: PGP signature
Re: Ideas about a better API for postgres_fdw remote estimates
On Mon, Aug 31, 2020 at 12:56:21PM -0400, Stephen Frost wrote: > Greetings, > * Bruce Momjian (br...@momjian.us) wrote: > The point I was making was that it has value and people did realize it > but there's only so many resources to go around when it comes to hacking > on PG and therefore it simply hasn't been done yet. > > There's a big difference between "yes, we all agree that would be good > to have, but no one has had time to work on it" and "we don't think this > is worth having because of the maintenance work it'd require." The > latter shuts down anyone thinking of working on it, which is why I said > anything. I actually don't know which statement above is correct, because of the "forever" maintenance. > I tend to agree with it being more of a pg_dump issue- but that also > shows that your assessment above doesn't actually fit, because we > definitely change pg_dump every release. Consider that if someone wants > to add some new option to CREATE TABLE, which gets remembered in the > catalog, they have to make sure that pg_dump support is added for that. > If we added the statistics export/import to pg_dump, someone changing > those parts of the system would also be expected to update pg_dump to > manage those changes, including working with older versions of PG. Yes, very true, but technically any change in any aspect of the statistics system would require modification of the statistics dump, which usually isn't required for most feature changes. -- Bruce Momjian https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: Ideas about a better API for postgres_fdw remote estimates
Greetings, * Bruce Momjian (br...@momjian.us) wrote: > On Mon, Aug 31, 2020 at 12:56:21PM -0400, Stephen Frost wrote: > > The point I was making was that it has value and people did realize it > > but there's only so many resources to go around when it comes to hacking > > on PG and therefore it simply hasn't been done yet. > > > > There's a big difference between "yes, we all agree that would be good > > to have, but no one has had time to work on it" and "we don't think this > > is worth having because of the maintenance work it'd require." The > > latter shuts down anyone thinking of working on it, which is why I said > > anything. > > I actually don't know which statement above is correct, because of the > "forever" maintenance. I can understand not being sure which is correct, and we can all have different points of view on it too, but that's a much softer stance than what I, at least, understood from your up-thread comment which was- > I don't think there was enough value to do statistics migration just > for pg_upgrade [...] That statement came across to me as saying the latter statement above. Perhaps that wasn't what you intended it to, in which case it's good to have the discussion and clarify it, for others who might be following this thread and wondering if they should consider working on this area of the code or not. > Yes, very true, but technically any change in any aspect of the > statistics system would require modification of the statistics dump, > which usually isn't required for most feature changes. Feature work either requires changes to pg_dump, or not. I agree that features which don't require pg_dump changes are definitionally less work than features which do (presuming the rest of the feature is the same in both cases) but that isn't a justification to not have pg_dump support in cases where it's expected- we just don't currently expect it for statistics (which is a rather odd exception when you consider that nearly everything else that ends up in the catalog tables is included). For my part, at least, I'd like to see us change that expectation, for a number of reasons: - pg_upgrade could leverage it and reduce downtime and/or confusion for users who are upgrading and dealing with poor statistics or no statistics for however long after the upgrade - Tables restored wouldn't require an ANALYZE to get reasonable queries against them - Debugging query plans would be a lot less guess-work or having to ask the user to export the statistics by hand from the catalog and then having to hand-hack them in to try and reproduce what's happening, particularly when re-running an analyze ends up giving different results, which isn't uncommon for edge cases - The postgres_fdw would be able to leverage this, as discussed earlier on in this thread - Logical replication could potentially leverage the existing stats and not require ANALYZE to be done after an import, leading to more predictable query plans on the replica I suspect there's probably other benefits than the ones above, but these all seem pretty valuable to me. Thanks, Stephen signature.asc Description: PGP signature
Re: Ideas about a better API for postgres_fdw remote estimates
Stephen Frost writes: > Feature work either requires changes to pg_dump, or not. I agree that > features which don't require pg_dump changes are definitionally less > work than features which do (presuming the rest of the feature is the > same in both cases) but that isn't a justification to not have pg_dump > support in cases where it's expected- we just don't currently expect it > for statistics (which is a rather odd exception when you consider that > nearly everything else that ends up in the catalog tables is included). > For my part, at least, I'd like to see us change that expectation, for a > number of reasons: Yeah. I think that originally we expected that the definition of the stats might change fast enough that porting them cross-version would be problematic. Subsequent experience has shown that they don't actually change any faster than any other aspect of the catalogs. So, while I do think we must have a plan for how to cope when/if the definition changes, I don't buy Bruce's argument that it's going to require more maintenance effort than any other part of the system does. regards, tom lane
Re: Ideas about a better API for postgres_fdw remote estimates
On Mon, Aug 31, 2020 at 01:26:59PM -0400, Stephen Frost wrote: > * Bruce Momjian (br...@momjian.us) wrote: > > I actually don't know which statement above is correct, because of the > > "forever" maintenance. > > I can understand not being sure which is correct, and we can all have > different points of view on it too, but that's a much softer stance than > what I, at least, understood from your up-thread comment which was- > > > I don't think there was enough value to do statistics migration just > > for pg_upgrade [...] > > That statement came across to me as saying the latter statement above. > Perhaps that wasn't what you intended it to, in which case it's good to > have the discussion and clarify it, for others who might be following > this thread and wondering if they should consider working on this area > of the code or not. I concluded that based on the fact that pg_upgrade has been used for years and there has been almost no work on statistics upgrades. > > Yes, very true, but technically any change in any aspect of the > > statistics system would require modification of the statistics dump, > > which usually isn't required for most feature changes. > > Feature work either requires changes to pg_dump, or not. I agree that > features which don't require pg_dump changes are definitionally less > work than features which do (presuming the rest of the feature is the > same in both cases) but that isn't a justification to not have pg_dump > support in cases where it's expected- we just don't currently expect it > for statistics (which is a rather odd exception when you consider that > nearly everything else that ends up in the catalog tables is included). Agreed, but the big differences is that you can change most SQL commands easily, e.g. system catalog changes, without any pg_dump changes, unless you change the SQL API, while any statistics storage change would potentially need pg_dump adjustments. And once you start doing it, you had better keep doing it for every major release or there will be major complaints. > For my part, at least, I'd like to see us change that expectation, for a > number of reasons: Yes, there are certainly more uses now than we used to have. -- Bruce Momjian https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: Ideas about a better API for postgres_fdw remote estimates
On Mon, Aug 31, 2020 at 01:53:01PM -0400, Tom Lane wrote: > Stephen Frost writes: > > Feature work either requires changes to pg_dump, or not. I agree that > > features which don't require pg_dump changes are definitionally less > > work than features which do (presuming the rest of the feature is the > > same in both cases) but that isn't a justification to not have pg_dump > > support in cases where it's expected- we just don't currently expect it > > for statistics (which is a rather odd exception when you consider that > > nearly everything else that ends up in the catalog tables is included). > > > For my part, at least, I'd like to see us change that expectation, for a > > number of reasons: > > Yeah. I think that originally we expected that the definition of the > stats might change fast enough that porting them cross-version would be > problematic. Subsequent experience has shown that they don't actually > change any faster than any other aspect of the catalogs. So, while > I do think we must have a plan for how to cope when/if the definition > changes, I don't buy Bruce's argument that it's going to require more > maintenance effort than any other part of the system does. Well, my point is that even bucket/calculation/data text respresentation changes could affect dumping statistics, and that is kind of rare for other changes we make. -- Bruce Momjian https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: Ideas about a better API for postgres_fdw remote estimates
On Mon, Aug 31, 2020 at 05:46:22PM -0400, Bruce Momjian wrote: > On Mon, Aug 31, 2020 at 01:53:01PM -0400, Tom Lane wrote: > > Stephen Frost writes: > > > Feature work either requires changes to pg_dump, or not. I agree that > > > features which don't require pg_dump changes are definitionally less > > > work than features which do (presuming the rest of the feature is the > > > same in both cases) but that isn't a justification to not have pg_dump > > > support in cases where it's expected- we just don't currently expect it > > > for statistics (which is a rather odd exception when you consider that > > > nearly everything else that ends up in the catalog tables is included). > > > > > For my part, at least, I'd like to see us change that expectation, for a > > > number of reasons: > > > > Yeah. I think that originally we expected that the definition of the > > stats might change fast enough that porting them cross-version would be > > problematic. Subsequent experience has shown that they don't actually > > change any faster than any other aspect of the catalogs. So, while > > I do think we must have a plan for how to cope when/if the definition > > changes, I don't buy Bruce's argument that it's going to require more > > maintenance effort than any other part of the system does. > > Well, my point is that even bucket/calculation/data text respresentation > changes could affect dumping statistics, and that is kind of rare for > other changes we make. And I have been hoping someone would prove me wrong all these years, but it hasn't happened yet. It is possible we have hit a tipping point where the work is worth it, and I hope that is the case. I am just explaining why I think it has not happened yet. -- Bruce Momjian https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: Ideas about a better API for postgres_fdw remote estimates
On 8/31/20 6:19 PM, Ashutosh Bapat wrote: On Mon, Aug 31, 2020 at 3:36 PM Andrey V. Lepikhov wrote: I agreed that this task needs to implement an API for serialization/deserialization of statistics: pg_load_relation_statistics(json_string text); pg_get_relation_statistics(relname text); We can use a version number for resolving conflicts with different statistics implementations. "Load" function will validate the values[] anyarray while deserializing the input json string to the datatype of the relation column. This is a valuable feature. Analysing a foreign table by fetching rows from the foreign server isn't very efficient. In fact the current FDW API for doing that forges that in-efficiency by requiring the FDW to return a sample of rows that will be analysed by the core. That's why I see that your patch introduces a new API to get foreign rel stat. I don't think there's any point in maintaining these two APIs just for ANALYSING table. Instead we should have only one FDW API which will do whatever it wants and return statistics that can be understood by the core and let core install it in the catalogs. I believe that's doable. I think the same. In case of PostgreSQL it could get the stats available as is from the foreign server, convert it into a form that the core understands and returns. The patch introduces a new function postgres_fdw_stat() which will be available only from version 14 onwards. Can we use row_to_json(), which is available in all the supported versions, instead? I started from here. But we need to convert starelid, staop[] stacoll[] oids into portable format. Also we need to explicitly specify the type of each values[] array. And no one guaranteed that anyarray values[] can't contained an array of complex type values, containing oids, that can't be correctly converted to database objects on another server... These considerations required me to add new postgres_fdw_stat() routine that can be moved into the core. In case of some other foreign server, an FDW will be responsible to return statistics in a form that the core will understand. It may fetch rows from the foreign server or be a bit smart and fetch the statistics and convert. I don't think I fully understood your idea. Please explain in more detail if possible. This also means that FDWs will have to deal with the statistics format that the core understands and thus will need changes in their code with every version in the worst case. But AFAIR, PostgreSQL supports different forms of statistics so the problem may not remain that severe if FDWs and core agree on some bare minimum format that the core supports for long. I don't think FDW needs to know anything about the internals of statistics. It only need to execute query like "SELECT extract_statistics(namespace.relation);" and apply the text representation by the function call like this: store_statistics(const char *stat); All validation and update pg_statistic operations will be performed into the core. I think the patch has some other problems like it works only for regular tables on foreign server but a foreign table can be pointing to any relation like a materialized view, partitioned table or a foreign table on the foreign server all of which have statistics associated with them. Ok. It was implemented for discussion, test and as a base of development. I didn't look closely but it does not consider that the foreign table may not have all the columns from the relation on the foreign server or may have different names. Here we get full statistics from remote server and extract statistics only for columns, included into the tuple descriptor of foreign table. But I think those problems are kind of secondary. We have to agree on the design first. +1. I only want to point out the following. In previous threads statistics was converted row-by-row. I want to suggest to serialize all statistics tuples for the relation into single json string. On apply phase we can filter unneeded attributes. -- regards, Andrey Lepikhov Postgres Professional
Re: Ideas about a better API for postgres_fdw remote estimates
On 8/31/20 6:19 PM, Ashutosh Bapat wrote: On Mon, Aug 31, 2020 at 3:36 PM Andrey V. Lepikhov wrote: Thanks for this helpful feedback. I think the patch has some other problems like it works only for regular tables on foreign server but a foreign table can be pointing to any relation like a materialized view, partitioned table or a foreign table on the foreign server all of which have statistics associated with them. I didn't look closely but it does not consider that the foreign table may not have all the columns from the relation on the foreign server or may have different names. But I think those problems are kind of secondary. We have to agree on the design first. In accordance with discussion i made some changes in the patch: 1. The extract statistic routine moved into the core. 2. Serialized stat contains 'version' field to indicate format of statistic received. 3. ANALYZE and VACUUM ANALYZE uses this approach only in the case of implicit analysis of the relation. I am currently keeping limitation of using the approach for regular relations only, because i haven't studied the specifics of another types of relations. But I don't know any reason to keep this limit in the future. The patch in attachment is very raw. I publish for further substantive discussion. -- regards, Andrey Lepikhov Postgres Professional From 9cfd9b8a43691f1dacd0967dcc32fdf8414ddb56 Mon Sep 17 00:00:00 2001 From: "Andrey V. Lepikhov" Date: Tue, 4 Aug 2020 09:29:37 +0500 Subject: [PATCH] Pull statistic for a foreign table from remote server. Add the extract_relation_statistics() routine that convert statistics on the relation into json format. All OIDs - starelid, staop[] stacoll[] is converted to a portable representation. Operation uniquely defined by set of features: namespace, operator name, left operator namespace and name, right operator namespace and name. Collation uniquely defined by namespace, collation name and encoding. New fdw API routine GetForeignRelStat() implements access to this machinery and returns JSON string to the caller. This function is called by ANALYZE command (without explicit relation name) as an attempt to reduce the cost of updating statistics. If attempt fails, ANALYZE go the expensive way. Add this feature into the VACUUM ANALYZE and autovacuum. In accordance with discussion [1] move the extract_relation_statistics() routine into the core. ToDo: tests on custom operations and collations. 1. https://www.postgresql.org/message-id/flat/1155731.1593832096%40sss.pgh.pa.us --- contrib/postgres_fdw/Makefile | 2 +- contrib/postgres_fdw/deparse.c| 8 + .../postgres_fdw/expected/foreign_stat.out| 112 +++ contrib/postgres_fdw/postgres_fdw.c | 49 ++ contrib/postgres_fdw/postgres_fdw.h | 1 + contrib/postgres_fdw/sql/foreign_stat.sql | 46 + src/backend/commands/analyze.c| 794 ++ src/backend/commands/vacuum.c | 13 +- src/backend/utils/adt/json.c | 6 + src/backend/utils/cache/lsyscache.c | 167 src/include/catalog/pg_proc.dat | 3 + src/include/catalog/pg_statistic.h| 1 + src/include/foreign/fdwapi.h | 2 + src/include/utils/json.h | 1 + src/include/utils/lsyscache.h | 8 + 15 files changed, 1211 insertions(+), 2 deletions(-) create mode 100644 contrib/postgres_fdw/expected/foreign_stat.out create mode 100644 contrib/postgres_fdw/sql/foreign_stat.sql diff --git a/contrib/postgres_fdw/Makefile b/contrib/postgres_fdw/Makefile index ee8a80a392..a5a838b8fc 100644 --- a/contrib/postgres_fdw/Makefile +++ b/contrib/postgres_fdw/Makefile @@ -16,7 +16,7 @@ SHLIB_LINK_INTERNAL = $(libpq) EXTENSION = postgres_fdw DATA = postgres_fdw--1.0.sql -REGRESS = postgres_fdw +REGRESS = postgres_fdw foreign_stat ifdef USE_PGXS PG_CONFIG = pg_config diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index ad37a74221..e63cb4982f 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -2053,6 +2053,14 @@ deparseAnalyzeSizeSql(StringInfo buf, Relation rel) appendStringInfo(buf, "::pg_catalog.regclass) / %d", BLCKSZ); } +void +deparseGetStatSql(StringInfo buf, Relation rel) +{ + appendStringInfo(buf, "SELECT * FROM extract_relation_statistics('"); + deparseRelation(buf, rel); + appendStringInfoString(buf, "');"); +} + /* * Construct SELECT statement to acquire sample rows of given relation. * diff --git a/contrib/postgres_fdw/expected/foreign_stat.out b/contrib/postgres_fdw/expected/foreign_stat.out new file mode 100644 index 00..46fd2f8427 --- /dev/null +++ b/contrib/postgres_fdw/expected/foreign_stat.out @@ -0,0 +1,112 @@ +CREATE TABLE ltable (a int, b real); +CREATE FOREIGN TABLE ftable (a int) server loopback options (table_name 'ltable'); +VACUUM ANALYZE; +-- Check statistic interface r
Re: Ideas about a better API for postgres_fdw remote estimates
On Thu, 3 Sep 2020 at 10:44, Andrey V. Lepikhov wrote: > On 8/31/20 6:19 PM, Ashutosh Bapat wrote: > > On Mon, Aug 31, 2020 at 3:36 PM Andrey V. Lepikhov > > wrote: > >> > >> Thanks for this helpful feedback. > > I think the patch has some other problems like it works only for > > regular tables on foreign server but a foreign table can be pointing > > to any relation like a materialized view, partitioned table or a > > foreign table on the foreign server all of which have statistics > > associated with them. I didn't look closely but it does not consider > > that the foreign table may not have all the columns from the relation > > on the foreign server or may have different names. But I think those > > problems are kind of secondary. We have to agree on the design first. > > > In accordance with discussion i made some changes in the patch: > 1. The extract statistic routine moved into the core. > Bulk of the patch implements the statistics conversion to and fro json format. I am still not sure whether we need all of that code here. Can we re-use pg_stats view? That is converting some of the OIDs to names. I agree with anyarray but if that's a problem here it's also a problem for pg_stats view, isn't it? If we can reduce the stats handling code to a minimum or use it for some other purpose as well e.g. pg_stats enhancement, the code changes required will be far less compared to the value that this patch provides. -- Best Wishes, Ashutosh
Re: Ideas about a better API for postgres_fdw remote estimates
On Thu, Sep 03, 2020 at 10:14:41AM +0500, Andrey V. Lepikhov wrote: On 8/31/20 6:19 PM, Ashutosh Bapat wrote: On Mon, Aug 31, 2020 at 3:36 PM Andrey V. Lepikhov wrote: Thanks for this helpful feedback. I think the patch has some other problems like it works only for regular tables on foreign server but a foreign table can be pointing to any relation like a materialized view, partitioned table or a foreign table on the foreign server all of which have statistics associated with them. I didn't look closely but it does not consider that the foreign table may not have all the columns from the relation on the foreign server or may have different names. But I think those problems are kind of secondary. We have to agree on the design first. In accordance with discussion i made some changes in the patch: 1. The extract statistic routine moved into the core. 2. Serialized stat contains 'version' field to indicate format of statistic received. 3. ANALYZE and VACUUM ANALYZE uses this approach only in the case of implicit analysis of the relation. I am currently keeping limitation of using the approach for regular relations only, because i haven't studied the specifics of another types of relations. But I don't know any reason to keep this limit in the future. The patch in attachment is very raw. I publish for further substantive discussion. Thanks for working on this. I briefly looked at the patch today, and I have some comments/feedback: 1) I wonder why deparseGetStatSql looks so different from e.g. deparseAnalyzeSizeSql - no deparseStringLiteral on relname, no cast to pg_catalog.regclass, function name not qualified with pg_catalog, ... 2) I'm a bit annoyed by the amount of code added to analyze.c only to support output/input in JSON format. I'm no expert, but I don't recall explain needing this much new stuff (OTOH it just produces json, it does not need to read it). Maybe we also need to process wider range of data types here. But the code is almost perfectly undocumented :-( 3) Why do we need to change vacuum_rel this way? 4) I wonder if we actually want/need to simply output pg_statistic data verbatim like this. Is postgres_fdw actually going to benefit from it? I kinda doubt that, and my assumption was that we'd return only a small subset of the data, needed by get_remote_estimate. This has a couple of issues. Firstly, it requires the knowledge of what the stakind constants in pg_statistic mean and how to interpret it - but OK, it's true that does not change very often (or at all). Secondly, it entirely ignores extended statistics - OK, we might extract those too, but it's going to be much more complex. And finally it entirely ignores costing on the remote node. Surely we can't just apply local random_page_cost or whatever, because those may be entirely different. And we don't know if the remote is going to use index etc. So is extracting data from pg_statistic the right approach? 5) I doubt it's enough to support relnames - we also need to estimate joins, so this needs to support plain queries I think. At least that's what Tom envisioned in his postgres_fdw_support(query text) proposal. 6) I see you've included a version number in the data - why not to just check regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Ideas about a better API for postgres_fdw remote estimates
On 9/4/20 6:23 PM, Ashutosh Bapat wrote: On Thu, 3 Sep 2020 at 10:44, Andrey V. Lepikhov mailto:a.lepik...@postgrespro.ru>> wrote: On 8/31/20 6:19 PM, Ashutosh Bapat wrote: > On Mon, Aug 31, 2020 at 3:36 PM Andrey V. Lepikhov > mailto:a.lepik...@postgrespro.ru>> wrote: >> >> Thanks for this helpful feedback. > I think the patch has some other problems like it works only for > regular tables on foreign server but a foreign table can be pointing > to any relation like a materialized view, partitioned table or a > foreign table on the foreign server all of which have statistics > associated with them. I didn't look closely but it does not consider > that the foreign table may not have all the columns from the relation > on the foreign server or may have different names. But I think those > problems are kind of secondary. We have to agree on the design first. > In accordance with discussion i made some changes in the patch: 1. The extract statistic routine moved into the core. Bulk of the patch implements the statistics conversion to and fro json format. I am still not sure whether we need all of that code here. Yes, i'm sure we'll replace it with something. Right now, i want to discuss format of statistics dump. Remind, that a statistics dump is needed not only for fdw, but it need for the pg_dump. And in the dump will be placed something like this: 'SELECT store_relation_statistics(rel, serialized_stat)' my reasons for using JSON: * it have conversion infrastructure like json_build_object() * this is flexible readable format, that can be useful in text dumps of relations. Can we re-use pg_stats view? That is converting some of the OIDs to names. I agree with anyarray but if that's a problem here it's also a problem for pg_stats view, isn't it? Right now, I don't know if it is possible to unambiguously convert the pg_stats information to a pg_statistic tuple. If we can reduce the stats handling code to a minimum or use it for some other purpose as well e.g. pg_stats enhancement, the code changes required will be far less compared to the value that this patch provides. +1 -- regards, Andrey Lepikhov Postgres Professional
Re: Ideas about a better API for postgres_fdw remote estimates
On Fri, 4 Sep 2020 at 20:27, Tomas Vondra wrote > > > 4) I wonder if we actually want/need to simply output pg_statistic data > verbatim like this. Is postgres_fdw actually going to benefit from it? I > kinda doubt that, and my assumption was that we'd return only a small > subset of the data, needed by get_remote_estimate. > > This has a couple of issues. Firstly, it requires the knowledge of what > the stakind constants in pg_statistic mean and how to interpret it - but > OK, it's true that does not change very often (or at all). Secondly, it > entirely ignores extended statistics - OK, we might extract those too, > but it's going to be much more complex. And finally it entirely ignores > costing on the remote node. Surely we can't just apply local > random_page_cost or whatever, because those may be entirely different. > And we don't know if the remote is going to use index etc. > > So is extracting data from pg_statistic the right approach? > > There are two different problems, which ultimately might converge. 1. If use_remote_estimates = false, more generally if querying costs from foreign server for costing paths is impractical, we want to use local estimates and try to come up with costs. For that purpose we keep some statistics locally and user is expected to refresh it periodically by running ANALYZE on the foreign table. This patch is about a. doing this efficiently without requiring to fetch every row from the foreign server b. through autovacuum automatically without user firing ANALYZE. I think this also answers your question about vacuum_rel() above. 2. How to efficiently extract costs from an EXPLAIN plan when use_remote_eestimates is true. That's the subject of some nearby thread. I think you are referring to that problem here. Hence your next point. Using EXPLAIN to get costs from the foreign server isn't efficient. It increases planning time a lot; sometimes planning time exceeds execution time. If usage of foreign tables becomes more and more common, this isn't ideal. I think we should move towards a model in which the optimizer can decide whether a subtree involving a foreign server should be evaluated locally or on the foreign server without the help of foreign server. One way to do it (I am not saying that this is the only or the best way) is to estimate the cost of foreign query locally based on the information available locally about the foreign server and foreign table. This might mean that we have to get that information from the foreign server and cache it locally and use it several times, including the indexes on foreign table, values of various costs etc. Though this approach doesn't solve all of those problems it's one step forward + it makes the current scenario also efficient. I agree that the patch needs some work though, esp the code dealing with serialization and deserialization of statistics. -- Best Wishes, Ashutosh
Re: Ideas about a better API for postgres_fdw remote estimates
On Tue, Sep 08, 2020 at 05:55:09PM +0530, Ashutosh Bapat wrote: On Fri, 4 Sep 2020 at 20:27, Tomas Vondra wrote 4) I wonder if we actually want/need to simply output pg_statistic data verbatim like this. Is postgres_fdw actually going to benefit from it? I kinda doubt that, and my assumption was that we'd return only a small subset of the data, needed by get_remote_estimate. This has a couple of issues. Firstly, it requires the knowledge of what the stakind constants in pg_statistic mean and how to interpret it - but OK, it's true that does not change very often (or at all). Secondly, it entirely ignores extended statistics - OK, we might extract those too, but it's going to be much more complex. And finally it entirely ignores costing on the remote node. Surely we can't just apply local random_page_cost or whatever, because those may be entirely different. And we don't know if the remote is going to use index etc. So is extracting data from pg_statistic the right approach? There are two different problems, which ultimately might converge. 1. If use_remote_estimates = false, more generally if querying costs from foreign server for costing paths is impractical, we want to use local estimates and try to come up with costs. For that purpose we keep some statistics locally and user is expected to refresh it periodically by running ANALYZE on the foreign table. This patch is about a. doing this efficiently without requiring to fetch every row from the foreign server b. through autovacuum automatically without user firing ANALYZE. I think this also answers your question about vacuum_rel() above. 2. How to efficiently extract costs from an EXPLAIN plan when use_remote_eestimates is true. That's the subject of some nearby thread. I think you are referring to that problem here. Hence your next point. I think that was the topic of *this* thread as started by Tom, but I now realize Andrey steered it in the direction to allow re-using remote stats. Which seems useful too, but it confused me a bit. Using EXPLAIN to get costs from the foreign server isn't efficient. It increases planning time a lot; sometimes planning time exceeds execution time. If usage of foreign tables becomes more and more common, this isn't ideal. I think we should move towards a model in which the optimizer can decide whether a subtree involving a foreign server should be evaluated locally or on the foreign server without the help of foreign server. One way to do it (I am not saying that this is the only or the best way) is to estimate the cost of foreign query locally based on the information available locally about the foreign server and foreign table. This might mean that we have to get that information from the foreign server and cache it locally and use it several times, including the indexes on foreign table, values of various costs etc. Though this approach doesn't solve all of those problems it's one step forward + it makes the current scenario also efficient. True, but that ptoject is way more ambitious than providing a simple API for postgres_fdw to obtain the estimates more efficiently. I agree that the patch needs some work though, esp the code dealing with serialization and deserialization of statistics. I think there's a bunch of open questions, e.g. what to do with extended statistics - for example what should happen when the extended statistics object is defined only on local/remote server, or when the definitions don't match? What should happen when the definitions don't match? This probably is not an issue for "regular" stats, because that seems pretty stable, but for extended stats there are differences between versions. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Ideas about a better API for postgres_fdw remote estimates
On Wed, 9 Sep 2020 at 02:35, Tomas Vondra wrote > > I think that was the topic of *this* thread as started by Tom, but I now > realize Andrey steered it in the direction to allow re-using remote > stats. Which seems useful too, but it confused me a bit. > I didn't realize that the nearby thread I am mentioning is actually this thread :). Sorry. > > >Using EXPLAIN to get costs from the foreign server isn't efficient. It > >increases planning time a lot; sometimes planning time exceeds execution > >time. If usage of foreign tables becomes more and more common, this isn't > >ideal. I think we should move towards a model in which the optimizer can > >decide whether a subtree involving a foreign server should be evaluated > >locally or on the foreign server without the help of foreign server. One > >way to do it (I am not saying that this is the only or the best way) is to > >estimate the cost of foreign query locally based on the information > >available locally about the foreign server and foreign table. This might > >mean that we have to get that information from the foreign server and > cache > >it locally and use it several times, including the indexes on foreign > >table, values of various costs etc. Though this approach doesn't solve all > >of those problems it's one step forward + it makes the current scenario > >also efficient. > > > > True, but that ptoject is way more ambitious than providing a simple API > for postgres_fdw to obtain the estimates more efficiently. > Doing all of that is a big project. But what this patch aims at is a small subset which makes statistics collection efficient and automatic. So, just for that, we should consider it. > > >I agree that the patch needs some work though, esp the code dealing with > >serialization and deserialization of statistics. > > I think there's a bunch of open questions, e.g. what to do with extended > statistics - for example what should happen when the extended statistics > object is defined only on local/remote server, or when the definitions > don't match? What should happen when the definitions don't match? This > probably is not an issue for "regular" stats, because that seems pretty > stable, but for extended stats there are differences between versions. If it is defined on the foreign server but not the local server, there is no need to fetch it from the foreign server. The other way round case is tricky. We could mark the extended statistics object invalid if it's not defined on the foreign server or the definition is different. We have to document it that way. I think that should serve most of the cases. -- Best Wishes, Ashutosh