Re: [HACKERS] [POC] FETCH limited by bytes.
On Tue, Feb 2, 2016 at 10:42 PM, Corey Huinkerwrote: >> I don't see how. There really is no declaration in there for a >> variable called server. > > Absolutely correct. My only guess is that it was from failing to make clean > after a checkout/re-checkout. A good reason to have even boring regression > tests. > > Regression tests added. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
On Wed, Feb 3, 2016 at 11:17 AM, Ashutosh Bapatwrote: > There seems to be double successive assignment to fdw_private in recent > commit. Here's patch to remove the first one. Committed along with a fix for another problem I noted along the way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
There seems to be double successive assignment to fdw_private in recent commit. Here's patch to remove the first one. On Wed, Feb 3, 2016 at 7:40 PM, Robert Haaswrote: > On Tue, Feb 2, 2016 at 10:42 PM, Corey Huinker > wrote: > >> I don't see how. There really is no declaration in there for a > >> variable called server. > > > > Absolutely correct. My only guess is that it was from failing to make > clean > > after a checkout/re-checkout. A good reason to have even boring > regression > > tests. > > > > Regression tests added. > > Committed. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index d5c0383..c95ac05 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -1009,22 +1009,20 @@ postgresGetForeignPlan(PlannerInfo *root, * expressions to be sent as parameters. */ initStringInfo(); deparseSelectStmtForRel(, root, baserel, remote_conds, best_path->path.pathkeys, _attrs, _list); /* * Build the fdw_private list that will be available to the executor. * Items in the list must match enum FdwScanPrivateIndex, above. */ - fdw_private = list_make2(makeString(sql.data), -retrieved_attrs); fdw_private = list_make3(makeString(sql.data), retrieved_attrs, makeInteger(fpinfo->fetch_size)); /* * Create the ForeignScan node from target list, filtering expressions, * remote parameter expressions, and FDW private information. * * Note that the remote parameter expressions are stored in the fdw_exprs * field of the finished plan node; we can't keep them in private state -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
On Wed, Jan 27, 2016 at 11:19 PM, Corey Huinkerwrote: > The possible tests might be: > - creating a server/table with fetch_size set > - altering an existing server/table to have fetch_size set > - verifying that the value exists in pg_foreign_server.srvoptions and > pg_foreign_table.ftoptions. > - attempting to set fetch_size to a non-integer value I'd add a test that does one of the first two and skip the others. I'm not wedded to that exact thing; that's just a suggestion. > Enjoy. I'd enjoy it more if, heh heh, it compiled. postgres_fdw.c:2642:16: error: use of undeclared identifier 'server' foreach(lc, server->options) ^ ../../src/include/nodes/pg_list.h:153:26: note: expanded from macro 'foreach' for ((cell) = list_head(l); (cell) != NULL; (cell) = lnext(cell)) ^ 1 error generated. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
> > > postgres_fdw.c:2642:16: error: use of undeclared identifier 'server' > foreach(lc, server->options) > Odd. Compiled for me. Maybe I messed up the diff. Will get back to you on that with the tests suggested. > ^ > ../../src/include/nodes/pg_list.h:153:26: note: expanded from macro > 'foreach' > for ((cell) = list_head(l); (cell) != NULL; (cell) = lnext(cell)) > ^ Didn't modify this file on this or any other work of mine. Possible garbage from a git pull. Will investigate.
Re: [HACKERS] [POC] FETCH limited by bytes.
On Tue, Feb 2, 2016 at 5:57 PM, Corey Huinkerwrote: >> postgres_fdw.c:2642:16: error: use of undeclared identifier 'server' >> foreach(lc, server->options) > > > Odd. Compiled for me. Maybe I messed up the diff. Will get back to you on > that with the tests suggested. I don't see how. There really is no declaration in there for a variable called server. >> ../../src/include/nodes/pg_list.h:153:26: note: expanded from macro >> 'foreach' >> for ((cell) = list_head(l); (cell) != NULL; (cell) = lnext(cell)) >> ^ > > Didn't modify this file on this or any other work of mine. Possible garbage > from a git pull. Will investigate. This is context information for the same error, not a separate problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
> > > I don't see how. There really is no declaration in there for a > variable called server. > Absolutely correct. My only guess is that it was from failing to make clean after a checkout/re-checkout. A good reason to have even boring regression tests. Regression tests added. diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 2390e61..e28cf77 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -3951,3 +3951,63 @@ QUERY: CREATE FOREIGN TABLE t5 ( OPTIONS (schema_name 'import_source', table_name 't5'); CONTEXT: importing foreign table "t5" ROLLBACK; +BEGIN; +CREATE SERVER fetch101 FOREIGN DATA WRAPPER postgres_fdw OPTIONS( fetch_size '101' ); +SELECT count(*) +FROM pg_foreign_server +WHERE srvname = 'fetch101' +AND srvoptions @> array['fetch_size=101']; + count +--- + 1 +(1 row) + +ALTER SERVER fetch101 OPTIONS( SET fetch_size '202' ); +SELECT count(*) +FROM pg_foreign_server +WHERE srvname = 'fetch101' +AND srvoptions @> array['fetch_size=101']; + count +--- + 0 +(1 row) + +SELECT count(*) +FROM pg_foreign_server +WHERE srvname = 'fetch101' +AND srvoptions @> array['fetch_size=202']; + count +--- + 1 +(1 row) + +CREATE FOREIGN TABLE table3 ( x int ) SERVER fetch101 OPTIONS ( fetch_size '3' ); +SELECT COUNT(*) +FROM pg_foreign_table +WHERE ftrelid = 'table3'::regclass +AND ftoptions @> array['fetch_size=3']; + count +--- + 1 +(1 row) + +ALTER FOREIGN TABLE table3 OPTIONS ( SET fetch_size '6'); +SELECT COUNT(*) +FROM pg_foreign_table +WHERE ftrelid = 'table3'::regclass +AND ftoptions @> array['fetch_size=3']; + count +--- + 0 +(1 row) + +SELECT COUNT(*) +FROM pg_foreign_table +WHERE ftrelid = 'table3'::regclass +AND ftoptions @> array['fetch_size=6']; + count +--- + 1 +(1 row) + +ROLLBACK; diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c index 86a5789..f89de2f 100644 --- a/contrib/postgres_fdw/option.c +++ b/contrib/postgres_fdw/option.c @@ -131,6 +131,17 @@ postgres_fdw_validator(PG_FUNCTION_ARGS) /* check list syntax, warn about uninstalled extensions */ (void) ExtractExtensionList(defGetString(def), true); } + else if (strcmp(def->defname, "fetch_size") == 0) + { + int fetch_size; + + fetch_size = strtol(defGetString(def), NULL,10); + if (fetch_size <= 0) +ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("%s requires a non-negative integer value", +def->defname))); + } } PG_RETURN_VOID(); @@ -162,6 +173,9 @@ InitPgFdwOptions(void) /* updatable is available on both server and table */ {"updatable", ForeignServerRelationId, false}, {"updatable", ForeignTableRelationId, false}, + /* fetch_size is available on both server and table */ + {"fetch_size", ForeignServerRelationId, false}, + {"fetch_size", ForeignTableRelationId, false}, {NULL, InvalidOid, false} }; diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 2ab85f6..8f72ff7 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -68,7 +68,9 @@ enum FdwScanPrivateIndex /* SQL statement to execute remotely (as a String node) */ FdwScanPrivateSelectSql, /* Integer list of attribute numbers retrieved by the SELECT */ - FdwScanPrivateRetrievedAttrs + FdwScanPrivateRetrievedAttrs, + /* Integer representing the desired fetch_size */ + FdwScanPrivateFetchSize }; /* @@ -126,6 +128,8 @@ typedef struct PgFdwScanState /* working memory contexts */ MemoryContext batch_cxt; /* context holding current batch of tuples */ MemoryContext temp_cxt; /* context for per-tuple temporary data */ + + int fetch_size; /* number of tuples per fetch */ } PgFdwScanState; /* @@ -380,6 +384,7 @@ postgresGetForeignRelSize(PlannerInfo *root, fpinfo->fdw_startup_cost = DEFAULT_FDW_STARTUP_COST; fpinfo->fdw_tuple_cost = DEFAULT_FDW_TUPLE_COST; fpinfo->shippable_extensions = NIL; + fpinfo->fetch_size = 100; foreach(lc, fpinfo->server->options) { @@ -394,16 +399,17 @@ postgresGetForeignRelSize(PlannerInfo *root, else if (strcmp(def->defname, "extensions") == 0) fpinfo->shippable_extensions = ExtractExtensionList(defGetString(def), false); + else if (strcmp(def->defname, "fetch_size") == 0) + fpinfo->fetch_size = strtol(defGetString(def), NULL,10); } foreach(lc, fpinfo->table->options) { DefElem*def = (DefElem *) lfirst(lc); if (strcmp(def->defname, "use_remote_estimate") == 0) - { fpinfo->use_remote_estimate = defGetBoolean(def); - break;/* only need the one value */ - } + else if (strcmp(def->defname, "fetch_size") == 0) + fpinfo->fetch_size = strtol(defGetString(def), NULL,10); } /* @@ -1012,6 +1018,9 @@ postgresGetForeignPlan(PlannerInfo *root, */ fdw_private =
Re: [HACKERS] [POC] FETCH limited by bytes.
Corey Huinker wrote: > Enjoy. Didn't actually look into the patch but looks like there's progress here and this might be heading for commit. Moving to next one. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
On Mon, Jan 25, 2016 at 4:18 PM, Corey Huinkerwrote: > Revised in patch v3: > * get_option() and get_fetch_size() removed, fetch_size searches added to > existing loops. > * Move fetch_size <= 0 tests into postgres_fdw_validator() routine in > option.c > * DEBUG1 message removed, never intended that to live beyond the proof of > concept. > * Missing regression test mentioned in makefile de-mentioned, as there's > nothing to see without the DEBUG1 message. > * Multi-line comment shrunk Looks pretty good. You seem to have added a blank line at the end of postgres_fdw.c, which should be stripped back out. > I'm not too keen on having *no* new regression tests, but defer to your > judgement. So... I'm not *opposed* to regression tests. I just don't see a reasonable way to write one. If you've got an idea, I'm all ears. > Still not sure what you mean by remote execution options. But it might be > simpler now that the patch is closer to your expectations. I'm talking about the documentation portion of the patch, which regrettably seems to have disappeared between v2 and v3. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
> > > Looks pretty good. You seem to have added a blank line at the end of > postgres_fdw.c, which should be stripped back out. > Done. > > > I'm not too keen on having *no* new regression tests, but defer to your > > judgement. > > So... I'm not *opposed* to regression tests. I just don't see a > reasonable way to write one. If you've got an idea, I'm all ears. > The possible tests might be: - creating a server/table with fetch_size set - altering an existing server/table to have fetch_size set - verifying that the value exists in pg_foreign_server.srvoptions and pg_foreign_table.ftoptions. - attempting to set fetch_size to a non-integer value None of which are very interesting, and none of which actually test what fetch_size was actually used. But I'm happy to add any of those that seem worthwhile. > > Still not sure what you mean by remote execution options. But it might be > > simpler now that the patch is closer to your expectations. > > I'm talking about the documentation portion of the patch, which > regrettably seems to have disappeared between v2 and v3. > Ah, didn't realize you were speaking about the documentation, and since that section was new, I wasn't familiar with the name. Moved. ...and not sure why the doc change didn't make it into the last patch, but it's in this one. I also moved the paragraph starting "When using the extensions option, *it is the user's responsibility* that..." such that it is in the same varlistentry as "extensions", though maybe that change should be delegated to the patch that created the extensions option. Enjoy. diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c index 86a5789..f89de2f 100644 --- a/contrib/postgres_fdw/option.c +++ b/contrib/postgres_fdw/option.c @@ -131,6 +131,17 @@ postgres_fdw_validator(PG_FUNCTION_ARGS) /* check list syntax, warn about uninstalled extensions */ (void) ExtractExtensionList(defGetString(def), true); } + else if (strcmp(def->defname, "fetch_size") == 0) + { + int fetch_size; + + fetch_size = strtol(defGetString(def), NULL,10); + if (fetch_size <= 0) +ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("%s requires a non-negative integer value", +def->defname))); + } } PG_RETURN_VOID(); @@ -162,6 +173,9 @@ InitPgFdwOptions(void) /* updatable is available on both server and table */ {"updatable", ForeignServerRelationId, false}, {"updatable", ForeignTableRelationId, false}, + /* fetch_size is available on both server and table */ + {"fetch_size", ForeignServerRelationId, false}, + {"fetch_size", ForeignTableRelationId, false}, {NULL, InvalidOid, false} }; diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 374faf5..f71bf61 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -68,7 +68,9 @@ enum FdwScanPrivateIndex /* SQL statement to execute remotely (as a String node) */ FdwScanPrivateSelectSql, /* Integer list of attribute numbers retrieved by the SELECT */ - FdwScanPrivateRetrievedAttrs + FdwScanPrivateRetrievedAttrs, + /* Integer representing the desired fetch_size */ + FdwScanPrivateFetchSize }; /* @@ -126,6 +128,8 @@ typedef struct PgFdwScanState /* working memory contexts */ MemoryContext batch_cxt; /* context holding current batch of tuples */ MemoryContext temp_cxt; /* context for per-tuple temporary data */ + + int fetch_size; /* number of tuples per fetch */ } PgFdwScanState; /* @@ -380,6 +384,7 @@ postgresGetForeignRelSize(PlannerInfo *root, fpinfo->fdw_startup_cost = DEFAULT_FDW_STARTUP_COST; fpinfo->fdw_tuple_cost = DEFAULT_FDW_TUPLE_COST; fpinfo->shippable_extensions = NIL; + fpinfo->fetch_size = 100; foreach(lc, fpinfo->server->options) { @@ -394,16 +399,17 @@ postgresGetForeignRelSize(PlannerInfo *root, else if (strcmp(def->defname, "extensions") == 0) fpinfo->shippable_extensions = ExtractExtensionList(defGetString(def), false); + else if (strcmp(def->defname, "fetch_size") == 0) + fpinfo->fetch_size = strtol(defGetString(def), NULL,10); } foreach(lc, fpinfo->table->options) { DefElem*def = (DefElem *) lfirst(lc); if (strcmp(def->defname, "use_remote_estimate") == 0) - { fpinfo->use_remote_estimate = defGetBoolean(def); - break;/* only need the one value */ - } + else if (strcmp(def->defname, "fetch_size") == 0) + fpinfo->fetch_size = strtol(defGetString(def), NULL,10); } /* @@ -1069,6 +1075,9 @@ postgresGetForeignPlan(PlannerInfo *root, */ fdw_private = list_make2(makeString(sql.data), retrieved_attrs); + fdw_private = list_make3(makeString(sql.data), + retrieved_attrs, + makeInteger(fpinfo->fetch_size)); /* * Create the ForeignScan node from target list, filtering expressions, @@ -1147,6 +1156,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags) FdwScanPrivateSelectSql));
Re: [HACKERS] [POC] FETCH limited by bytes.
On Mon, Jan 25, 2016 at 1:21 PM, Corey Huinkerwrote: >> - We could consider folding fetch_size into "Remote Execution >> Options", but maybe that's too clever. > > If you care to explain, I'm listening. Otherwise I'm going forward with the > other suggestions you've made. It's just a little unfortunate to have multiple sections with only a single option in each. It would be nice to avoid that somehow. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
On Thu, Dec 31, 2015 at 1:10 AM, Corey Huinkerwrote: > Here's a re-based patch. Notable changes since the last patch are: > > - some changes had to move to postgres_fdw.h > - the regression tests are in their own script fetch_size.sql / > fetch_size.out. that should make things easier for the reviewer(s), even if > we later merge it into postgres_fdw.sql/.out. > - I put braces around a multi-line error ereport(). That might be a style > violation, but the statement seemed confusing without it. > - The fetch sql is reported as a DEBUG1 level ereport(), and confirms that > server level options are respected, and table level options are used in > favor of server-level options. > > I left the DEBUG1-level message in this patch so that others can see the > output, but I assume we would omit that for production code, with > corresponding deletions from fetch_size.sql and fetch_size.out. Review: - There is an established idiom in postgres_fdw for how to extract data from lists of DefElems, and this patch does something else instead. Please make it look like postgresIsForeignRelUpdatable, postgresGetForeignRelSize, and postgresImportForeignSchema instead of inventing something new. (Note that your approach would require multiple passes over the list if you needed information on multiple options, whereas the existing approach does not.) - I think the comment in InitPgFdwOptions() could be reduced to a one-line comment similar to those already present. - I would reduce the debug level on the fetch command to something lower than DEBUG1, or drop it entirely. If you keep it, you need to fix the formatting so that the line breaks look like other ereport() calls. - We could consider folding fetch_size into "Remote Execution Options", but maybe that's too clever. I would not worry about trying to get this into the regression tests. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
> > > Review: > > - There is an established idiom in postgres_fdw for how to extract > data from lists of DefElems, and this patch does something else > instead. Please make it look like postgresIsForeignRelUpdatable, > postgresGetForeignRelSize, and postgresImportForeignSchema instead of > inventing something new. (Note that your approach would require > multiple passes over the list if you needed information on multiple > options, whereas the existing approach does not.) > I will look into that. The original patch pre-dates import foreign schema, so I'm not surprised that I missed the established pattern. > > - I think the comment in InitPgFdwOptions() could be reduced to a > one-line comment similar to those already present. > Aye. > - I would reduce the debug level on the fetch command to something > lower than DEBUG1, or drop it entirely. If you keep it, you need to > fix the formatting so that the line breaks look like other ereport() > calls. > As I mentioned, I plan to drop it entirely. It was only there to show a reviewer that it works as advertised. There's not much to see without it. > - We could consider folding fetch_size into "Remote Execution > Options", but maybe that's too clever. > If you care to explain, I'm listening. Otherwise I'm going forward with the other suggestions you've made. > > I would not worry about trying to get this into the regression tests. > > Happy to hear that.
Re: [HACKERS] [POC] FETCH limited by bytes.
On Mon, Jan 25, 2016 at 3:22 PM, Robert Haaswrote: > On Mon, Jan 25, 2016 at 1:21 PM, Corey Huinker > wrote: > >> - We could consider folding fetch_size into "Remote Execution > >> Options", but maybe that's too clever. > > > > If you care to explain, I'm listening. Otherwise I'm going forward with > the > > other suggestions you've made. > > It's just a little unfortunate to have multiple sections with only a > single option in each. It would be nice to avoid that somehow. > > Revised in patch v3: * get_option() and get_fetch_size() removed, fetch_size searches added to existing loops. * Move fetch_size <= 0 tests into postgres_fdw_validator() routine in option.c * DEBUG1 message removed, never intended that to live beyond the proof of concept. * Missing regression test mentioned in makefile de-mentioned, as there's nothing to see without the DEBUG1 message. * Multi-line comment shrunk (There's a v2 patch that is prior to the change to postgres_fdw_validator() in option.c, but in retrospect that's not interesting to you). I'm not too keen on having *no* new regression tests, but defer to your judgement. Still not sure what you mean by remote execution options. But it might be simpler now that the patch is closer to your expectations. v3.Make-fetch_size-settable-per-server-and-table.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
> > >> I believe it won't be needed as a regression test but DEBUGn >> messages could be usable to see "what was sent to the remote >> side". It can be shown to the console so easily done in >> regression test. See the deocumentation for client_min_messages >> for details. (It could receive far many messages then expected, >> though, and maybe fragile for changing in other part.) >> >> regards, >> >> -- >> Kyotaro Horiguchi >> NTT Open Source Software Center >> >> >> > Ok, I'll see what debug-level messages reveal. > > > Here's a re-based patch. Notable changes since the last patch are: - some changes had to move to postgres_fdw.h - the regression tests are in their own script fetch_size.sql / fetch_size.out. that should make things easier for the reviewer(s), even if we later merge it into postgres_fdw.sql/.out. - I put braces around a multi-line error ereport(). That might be a style violation, but the statement seemed confusing without it. - The fetch sql is reported as a DEBUG1 level ereport(), and confirms that server level options are respected, and table level options are used in favor of server-level options. I left the DEBUG1-level message in this patch so that others can see the output, but I assume we would omit that for production code, with corresponding deletions from fetch_size.sql and fetch_size.out. diff --git a/contrib/postgres_fdw/Makefile b/contrib/postgres_fdw/Makefile index 3543312..5d50b30 100644 --- a/contrib/postgres_fdw/Makefile +++ b/contrib/postgres_fdw/Makefile @@ -10,7 +10,7 @@ SHLIB_LINK = $(libpq) EXTENSION = postgres_fdw DATA = postgres_fdw--1.0.sql -REGRESS = postgres_fdw +REGRESS = postgres_fdw fetch_size ifdef USE_PGXS PG_CONFIG = pg_config diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c index 380ac80..f482dcd 100644 --- a/contrib/postgres_fdw/option.c +++ b/contrib/postgres_fdw/option.c @@ -162,6 +162,12 @@ InitPgFdwOptions(void) /* updatable is available on both server and table */ {"updatable", ForeignServerRelationId, false}, {"updatable", ForeignTableRelationId, false}, + /* + * fetch_size is available on both server and table, the table setting + * overrides the server setting. + */ + {"fetch_size", ForeignServerRelationId, false}, + {"fetch_size", ForeignTableRelationId, false}, {NULL, InvalidOid, false} }; diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index f501c6c..d2595f6 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -68,7 +68,9 @@ enum FdwScanPrivateIndex /* SQL statement to execute remotely (as a String node) */ FdwScanPrivateSelectSql, /* Integer list of attribute numbers retrieved by the SELECT */ - FdwScanPrivateRetrievedAttrs + FdwScanPrivateRetrievedAttrs, + /* Integer representing the desired fetch_size */ + FdwScanPrivateFetchSize }; /* @@ -126,6 +128,8 @@ typedef struct PgFdwScanState /* working memory contexts */ MemoryContext batch_cxt; /* context holding current batch of tuples */ MemoryContext temp_cxt; /* context for per-tuple temporary data */ + + int fetch_size; /* number of tuples per fetch */ } PgFdwScanState; /* @@ -303,6 +307,8 @@ static HeapTuple make_tuple_from_result_row(PGresult *res, List *retrieved_attrs, MemoryContext temp_context); static void conversion_error_callback(void *arg); +static int get_fetch_size(ForeignTable *table, + ForeignServer *server); /* @@ -371,6 +377,8 @@ postgresGetForeignRelSize(PlannerInfo *root, /* Look up foreign-table catalog info. */ fpinfo->table = GetForeignTable(foreigntableid); fpinfo->server = GetForeignServer(fpinfo->table->serverid); + /* Look up any table-specific fetch size */ + fpinfo->fetch_size = get_fetch_size(fpinfo->table,fpinfo->server); /* * Extract user-settable option values. Note that per-table setting of @@ -1069,6 +1077,9 @@ postgresGetForeignPlan(PlannerInfo *root, */ fdw_private = list_make2(makeString(sql.data), retrieved_attrs); + fdw_private = list_make3(makeString(sql.data), + retrieved_attrs, + makeInteger(fpinfo->fetch_size)); /* * Create the ForeignScan node from target list, filtering expressions, @@ -1147,6 +1158,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags) FdwScanPrivateSelectSql)); fsstate->retrieved_attrs = (List *) list_nth(fsplan->fdw_private, FdwScanPrivateRetrievedAttrs); + fsstate->fetch_size = intVal(list_nth(fsplan->fdw_private, + FdwScanPrivateFetchSize)); /* Create contexts for batches of tuples and per-tuple temp workspace. */ fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt, @@ -2275,15 +2288,13 @@ fetch_more_data(ForeignScanState *node) { PGconn *conn = fsstate->conn; char sql[64]; - int fetch_size; int numrows; int i; - /* The fetch size is arbitrary, but shouldn't be enormous. */ - fetch_size = 100; -
Re: [HACKERS] [POC] FETCH limited by bytes.
>Have you got numbers showing any actual performance win for postgres_fdw? For JDBC purposes, it would be nice to have an ability of asking backend "to stop fetching if produced more than X MiB of response data". For small table (4 int4 fields), having decent fetchSize (~1000) makes result processing 7 times faster than with fetchSize of 50 rows (14 ms -> 2 ms for 2000 rows). Here are the measurements: [1] and [2]. Note: it is not required to precisely follow given "max fetch bytes" limit. It would be enough just to stop after certain amount of data was sent. The whole thing of using limited fetch size is to avoid running out of memory at client side. I do not think developers care how many rows is fetched at once. It they do, they should rather use "limit X" SQL syntax. Do you have a suggestion for such a use case? For fixed-size data types, JDBC driver can estimate "max sane fetch size", however: 1) In order to know data types, a roundtrip is required. This means the first fetch must be conservative, thus small queries would be penalized. 2) For variable length types there is no way to estimate "sane number of rows", except of using "average row size of already received data". This is not reliable, especially if the first rows have nulls, and subsequent ones contain non-empty strings. [1]: https://github.com/pgjdbc/pgjdbc/issues/292#issue-82595473 [2]: https://github.com/pgjdbc/pgjdbc/issues/292#issuecomment-107019387 Vladimir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
On Fri, Dec 25, 2015 at 12:31 AM, Kyotaro HORIGUCHI < horiguchi.kyot...@lab.ntt.co.jp> wrote: > Hello, > > At Thu, 24 Dec 2015 18:31:42 -0500, Corey Huinker> wrote in
Re: [HACKERS] [POC] FETCH limited by bytes.
On Sat, Dec 26, 2015 at 6:16 AM, Vladimir Sitnikov < sitnikov.vladi...@gmail.com> wrote: > >Have you got numbers showing any actual performance win for postgres_fdw? > > For JDBC purposes, it would be nice to have an ability of asking > backend "to stop fetching if produced more than X MiB of response > data". > For small table (4 int4 fields), having decent fetchSize (~1000) makes > result processing 7 times faster than with fetchSize of 50 rows (14 ms > -> 2 ms for 2000 rows). > Here are the measurements: [1] and [2]. > > Note: it is not required to precisely follow given "max fetch bytes" > limit. It would be enough just to stop after certain amount of data > was sent. > The whole thing of using limited fetch size is to avoid running out of > memory at client side. > I do not think developers care how many rows is fetched at once. It > they do, they should rather use "limit X" SQL syntax. > > Do you have a suggestion for such a use case? > > For fixed-size data types, JDBC driver can estimate "max sane fetch > size", however: > 1) In order to know data types, a roundtrip is required. This means > the first fetch must be conservative, thus small queries would be > penalized. > 2) For variable length types there is no way to estimate "sane number > of rows", except of using "average row size of already received data". > This is not reliable, especially if the first rows have nulls, and > subsequent ones contain non-empty strings. > > [1]: https://github.com/pgjdbc/pgjdbc/issues/292#issue-82595473 > [2]: https://github.com/pgjdbc/pgjdbc/issues/292#issuecomment-107019387 > > Vladimir > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > I believe that Kyotaro proposed something like that, wherein the FDW would be more adaptive based on the amount of memory available, and fetch a number of rows that, by its estimation, would fit in the memory available. I don't know the progress of that patch. This patch is a far less complicated solution and puts the burden on the DBA to figure out approximately how many rows would fit in memory based on the average row size, and set the per-table option accordingly. If it is later determined that the rows are now too heavy to fit into the space allotted, the fetch size can be altered for that table as needed.
Re: [HACKERS] [POC] FETCH limited by bytes.
>and fetch a number of rows that, by its estimation, would fit in the memory >available What's wrong with having size limit in the first place? It seems to make much more sense. Vladimir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
On Wed, Dec 23, 2015 at 3:08 PM, Jim Nasbywrote: > On 12/23/15 12:15 PM, Corey Huinker wrote: > >> That's fair. I'm still at a loss for how to show that the fetch size was >> respected by the remote server, suggestions welcome! >> > > A combination of repeat() and generate_series()? > > I'm guessing it's not that obvious and that I'm missing something; can you > elaborate? > -- > Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX > Experts in Analytics, Data Architecture and PostgreSQL > Data in Trouble? Get it in Treble! http://BlueTreble.com > I'll try. So the basic test of whether the FDW respected the fetch limit is this: 1. create foreign server using postgres_fdw, create foreign table. 2. run a query against that table. it works. great. 3. alter server set fetch size option to 101 (or any number different from 100) 4. run same query against the table. the server side should show that the result set was fetched in 101 row chunks[1]. 5. alter table set fetch size option to 102 (or any number different from 100 and the number you picked in #3) 6. run same query against the table. the server side should show that the result set was fetched in 102 row chunks[1]. The parts marked [1] are the problem...because the way I know it works is looking at the query console on the remote redshift cluster where the query column reads "FETCH 101 in c1" or somesuch rather than the query text. That's great, *I* know it works, but I don't know how capture that information from a vanilla postgres server, and I don't know if we can do the regression with a loopback connection, or if we'd need to set up a second pg instance for the regression test scaffolding.
Re: [HACKERS] [POC] FETCH limited by bytes.
Hello, At Thu, 24 Dec 2015 18:31:42 -0500, Corey Huinkerwrote in
Re: [HACKERS] [POC] FETCH limited by bytes.
On Mon, Sep 21, 2015 at 1:52 PM, Corey Huinkerwrote: > Attached is the patch / diff against current master. I have moved this patch to next CF, there is a new patch, but no reviews. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
On Wed, Dec 23, 2015 at 8:43 AM, Michael Paquierwrote: > On Mon, Sep 21, 2015 at 1:52 PM, Corey Huinker > wrote: > > Attached is the patch / diff against current master. > > I have moved this patch to next CF, there is a new patch, but no reviews. > -- > Michael > That's fair. I'm still at a loss for how to show that the fetch size was respected by the remote server, suggestions welcome!
Re: [HACKERS] [POC] FETCH limited by bytes.
On 12/23/15 12:15 PM, Corey Huinker wrote: That's fair. I'm still at a loss for how to show that the fetch size was respected by the remote server, suggestions welcome! A combination of repeat() and generate_series()? I'm guessing it's not that obvious and that I'm missing something; can you elaborate? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
On Fri, Sep 4, 2015 at 2:45 AM, Kyotaro HORIGUCHI < horiguchi.kyot...@lab.ntt.co.jp> wrote: > Hello, > > > > > @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node, > > > int eflags) > ... > > > > + def = get_option(table->options, "fetch_size"); > > > > > I don't think it's a good idea to make such checks at runtime - and > > > either way it's somethign that should be reported back using an > > > ereport(), not an elog. > > > > > Also, it seems somewhat wrong to determine this at execution > > > time. Shouldn't this rather be done when creating the foreign scan > node? > > > And be a part of the scan state? > > > > I agree, that was my original plan, but I wasn't familiar enough with the > > FDW architecture to know where the table struct and the scan struct were > > both exposed in the same function. > > > > What I submitted incorporated some of Kyotaro's feedback (and working > > patch) to my original incomplete patch. > > Sorry, it certainly shouldn't be a good place to do such thing. I > easily selected the place in order to avoid adding new similar > member in multiple existing structs (PgFdwRelationInfo and > PgFdwScanState). > > Having a new member fetch_size is added in PgFdwRelationInfo and > PgFdwScanState, I think postgresGetForeignRelSize is the best > place to do that, from the point that it collects basic > information needed to calculate scan costs. > > # Fetch sizes of foreign join would be the future issue.. > > > typedef struct PgFdwRelationInfo > > { > ... > + intfetch_size; /* fetch size for this remote table */ > > > > postgreGetForeignRelSize() > > { > ... > > fpinfo->table = GetForeignTable(foreigntableid); > > fpinfo->server = GetForeignServer(fpinfo->table->serverid); > > > + def = get_option(table->options, "fetch_size"); > + .. > + fpinfo->fetch_size = strtod(defGetString... > > Also it is doable in postgresGetForeignPlan and doing there > removes redundant copy of fetch_size from PgFdwRelation to > PgFdwScanState but theoretical basis would be weak. > > regards, > > > > On 2015-02-27 13:50:22 -0500, Corey Huinker wrote: > > > > +static DefElem* > > > > +get_option(List *options, char *optname) > > > > +{ > > > > + ListCell *lc; > > > > + > > > > + foreach(lc, options) > > > > + { > > > > + DefElem *def = (DefElem *) lfirst(lc); > > > > + > > > > + if (strcmp(def->defname, optname) == 0) > > > > + return def; > > > > + } > > > > + return NULL; > > > > +} > > > > > > > > > > /* > > > >* Do nothing in EXPLAIN (no ANALYZE) case. node->fdw_state > stays > > > NULL. > > > > @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node, > > > int eflags) > > > > server = GetForeignServer(table->serverid); > > > > user = GetUserMapping(userid, server->serverid); > > > > > > > > + /* Reading table options */ > > > > + fsstate->fetch_size = -1; > > > > + > > > > + def = get_option(table->options, "fetch_size"); > > > > + if (!def) > > > > + def = get_option(server->options, "fetch_size"); > > > > + > > > > + if (def) > > > > + { > > > > + fsstate->fetch_size = strtod(defGetString(def), NULL); > > > > + if (fsstate->fetch_size < 0) > > > > + elog(ERROR, "invalid fetch size for foreign > table > > > \"%s\"", > > > > + get_rel_name(table->relid)); > > > > + } > > > > + else > > > > + fsstate->fetch_size = 100; > > > > > > I don't think it's a good idea to make such checks at runtime - and > > > either way it's somethign that should be reported back using an > > > ereport(), not an elog. > > > > > Also, it seems somewhat wrong to determine this at execution > > > time. Shouldn't this rather be done when creating the foreign scan > node? > > > And be a part of the scan state? > > > > I agree, that was my original plan, but I wasn't familiar enough with the > > FDW architecture to know where the table struct and the scan struct were > > both exposed in the same function. > > > > What I submitted incorporated some of Kyotaro's feedback (and working > > patch) to my original incomplete patch. > > > > > Have you thought about how this option should cooperate with join > > > pushdown once implemented? > > > > > > > I hadn't until now, but I think the only sensible thing would be to > > disregard table-specific settings once a second foreign table is > detected, > > and instead consider only the server-level setting. > > > > I suppose one could argue that if ALL the tables in the join had the same > > table-level setting, we should go with that, but I think that would be > > complicated, expensive, and generally a good argument for changing the > > server setting instead. > > -- > Kyotaro Horiguchi > NTT Open Source Software Center > > Ok, with some guidance from RhodiumToad (thanks!) I was able to get
Re: [HACKERS] [POC] FETCH limited by bytes.
Hello, > > > @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node, > > int eflags) ... > > > + def = get_option(table->options, "fetch_size"); > > > I don't think it's a good idea to make such checks at runtime - and > > either way it's somethign that should be reported back using an > > ereport(), not an elog. > > > Also, it seems somewhat wrong to determine this at execution > > time. Shouldn't this rather be done when creating the foreign scan node? > > And be a part of the scan state? > > I agree, that was my original plan, but I wasn't familiar enough with the > FDW architecture to know where the table struct and the scan struct were > both exposed in the same function. > > What I submitted incorporated some of Kyotaro's feedback (and working > patch) to my original incomplete patch. Sorry, it certainly shouldn't be a good place to do such thing. I easily selected the place in order to avoid adding new similar member in multiple existing structs (PgFdwRelationInfo and PgFdwScanState). Having a new member fetch_size is added in PgFdwRelationInfo and PgFdwScanState, I think postgresGetForeignRelSize is the best place to do that, from the point that it collects basic information needed to calculate scan costs. # Fetch sizes of foreign join would be the future issue.. > typedef struct PgFdwRelationInfo > { ... + intfetch_size; /* fetch size for this remote table */ > postgreGetForeignRelSize() > { ... > fpinfo->table = GetForeignTable(foreigntableid); > fpinfo->server = GetForeignServer(fpinfo->table->serverid); > + def = get_option(table->options, "fetch_size"); + .. + fpinfo->fetch_size = strtod(defGetString... Also it is doable in postgresGetForeignPlan and doing there removes redundant copy of fetch_size from PgFdwRelation to PgFdwScanState but theoretical basis would be weak. regards, > > On 2015-02-27 13:50:22 -0500, Corey Huinker wrote: > > > +static DefElem* > > > +get_option(List *options, char *optname) > > > +{ > > > + ListCell *lc; > > > + > > > + foreach(lc, options) > > > + { > > > + DefElem *def = (DefElem *) lfirst(lc); > > > + > > > + if (strcmp(def->defname, optname) == 0) > > > + return def; > > > + } > > > + return NULL; > > > +} > > > > > > > /* > > >* Do nothing in EXPLAIN (no ANALYZE) case. node->fdw_state stays > > NULL. > > > @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node, > > int eflags) > > > server = GetForeignServer(table->serverid); > > > user = GetUserMapping(userid, server->serverid); > > > > > > + /* Reading table options */ > > > + fsstate->fetch_size = -1; > > > + > > > + def = get_option(table->options, "fetch_size"); > > > + if (!def) > > > + def = get_option(server->options, "fetch_size"); > > > + > > > + if (def) > > > + { > > > + fsstate->fetch_size = strtod(defGetString(def), NULL); > > > + if (fsstate->fetch_size < 0) > > > + elog(ERROR, "invalid fetch size for foreign table > > \"%s\"", > > > + get_rel_name(table->relid)); > > > + } > > > + else > > > + fsstate->fetch_size = 100; > > > > I don't think it's a good idea to make such checks at runtime - and > > either way it's somethign that should be reported back using an > > ereport(), not an elog. > > > Also, it seems somewhat wrong to determine this at execution > > time. Shouldn't this rather be done when creating the foreign scan node? > > And be a part of the scan state? > > I agree, that was my original plan, but I wasn't familiar enough with the > FDW architecture to know where the table struct and the scan struct were > both exposed in the same function. > > What I submitted incorporated some of Kyotaro's feedback (and working > patch) to my original incomplete patch. > > > Have you thought about how this option should cooperate with join > > pushdown once implemented? > > > > I hadn't until now, but I think the only sensible thing would be to > disregard table-specific settings once a second foreign table is detected, > and instead consider only the server-level setting. > > I suppose one could argue that if ALL the tables in the join had the same > table-level setting, we should go with that, but I think that would be > complicated, expensive, and generally a good argument for changing the > server setting instead. -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
On Wed, Sep 2, 2015 at 9:08 AM, Andres Freundwrote: > On 2015-02-27 13:50:22 -0500, Corey Huinker wrote: > > +static DefElem* > > +get_option(List *options, char *optname) > > +{ > > + ListCell *lc; > > + > > + foreach(lc, options) > > + { > > + DefElem *def = (DefElem *) lfirst(lc); > > + > > + if (strcmp(def->defname, optname) == 0) > > + return def; > > + } > > + return NULL; > > +} > > > > /* > >* Do nothing in EXPLAIN (no ANALYZE) case. node->fdw_state stays > NULL. > > @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node, > int eflags) > > server = GetForeignServer(table->serverid); > > user = GetUserMapping(userid, server->serverid); > > > > + /* Reading table options */ > > + fsstate->fetch_size = -1; > > + > > + def = get_option(table->options, "fetch_size"); > > + if (!def) > > + def = get_option(server->options, "fetch_size"); > > + > > + if (def) > > + { > > + fsstate->fetch_size = strtod(defGetString(def), NULL); > > + if (fsstate->fetch_size < 0) > > + elog(ERROR, "invalid fetch size for foreign table > \"%s\"", > > + get_rel_name(table->relid)); > > + } > > + else > > + fsstate->fetch_size = 100; > > I don't think it's a good idea to make such checks at runtime - and > either way it's somethign that should be reported back using an > ereport(), not an elog. > Also, it seems somewhat wrong to determine this at execution > time. Shouldn't this rather be done when creating the foreign scan node? > And be a part of the scan state? > I agree, that was my original plan, but I wasn't familiar enough with the FDW architecture to know where the table struct and the scan struct were both exposed in the same function. What I submitted incorporated some of Kyotaro's feedback (and working patch) to my original incomplete patch. > > Have you thought about how this option should cooperate with join > pushdown once implemented? > I hadn't until now, but I think the only sensible thing would be to disregard table-specific settings once a second foreign table is detected, and instead consider only the server-level setting. I suppose one could argue that if ALL the tables in the join had the same table-level setting, we should go with that, but I think that would be complicated, expensive, and generally a good argument for changing the server setting instead.
Re: [HACKERS] [POC] FETCH limited by bytes.
On 2015-02-27 13:50:22 -0500, Corey Huinker wrote: > +static DefElem* > +get_option(List *options, char *optname) > +{ > + ListCell *lc; > + > + foreach(lc, options) > + { > + DefElem *def = (DefElem *) lfirst(lc); > + > + if (strcmp(def->defname, optname) == 0) > + return def; > + } > + return NULL; > +} > /* >* Do nothing in EXPLAIN (no ANALYZE) case. node->fdw_state stays NULL. > @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node, int > eflags) > server = GetForeignServer(table->serverid); > user = GetUserMapping(userid, server->serverid); > > + /* Reading table options */ > + fsstate->fetch_size = -1; > + > + def = get_option(table->options, "fetch_size"); > + if (!def) > + def = get_option(server->options, "fetch_size"); > + > + if (def) > + { > + fsstate->fetch_size = strtod(defGetString(def), NULL); > + if (fsstate->fetch_size < 0) > + elog(ERROR, "invalid fetch size for foreign table > \"%s\"", > + get_rel_name(table->relid)); > + } > + else > + fsstate->fetch_size = 100; I don't think it's a good idea to make such checks at runtime - and either way it's somethign that should be reported back using an ereport(), not an elog. Also, it seems somewhat wrong to determine this at execution time. Shouldn't this rather be done when creating the foreign scan node? And be a part of the scan state? Have you thought about how this option should cooperate with join pushdown once implemented? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
Attached is a diff containing the original (working) patch from my (incomplete) patch, plus regression test changes and documentation changes. While it's easy to regression-test the persistence of the fetch_size options, I am confounded as to how we would show that the fetch_size setting was respected. I've seen it with my own eyes viewing the query log on redshift, but I see no way to automate that. Suggestions welcome. On Fri, Feb 6, 2015 at 3:11 AM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hello, Redshift has a table, stv_inflight, which serves about the same purpose as pg_stat_activity. Redshift seems to perform better with very high fetch sizes (10,000 is a good start), so the default foreign data wrapper didn't perform so well. I agree with you. I was able to confirm that the first query showed FETCH 150 FROM c1 as the query, which is normal highly unhelpful, but in this case it confirms that tables created in redshift_server do by default use the fetch_size option given during server creation. I was also able to confirm that the second query showed FETCH 151 FROM c1 as the query, which shows that per-table overrides also work. I'd be happy to stop here, but Kyotaro might feel differently. This is enough in its own way, of course. With this limited patch, one could estimate the number of rows that would fit into the desired memory block based on the row width and set fetch_size accordingly. The users including me will be happy with it when the users know how to determin the fetch size. Especially the remote tables are very few or the configuration will be enough stable. On widely distributed systems, it would be far difficult to tune fetch size manually for every foreign tables, so finally it would be left at the default and safe size, it's 100 or so. This is the similar discussion about max_wal_size on another thread. Calculating fetch size is far tougher for users than setting expected memory usage, I think. But we could go further, and have a fetch_max_memory option only at the table level, and the fdw could do that same memory / estimated_row_size calculation and derive fetch_size that way *at table creation time*, not query time. We cannot know the real length of the text type data in advance, other than that, even defined as char(n), the n is the theoretically(or in-design) maximum size for the field but in the most cases the mean length of the real data would be far small than that. For that reason, calculating the ratio at the table creation time seems to be difficult. However, I agree to the Tom's suggestion that the changes in FETCH statement is defenitly ugly, especially the overhead argument is prohibitive even for me:( Thanks to Kyotaro and Bruce Momjian for their help. Not at all. regardes, At Wed, 4 Feb 2015 18:06:02 -0500, Corey Huinker corey.huin...@gmail.com wrote in CADkLM=eTpKYX5VOfjLr0VvfXhEZbC2UeakN= p6mxmg7s86c...@mail.gmail.com I applied this patch to REL9_4_STABLE, and I was able to connect to a foreign database (redshift, actually). the basic outline of the test is below, names changed to protect my employment. create extension if not exists postgres_fdw; create server redshift_server foreign data wrapper postgres_fdw options ( host 'some.hostname.ext', port '5439', dbname 'redacted', fetch_size '150' ); create user mapping for public server redshift_server options ( user 'redacted_user', password 'comeonyouarekiddingright' ); create foreign table redshift_tab150 ( colspecs ) server redshift_server options (table_name 'redacted_table', schema_name 'redacted_schema' ); create foreign table redshift_tab151 ( colspecs ) server redshift_server options (table_name 'redacted_table', schema_name 'redacted_schema', fetch_size '151' ); -- i don't expect the fdw to push the aggregate, this is just a test to see what query shows up in stv_inflight. select count(*) from redshift_ccp150; -- i don't expect the fdw to push the aggregate, this is just a test to see what query shows up in stv_inflight. select count(*) from redshift_ccp151; For those of you that aren't familiar with Redshift, it's a columnar database that seems to be a fork of postgres 8.cough. You can connect to it with modern libpq programs (psql, psycopg2, etc). Redshift has a table, stv_inflight, which serves about the same purpose as pg_stat_activity. Redshift seems to perform better with very high fetch sizes (10,000 is a good start), so the default foreign data wrapper didn't perform so well. I was able to confirm that the first query showed FETCH 150 FROM c1 as the query, which is normal highly unhelpful, but in this case it confirms that tables created in redshift_server do by default use the fetch_size option given during server creation. I was also able to confirm that the second query showed FETCH 151 FROM c1
Re: [HACKERS] [POC] FETCH limited by bytes.
Hello, Redshift has a table, stv_inflight, which serves about the same purpose as pg_stat_activity. Redshift seems to perform better with very high fetch sizes (10,000 is a good start), so the default foreign data wrapper didn't perform so well. I agree with you. I was able to confirm that the first query showed FETCH 150 FROM c1 as the query, which is normal highly unhelpful, but in this case it confirms that tables created in redshift_server do by default use the fetch_size option given during server creation. I was also able to confirm that the second query showed FETCH 151 FROM c1 as the query, which shows that per-table overrides also work. I'd be happy to stop here, but Kyotaro might feel differently. This is enough in its own way, of course. With this limited patch, one could estimate the number of rows that would fit into the desired memory block based on the row width and set fetch_size accordingly. The users including me will be happy with it when the users know how to determin the fetch size. Especially the remote tables are very few or the configuration will be enough stable. On widely distributed systems, it would be far difficult to tune fetch size manually for every foreign tables, so finally it would be left at the default and safe size, it's 100 or so. This is the similar discussion about max_wal_size on another thread. Calculating fetch size is far tougher for users than setting expected memory usage, I think. But we could go further, and have a fetch_max_memory option only at the table level, and the fdw could do that same memory / estimated_row_size calculation and derive fetch_size that way *at table creation time*, not query time. We cannot know the real length of the text type data in advance, other than that, even defined as char(n), the n is the theoretically(or in-design) maximum size for the field but in the most cases the mean length of the real data would be far small than that. For that reason, calculating the ratio at the table creation time seems to be difficult. However, I agree to the Tom's suggestion that the changes in FETCH statement is defenitly ugly, especially the overhead argument is prohibitive even for me:( Thanks to Kyotaro and Bruce Momjian for their help. Not at all. regardes, At Wed, 4 Feb 2015 18:06:02 -0500, Corey Huinker corey.huin...@gmail.com wrote in CADkLM=eTpKYX5VOfjLr0VvfXhEZbC2UeakN=p6mxmg7s86c...@mail.gmail.com I applied this patch to REL9_4_STABLE, and I was able to connect to a foreign database (redshift, actually). the basic outline of the test is below, names changed to protect my employment. create extension if not exists postgres_fdw; create server redshift_server foreign data wrapper postgres_fdw options ( host 'some.hostname.ext', port '5439', dbname 'redacted', fetch_size '150' ); create user mapping for public server redshift_server options ( user 'redacted_user', password 'comeonyouarekiddingright' ); create foreign table redshift_tab150 ( colspecs ) server redshift_server options (table_name 'redacted_table', schema_name 'redacted_schema' ); create foreign table redshift_tab151 ( colspecs ) server redshift_server options (table_name 'redacted_table', schema_name 'redacted_schema', fetch_size '151' ); -- i don't expect the fdw to push the aggregate, this is just a test to see what query shows up in stv_inflight. select count(*) from redshift_ccp150; -- i don't expect the fdw to push the aggregate, this is just a test to see what query shows up in stv_inflight. select count(*) from redshift_ccp151; For those of you that aren't familiar with Redshift, it's a columnar database that seems to be a fork of postgres 8.cough. You can connect to it with modern libpq programs (psql, psycopg2, etc). Redshift has a table, stv_inflight, which serves about the same purpose as pg_stat_activity. Redshift seems to perform better with very high fetch sizes (10,000 is a good start), so the default foreign data wrapper didn't perform so well. I was able to confirm that the first query showed FETCH 150 FROM c1 as the query, which is normal highly unhelpful, but in this case it confirms that tables created in redshift_server do by default use the fetch_size option given during server creation. I was also able to confirm that the second query showed FETCH 151 FROM c1 as the query, which shows that per-table overrides also work. I'd be happy to stop here, but Kyotaro might feel differently. With this limited patch, one could estimate the number of rows that would fit into the desired memory block based on the row width and set fetch_size accordingly. But we could go further, and have a fetch_max_memory option only at the table level, and the fdw could do that same memory / estimated_row_size calculation and derive fetch_size that way *at table creation time*, not query time. Thanks to Kyotaro and Bruce Momjian for their help. On Mon,
Re: [HACKERS] [POC] FETCH limited by bytes.
I applied this patch to REL9_4_STABLE, and I was able to connect to a foreign database (redshift, actually). the basic outline of the test is below, names changed to protect my employment. create extension if not exists postgres_fdw; create server redshift_server foreign data wrapper postgres_fdw options ( host 'some.hostname.ext', port '5439', dbname 'redacted', fetch_size '150' ); create user mapping for public server redshift_server options ( user 'redacted_user', password 'comeonyouarekiddingright' ); create foreign table redshift_tab150 ( colspecs ) server redshift_server options (table_name 'redacted_table', schema_name 'redacted_schema' ); create foreign table redshift_tab151 ( colspecs ) server redshift_server options (table_name 'redacted_table', schema_name 'redacted_schema', fetch_size '151' ); -- i don't expect the fdw to push the aggregate, this is just a test to see what query shows up in stv_inflight. select count(*) from redshift_ccp150; -- i don't expect the fdw to push the aggregate, this is just a test to see what query shows up in stv_inflight. select count(*) from redshift_ccp151; For those of you that aren't familiar with Redshift, it's a columnar database that seems to be a fork of postgres 8.cough. You can connect to it with modern libpq programs (psql, psycopg2, etc). Redshift has a table, stv_inflight, which serves about the same purpose as pg_stat_activity. Redshift seems to perform better with very high fetch sizes (10,000 is a good start), so the default foreign data wrapper didn't perform so well. I was able to confirm that the first query showed FETCH 150 FROM c1 as the query, which is normal highly unhelpful, but in this case it confirms that tables created in redshift_server do by default use the fetch_size option given during server creation. I was also able to confirm that the second query showed FETCH 151 FROM c1 as the query, which shows that per-table overrides also work. I'd be happy to stop here, but Kyotaro might feel differently. With this limited patch, one could estimate the number of rows that would fit into the desired memory block based on the row width and set fetch_size accordingly. But we could go further, and have a fetch_max_memory option only at the table level, and the fdw could do that same memory / estimated_row_size calculation and derive fetch_size that way *at table creation time*, not query time. Thanks to Kyotaro and Bruce Momjian for their help. On Mon, Feb 2, 2015 at 2:27 AM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hmm, somehow I removed some recipients, especially the list. Sorry for the duplicate. - Sorry, I've been back. Thank you for the comment. Do you have any insight into where I would pass the custom row fetches from the table struct to the scan struct? Yeah it's one simple way to tune it, if the user knows the appropreate value. Last year I was working on a patch to postgres_fdw where the fetch_size could be set at the table level and the server level. I was able to get the settings parsed and they would show up in pg_foreign_table and pg_foreign_servers. Unfortunately, I'm not very familiar with how foreign data wrappers work, so I wasn't able to figure out how to get these custom values passed from the PgFdwRelationInfo struct into the query's PgFdwScanState struct. Directly answering, the states needed to be shared among several stages are holded within fdw_private. Your new variable fpinfo-fetch_size can be read in postgresGetForeignPlan. It newly creates another fdw_private. You can pass your values to ForeignPlan making it hold the value there. Finally, you will get it in postgresBeginForeginScan and can set it into PgFdwScanState. I bring this up only because it might be a simpler solution, in that the table designer could set the fetch size very high for narrow tables, and lower or default for wider tables. It's also a very clean syntax, just another option on the table and/or server creation. My incomplete patch is attached. However, the fetch_size is not needed by planner (so far), so we can simply read the options in postgresBeginForeignScan() and set into PgFdwScanState. This runs once per exection. Finally, the attached patch will work as you intended. What do you think about this? regards, -- Kyotaro Horiguchi NTT Open Source Software Center On Tue, Jan 27, 2015 at 4:24 AM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Thank you for the comment. The automatic way to determin the fetch_size looks become too much for the purpose. An example of non-automatic way is a new foreign table option like 'fetch_size' but this exposes the inside too much... Which do you think is preferable? Thu, 22 Jan 2015 11:17:52 -0500, Tom Lane t...@sss.pgh.pa.us wrote in 24503.1421943...@sss.pgh.pa.us Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: Hello, as the
Re: [HACKERS] [POC] FETCH limited by bytes.
Hmm, somehow I removed some recipients, especially the list. Sorry for the duplicate. - Sorry, I've been back. Thank you for the comment. Do you have any insight into where I would pass the custom row fetches from the table struct to the scan struct? Yeah it's one simple way to tune it, if the user knows the appropreate value. Last year I was working on a patch to postgres_fdw where the fetch_size could be set at the table level and the server level. I was able to get the settings parsed and they would show up in pg_foreign_table and pg_foreign_servers. Unfortunately, I'm not very familiar with how foreign data wrappers work, so I wasn't able to figure out how to get these custom values passed from the PgFdwRelationInfo struct into the query's PgFdwScanState struct. Directly answering, the states needed to be shared among several stages are holded within fdw_private. Your new variable fpinfo-fetch_size can be read in postgresGetForeignPlan. It newly creates another fdw_private. You can pass your values to ForeignPlan making it hold the value there. Finally, you will get it in postgresBeginForeginScan and can set it into PgFdwScanState. I bring this up only because it might be a simpler solution, in that the table designer could set the fetch size very high for narrow tables, and lower or default for wider tables. It's also a very clean syntax, just another option on the table and/or server creation. My incomplete patch is attached. However, the fetch_size is not needed by planner (so far), so we can simply read the options in postgresBeginForeignScan() and set into PgFdwScanState. This runs once per exection. Finally, the attached patch will work as you intended. What do you think about this? regards, -- Kyotaro Horiguchi NTT Open Source Software Center On Tue, Jan 27, 2015 at 4:24 AM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Thank you for the comment. The automatic way to determin the fetch_size looks become too much for the purpose. An example of non-automatic way is a new foreign table option like 'fetch_size' but this exposes the inside too much... Which do you think is preferable? Thu, 22 Jan 2015 11:17:52 -0500, Tom Lane t...@sss.pgh.pa.us wrote in 24503.1421943...@sss.pgh.pa.us Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: Hello, as the discuttion on async fetching on postgres_fdw, FETCH with data-size limitation would be useful to get memory usage stability of postgres_fdw. Is such a feature and syntax could be allowed to be added? This seems like a lot of work, and frankly an incredibly ugly API, for a benefit that is entirely hypothetical. Have you got numbers showing any actual performance win for postgres_fdw? The API is a rush work to make the path for the new parameter (but, yes, I did too much for the purpose that use from postgres_fdw..) and it can be any saner syntax but it's not the time to do so yet. The data-size limitation, any size to limit, would give significant gain especially for small sized rows. This patch began from the fact that it runs about twice faster when fetch size = 1 than 100. http://www.postgresql.org/message-id/20150116.171849.109146500.horiguchi.kyot...@lab.ntt.co.jp I took exec times to get 1M rows from localhost via postgres_fdw and it showed the following numbers. =# SELECT a from ft1; fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1) (local)0.75s 10060 6.2s 6000 (0.006) 1 60 2.7s 60 (0.6 ) 3 60 2.2s180 (2.0 ) 6 60 2.4s360 (4.0 ) =# SELECT a, b, c from ft1; fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1) (local)0.8s 100 204 12 s 20400 (0.02 ) 1000 204 10 s 204000 (0.2 ) 1 204 5.8s204 (2) 2 204 5.9s408 (4) =# SELECT a, b, d from ft1; fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1) (local)0.8s 100 1356 17 s 135600 (0.136) 1000 1356 15 s1356000 (1.356) 1475 1356 13 s2000100 (2.0 ) 2950 1356 13 s4000200 (4.0 ) The definitions of the environment are the following. CREATE SERVER sv1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'postgres'); CREATE USER MAPPING FOR PUBLIC SERVER sv1; CREATE TABLE lt1 (a int, b timestamp, c text, d text); CREATE FOREIGN TABLE ft1 (a int, b timestamp, c text, d text) SERVER sv1 OPTIONS (table_name 'lt1'); INSERT INTO lt1
Re: [HACKERS] [POC] FETCH limited by bytes.
Thank you for the comment. The automatic way to determin the fetch_size looks become too much for the purpose. An example of non-automatic way is a new foreign table option like 'fetch_size' but this exposes the inside too much... Which do you think is preferable? Thu, 22 Jan 2015 11:17:52 -0500, Tom Lane t...@sss.pgh.pa.us wrote in 24503.1421943...@sss.pgh.pa.us Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: Hello, as the discuttion on async fetching on postgres_fdw, FETCH with data-size limitation would be useful to get memory usage stability of postgres_fdw. Is such a feature and syntax could be allowed to be added? This seems like a lot of work, and frankly an incredibly ugly API, for a benefit that is entirely hypothetical. Have you got numbers showing any actual performance win for postgres_fdw? The API is a rush work to make the path for the new parameter (but, yes, I did too much for the purpose that use from postgres_fdw..) and it can be any saner syntax but it's not the time to do so yet. The data-size limitation, any size to limit, would give significant gain especially for small sized rows. This patch began from the fact that it runs about twice faster when fetch size = 1 than 100. http://www.postgresql.org/message-id/20150116.171849.109146500.horiguchi.kyot...@lab.ntt.co.jp I took exec times to get 1M rows from localhost via postgres_fdw and it showed the following numbers. =# SELECT a from ft1; fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1) (local)0.75s 10060 6.2s 6000 (0.006) 1 60 2.7s 60 (0.6 ) 3 60 2.2s180 (2.0 ) 6 60 2.4s360 (4.0 ) =# SELECT a, b, c from ft1; fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1) (local)0.8s 100 204 12 s 20400 (0.02 ) 1000 204 10 s 204000 (0.2 ) 1 204 5.8s204 (2) 2 204 5.9s408 (4) =# SELECT a, b, d from ft1; fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1) (local)0.8s 100 1356 17 s 135600 (0.136) 1000 1356 15 s1356000 (1.356) 1475 1356 13 s2000100 (2.0 ) 2950 1356 13 s4000200 (4.0 ) The definitions of the environment are the following. CREATE SERVER sv1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'postgres'); CREATE USER MAPPING FOR PUBLIC SERVER sv1; CREATE TABLE lt1 (a int, b timestamp, c text, d text); CREATE FOREIGN TABLE ft1 (a int, b timestamp, c text, d text) SERVER sv1 OPTIONS (table_name 'lt1'); INSERT INTO lt1 (SELECT a, now(), repeat('x', 128), repeat('x', 1280) FROM generate_series(0, 99) a); The avg row size is alloced_mem/fetch_size and the alloced_mem is the sum of HeapTuple[fetch_size] and (HEAPTUPLESIZE + tup-t_len) for all stored tuples in the receiver side, fetch_more_data() in postgres_fdw. They are about 50% gain for the smaller tuple size and 25% for the larger. They looks to be optimal at where alloced_mem is around 2MB by the reason unknown to me. Anyway the difference seems to be significant. Even if we wanted to do something like this, I strongly object to measuring size by heap_compute_data_size. That's not a number that users would normally have any direct knowledge of; nor does it have anything at all to do with the claimed use-case, where what you'd really need to measure is bytes transmitted down the wire. (The difference is not small: for instance, toasted values would likely still be toasted at the point where you're measuring.) Sure. Finally, the attached patch #1 which does the following things. - Sender limits the number of tuples using the sum of the net length of the column values to be sent, not including protocol overhead. It is calculated in the added function slot_compute_attr_size(), using raw length for compressed values. - postgres_fdw calculates fetch limit bytes by the following formula, MAX_FETCH_MEM - MAX_FETCH_SIZE * (estimated overhead per tuple); The result of the patch is as follows. MAX_FETCH_MEM = 2MiB and MAX_FETCH_SIZE = 3. fetch_size, avg row size(*1), time, max alloced_mem/fetch(Mbytes) (auto) 60 2.4s 108 ( 1.08) (auto)204 7.3s536400 ( 0.54) (auto) 1356 15 s430236 ( 0.43) This is meaningfully fast but the patch looks too big and the meaning of the new parameter is hard to understand..:( On the other hand the cause of the displacements of alloced_mem shown above is per-tuple overhead, the sum of which is unknown before execution.
Re: [HACKERS] [POC] FETCH limited by bytes.
Last year I was working on a patch to postgres_fdw where the fetch_size could be set at the table level and the server level. I was able to get the settings parsed and they would show up in pg_foreign_table and pg_foreign_servers. Unfortunately, I'm not very familiar with how foreign data wrappers work, so I wasn't able to figure out how to get these custom values passed from the PgFdwRelationInfo struct into the query's PgFdwScanState struct. I bring this up only because it might be a simpler solution, in that the table designer could set the fetch size very high for narrow tables, and lower or default for wider tables. It's also a very clean syntax, just another option on the table and/or server creation. My incomplete patch is attached. On Tue, Jan 27, 2015 at 4:24 AM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Thank you for the comment. The automatic way to determin the fetch_size looks become too much for the purpose. An example of non-automatic way is a new foreign table option like 'fetch_size' but this exposes the inside too much... Which do you think is preferable? Thu, 22 Jan 2015 11:17:52 -0500, Tom Lane t...@sss.pgh.pa.us wrote in 24503.1421943...@sss.pgh.pa.us Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: Hello, as the discuttion on async fetching on postgres_fdw, FETCH with data-size limitation would be useful to get memory usage stability of postgres_fdw. Is such a feature and syntax could be allowed to be added? This seems like a lot of work, and frankly an incredibly ugly API, for a benefit that is entirely hypothetical. Have you got numbers showing any actual performance win for postgres_fdw? The API is a rush work to make the path for the new parameter (but, yes, I did too much for the purpose that use from postgres_fdw..) and it can be any saner syntax but it's not the time to do so yet. The data-size limitation, any size to limit, would give significant gain especially for small sized rows. This patch began from the fact that it runs about twice faster when fetch size = 1 than 100. http://www.postgresql.org/message-id/20150116.171849.109146500.horiguchi.kyot...@lab.ntt.co.jp I took exec times to get 1M rows from localhost via postgres_fdw and it showed the following numbers. =# SELECT a from ft1; fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1) (local)0.75s 10060 6.2s 6000 (0.006) 1 60 2.7s 60 (0.6 ) 3 60 2.2s180 (2.0 ) 6 60 2.4s360 (4.0 ) =# SELECT a, b, c from ft1; fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1) (local)0.8s 100 204 12 s 20400 (0.02 ) 1000 204 10 s 204000 (0.2 ) 1 204 5.8s204 (2) 2 204 5.9s408 (4) =# SELECT a, b, d from ft1; fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1) (local)0.8s 100 1356 17 s 135600 (0.136) 1000 1356 15 s1356000 (1.356) 1475 1356 13 s2000100 (2.0 ) 2950 1356 13 s4000200 (4.0 ) The definitions of the environment are the following. CREATE SERVER sv1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'postgres'); CREATE USER MAPPING FOR PUBLIC SERVER sv1; CREATE TABLE lt1 (a int, b timestamp, c text, d text); CREATE FOREIGN TABLE ft1 (a int, b timestamp, c text, d text) SERVER sv1 OPTIONS (table_name 'lt1'); INSERT INTO lt1 (SELECT a, now(), repeat('x', 128), repeat('x', 1280) FROM generate_series(0, 99) a); The avg row size is alloced_mem/fetch_size and the alloced_mem is the sum of HeapTuple[fetch_size] and (HEAPTUPLESIZE + tup-t_len) for all stored tuples in the receiver side, fetch_more_data() in postgres_fdw. They are about 50% gain for the smaller tuple size and 25% for the larger. They looks to be optimal at where alloced_mem is around 2MB by the reason unknown to me. Anyway the difference seems to be significant. Even if we wanted to do something like this, I strongly object to measuring size by heap_compute_data_size. That's not a number that users would normally have any direct knowledge of; nor does it have anything at all to do with the claimed use-case, where what you'd really need to measure is bytes transmitted down the wire. (The difference is not small: for instance, toasted values would likely still be toasted at the point where you're measuring.) Sure. Finally, the attached patch #1 which does the following things. - Sender limits the number of tuples using the sum of the net length of
Re: [HACKERS] [POC] FETCH limited by bytes.
Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: Hello, as the discuttion on async fetching on postgres_fdw, FETCH with data-size limitation would be useful to get memory usage stability of postgres_fdw. Is such a feature and syntax could be allowed to be added? This seems like a lot of work, and frankly an incredibly ugly API, for a benefit that is entirely hypothetical. Have you got numbers showing any actual performance win for postgres_fdw? Even if we wanted to do something like this, I strongly object to measuring size by heap_compute_data_size. That's not a number that users would normally have any direct knowledge of; nor does it have anything at all to do with the claimed use-case, where what you'd really need to measure is bytes transmitted down the wire. (The difference is not small: for instance, toasted values would likely still be toasted at the point where you're measuring.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers