Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
On Sat, Sep 25, 2021 at 10:59 PM Tom Lane wrote: > Etsuro Fujita writes: > > On Sat, Sep 25, 2021 at 4:11 AM Tom Lane wrote: > >> Longer-term, it seems like we really have to be able to represent > >> the notion of a remote column that has an "unknown" collation (that > >> is, one that doesn't match any local collation, or at least is not > >> known to do so). > > > +1 > > > In addition, a) we should detect whether local “default” matches > > remote “default”, > > If we had a way to do that, most of the problem here wouldn't exist. > I don't believe we can do it reliably. (Maybe we could put it on > the user to tell us, say via a foreign-server property?) Yeah, I was thinking we could get it from a server option. Also, I was thinking this bit might be back-patchable independently of the solution mentioned above. Best regards, Etsuro Fujita
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Etsuro Fujita writes: > On Sat, Sep 25, 2021 at 4:11 AM Tom Lane wrote: >> As a short-term answer, I propose that we apply (and back-patch) the >> attached documentation changes. > The attached patch looks good to me. I've pushed that, and marked the current CF entry as returned with feedback. I'm not sure how soon I might get around to trying the idea of an explicit "unknown" collation ... if anyone wants to take a stab at that, feel free. regards, tom lane
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Etsuro Fujita writes: > On Sat, Sep 25, 2021 at 4:11 AM Tom Lane wrote: >> Longer-term, it seems like we really have to be able to represent >> the notion of a remote column that has an "unknown" collation (that >> is, one that doesn't match any local collation, or at least is not >> known to do so). > +1 > In addition, a) we should detect whether local “default” matches > remote “default”, If we had a way to do that, most of the problem here wouldn't exist. I don't believe we can do it reliably. (Maybe we could put it on the user to tell us, say via a foreign-server property?) regards, tom lane
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
On Sat, Sep 25, 2021 at 4:11 AM Tom Lane wrote: > Etsuro Fujita writes: > > One thing I noticed is that collatable operators/functions sent to the > > remote might also cause an unexpected result when the default > > collations are not compatible. Consider this example (even with your > > patch): > > ... > > where ft1 is a foreign table with an integer column c1. As shown > > above, the sort using the collatable function chr() is performed > > remotely, so the select query might produce the result in an > > unexpected sort order when the default collations are not compatible. > > I don't think there's anything really new there --- it's still assuming > that COLLATE "default" means the same locally and remotely. I thought that the example showed that we would need to specify a collation per-operation, not only per-foreign-table-column, like “ORDER BY chr(c1) COLLATE “foo”” where “foo” is the actual name of a local collation matching the local server’s default collation, when the default collation doesn’t match the remote server’s default collation, to avoid pushing down operations incorrectly as in the example. > As a short-term answer, I propose that we apply (and back-patch) the > attached documentation changes. The attached patch looks good to me. > Longer-term, it seems like we really have to be able to represent > the notion of a remote column that has an "unknown" collation (that > is, one that doesn't match any local collation, or at least is not > known to do so). +1 > A rough sketch for making this happen is: > > 1. Create a built-in "unknown" entry in pg_collation. Insert some > hack or other to prevent this from being applied to any real, local > column; but allow foreign-table columns to have it. > > 2. Apply mods, probably fairly similar to my patch, that prevent > postgres_fdw from believing that "unknown" matches any local > collation. (Hm, actually maybe no special code change will be > needed here, once "unknown" has its own OID?) > > 3. Change postgresImportForeignSchema so that it can substitute > the "unknown" collation at need. The exact rules for this could > be debated depending on whether you'd rather prioritize safety or > ease-of-use, but I think at least we should use "unknown" whenever > import_collate is turned off. Perhaps there should be an option > to substitute it for remote "default" as well. (Further down the > road, perhaps that could be generalized to allow a user-controlled > mapping from remote to local collations.) In addition, a) we should detect whether local “default” matches remote “default”, and b) if not, we should prevent pushing down sort/comparison operations using collatable functions/operators like “ORDER BY chr(c1)” in the example (and pushing down those operations on foreign-table columns labeled with “COLLATE default” if such labeling is allowed)? Best regards, Etsuro Fujita
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Etsuro Fujita writes: > One thing I noticed is that collatable operators/functions sent to the > remote might also cause an unexpected result when the default > collations are not compatible. Consider this example (even with your > patch): > ... > where ft1 is a foreign table with an integer column c1. As shown > above, the sort using the collatable function chr() is performed > remotely, so the select query might produce the result in an > unexpected sort order when the default collations are not compatible. I don't think there's anything really new there --- it's still assuming that COLLATE "default" means the same locally and remotely. As a short-term answer, I propose that we apply (and back-patch) the attached documentation changes. Longer-term, it seems like we really have to be able to represent the notion of a remote column that has an "unknown" collation (that is, one that doesn't match any local collation, or at least is not known to do so). My previous patch essentially makes "default" act that way, but conflating "unknown" with "default" has too many downsides. A rough sketch for making this happen is: 1. Create a built-in "unknown" entry in pg_collation. Insert some hack or other to prevent this from being applied to any real, local column; but allow foreign-table columns to have it. 2. Apply mods, probably fairly similar to my patch, that prevent postgres_fdw from believing that "unknown" matches any local collation. (Hm, actually maybe no special code change will be needed here, once "unknown" has its own OID?) 3. Change postgresImportForeignSchema so that it can substitute the "unknown" collation at need. The exact rules for this could be debated depending on whether you'd rather prioritize safety or ease-of-use, but I think at least we should use "unknown" whenever import_collate is turned off. Perhaps there should be an option to substitute it for remote "default" as well. (Further down the road, perhaps that could be generalized to allow a user-controlled mapping from remote to local collations.) Anyway, I think I should withdraw the upthread patch; we don't want to go that way. regards, tom lane diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml index bf95da9721..dbc11694a0 100644 --- a/doc/src/sgml/postgres-fdw.sgml +++ b/doc/src/sgml/postgres-fdw.sgml @@ -95,8 +95,8 @@ referenced columns of the remote table. Although postgres_fdw is currently rather forgiving about performing data type conversions at need, surprising semantic anomalies may arise when types or collations do - not match, due to the remote server interpreting WHERE clauses - slightly differently from the local server. + not match, due to the remote server interpreting query conditions + differently from the local server. @@ -537,6 +537,17 @@ OPTIONS (ADD password_required 'false'); need to turn this off if the remote server has a different set of collation names than the local server does, which is likely to be the case if it's running on a different operating system. + If you do so, however, there is a very severe risk that the imported + table columns' collations will not match the underlying data, resulting + in anomalous query behavior. + + + + Even when this parameter is set to true, importing + columns whose collation is the remote server's default can be risky. + They will be imported with COLLATE "default", which + will select the local server's default collation, which could be + different.
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
On Fri, Sep 10, 2021 at 8:42 PM Etsuro Fujita wrote: > On Fri, Sep 10, 2021 at 1:00 AM Tom Lane wrote: > > Etsuro Fujita writes: > > > Having said that, I think another option for this would be to left the > > > code as-is; assume that 1) the foreign var has "COLLATE default”, not > > > an unknown collation, when labeled with "COLLATE default”, and 2) > > > "COLLATE default” on the local database matches "COLLATE default” on > > > the remote database. > > > > The fundamental complaint that started this thread was exactly that > > assumption (2) isn't safe. So it sounds to me like you're proposing > > that we do nothing, which isn't a great answer either. I suppose > > we could try documenting our way out of this, but people will > > continue to get bit because they won't read or won't understand > > the limitation. > > Yeah, but I think it’s the user’s responsibility to make sure that the > local and remote default collations match if labeling collatable > columns with “COLLATE default” when defining foreign tables manually > IMO. One thing I noticed is that collatable operators/functions sent to the remote might also cause an unexpected result when the default collations are not compatible. Consider this example (even with your patch): explain verbose select chr(c1) from ft1 order by chr(c1); QUERY PLAN Foreign Scan on public.ft1 (cost=100.00..212.91 rows=2925 width=32) Output: chr(c1) Remote SQL: SELECT c1 FROM public.t1 ORDER BY chr(c1) ASC NULLS LAST (3 rows) where ft1 is a foreign table with an integer column c1. As shown above, the sort using the collatable function chr() is performed remotely, so the select query might produce the result in an unexpected sort order when the default collations are not compatible. ISTM that we rely heavily on assumption (2). Best regards, Etsuro Fujita
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
On Fri, Sep 10, 2021 at 1:00 AM Tom Lane wrote: > Etsuro Fujita writes: > > Having said that, I think another option for this would be to left the > > code as-is; assume that 1) the foreign var has "COLLATE default”, not > > an unknown collation, when labeled with "COLLATE default”, and 2) > > "COLLATE default” on the local database matches "COLLATE default” on > > the remote database. > > The fundamental complaint that started this thread was exactly that > assumption (2) isn't safe. So it sounds to me like you're proposing > that we do nothing, which isn't a great answer either. I suppose > we could try documenting our way out of this, but people will > continue to get bit because they won't read or won't understand > the limitation. Yeah, but I think it’s the user’s responsibility to make sure that the local and remote default collations match if labeling collatable columns with “COLLATE default” when defining foreign tables manually IMO. > I'd be happier if we had a way to check whether the local and remote > default collations are compatible. But it seems like that's a big ask, > especially in cross-operating-system situations. Agreed. Best regards, Etsuro Fujita
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Etsuro Fujita writes: > On Thu, Sep 2, 2021 at 5:42 AM Tom Lane wrote: >> The real reason that this hasn't gotten committed is that I remain >> pretty uncomfortable about whether it's an acceptable solution to >> the problem. Suddenly asking people to plaster COLLATE clauses >> on all their textual remote columns seems like a big compatibility >> gotcha. > I think so too. Yeah :-(. It seems like a very unpleasant change. > Having said that, I think another option for this would be to left the > code as-is; assume that 1) the foreign var has "COLLATE default”, not > an unknown collation, when labeled with "COLLATE default”, and 2) > "COLLATE default” on the local database matches "COLLATE default” on > the remote database. The fundamental complaint that started this thread was exactly that assumption (2) isn't safe. So it sounds to me like you're proposing that we do nothing, which isn't a great answer either. I suppose we could try documenting our way out of this, but people will continue to get bit because they won't read or won't understand the limitation. I'd be happier if we had a way to check whether the local and remote default collations are compatible. But it seems like that's a big ask, especially in cross-operating-system situations. regards, tom lane
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
On Thu, Sep 2, 2021 at 5:42 AM Tom Lane wrote: > The real reason that this hasn't gotten committed is that I remain > pretty uncomfortable about whether it's an acceptable solution to > the problem. Suddenly asking people to plaster COLLATE clauses > on all their textual remote columns seems like a big compatibility > gotcha. I think so too. I reviewed the patch: /* * If the Var is from the foreign table, we consider its -* collation (if any) safe to use. If it is from another +* collation (if any) safe to use, *unless* it's +* DEFAULT_COLLATION_OID. We treat that as meaning "we don't +* know which collation this is". If it is from another * table, we treat its collation the same way as we would a * Param's collation, ie it's not safe for it to have a * non-default collation. @@ -350,7 +352,12 @@ foreign_expr_walker(Node *node, /* Else check the collation */ collation = var->varcollid; - state = OidIsValid(collation) ? FDW_COLLATE_SAFE : FDW_COLLATE_NONE; + if (collation == InvalidOid) + state = FDW_COLLATE_NONE; + else if (collation == DEFAULT_COLLATION_OID) + state = FDW_COLLATE_UNSAFE; + else + state = FDW_COLLATE_SAFE; One thing I noticed about this change is: explain (verbose, costs off) select * from ft3 order by f2; QUERY PLAN - Sort Output: f1, f2, f3 Sort Key: ft3.f2 -> Foreign Scan on public.ft3 Output: f1, f2, f3 Remote SQL: SELECT f1, f2, f3 FROM public.loct3 (6 rows) where ft3 is defined as in the postgres_fdw regression test (see the section “test handling of collations”). For this query, the sort is done locally, but I think it should be done remotely, or an error should be raised, as we don’t know the collation assigned to the column “f2”. So I think we need to do something about this. Having said that, I think another option for this would be to left the code as-is; assume that 1) the foreign var has "COLLATE default”, not an unknown collation, when labeled with "COLLATE default”, and 2) "COLLATE default” on the local database matches "COLLATE default” on the remote database. This would be the same as before, so we could avoid the concern mentioned above. I agree with the postgresImportForeignSchema() change, except creating a local column with "COLLATE default" silently if that function can’t find a remote collation matching the database's datcollate/datctype when seeing "COLLATE default”, in which case I think an error should be raised to prompt the user to check the settings for the remote server and/or define foreign tables manually with collations that match the remote side. Maybe I’m missing something, though. Anyway, here is a patch created on top of your patch to modify async-related test cases to work as intended. I’m also attaching your patch to make the cfbot quiet. Sorry for the delay. Best regards, Etsuro Fujita 0001-fix-postgres-fdw-collation-handling-4.patch Description: Binary data 0002-modify-postgres-fdw-async-test-cases.patch Description: Binary data
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
On Thu, Sep 2, 2021 at 5:42 AM Tom Lane wrote: > Etsuro Fujita writes: > > On Thu, Jul 15, 2021 at 4:17 AM Ibrar Ahmed wrote: > >> On Wed, Jul 14, 2021 at 1:41 AM Tom Lane wrote: > >>> Seems to just need an update of the expected-file to account for test > >>> cases added recently. (I take no position on whether the new results > >>> are desirable; some of these might be breaking the intent of the case. > >>> But this should quiet the cfbot anyway.) > > >> The test case was added by commit "Add support for asynchronous execution" > >> "27e1f14563cf982f1f4d71e21ef24782a052" by Etsuro Fujita. He can comment > >> whether the new results are desirable or not. > > > The new results aren't what I intended. I'll update the patch to > > avoid that by modifying the original test cases properly, if there are > > no objections. > > Please follow up on that sometime? Will do in this commitfest. > In the meantime, here is a rebase > over aa769f80e and 2dc53fe2a, to placate the cfbot. Thanks for the rebase! Best regards, Etsuro Fujita
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Etsuro Fujita writes: > On Thu, Jul 15, 2021 at 4:17 AM Ibrar Ahmed wrote: >> On Wed, Jul 14, 2021 at 1:41 AM Tom Lane wrote: >>> Seems to just need an update of the expected-file to account for test >>> cases added recently. (I take no position on whether the new results >>> are desirable; some of these might be breaking the intent of the case. >>> But this should quiet the cfbot anyway.) >> The test case was added by commit "Add support for asynchronous execution" >> "27e1f14563cf982f1f4d71e21ef24782a052" by Etsuro Fujita. He can comment >> whether the new results are desirable or not. > The new results aren't what I intended. I'll update the patch to > avoid that by modifying the original test cases properly, if there are > no objections. Please follow up on that sometime? In the meantime, here is a rebase over aa769f80e and 2dc53fe2a, to placate the cfbot. The real reason that this hasn't gotten committed is that I remain pretty uncomfortable about whether it's an acceptable solution to the problem. Suddenly asking people to plaster COLLATE clauses on all their textual remote columns seems like a big compatibility gotcha. However, I lack any ideas about a less unpleasant solution. regards, tom lane diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index d98bd66681..654b09273e 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -328,7 +328,9 @@ foreign_expr_walker(Node *node, /* * If the Var is from the foreign table, we consider its - * collation (if any) safe to use. If it is from another + * collation (if any) safe to use, *unless* it's + * DEFAULT_COLLATION_OID. We treat that as meaning "we don't + * know which collation this is". If it is from another * table, we treat its collation the same way as we would a * Param's collation, ie it's not safe for it to have a * non-default collation. @@ -350,7 +352,12 @@ foreign_expr_walker(Node *node, /* Else check the collation */ collation = var->varcollid; - state = OidIsValid(collation) ? FDW_COLLATE_SAFE : FDW_COLLATE_NONE; + if (collation == InvalidOid) + state = FDW_COLLATE_NONE; + else if (collation == DEFAULT_COLLATION_OID) + state = FDW_COLLATE_UNSAFE; + else + state = FDW_COLLATE_SAFE; } else { @@ -941,8 +948,24 @@ foreign_expr_walker(Node *node, /* * Now, merge my collation information into my parent's state. + * + * If one branch of an expression derives a non-default collation safely + * (that is, from a foreign Var) and another one derives the same + * collation unsafely, we can consider the expression safe overall. This + * allows cases such as "foreign_var = ('foo' COLLATE x)" where x is the + * same collation the foreign_var has anyway. Note that we will not ship + * any explicit COLLATE clause to the remote, but rely on it to re-derive + * the correct collation based on the foreign_var. */ - if (state > outer_cxt->state) + if (collation == outer_cxt->collation && + ((state == FDW_COLLATE_UNSAFE && + outer_cxt->state == FDW_COLLATE_SAFE) || + (state == FDW_COLLATE_SAFE && + outer_cxt->state == FDW_COLLATE_UNSAFE))) + { + outer_cxt->state = FDW_COLLATE_SAFE; + } + else if (state > outer_cxt->state) { /* Override previous parent state */ outer_cxt->collation = collation; diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index e3ee30f1aa..9e52e09a8b 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -32,29 +32,29 @@ CREATE SCHEMA "S 1"; CREATE TABLE "S 1"."T 1" ( "C 1" int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10), + c6 varchar(10) collate "C", + c7 char(10) collate "C", c8 user_enum, CONSTRAINT t1_pkey PRIMARY KEY ("C 1") ); CREATE TABLE "S 1"."T 2" ( c1 int NOT NULL, - c2 text, + c2 text collate "C", CONSTRAINT t2_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 3" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t3_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 4" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t4_pkey PRIMARY KEY (c1) ); -- Disable autovacuum for these tables to avoid unexpected effects of that @@ -94,16 +94,18 @@ ANALYZE "S 1"."T 3"; ANALYZE "S 1"."T 4"; -- === -- create foreign tables +-- Note: to ensure stable regression results, all collatable columns +-- in these tables must have explicitly-specified collations. -- === CREATE FOREIGN TABLE ft1 ( c0 int, c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz,
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
On Thu, Jul 15, 2021 at 2:35 PM Etsuro Fujita wrote: > On Thu, Jul 15, 2021 at 4:17 AM Ibrar Ahmed wrote: > > On Wed, Jul 14, 2021 at 1:41 AM Tom Lane wrote: > >> Seems to just need an update of the expected-file to account for test > >> cases added recently. (I take no position on whether the new results > >> are desirable; some of these might be breaking the intent of the case. > >> But this should quiet the cfbot anyway.) > > > The test case was added by commit "Add support for asynchronous > execution" > > "27e1f14563cf982f1f4d71e21ef24782a052" by Etsuro Fujita. He can > comment > > whether the new results are desirable or not. > > The new results aren't what I intended. I'll update the patch to > avoid that by modifying the original test cases properly, if there are > no objections. > > Best regards, > Etsuro Fujita > Thanks Etsuro, I have changed the status to "Waiting On Author", because patch need changes. Etsuro, can you make yourself a reviewer/co-author to keep track of that? -- Ibrar Ahmed
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
On Thu, Jul 15, 2021 at 4:17 AM Ibrar Ahmed wrote: > On Wed, Jul 14, 2021 at 1:41 AM Tom Lane wrote: >> Seems to just need an update of the expected-file to account for test >> cases added recently. (I take no position on whether the new results >> are desirable; some of these might be breaking the intent of the case. >> But this should quiet the cfbot anyway.) > The test case was added by commit "Add support for asynchronous execution" > "27e1f14563cf982f1f4d71e21ef24782a052" by Etsuro Fujita. He can comment > whether the new results are desirable or not. The new results aren't what I intended. I'll update the patch to avoid that by modifying the original test cases properly, if there are no objections. Best regards, Etsuro Fujita
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
On Wed, Jul 14, 2021 at 1:41 AM Tom Lane wrote: > Ibrar Ahmed writes: > > The patch is failing the regression, @Tom Lane can > you > > please take a look at that. > > Seems to just need an update of the expected-file to account for test > cases added recently. (I take no position on whether the new results > are desirable; some of these might be breaking the intent of the case. > But this should quiet the cfbot anyway.) > > regards, tom lane > > Thanks for the update. The test case was added by commit "Add support for asynchronous execution" "27e1f14563cf982f1f4d71e21ef24782a052" by Etsuro Fujita. He can comment whether the new results are desirable or not. -- Ibrar Ahmed
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Ibrar Ahmed writes: > The patch is failing the regression, @Tom Lane can you > please take a look at that. Seems to just need an update of the expected-file to account for test cases added recently. (I take no position on whether the new results are desirable; some of these might be breaking the intent of the case. But this should quiet the cfbot anyway.) regards, tom lane diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 31919fda8c..d68720423e 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -321,7 +321,9 @@ foreign_expr_walker(Node *node, /* * If the Var is from the foreign table, we consider its - * collation (if any) safe to use. If it is from another + * collation (if any) safe to use, *unless* it's + * DEFAULT_COLLATION_OID. We treat that as meaning "we don't + * know which collation this is". If it is from another * table, we treat its collation the same way as we would a * Param's collation, ie it's not safe for it to have a * non-default collation. @@ -343,7 +345,12 @@ foreign_expr_walker(Node *node, /* Else check the collation */ collation = var->varcollid; - state = OidIsValid(collation) ? FDW_COLLATE_SAFE : FDW_COLLATE_NONE; + if (collation == InvalidOid) + state = FDW_COLLATE_NONE; + else if (collation == DEFAULT_COLLATION_OID) + state = FDW_COLLATE_UNSAFE; + else + state = FDW_COLLATE_SAFE; } else { @@ -814,8 +821,24 @@ foreign_expr_walker(Node *node, /* * Now, merge my collation information into my parent's state. + * + * If one branch of an expression derives a non-default collation safely + * (that is, from a foreign Var) and another one derives the same + * collation unsafely, we can consider the expression safe overall. This + * allows cases such as "foreign_var = ('foo' COLLATE x)" where x is the + * same collation the foreign_var has anyway. Note that we will not ship + * any explicit COLLATE clause to the remote, but rely on it to re-derive + * the correct collation based on the foreign_var. */ - if (state > outer_cxt->state) + if (collation == outer_cxt->collation && + ((state == FDW_COLLATE_UNSAFE && + outer_cxt->state == FDW_COLLATE_SAFE) || + (state == FDW_COLLATE_SAFE && + outer_cxt->state == FDW_COLLATE_UNSAFE))) + { + outer_cxt->state = FDW_COLLATE_SAFE; + } + else if (state > outer_cxt->state) { /* Override previous parent state */ outer_cxt->collation = collation; diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index e5bbf3b0af..4408543f55 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -32,29 +32,29 @@ CREATE SCHEMA "S 1"; CREATE TABLE "S 1"."T 1" ( "C 1" int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10), + c6 varchar(10) collate "C", + c7 char(10) collate "C", c8 user_enum, CONSTRAINT t1_pkey PRIMARY KEY ("C 1") ); CREATE TABLE "S 1"."T 2" ( c1 int NOT NULL, - c2 text, + c2 text collate "C", CONSTRAINT t2_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 3" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t3_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 4" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t4_pkey PRIMARY KEY (c1) ); -- Disable autovacuum for these tables to avoid unexpected effects of that @@ -94,16 +94,18 @@ ANALYZE "S 1"."T 3"; ANALYZE "S 1"."T 4"; -- === -- create foreign tables +-- Note: to ensure stable regression results, all collatable columns +-- in these tables must have explicitly-specified collations. -- === CREATE FOREIGN TABLE ft1 ( c0 int, c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10) default 'ft1', + c6 varchar(10) collate "C", + c7 char(10) default 'ft1' collate "C", c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft1 DROP COLUMN c0; @@ -111,28 +113,28 @@ CREATE FOREIGN TABLE ft2 ( c1 int NOT NULL, c2 int NOT NULL, cx int, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10) default 'ft2', + c6 varchar(10) collate "C", + c7 char(10) default 'ft2' collate "C", c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft2 DROP COLUMN cx; CREATE FOREIGN TABLE ft4 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3'); CREATE FOREIGN TABLE ft5 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback OPTIONS
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
On Wed, Mar 3, 2021 at 1:42 PM Neil Chen wrote: > The following review has been posted through the commitfest application: > make installcheck-world: tested, passed > Implements feature: tested, passed > Spec compliant: not tested > Documentation:tested, passed > > Greetings, > I learned about the patch and read your discussions. I'm not sure why this > patch has not been discussed now. In short, I think it's beneficial to > submit it as a temporary solution. > Another thing I want to know is whether these codes can be simplified: > - if (state > outer_cxt->state) > + if (collation == outer_cxt->collation && > + ((state == FDW_COLLATE_UNSAFE && > + outer_cxt->state == FDW_COLLATE_SAFE) || > +(state == FDW_COLLATE_SAFE && > + outer_cxt->state == FDW_COLLATE_UNSAFE))) > + { > + outer_cxt->state = FDW_COLLATE_SAFE; > + } > + else if (state > outer_cxt->state) > > If the state is determined by the collation, when the collations are > equal, do we just need to judge the state not equal to FDW_COLLATE_NONE? The patch is failing the regression, @Tom Lane can you please take a look at that. https://cirrus-ci.com/task/4593497492684800 == running regression test queries == test postgres_fdw ... FAILED 2782 ms == shutting down postmaster == == 1 of 1 tests failed. == The differences that caused some tests to fail can be viewed in the file "/tmp/cirrus-ci-build/contrib/postgres_fdw/regression.diffs". A copy of the test summary that you see above is saved in the file "/tmp/cirrus-ci-build/contrib/postgres_fdw/regression.out". -- Ibrar Ahmed
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:tested, passed Greetings, I learned about the patch and read your discussions. I'm not sure why this patch has not been discussed now. In short, I think it's beneficial to submit it as a temporary solution. Another thing I want to know is whether these codes can be simplified: - if (state > outer_cxt->state) + if (collation == outer_cxt->collation && + ((state == FDW_COLLATE_UNSAFE && + outer_cxt->state == FDW_COLLATE_SAFE) || +(state == FDW_COLLATE_SAFE && + outer_cxt->state == FDW_COLLATE_UNSAFE))) + { + outer_cxt->state = FDW_COLLATE_SAFE; + } + else if (state > outer_cxt->state) If the state is determined by the collation, when the collations are equal, do we just need to judge the state not equal to FDW_COLLATE_NONE?
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Rebased over b663a4136 --- no substantive changes, just keeping the cfbot happy. regards, tom lane diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 6faf499f9a..c38e2419d5 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -320,7 +320,9 @@ foreign_expr_walker(Node *node, /* * If the Var is from the foreign table, we consider its - * collation (if any) safe to use. If it is from another + * collation (if any) safe to use, *unless* it's + * DEFAULT_COLLATION_OID. We treat that as meaning "we don't + * know which collation this is". If it is from another * table, we treat its collation the same way as we would a * Param's collation, ie it's not safe for it to have a * non-default collation. @@ -342,7 +344,12 @@ foreign_expr_walker(Node *node, /* Else check the collation */ collation = var->varcollid; - state = OidIsValid(collation) ? FDW_COLLATE_SAFE : FDW_COLLATE_NONE; + if (collation == InvalidOid) + state = FDW_COLLATE_NONE; + else if (collation == DEFAULT_COLLATION_OID) + state = FDW_COLLATE_UNSAFE; + else + state = FDW_COLLATE_SAFE; } else { @@ -813,8 +820,24 @@ foreign_expr_walker(Node *node, /* * Now, merge my collation information into my parent's state. + * + * If one branch of an expression derives a non-default collation safely + * (that is, from a foreign Var) and another one derives the same + * collation unsafely, we can consider the expression safe overall. This + * allows cases such as "foreign_var = ('foo' COLLATE x)" where x is the + * same collation the foreign_var has anyway. Note that we will not ship + * any explicit COLLATE clause to the remote, but rely on it to re-derive + * the correct collation based on the foreign_var. */ - if (state > outer_cxt->state) + if (collation == outer_cxt->collation && + ((state == FDW_COLLATE_UNSAFE && + outer_cxt->state == FDW_COLLATE_SAFE) || + (state == FDW_COLLATE_SAFE && + outer_cxt->state == FDW_COLLATE_UNSAFE))) + { + outer_cxt->state = FDW_COLLATE_SAFE; + } + else if (state > outer_cxt->state) { /* Override previous parent state */ outer_cxt->collation = collation; diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 60c7e115d6..05628d8aa7 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -32,29 +32,29 @@ CREATE SCHEMA "S 1"; CREATE TABLE "S 1"."T 1" ( "C 1" int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10), + c6 varchar(10) collate "C", + c7 char(10) collate "C", c8 user_enum, CONSTRAINT t1_pkey PRIMARY KEY ("C 1") ); CREATE TABLE "S 1"."T 2" ( c1 int NOT NULL, - c2 text, + c2 text collate "C", CONSTRAINT t2_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 3" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t3_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 4" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t4_pkey PRIMARY KEY (c1) ); -- Disable autovacuum for these tables to avoid unexpected effects of that @@ -94,16 +94,18 @@ ANALYZE "S 1"."T 3"; ANALYZE "S 1"."T 4"; -- === -- create foreign tables +-- Note: to ensure stable regression results, all collatable columns +-- in these tables must have explicitly-specified collations. -- === CREATE FOREIGN TABLE ft1 ( c0 int, c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10) default 'ft1', + c6 varchar(10) collate "C", + c7 char(10) default 'ft1' collate "C", c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft1 DROP COLUMN c0; @@ -111,28 +113,28 @@ CREATE FOREIGN TABLE ft2 ( c1 int NOT NULL, c2 int NOT NULL, cx int, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10) default 'ft2', + c6 varchar(10) collate "C", + c7 char(10) default 'ft2' collate "C", c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft2 DROP COLUMN cx; CREATE FOREIGN TABLE ft4 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3'); CREATE FOREIGN TABLE ft5 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4'); CREATE FOREIGN TABLE ft6 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4'); CREATE FOREIGN TABLE ft7 ( c1 int NOT NULL, @@ -288,7 +290,7 @@ EXPLAIN
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Peter Eisentraut writes: > I have studied this patch and this functionality. I don't think > collation differences between remote and local instances are handled > sufficiently. This bug report and patch addresses one particular case, > where the database-wide collation of the remote and local instance are > different. But it doesn't handle cases like the same collation name > doing different things, having different versions, or different > attributes. Yeah, agreed. I don't think it's practical to have a 100% solution. I'd make a couple of points: * The design philosophy of postgres_fdw, to the extent it has one, is that it's the user's responsibility to make sure that the local declaration of a foreign table is a faithful model of the actual remote object. There are certain variances you can get away with, but in general, if it breaks it's your fault. (Admittedly, if the local declaration was created via IMPORT FOREIGN SCHEMA, we would like to be sure that it's right without help. But there's only so much we can do there. There are already plenty of ways to fool IMPORT FOREIGN SCHEMA anyway, for example if the same type name refers to something different on the two systems.) * Not being able to ship any qual conditions involving collatable datatypes seems like an absolutely unacceptable outcome. Thus, I don't buy your alternative of not letting the planner make any assumptions at all about compatibility of remote collations. I think that what this patch is basically doing is increasing the visibility of collation compatibility as something that postgres_fdw users need to take into account. Sure, it's not a 100% solution, but it improves the situation, and it seems like we'd have to do this anyway along the road to any better solution. If you've got ideas about how to improve things further, by all means let's discuss that ... but let's not make the perfect be the enemy of the good. regards, tom lane
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
On 2020-08-18 22:09, Tom Lane wrote: Here's a full patch addressing this issue. I decided that the best way to address the test-instability problem is to explicitly give collations to all the foreign-table columns for which it matters in the postgres_fdw test. (For portability's sake, that has to be "C" or "POSIX"; I mostly used "C".) Aside from ensuring that the test still passes with some other prevailing locale, this seems like a good idea since we'll then be testing the case we are encouraging users to use. I have studied this patch and this functionality. I don't think collation differences between remote and local instances are handled sufficiently. This bug report and patch addresses one particular case, where the database-wide collation of the remote and local instance are different. But it doesn't handle cases like the same collation name doing different things, having different versions, or different attributes. This probably works currently because the libc collations don't have much functionality like that, but there is a variety of work conceived (or, in the case of version tracking, already done since the bug was first discussed) that would break that. Taking a step back, I think there are only two ways this could really work: Either, the admin makes a promise that all the collations match on all the instances; then the planner can take advantage of that. Or, there is no such promise, and then the planner can't. I don't understand what the currently implemented approach is. It appears to be something in the middle, where certain representations are made that certain things might match, and then there is some nontrivial code that analyzes expressions whether they conform to those rules. As you said, the description of the import_collate option is kind of hand-wavy about all this. -- Peter Eisentraut 2ndQuadrant, an EDB company https://www.2ndquadrant.com/
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
I wrote: >> So I think what is happening here is that postgres_fdw's version of >> IMPORT FOREIGN SCHEMA translates "COLLATE default" on the remote >> server to "COLLATE default" on the local one, which of course is >> a big fail if the defaults don't match. That allows the local >> planner to believe that remote ORDER BYs on the two foreign tables >> will give compatible results, causing the merge join to not work >> very well at all. Here's a full patch addressing this issue. I decided that the best way to address the test-instability problem is to explicitly give collations to all the foreign-table columns for which it matters in the postgres_fdw test. (For portability's sake, that has to be "C" or "POSIX"; I mostly used "C".) Aside from ensuring that the test still passes with some other prevailing locale, this seems like a good idea since we'll then be testing the case we are encouraging users to use. And indeed, it immediately turned up a new problem: if we explicitly assign a collation to a foreign-table column c, the system won't ship WHERE clauses as simple as "c = 'foo'" to the remote. This surprised me, but the reason turned out to be that what postgres_fdw is actually seeing is something like {OPEXPR :opno 98 :opfuncid 67 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 950 :args ( {VAR :varno 6 :varattno 4 :vartype 25 :vartypmod -1 :varcollid 950 :varlevelsup 0 :varnosyn 6 :varattnosyn 4 :location 171 } {RELABELTYPE :arg {CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull false :location 341 :constvalue 9 [ 36 0 0 0 48 48 48 48 49 ] } :resulttype 25 :resulttypmod -1 :resultcollid 950 :relabelformat 2 :location -1 } ) :location -1 } that is, the constant is being explicitly relabeled with the correct collation, and thus is_foreign_expr() thinks the collation shown by the RelabelType node is an unsafely-introduced collation. What I did about this was to change the recursion rule in foreign_expr_walker() so that merging a safely-derived collation with the same collation unsafely derived is considered safe. I think this is all right, and it allows us to accept some cases that previously were rejected as unsafe. But I might be missing something. (BTW, there's an independent bug here, which is that we're getting a tree of the above shape rather than a simple Const with the appropriate collation; that is, this tree isn't fully const-folded. This is a bug in canonicalize_ec_expression, which I'll go fix separately. But it won't affect the problem at hand.) This seems like a sufficiently large change in postgres_fdw's behavior to require review, so I'll go add this to the next CF. regards, tom lane diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index ad37a74221..f0693a061c 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -320,7 +320,9 @@ foreign_expr_walker(Node *node, /* * If the Var is from the foreign table, we consider its - * collation (if any) safe to use. If it is from another + * collation (if any) safe to use, *unless* it's + * DEFAULT_COLLATION_OID. We treat that as meaning "we don't + * know which collation this is". If it is from another * table, we treat its collation the same way as we would a * Param's collation, ie it's not safe for it to have a * non-default collation. @@ -342,7 +344,12 @@ foreign_expr_walker(Node *node, /* Else check the collation */ collation = var->varcollid; - state = OidIsValid(collation) ? FDW_COLLATE_SAFE : FDW_COLLATE_NONE; + if (collation == InvalidOid) + state = FDW_COLLATE_NONE; + else if (collation == DEFAULT_COLLATION_OID) + state = FDW_COLLATE_UNSAFE; + else + state = FDW_COLLATE_SAFE; } else { @@ -808,8 +815,24 @@ foreign_expr_walker(Node *node, /* * Now, merge my collation information into my parent's state. + * + * If one branch of an expression derives a non-default collation safely + * (that is, from a foreign Var) and another one derives the same + * collation unsafely, we can consider the expression safe overall. This + * allows cases such as "foreign_var = ('foo' COLLATE x)" where x is the + * same collation the foreign_var has anyway. Note that we will not ship + * any explicit COLLATE clause to the remote, but rely on it to re-derive + * the correct collation based on the foreign_var. */ - if (state > outer_cxt->state) + if (collation == outer_cxt->collation && + ((state == FDW_COLLATE_UNSAFE && + outer_cxt->state == FDW_COLLATE_SAFE) || + (state == FDW_COLLATE_SAFE && + outer_cxt->state