roseduan opened a new issue, #1341:
URL: https://github.com/apache/cloudberry/issues/1341
### Apache Cloudberry version
any version
### What happened
There are two foreign tables named `ft1` and `ft2`:
```
contrib_regression=# \d ft1
Foreign table "public.ft1"
Column | Type | Collation | Nullable | Default
| FDW options
--------+-----------------------------+-----------+----------+---------------+---------------------
c1 | integer | | not null |
| (column_name 'C 1')
c2 | integer | | not null |
|
c3 | text | | |
|
c4 | timestamp with time zone | | |
|
c5 | timestamp without time zone | | |
|
c6 | character varying(10) | | |
|
c7 | character(10) | | | 'ft1'::bpchar
|
c8 | user_enum | | |
|
Server: pgserver
FDW options: (schema_name 'S 1', table_name 'T 1', mpp_execute 'all
segments')
contrib_regression=# \d ft2
Foreign table "public.ft2"
Column | Type | Collation | Nullable | Default
| FDW options
--------+-----------------------------+-----------+----------+---------------+---------------------
c1 | integer | | not null |
| (column_name 'C 1')
c2 | integer | | not null |
|
c3 | text | | |
|
c4 | timestamp with time zone | | |
|
c5 | timestamp without time zone | | |
|
c6 | character varying(10) | | |
|
c7 | character(10) | | | 'ft2'::bpchar
|
c8 | user_enum | | |
|
Server: pgserver
FDW options: (schema_name 'S 1', table_name 'T 1', mpp_execute 'all
segments', use_remote_estimate 'true')
```
And a local table named `s1t1`:
```
contrib_regression=# \d s1t1
Table "public.s1t1"
Column | Type | Collation | Nullable | Default
--------+-----------------------------+-----------+----------+---------
C 1 | integer | | not null |
c2 | integer | | not null |
c3 | text | | |
c4 | timestamp with time zone | | |
c5 | timestamp without time zone | | |
c6 | character varying(10) | | |
c7 | character(10) | | |
c8 | character varying(255) | | |
Indexes:
"s1t1_pkey" PRIMARY KEY, btree ("C 1")
Distributed by: ("C 1")
```
Then the query plan is:
```
contrib_regression=# explain (verbose, costs off) SELECT ref_0.c2, subq_1.*
FROM
s1t1 AS ref_0,
LATERAL (
SELECT ref_0."C 1" c1, subq_0.*
FROM (SELECT ref_0.c2, ref_1.c3
FROM ft1 AS ref_1) AS subq_0
RIGHT JOIN ft2 AS ref_3 ON (subq_0.c3 = ref_3.c3)
) AS subq_1
WHERE ref_0."C 1" < 10 AND subq_1.c3 = '00001';
QUERY PLAN
-----------------------------------------------------------------------------------------------------
Gather Motion 3:1 (slice1; segments: 3)
Output: ref_0.c2, ref_0."C 1", (ref_0.c2), ref_1.c3
-> Nested Loop
Output: ref_0.c2, ref_0."C 1", (ref_0.c2), ref_1.c3
-> Broadcast Motion 3:3 (slice2; segments: 3)
Output: ref_3.c3
-> Foreign Scan on public.ft2 ref_3
Output: ref_3.c3
Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE ((c3 =
'00001'::text))
-> Materialize
Output: ref_0.c2, ref_0."C 1", ref_1.c3, (ref_0.c2)
-> Nested Loop
Output: ref_0.c2, ref_0."C 1", ref_1.c3, (ref_0.c2)
-> Broadcast Motion 3:3 (slice3; segments: 3)
Output: ref_0.c2, ref_0."C 1"
-> Seq Scan on public.s1t1 ref_0
Output: ref_0.c2, ref_0."C 1"
Filter: (ref_0."C 1" < 10)
-> Materialize
Output: ref_1.c3, (ref_0.c2)
-> Foreign Scan on public.ft1 ref_1
Output: ref_1.c3, ref_0.c2
Remote SQL: SELECT c3 FROM "S 1"."T 1"
WHERE ((c3 = '00001'::text))
Optimizer: Postgres query optimizer
(24 rows)
```
And focus on the foreign scan node on `ft1`:
```
-> Materialize
Output: ref_1.c3, (ref_0.c2)
-> Foreign Scan on public.ft1 ref_1
Output: ref_1.c3, ref_0.c2
Remote SQL: SELECT c3 FROM "S 1"."T 1"
WHERE ((c3 = '00001'::text))
```
We can see that the remote sql does not contain the `ref_0.c2` field, but it
appears in the target list anyway, and if the foreign table is not rescannable,
but the Materialize node will do rescan because it assumes there is a param in
subplan, see code details:
```
/*
* If subnode is to be rescanned then we forget previous stored
* results; we have to re-read the subplan and re-store. Also,
if we
* told tuplestore it needn't support rescan, we lose and must
* re-read. (This last should not happen in common cases; else
our
* caller lied by not passing EXEC_FLAG_REWIND to us.)
*
* Otherwise we can just rewind and rescan the stored output.
The
* state of the subnode does not change.
*/
if (outerPlan->chgParam != NULL ||
(node->eflags & EXEC_FLAG_REWIND) == 0)
{
tuplestore_end(node->tuplestorestate);
node->tuplestorestate = NULL;
node->ts_destroyed = true;
if (outerPlan->chgParam == NULL)
ExecReScan(outerPlan);
node->eof_underlying = false;
}
else
tuplestore_rescan(node->tuplestorestate);
```
This is not expected, beacause our foreign table with `mpp_execute all
segments` option is not rescanable.
### What you think should happen instead
the expected plan should be:
```
-> Result
Output: ref_1.c3, (ref_0.c2)
-> Materialize
Output: ref_1.c3
-> Foreign Scan on public.ft1 ref_1
Output: ref_1.c3
Remote SQL: SELECT c3 FROM "S 1"."T 1"
WHERE ((c3 = '00001'::text))
```
### How to reproduce
As mentioned above
### Operating System
CentOS 7
### Anything else
_No response_
### Are you willing to submit PR?
- [x] Yes, I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/cloudberry/blob/main/CODE_OF_CONDUCT.md).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]