I’m looking at using RelFieldTrimmer, and I’m noticing that if a side of a join has unnecessary fields after a filter, there is no trim-fields project on that side to reduce the width of the row. Is this expected, or is there a configuration or pre-processing step that I am missing?
For example, starting with this tree (these all look better in monospace, hopefully the formatting comes through) 4:Project(C5633_14509=[$4], C5633_486=[$8]) └── 3:Join(condition=[=($1, $6)], joinType=[inner]) ....├── 1:Filter(condition=[<($2, 10)]) ....│...└── 0:TableScan(table=[T902], Schema=[...6 fields...]) ....└── 2:TableScan(table=[T895], Schema=[...64 fields...]) The result of RelFieldTrimmer is this: 9:Project(C5633_14509=[$2], C5633_486=[$4]) └── 8:Join(condition=[=($0, $3)], joinType=[inner]) ....├── 6:Filter(condition=[<($1, 10)]) ....│...└── 5:Project(C5633_14505=[$1], C5633_14506=[$2], C5633_14509=[$4]) ....│.......└── 0:TableScan(table=[T902], Schema=[...6 fields...]) ....└── 7:Project(ID=[$0], C5633_486=[$2]) ........└── 2:TableScan(table=[T895], Schema=[...64 fields...]) Notice: $1 on the LHS of the node is not used *after* the filter so a projection of only the $0 and $2 fields would be reduce the width of the row before the join. However, I can force the insertion of a projection which is simply the identity (ie, projecting all fields of the input row with now additions or subtractions): 5:Project(C5633_14509=[$4], C5633_486=[$8]) └── 4:Join(condition=[=($1, $6)], joinType=[inner]) ....├── 2:Project(...Identity mapping, 6 fields...) ....│...└── 1:Filter(condition=[<($2, 10)]) ....│.......└── 0:TableScan(table=[T902], Schema=[...6 fields...]) ....└── 3:TableScan(table=[T895], Schema=[...64 fields...]) And the result is a projection wich only has the 2 fields necessary after the filter. 11:Project(C5633_14509=[$1], C5633_486=[$3]) └── 10:Join(condition=[=($0, $2)], joinType=[inner]) ....├── 8:Project(C5633_14505=[$0], C5633_14509=[$2]) <- trimmed ....│...└── 7:Filter(condition=[<($1, 10)]) ....│.......└── 6:Project(C5633_14505=[$1], C5633_14506=[$2], C5633_14509=[$4]) ....│...........└── 0:TableScan(table=[T902], Schema=[...6 fields...]) ....└── 9:Project(ID=[$0], C5633_486=[$2]) ........└── 3:TableScan(table=[T895], Schema=[...64 fields...]) Thanks! -Ian
