mihailoale-db commented on PR #55667: URL: https://github.com/apache/spark/pull/55667#issuecomment-4379156416
> Nice cleanup — the prior coupling looked accidental. I traced the only setter call site (`NameScope.updateHiddenOutputProperties`) and the only predicate call site (`NameScope.getHiddenOutputCandidates`); neither depends on `aggregatedAccessOnly` attributes also being metadata columns, so decoupling is sound and existing behavior is preserved. > > Two small, optional follow-ups: > > 1. The doc on `AGGREGATED_ACCESS_ONLY` (lines 179–185) still says "If set, this metadata column can only be accessed under `AggregateExpression`." After this PR the marker is not restricted to metadata columns — could you tweak the wording (e.g. "this attribute" / "this column") so the comment matches the new contract? > 2. There is a symmetric unit test for the analogous metadata key — `AnalysisSuite.scala:1761` "[SPARK-43293](https://issues.apache.org/jira/browse/SPARK-43293): `__qualified_access_only` should be ignored in normal columns". An equivalent targeted test for `__aggregated_access_only` (e.g. that a non-metadata attribute marked via `markAsAggregatedAccessOnly()` does not satisfy `isMetadataCol`) would lock the new contract in directly, on top of the existing end-to-end SQL coverage. Addressed your comments. PTAL again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
