karenfeng opened a new pull request #31654:
URL: https://github.com/apache/spark/pull/31654


   ### What changes were proposed in this pull request?
   
   Today, child expressions may be resolved based on "real" or metadata output 
attributes. We should prefer the real attribute during resolution if one exists.
   
   ### Why are the changes needed?
   
   Today, attempting to resolve an expression when there is a "real" output 
attribute and a metadata attribute with the same name results in resolution 
failure. This is likely unexpected, as the user may not know about the metadata 
attribute.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. Previously, the user would see an error message when resolving a column 
with the same name as a "real" output attribute and a metadata attribute as 
below:
   ```
   org.apache.spark.sql.AnalysisException: Reference 'index' is ambiguous, 
could be: testcat.ns1.ns2.tableTwo.index, testcat.ns1.ns2.tableOne.index.; line 
1 pos 71
   at 
org.apache.spark.sql.catalyst.expressions.package$AttributeSeq.resolve(package.scala:363)
   at 
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveChildren(LogicalPlan.scala:107)
   ```
   
   Now, resolution succeeds and provides the "real" output attribute.
   
   ### How was this patch tested?
   
   Added a unit test.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to