Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22318 @srowen the change seems fine to me as I think this does improve the behavior of attributes deduplication (I think it was a bug not rewriting the join condition with the new references). The point is that in order to have a really correct and expected behavior in all the conditions we need to keep a reference of the dataset an attribute is from. That is why I created #21449 and I know @cloud-fan mentioned he had a similar PR for the same reason. But that is another story. In general I think both this and something like #21449 is needed in order to handle properly all the cases.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org