[ https://issues.apache.org/jira/browse/SPARK-24865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16550047#comment-16550047 ]
Apache Spark commented on SPARK-24865: -------------------------------------- User 'rxin' has created a pull request for this issue: https://github.com/apache/spark/pull/21822 > Remove AnalysisBarrier > ---------------------- > > Key: SPARK-24865 > URL: https://issues.apache.org/jira/browse/SPARK-24865 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.3.0, 2.3.1 > Reporter: Reynold Xin > Priority: Major > > AnalysisBarrier was introduced in SPARK-20392 to improve analysis speed > (don't re-analyze nodes that have already been analyzed). > Before AnalysisBarrier, we already had some infrastructure in place, with > analysis specific functions (resolveOperators and resolveExpressions). These > functions do not recursively traverse down subplans that are already analyzed > (with a mutable boolean flag _analyzed). The issue with the old system was > that developers started using transformDown, which does a top-down traversal > of the plan tree, because there was not top-down resolution function, and as > a result analyzer performance became pretty bad. > In order to fix the issue in SPARK-20392, AnalysisBarrier was introduced as a > special node and for this special node, transform/transformUp/transformDown > don't traverse down. However, the introduction of this special node caused a > lot more troubles than it solves. This implicit node breaks assumptions and > code in a few places, and it's hard to know when analysis barrier would > exist, and when it wouldn't. Just a simple search of AnalysisBarrier in PR > discussions demonstrates it is a source of bugs and additional complexity. > Instead, I think a much simpler fix to the original issue is to introduce > resolveOperatorsDown, and change all places that call transformDown in the > analyzer to use that. We can also ban accidental uses of the various > transform* methods by using a linter (which can only lint specific packages), > or in test mode inspect the stack trace and fail explicitly if transform* are > called in the analyzer. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org