[ 
https://issues.apache.org/jira/browse/SPARK-24865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16550047#comment-16550047
 ] 

Apache Spark commented on SPARK-24865:
--------------------------------------

User 'rxin' has created a pull request for this issue:
https://github.com/apache/spark/pull/21822

> Remove AnalysisBarrier
> ----------------------
>
>                 Key: SPARK-24865
>                 URL: https://issues.apache.org/jira/browse/SPARK-24865
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.3.0, 2.3.1
>            Reporter: Reynold Xin
>            Priority: Major
>
> AnalysisBarrier was introduced in SPARK-20392 to improve analysis speed 
> (don't re-analyze nodes that have already been analyzed).
> Before AnalysisBarrier, we already had some infrastructure in place, with 
> analysis specific functions (resolveOperators and resolveExpressions). These 
> functions do not recursively traverse down subplans that are already analyzed 
> (with a mutable boolean flag _analyzed). The issue with the old system was 
> that developers started using transformDown, which does a top-down traversal 
> of the plan tree, because there was not top-down resolution function, and as 
> a result analyzer performance became pretty bad.
> In order to fix the issue in SPARK-20392, AnalysisBarrier was introduced as a 
> special node and for this special node, transform/transformUp/transformDown 
> don't traverse down. However, the introduction of this special node caused a 
> lot more troubles than it solves. This implicit node breaks assumptions and 
> code in a few places, and it's hard to know when analysis barrier would 
> exist, and when it wouldn't. Just a simple search of AnalysisBarrier in PR 
> discussions demonstrates it is a source of bugs and additional complexity.
> Instead, I think a much simpler fix to the original issue is to introduce 
> resolveOperatorsDown, and change all places that call transformDown in the 
> analyzer to use that. We can also ban accidental uses of the various 
> transform* methods by using a linter (which can only lint specific packages), 
> or in test mode inspect the stack trace and fail explicitly if transform* are 
> called in the analyzer. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to