We have had multiple bugs introduced by AnalysisBarrier. In hindsight I
think the original design before analysis barrier was much simpler and
requires less developer knowledge of the infrastructure.

As long as analysis barrier is there, developers writing various code in
analyzer will have to be aware of this special node and we are bound to
have more bugs in the future due to people not considering it.


Filed this JIRA ticket: https://issues.apache.org/jira/browse/SPARK-24865



AnalysisBarrier was introduced in SPARK-20392
<https://issues.apache.org/jira/browse/SPARK-20392> to improve analysis
speed (don't re-analyze nodes that have already been analyzed).

Before AnalysisBarrier, we already had some infrastructure in place, with
analysis specific functions (resolveOperators and resolveExpressions).
These functions do not recursively traverse down subplans that are already
analyzed (with a mutable boolean flag _analyzed). The issue with the old
system was that developers started using transformDown, which does a
top-down traversal of the plan tree, because there was not top-down
resolution function, and as a result analyzer performance became pretty bad.

In order to fix the issue in SPARK-20392
<https://issues.apache.org/jira/browse/SPARK-20392>, AnalysisBarrier was
introduced as a special node and for this special node,
transform/transformUp/transformDown don't traverse down. However, the
introduction of this special node caused a lot more troubles than it
solves. This implicit node breaks assumptions and code in a few places, and
it's hard to know when analysis barrier would exist, and when it wouldn't.
Just a simple search of AnalysisBarrier in PR discussions demonstrates it
is a source of bugs and additional complexity.

Instead, I think a much simpler fix to the original issue is to introduce
resolveOperatorsDown, and change all places that call transformDown in the
analyzer to use that. We can also ban accidental uses of the various
transform* methods by using a linter (which can only lint specific
packages), or in test mode inspect the stack trace and fail explicitly if
transform* are called in the analyzer.





On Thu, Jul 19, 2018 at 11:41 AM Xiao Li <gatorsm...@gmail.com> wrote:

> dfWithUDF.cache()
> dfWithUDF.write.saveAsTable("t")
> dfWithUDF.write.saveAsTable("t1")
>
>
> Cached data is not being used. It causes a big performance regression.
>
>
>
>
> 2018-07-19 11:32 GMT-07:00 Sean Owen <sro...@gmail.com>:
>
>> What regression are you referring to here? A -1 vote really needs a
>> rationale.
>>
>> On Thu, Jul 19, 2018 at 1:27 PM Xiao Li <gatorsm...@gmail.com> wrote:
>>
>>> I would first vote -1.
>>>
>>> I might find another regression caused by the analysis barrier. Will
>>> keep you posted.
>>>
>>>
>

Reply via email to