Liang-Chi Hsieh created SPARK-19665:
---------------------------------------

             Summary: Improve constraint propagation
                 Key: SPARK-19665
                 URL: https://issues.apache.org/jira/browse/SPARK-19665
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.1.0
            Reporter: Liang-Chi Hsieh


If there are aliased expression in the projection, we propagate constraints by 
completely expanding the original constraints with aliases.

This expanding costs much computation time when the number of aliases increases.

Another issue is we actually don't need the additional constraints at most of 
time. For example, if there is a constraint "a > b", and "a" is aliased to "c" 
and "d". When we use this constraint in filtering, we don't need all 
constraints "a > b", "c > b", "d > b". We only need "a > b" because if it is 
false, it is guaranteed that all other constraints are false too.

Fully expanding all constraints at all the time makes iterative ML algorithms 
where a ML pipeline with many stages runs very slow.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to