[GitHub] spark pull request #22047: [SPARK-19851] Add support for EVERY and ANY (SOME...

dilipbiswal Tue, 16 Oct 2018 13:43:51 -0700

Github user dilipbiswal commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22047#discussion_r225700828
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -403,6 +403,28 @@ def countDistinct(col, *cols):
         return Column(jc)
     
     
    +def every(col):
    --- End diff --
    
    @gatorsmile Hi Sean, I have prepared two branches. One in which these new 
aggregate functions are extending from the base Max and Min class basically 
reusing code. The other in which we replace these aggregate expressions in the 
optimizer. Below are the links.
    
    1. 
[branch-extend](https://github.com/dilipbiswal/spark/tree/SPARK-19851-extend)
    
    2. 
[branch-rewrite](https://github.com/dilipbiswal/spark/tree/SPARK-19851-rewrite)
    
    I would prefer option 1 because of the following reasons.
    1. Code changes are simpler
    2. Supports these aggregates as window expressions naturally. In the other 
option i have
        to block it. 
    3. It seems to me for these simple mapping, we probably don't need a 
rewrite frame work. We could add it in the future if we need a little complex 
transformation.
    
    Please let me know how we want to move forward with this. Thanks !!



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22047: [SPARK-19851] Add support for EVERY and ANY (SOME...

Reply via email to