Andrew Ash created SPARK-22470:
----------------------------------

             Summary: Doc that functions.hash is also used internally for 
shuffle and bucketing
                 Key: SPARK-22470
                 URL: https://issues.apache.org/jira/browse/SPARK-22470
             Project: Spark
          Issue Type: Documentation
          Components: Documentation, SQL
    Affects Versions: 2.2.0
            Reporter: Andrew Ash


https://issues.apache.org/jira/browse/SPARK-12480 added a hash function that 
appears to be the same hash function as what Spark uses internally for shuffle 
and bucketing.

One of my users would like to bake this assumption into code, but is unsure if 
it's a guarantee or a coincidence that they're the same function.  Would it be 
considered an API break if at some point the two functions were different, or 
if the implementation of both changed together?

We should add a line to the scaladoc to clarify.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to