Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23054#discussion_r234401319
  
    --- Diff: docs/sql-migration-guide-upgrade.md ---
    @@ -17,6 +17,8 @@ displayTitle: Spark SQL Upgrading Guide
     
       - The `ADD JAR` command previously returned a result set with the single 
value 0. It now returns an empty result set.
     
    +  - In Spark version 2.4 and earlier, the key attribute is wrongly named 
as "value" for primitive key type when doing typed aggregation on Dataset. This 
attribute is now named as "key" since Spark 3.0 like complex key type.
    --- End diff --
    
    ```
    In Spark version 2.4 and earlier, `Dataset.groupByKey` results to a grouped 
dataset with key attribute
    wrongly named as "value", if the `Dataset` element is of atomic type, e.g. 
int, string, etc. This is
    counterintuitive and makes the schema of aggregation queries weird. For 
example, the schema
    of `ds.groupByKey(...).count()` is `(value, count)`. Since Spark 3.0, we 
name the
    grouping attribute to "key".
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to