GitHub user MaxGekk opened a pull request:

    https://github.com/apache/spark/pull/22316

    [SPARK-25048][SQL] Pivoting by multiple columns in Scala/Java

    ## What changes were proposed in this pull request?
    
    In the PR, I propose to extend implementation of existing method:
    ```
    def pivot(pivotColumn: Column, values: Seq[Any]): RelationalGroupedDataset
    ```
    to support values of the struct type. This allows pivoting by multiple 
columns combined by `struct`:
    ```
    trainingSales
          .groupBy($"sales.year")
          .pivot(
            pivotColumn = struct(lower($"sales.course"), $"training"),
            values = Seq(
              struct(lit("dotnet"), lit("Experts")),
              struct(lit("java"), lit("Dummies")))
          ).agg(sum($"sales.earnings"))
    ```
    
    ## How was this patch tested?
    
    Added a test for values specified via `struct` in Java and Scala.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/MaxGekk/spark-1 pivoting-by-multiple-columns2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22316.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22316
    
----
commit 058072544fdd606392a57615119bb55dff5345c0
Author: Maxim Gekk <max.gekk@...>
Date:   2018-09-02T12:24:20Z

    Support columns as values

commit 1221db39b75a9b9bd4fbc6144150283d9c24e9d5
Author: Maxim Gekk <max.gekk@...>
Date:   2018-09-02T13:14:55Z

    Added a test for the case when values are not specified

commit a097b294854f99ec58ca307d85c19e54cd76d6b8
Author: Maxim Gekk <max.gekk@...>
Date:   2018-09-02T13:19:14Z

    Added a test for Java

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to