GitHub user maropu opened a pull request:

    https://github.com/apache/spark/pull/20214

    [SPARK-23023][SQL] Cast field data to strings in showString

    ## What changes were proposed in this pull request?
    The current `Datset.showString` prints rows thru `RowEncoder` deserializers 
like;
    ```
    scala> Seq(Seq(Seq(1, 2), Seq(3), Seq(4, 5, 6))).toDF("a").show(false)
    +------------------------------------------------------------+
    |a                                                           |
    +------------------------------------------------------------+
    |[WrappedArray(1, 2), WrappedArray(3), WrappedArray(4, 5, 6)]|
    +------------------------------------------------------------+
    ```
    This result is incorrect because the correct one is;
    ```
    scala> Seq(Seq(Seq(1, 2), Seq(3), Seq(4, 5, 6))).toDF("a").show(false)
    +------------------------+
    |a                       |
    +------------------------+
    |[[1, 2], [3], [4, 5, 6]]|
    +------------------------+
    ```
    So, this pr fixed code in `showString` to cast field data to strings before 
printing.
    
    ## How was this patch tested?
    Added tests in `DataFrameSuite`.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/maropu/spark SPARK-23023

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20214.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20214
    
----
commit eb56aff74352a360d1d4b1273be23b670f3c958a
Author: Takeshi Yamamuro <yamamuro@...>
Date:   2018-01-06T11:05:54Z

    Cast data to strings in showString

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to