[jira] [Commented] (SPARK-26335) Add a property for Dataset#show not to care about wide characters when padding them

ASF GitHub Bot (JIRA) Thu, 13 Dec 2018 00:05:06 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-26335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16719870#comment-16719870
 ]


ASF GitHub Bot commented on SPARK-26335:
----------------------------------------

kjmrknsn opened a new pull request #23307: [SPARK-26335][SQL] Add a property 
for Dataset#show not to care about wide characters when padding them
URL: https://github.com/apache/spark/pull/23307
 
 
   ## What changes were proposed in this pull request?
   
   ### Issue
   [SPARK-25108](https://issues.apache.org/jira/browse/SPARK-25108) made 
`Dataset#show` care about wide characters when padding them. That is useful for 
humans to read a result of `Dataset#show`. On the other hand, that makes it 
impossible for programs to parse a result of `Dataset#show` because each cell's 
length can be different from its header's length. My company develops and 
manages a Jupyter/Apache Zeppelin-like visualization tool named 
[OASIS](https://databricks.com/session/oasis-collaborative-data-analysis-platform-using-apache-spark).
 On this application, a result of `Dataset#show` on a Scala or Python process 
is parsed to visualize it as an HTML table format as follows: 
   
   <img width="1092" alt="screen shot 2018-12-13 at 16 38 58" 
src="https://user-images.githubusercontent.com/31149688/49923017-9e3c6180-fef5-11e8-970b-077bed46cdee.png";>
   
   ### Solution
   Add the `spark.sql.dataset.show.handleFullWidthCharacters` property for 
`Dataset#show` to control whether wide characters are cared/handled or not.
   
   ## How was this patch tested?
   This patch was tested via unit tests.
   
   ## Jira Issue
   https://issues.apache.org/jira/browse/SPARK-26335

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add a property for Dataset#show not to care about wide characters when 
> padding them
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-26335
>                 URL: https://issues.apache.org/jira/browse/SPARK-26335
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Keiji Yoshida
>            Priority: Major
>         Attachments: Screen Shot 2018-12-11 at 17.53.54.png
>
>
> h2. Issue
> https://issues.apache.org/jira/browse/SPARK-25108 makes Dataset#show care 
> about wide characters when padding them. That is useful for humans to read a 
> result of Dataset#show. On the other hand, that makes it impossible for 
> programs to parse a result of Dataset#show because each cell's length can be 
> different from its header's length. My company develops and manages a 
> Jupyter/Apache Zeppelin-like visualization tool named "OASIS" 
> ([https://databricks.com/session/oasis-collaborative-data-analysis-platform-using-apache-spark]).
>  On this application, a result of Dataset#show on a Scala or Python process 
> is parsed to visualize it as an HTML table format. (A screenshot of OASIS has 
> been attached to this ticket as a file named "Screen Shot 2018-12-11 at 
> 17.53.54.png".)
> h2. Solution
> Add a property for Dataset#show not to care about wide characters when 
> padding them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-26335) Add a property for Dataset#show not to care about wide characters when padding them

Reply via email to