[GitHub] kjmrknsn opened a new pull request #23307: [SPARK-26335][SQL] Add a property for Dataset#show not to care about wide characters when padding them

GitBox Wed, 12 Dec 2018 23:45:48 -0800

kjmrknsn opened a new pull request #23307: [SPARK-26335][SQL] Add a property 
for Dataset#show not to care about wide characters when padding them
URL: https://github.com/apache/spark/pull/23307
 
 
   ## What changes were proposed in this pull request?
   
   ### Issue
   [SPARK-25108](https://issues.apache.org/jira/browse/SPARK-25108) made 
`Dataset#show` care about wide characters when padding them. That is useful for 
humans to read a result of `Dataset#show`. On the other hand, that makes it 
impossible for programs to parse a result of `Dataset#show` because each cell's 
length can be different from its header's length. My company develops and 
manages a Jupyter/Apache Zeppelin-like visualization tool named 
[OASIS](https://databricks.com/session/oasis-collaborative-data-analysis-platform-using-apache-spark).
 On this application, a result of `Dataset#show` on a Scala or Python process 
is parsed to visualize it as an HTML table format as follows: 
   
   <img width="1092" alt="screen shot 2018-12-13 at 16 38 58" 
src="https://user-images.githubusercontent.com/31149688/49923017-9e3c6180-fef5-11e8-970b-077bed46cdee.png";>
   
   ### Solution
   Add the `spark.sql.dataset.show.handleFullWidthCharacters` property for 
`Dataset#show` to control whether wide characters are cared/handled or not.
   
   ## How was this patch tested?
   This patch was tested via unit tests.
   
   ## Jira Issue
   https://issues.apache.org/jira/browse/SPARK-26335


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] kjmrknsn opened a new pull request #23307: [SPARK-26335][SQL] Add a property for Dataset#show not to care about wide characters when padding them

Reply via email to