Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21370 @xuanyuanking Just for your reference, for this PR, the PR description can be improved to something like > This PR is to add eager execution into the __repr__ and _repr_html_ of the DataFrame APIs in PySpark. When eager evaluation is enabled, _repr_html_ returns a rich HTML version of the top-K rows of the DataFrame output. If `_repr_html_` is not called by REPL, `_repr_` will return the plain text of the top-K rows. > This PR adds three new external SQL confs for controlling the behavior of eager evaluation: > - spark.sql.repl.eagerEval.enabled: Enables eager evaluation or not. When true, the top K rows of Dataset will be displayed if and only if the REPL supports the eager evaluation. Currently, the eager evaluation is only supported in PySpark. For the notebooks like Jupyter, the HTML table (generated by _repr_html_) will be returned. For plain Python REPL, the returned outputs are formatted like <code>dataframe.show()</code>. > - spark.sql.repl.eagerEval.maxNumRows: The max number of rows that are returned by eager evaluation. This only takes effect when <code>spark.sql.repl.eagerEval.enabled</code> is set to true. > - spark.sql.repl.eagerEval.truncate: The max number of characters of each row that is returned by eager evaluation. This only takes effect when <code>spark.sql.repl.eagerEval.enabled</code> is set to true.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org