timsaucer opened a new pull request, #1036:
URL: https://github.com/apache/datafusion-python/pull/1036
# Which issue does this PR close?
None.
# Rationale for this change
The notebook rendering of DataFrames is very useful, but it can be enhanced.
This PR adds quality of life improvements such as
- The table is now scrollable both vertically and horizontally
- Instead of collecting an arbitrary 10 rows, we collect up to 2 MB worth of
data
- For Scalars that render to long strings (25 characters) we limit them down
and have a `...` button to allow expanding the cell so you can view it in it's
entirety
- When we have more data available than is displayed we indicate this to the
user that the data are truncated
- When there are no data returned, we write this to the user
# What changes are included in this PR?
This PR adds a feature to collect record batches and uses their size
estimate to collect up to 2MB worth of data. This is typically enough for most
use cases to review the data, but it is a constant we can update. We determine
how many rows to show to the user which is either 2MB worth (record batch will
easily have more than this) or at least 20 rows (also up for changing). We then
render this as a html table
In the rendering we see if the individual cell contains more than 25
characters. If so we show a 25 character snippet of the string representation
of the data and a `...` button that has a javascript call to update which data
are displayed in the cell.
# Are there any user-facing changes?
Yes, but not to the API. Any user who uses jupyter notebooks will experience
these enhanced tables.
See the below screenshots for examples:

<img width="1022" alt="table-views-2"
src="https://github.com/user-attachments/assets/3098f9a4-f5a5-4658-a3f5-dd6ba7706e4b"
/>
<img width="1127" alt="table-views-3"
src="https://github.com/user-attachments/assets/c73a6118-75ea-4a40-9e50-2aa5718be03c"
/>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]