[ https://issues.apache.org/jira/browse/BEAM-12533?focusedWorklogId=615114&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615114 ]
ASF GitHub Bot logged work on BEAM-12533: ----------------------------------------- Author: ASF GitHub Bot Created on: 25/Jun/21 17:19 Start Date: 25/Jun/21 17:19 Worklog Time Spent: 10m Work Description: rohdesamuel commented on a change in pull request #15072: URL: https://github.com/apache/beam/pull/15072#discussion_r658921835 ########## File path: sdks/python/apache_beam/dataframe/frames.py ########## @@ -1843,9 +1873,69 @@ def repeat(self, repeats, axis): f"DeferredSeries (encountered {type(repeats)}).") +def _justify_str_column(objs, rjust=True): + strs = [str(o) for o in objs] + maxlen = max(len(s) for s in strs) + return [s.rjust(maxlen) if rjust else s.ljust(maxlen) for s in strs] + + +def _ljustify_str_column(objs): + strs = [str(o) for o in objs] + maxlen = max(len(s) for s in strs) + return [s.ljust(maxlen) for s in strs] + + +def _justify_columns_and_transpose(columns, rjust=True): + for row in zip(*[_justify_str_column(objs, rjust) for objs in columns]): + yield ' '.join(row) + + @populate_not_implemented(pd.DataFrame) @frame_base.DeferredFrame._register_for(pd.DataFrame) class DeferredDataFrame(DeferredDataFrameOrSeries): Review comment: Yeah, the first ten or so lines are the same for the series and frames, maybe that can be pulled out? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 615114) Time Spent: 2h 20m (was: 2h 10m) > DeferedSeries and DeferredDataFrame should have a useful repr > ------------------------------------------------------------- > > Key: BEAM-12533 > URL: https://issues.apache.org/jira/browse/BEAM-12533 > Project: Beam > Issue Type: Improvement > Components: dsl-dataframe > Reporter: Brian Hulette > Assignee: Brian Hulette > Priority: P2 > Fix For: 2.32.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > DeferredSeries and DeferredDataFrame just use the default __repr__ > implementation right now, which means outputting them in a notebook is not > useful at all. Users will need to inspect columns, dtypes, index, name, etc.. > manually. We should include basic information about the frames in a simple > __repr__ implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005)