Brian Hulette created BEAM-12531:
------------------------------------
Summary: ib.show does not handle deferred dataframe instances
Key: BEAM-12531
URL: https://issues.apache.org/jira/browse/BEAM-12531
Project: Beam
Issue Type: Bug
Components: dsl-dataframe
Affects Versions: 2.31.0
Reporter: Brian Hulette
Assignee: Sam Rohde
Fix For: 2.32.0
When passed a deferred dataframe instance (e.g. {{ib.show(counts.nlargest(20,
keep='all'))}}), ib.show calls len() and ends up raising a WontImplementError:
{code}
---------------------------------------------------------------------------
WontImplementError Traceback (most recent call last)
<ipython-input-9-56c2dd81898d> in <module>
----> 1 ib.show(counts.nlargest(20, keep='all'))
2 frames
/usr/local/lib/python3.7/dist-packages/apache_beam/runners/interactive/utils.py
in run_within_progress_indicator(*args, **kwargs)
245 def run_within_progress_indicator(*args, **kwargs):
246 with ProgressIndicator('Processing...', 'Done.'):
--> 247 return func(*args, **kwargs)
248
249 return run_within_progress_indicator
/usr/local/lib/python3.7/dist-packages/apache_beam/runners/interactive/interactive_beam.py
in show(include_window_info, visualize_data, n, duration, *pcolls)
441 else:
442 try:
--> 443 flatten_pcolls.extend(iter(pcoll_container))
444 except TypeError:
445 raise ValueError(
/usr/local/lib/python3.7/dist-packages/apache_beam/dataframe/frames.py in
__len__(self)
695 "len(df) is not currently supported because it produces a
non-deferred "
696 "result. Consider using df.length() instead.",
--> 697 reason="non-deferred-result")
698
699 @property # type: ignore
WontImplementError: len(df) is not currently supported because it produces a
non-deferred result. Consider using df.length() instead.
For more information see https://s.apache.org/dataframe-non-deferred-result.
{code}
We should support this case, or at least fail gracefully.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)