[ 
https://issues.apache.org/jira/browse/BEAM-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Rohde reassigned BEAM-12388:
--------------------------------

    Assignee: Sam Rohde

> Improve caching experience on InteractiveRunner with dataframes
> ---------------------------------------------------------------
>
>                 Key: BEAM-12388
>                 URL: https://issues.apache.org/jira/browse/BEAM-12388
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-py-interactive
>            Reporter: Sam Rohde
>            Assignee: Sam Rohde
>            Priority: P2
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Reusing the default label for to_pcollection when using the interactive 
> runner results in caching errors when used with multiple pipelines:
>  
>  
> {{Traceback (most recent call last):
>   File 
> "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/interactive_runner_test.py",
>  line 389, in test_dataframes_with_multi_index_get_result
>     pd.testing.assert_series_equal(df_expected, ib.collect(deferred_df, n=10))
>   File 
> "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/utils.py",
>  line 247, in run_within_progress_indicator
>     return func(*args, **kwargs)
>   File 
> "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/interactive_beam.py",
>  line 579, in collect
>     recording = recording_manager.record([pcoll], max_n=n, 
> max_duration=duration)
>   File 
> "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/recording_manager.py",
>  line 433, in record
>     self._watch(pcolls)
>   File 
> "/home/srohde/Workdir/beam/sdks/python/apache_beam/runners/interactive/recording_manager.py",
>  line 306, in _watch
>     for pcoll in to_pcollection(*watched_dataframes, 
> always_return_tuple=True):
>   File 
> "/home/srohde/Workdir/beam/sdks/python/apache_beam/dataframe/convert.py", 
> line 196, in to_pcollection
>     new_results = {p: extract_input(p)
>   File 
> "/home/srohde/Workdir/beam/sdks/python/apache_beam/transforms/ptransform.py", 
> line 1086, in __ror__
>     return self.transform.__ror__(pvalueish, self.label)
>   File 
> "/home/srohde/Workdir/beam/sdks/python/apache_beam/transforms/ptransform.py", 
> line 587, in __ror__
>     raise ValueError(
> ValueError: Mixing value from different pipelines not allowed.}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to