[ https://issues.apache.org/jira/browse/BEAM-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stephan Hoyer closed BEAM-4591. ------------------------------- Resolution: Fixed Fix Version/s: Not applicable > beam.Create should be splittable > -------------------------------- > > Key: BEAM-4591 > URL: https://issues.apache.org/jira/browse/BEAM-4591 > Project: Beam > Issue Type: Bug > Components: sdk-java-core, sdk-py-core > Reporter: Stephan Hoyer > Priority: Major > Fix For: Not applicable > > > beam.Create() should be splittable. This would allow the unintuitive > "Reshuffle" step below to be safely omitted: > > {{pipeline = (}} > {{ beam.Create(range(large_number))}} > {{ | beam.Reshuffle() # prevent task fusion}} > {{ | beam.Map(very_expensive_function)}} > {{ ...}} > {{)}} > > These sort of pipelines with small inputs to expensive CPU bound tasks arise > frequently in scientific computing use-cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)