[ 
https://issues.apache.org/jira/browse/BEAM-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9748:
-------------------------------
    Description: Some DoFn based IOs like JdbcIO and RedisIO rely on a 
different approach to Reparallelize outputs using a combination of a an empty 
PCollectionView to force materialization and Reshuffle.viaRandomkey to 
reparallelize a PCollection. This issue extracts this transform and expose it 
as part of the Reshuffle to avoid repeating the code for transforms (notably 
IOs) that produce lots of sequentially generated data where and benefit of this 
alternative approach to perform better reparallelization of its output.  (was: 
Some DoFn based IOs like JdbcIO and RedisIO rely on the Reparallelize transform,
a combination of a an empty PCollectionView and Reshuffle to force the
materialization and reparallelize a PCollection. The idea of this issue is to
extract this transform and expose it as part of the internal Reshuffle
transform to avoid repeating the code for transforms (notably IOs) that require
to reparallelize its output.)

> Add Reshuffle.ForSequentiallyGeneratedInput transform
> -----------------------------------------------------
>
>                 Key: BEAM-9748
>                 URL: https://issues.apache.org/jira/browse/BEAM-9748
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Ismaël Mejía
>            Assignee: Ismaël Mejía
>            Priority: Minor
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Some DoFn based IOs like JdbcIO and RedisIO rely on a different approach to 
> Reparallelize outputs using a combination of a an empty PCollectionView to 
> force materialization and Reshuffle.viaRandomkey to reparallelize a 
> PCollection. This issue extracts this transform and expose it as part of the 
> Reshuffle to avoid repeating the code for transforms (notably IOs) that 
> produce lots of sequentially generated data where and benefit of this 
> alternative approach to perform better reparallelization of its output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to