Hello,

Right now, if someone needs sharded files via FileIO, there is only one
option which is random (round robin) shard assignment per element and it
always use ShardedKey<Integer> as a key for the GBK which follows.

I would like to generalize this and have a possibility to provide some
ShardingFn[UserT, DestinationT, ShardKeyT] via FileIO.
What I am mainly after is, to have a possibility to provide optimisation
for Flink runtime and pass in a special function which generates shard keys
in a way that they are evenly spread among workers (BEAM-5865).

Would such extension for FileIO make sense? If yes, I would create a ticket
for it and try to draft a PR.

Best,
Jozef

Reply via email to