[ https://issues.apache.org/jira/browse/BEAM-9946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106772#comment-17106772 ]
Darshan Jani commented on BEAM-9946: ------------------------------------ PR: https://github.com/apache/beam/pull/11682 > Enhance Partition transform to provide partitionfn with SideInputs > ------------------------------------------------------------------ > > Key: BEAM-9946 > URL: https://issues.apache.org/jira/browse/BEAM-9946 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core > Reporter: Darshan Jani > Assignee: Darshan Jani > Priority: Major > Original Estimate: 96h > Remaining Estimate: 96h > > Currently _Partition_ transform can partition a collection into n collections > based on only _element_ value in _PartitionFn_ to decide on which partition a > particular element belongs to. > {code:java} > public interface PartitionFn<T> extends Serializable { > int partitionFor(T elem, int numPartitions); > } > public static <T> Partition<T> of(int numPartitions, PartitionFn<? super T> > partitionFn) { > return new Partition<>(new PartitionDoFn<T>(numPartitions, partitionFn)); > } > {code} > It will be useful to introduce new API with additional _sideInputs_ provided > to partition function. User will be able to write logic to use both _element_ > value and _sideInputs_ to decide on which partition a particular element > belongs to. > Option-1: Proposed new API: > {code:java} > public interface PartitionWithSideInputsFn<T> extends Serializable { > int partitionFor(T elem, int numPartitions, Context c); > } > public static <T> Partition<T> of(int numPartitions, > PartitionWithSideInputsFn<? super T> partitionFn, Requirements requirements) { > ... > } > {code} > User can use any of the two APIs as per there partitioning function logic. > Option-2: Redesign old API with Builder Pattern which can provide optionally > a _Requirements_ with _sideInputs._ Deprecate old API. > {code:java} > // using sideviews > Partition.into(numberOfPartitions).via( > fn( > (input,c) -> { > // use c.sideInput(view) > // use input > // return partitionnumber > },requiresSideInputs(view)) > ) > // without using sideviews > Partition.into(numberOfPartitions).via( > fn((input,c) -> { > // use input > // return partitionnumber > }) > ) > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)