[
https://issues.apache.org/jira/browse/FLINK-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249428#comment-15249428
]
ASF GitHub Bot commented on FLINK-2998:
---------------------------------------
Github user gallenvara commented on the pull request:
https://github.com/apache/flink/pull/1838#issuecomment-212300584
@fhueske PR updated.
I am a little confused when i wrote the tests. The original dataset handled
by a `map` operator to ensure that the type of partition key is same with the
boundary in the supplied distribution. In the `DataDistribution` interface, the
type of `getBucketBoundary` method returned is `Object[]`. My doubt is whether
this can be changed to type of `Tuple`. I mean that when range partition by one
field, it return `Tuple1` and two fields return `Tuple2`. Also in the
`OutputEmmiter`, change the type of keys from `Object[]` to `Tuple` and
comparing the key with boundary using `Tuple` comparator. If this is possible,
the boundaries in the distribution for rangePartition test will be:
`Tuple2<Integer, Long>[] boundaries = new Tuple2[]{
new Tuple2(1, 1L),
new Tuple2(3, 2L),
....
}`
This can make the test more succinct and direct.
Another confusing is that why partitionByHash and partitionByRange do not
support some KeySelectors returned Tuple type such as:
```
public static class KeySelector3 implements
KeySelector<Tuple3<Integer,Long,String>, Tuple2<Integer,Long>> {
private static final long serialVersionUID = 1L;
@Override
public Tuple2<Integer,Long> getKey(Tuple3<Integer,Long,String>
in) {
return new Tuple2<>(in.f0,in.f1);
}
}
```
and can not run the following codes:
```
DataSet<Tuple3<Integer,Long,String>> dataSet = ...;
dataSet.partitionByRange(new KeySelector3());
```
Can you explain it to me?Thanks!
> Support range partition comparison for multi input nodes.
> ---------------------------------------------------------
>
> Key: FLINK-2998
> URL: https://issues.apache.org/jira/browse/FLINK-2998
> Project: Flink
> Issue Type: New Feature
> Components: Optimizer
> Reporter: Chengxiang Li
> Priority: Minor
>
> The optimizer may have potential opportunity to optimize the DAG while it
> found two input range partition are equivalent, we does not support the
> comparison yet.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)