[ https://issues.apache.org/jira/browse/FLINK-10662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681198#comment-16681198 ]
zhijiang edited comment on FLINK-10662 at 11/9/18 10:07 AM: ------------------------------------------------------------ [~pnowojski], I am currently working on this. If we change to return single {{int}} for {{ChannelSelector#selectChannels, there are some choices for special {{BroadcastPartitioner}}}} implementation. # Return any int such as -1 for BroadcastPartitioner, because this value would not really be used in {{RecordWriter}} side. We can make it shortcut branch for {{BroadcastPartitioner}} in {{RecordWriter}}. This way is easy to handle, but may seem a little hacky to return a dummy value in {{selectChannels}} method. # Return a `tuple<boolean, int>` for {{ChannelSelector#selectChannels}}, and the first boolean value indicate whether it is broadcast or not. If broadcast, we will ignore the second selected channel index in tuple. I am wondering if it would bring additional overhead than the first way above. # Define a high level interface called {{ChannelSelectorBase}} with no specific methods, then {{ChannelSelector}} extends {{ChannelSelectorBase}} and {{BroadcastPartitioner}} implements {{ChannelSelectorBase}} directly, because it does not need any methods to return selected channels. In this way, we change the current class structure and also need adjust the API in {{DataStream}} to reference {{ChannelSelectorBase}} instead of {{StreamPartitioner}}. It would be more complex to handle all the related references. Currently, I would prefer the first way although it seems a little hacky, or do you have other better suggestions? :) was (Author: zjwang): [~pnowojski], I am currently working on this. If we change to return single {{int}} for {{ChannelSelector#selectChannels, there are some choices for special }}{{BroadcastPartitioner}} implementation. # Return any int such as -1 for BroadcastPartitioner, because this value would not really be used in {{RecordWriter}} side. We can make it shortcut branch for {{BroadcastPartitioner}} in {{RecordWriter}}. This way is easy to handle, but may seem a little hacky to return a dummy value in {{selectChannels}} method. # Return a `tuple<boolean, int>` for {{ChannelSelector#selectChannels}}, and the first boolean value indicate whether it is broadcast or not. If broadcast, we will ignore the second selected channel index in tuple. I am wondering if it would bring additional overhead than the first way above. # Define a high level interface called {{ChannelSelectorBase}} with no specific methods, then {{ChannelSelector}} extends {{ChannelSelectorBase}} and {{BroadcastPartitioner}} implements {{ChannelSelectorBase}} directly, because it does not need any methods to return selected channels. In this way, we change the current class structure and also need adjust the API in {{DataStream}} to reference {{ChannelSelectorBase}} instead of {{StreamPartitioner}}. It would be more complex to handle all the related references. Currently, I would prefer the first way although it seems a little hacky, or do you have other better suggestions? :) > Refactor the ChannelSelector interface for single selected channel > ------------------------------------------------------------------ > > Key: FLINK-10662 > URL: https://issues.apache.org/jira/browse/FLINK-10662 > Project: Flink > Issue Type: Sub-task > Components: Network > Affects Versions: 1.5.4, 1.6.1 > Reporter: zhijiang > Assignee: zhijiang > Priority: Minor > > In the discussion of broadcast improvement, [~pnowojski] pointed out the > issue of improving the current channel selector. > > In {{ChannelSelector#selectChannels}}, it would return an array for selected > channels. But considering specific implementations, only > {{BroadcastPartitioner}} would select all the channels, and other > implementations will select one channel. So we can simple this interface to > return single channel index for benefiting performance, and specialize the > {{BroadcastPartitioner}} in a more efficient way. -- This message was sent by Atlassian JIRA (v7.6.3#76005)