Hi everyone,
      In some scenarios we met a requirement that some operators want to send 
records to theirs downstream operators with an multicast communication pattern. 
In detail, for some records, the operators want to send them according to the 
partitioner (for example, Rebalance), and for some other records, the operators 
want to send them to all the connected operators and tasks. Such a 
communication pattern could be viewed as a kind of multicast: it does not 
broadcast every record, but some record will indeed be sent to multiple 
downstream operators.

However, we found that this kind of communication pattern seems could not be 
implemented rightly if the operators have multiple consumers with different 
parallelism, using the customized partitioner. To solve the above problem, we 
propose to enhance the support for such kind of irregular communication 
pattern. We think there may be two options:

     1. Support a kind of customized operator events, which share much 
similarity with Watermark, and these events can be broadcasted to the 
downstream operators separately.
     2. Let the channel selector supports multicast, and also add the separate 
RecordWriter implementation to avoid impacting the performance of the channel 
selector that does not need multicast.

The problem and options are detailed in 
https://docs.google.com/document/d/1npi5c_SeP68KuT2lNdKd8G7toGR_lxQCGOnZm_hVMks/edit?usp=sharing

We are also wondering if there are other methods to implement this requirement 
with or without changing Runtime. Very thanks for any feedbacks !


Best,
Yun

Reply via email to