bkietz commented on pull request #10204:
URL: https://github.com/apache/arrow/pull/10204#issuecomment-834542335


   I would workaround this with an API for producing conjoined split nodes:
   
   ```c++
   /// Split the output of `to_be_split` for consumption by multiple downstream 
nodes
   ///
   /// Each split node will list `to_be_split` as its only input, though 
`to_be_split` will only
   /// consider the split node as its output. Whenever `to_be_split` pushes to 
the first split
   /// node, that data will be replicated to the outputs of each split node. 
Back pressure
   /// on any split node will also be felt by `to_be_split`.
   std::vector<std::unique_ptr<ExecNode>> MakeSplit(ExecNode* to_be_split, int 
n_splits);
   ```
   
   I think that the multiple consumer case will be sufficiently uncommon that 
making it
   a first class citizen of the API will be more confusing than dedicated split 
nodes.
   Doubly so since there's a semantic distinction between the inputs: a 
FilterNode would
   have one input for values-to-be-filtered and a second for masks/selection 
vectors
   whereas any node with multiple outputs would be pushing identical batches to 
each.
   
   @michalursa


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to