paleolimbot commented on PR #43632:
URL: https://github.com/apache/arrow/pull/43632#issuecomment-2286856527

   Echoing all the thanks to Weston for the detailed response!
   
   I wonder if it is worth clarifying the goals and non-goals of this proposal. 
In my mind, this is about rectifying two very different ways engines/APIs 
operate (push vs. pull). I don't have much experience on the performance side, 
but in the development time/lines-of-code side, trying to make a producer that 
expects to push its output interact with a consumer that wants to pull is 
expensive (the reverse is also true). This gets more and more complicated the 
more times this mismatch is encountered in a pipeline.
   
   I worry that in the quest for the best possible performance that we loose 
any development time/lines-of-code advantage that a simpler approach might have 
enabled! I also worry that an ABI that becomes too opinionated about how a 
scanner should be implemented will still not be able to express other ("non 
optimal"?) scanners that, for historical reasons (or because we were wrong 
about what an optimal scanner looks like), don't work that way. I still think 
that something like the original proposal (with clear, if imperfect, 
expectations about what can or should happen in the callbacks) is *a* missing 
piece (if not *the* missing piece).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to