westonpace commented on code in PR #43632:
URL: https://github.com/apache/arrow/pull/43632#discussion_r1712647337


##########
cpp/src/arrow/c/abi.h:
##########
@@ -228,6 +228,65 @@ struct ArrowDeviceArrayStream {
 
 #endif  // ARROW_C_DEVICE_STREAM_INTERFACE
 
+#ifndef ARROW_C_ASYNC_STREAM_INTERFACE
+#define ARROW_C_ASYNC_STREAM_INTERFACE
+
+// Similar to ArrowDeviceArrayStream, except designed for an asynchronous
+// style of interaction. While ArrowDeviceArrayStream provides producer
+// defined callbacks, this is intended to be created by the consumer instead.
+// The consumer passes this handler to the producer, which in turn uses the
+// callbacks to inform the consumer of events in the stream.
+struct ArrowAsyncDeviceStreamHandler {
+  // Handler for receiving a schema. The passed in stream_schema should be
+  // released or moved by the handler (producer is giving ownership of it to
+  // the handler).
+  //
+  // The `extension_param` argument can be null or can be used by a producer
+  // to pass arbitrary extra information to the consumer (such as total number
+  // of rows, context info, or otherwise).
+  //
+  // Return value: 0 if successful, `errno`-compatible error otherwise
+  int (*on_schema)(struct ArrowAsyncDeviceStreamHandler* self,
+                   struct ArrowSchema* stream_schema, void* extension_param);
+
+  // Handler for receiving an array/record batch. Always called at least once
+  // unless an error is encountered (which would result in calling on_error).
+  // An empty/released array is passed to indicate the end of the stream if no
+  // errors have been encountered.
+  //
+  // The `extension_param` argument can be null or can be used by a producer
+  // to pass arbitrary extra information to the consumer.
+  //
+  // Return value: 0 if successful, `errno`-compatible error otherwise.
+  int (*on_next)(struct ArrowAsyncDeviceStreamHandler* self,
+                 struct ArrowDeviceArray* next, void* extension_param);

Review Comment:
   What should happen if the consumer needs to do some lengthy processing of 
`next` (e.g. maybe the producer is a simple scanner and the consumer is running 
a complex analysis query).  Should it force the creation of a new thread task 
(which is annoying because the array is probably in the CPU cache already and 
we don't want to have to process it on a new thread) or should it do the 
analysis as part of the call to `on_next` (which is weird because now we are 
tying up a producer's thread and would that interfere with the production of 
new data?).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to