paleolimbot opened a new pull request, #336: URL: https://github.com/apache/arrow-nanoarrow/pull/336
An early commit implemented `nanoarrow::EmptyArrayStream` and `nanoarrow::VectorArrayStream` in the nanoarrow.hpp C++ helpers. The intention at the time was to make it "easy"/idiomatic to use a C++ object to back an `ArrowArrayStream`; however, it was in practice difficult to actually make work (I tried briefly and gave up when writing a [dummy ADBC driver](https://github.com/voltrondata-labs/blog-posts-code/blob/main/2023-08-nanoarrow-adbc/simple_csv_reader.cc#L52-L177) for [this blog post](https://voltrondata.com/resources/nanoarrow-lightweight-embeddable-arrow-implementation-data-pipelines). This PR deprecates the original syntax and migrates it to the following. Implementation: ```cpp class StreamImpl { public: // Public methods (e.g., constructor) used from C++ to initialize relevant data // Idiomatic exporter to move data + lifecycle responsibility to an instance // managed by the ArrowArrayStream callbacks void ToArrayStream(struct ArrowArrayStream* out) { ArrayStreamFactory<StreamImpl>::InitArrayStream(new StreamImpl(...), out); } private: // Make relevant methods available to the ArrayStreamFactory friend class ArrayStreamFactory<StreamImpl>; // Method implementations (called from C, not normally interacted with from C++) int GetSchema(struct ArrowSchema* schema) { return ENOTSUP; } int GetNext(struct ArrowArray* array) { return ENOTSUP; } const char* GetLastError() { nullptr; } }; ``` Usage: ```cpp // Call constructor and/or public methods to initialize relevant data StreamImpl impl; // Export to ArrowArrayStream after data are finalized UniqueArrayStream stream; impl.ToArrayStream(stream.get()); ``` I'm open to suggestions on how to make that better! It might be that the `ToArrayStream()` bit is confusing (i.e., just use `ArrayStreamFactory<>::InitArrayStream(new StreamImpl())` directory) but it also seemed better to keep the lines where a raw pointer was floating around to be entirely contained within the `StreamImpl` class. It also fixes an issue with the `XXX_pointer()` functions, which because of the way they were declared, required that `nanoarrow.hpp` be confusingly included *after* `nanoarrow_ipc.hpp`. The correct way to do this (I think) was to declare the template and add implementations (rather than use overloads). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
