What are you trying to achieve in converting these structs to arrays partitioned by columns? Are you transferring batches of them from/to somewhere? The Arrow format is not good if you intend to process one at a time.
On Wed, Mar 6, 2024 at 12:33 PM kekronbekron <kekronbek...@protonmail.com> wrote: > > Also considering derive crates for Arrow, but it seems to be very early days > for it. > If I can go from Rust structures to Arrow through derive macros, that would > be the least amount of work one has to do as a *user*. > Code for such derive macros is certainly a lot of work... > There's arrow2_convert, serde_arrow, and narrow. narrow seems to be more > promising. > > Although I conceptually like the example you've shown (python cffi + header > file to generate schema, then running the C program), > I wonder if I'm better off with python/rust (than C/C++), despite needing to > type out the structures manually for python/rust. > > > On Wednesday, March 6th, 2024 at 19:07, Dewey Dunnington via user > <user@arrow.apache.org> wrote: > > > Hi KB, > > > > I imagine you will need a mix of generated and manually typed code to > > generate the ArrowSchema from the definition and recipe to build the > > ArrowArray from an instance, perhaps starting with well-tested > > manually typed code that you replace with generated code as patterns > > appear. > > > > I think nanoarrow is appropriate for what you are trying to do...it > > provides a "straightforward" (in terms of packaging complexity) path > > to wrapping your generator functions in Rust and Python. We haven't > > done a great job of documenting how to do that with examples but feel > > free to ask here or open an issue in apache/arrow-nanoarrow asking for > > help until we do. > > > > Cheers! > > > > -dewey > > > > On Tue, Mar 5, 2024 at 11:14 PM kekronbekron > > kekronbek...@protonmail.com wrote: > > > > > Hi Dewey, > > > > > > Thank you for taking the time. > > > My goal is to convert from a variety of big C data structures like this > > > to equivalent Arrow spec/schema. > > > Then, I would like to store them (RecordBatches) to parquet or any other > > > relevant type. > > > The CSV or JSON output from the example C program (smf84fmt) doesn't > > > matter; just wanted to point to the sample data format as in the header > > > file. > > > > > > I had tried bindgen to create Rust definitions from the header files, but > > > it gets complicated real fast... more than I can comprehend at least. > > > > > > The types get crazier too, with singly linked lists (not there in the > > > linked example, but in other types), etc. > > > > > > Would really like to solve this in a systemtic way, without needing to > > > hand code the Arrow schema... > > > Because the C header files are maintained (by a provider), it would work > > > out best if it's possible to create a conversion script, and then use the > > > Arrow schema in Python/Rust/etc. > > > > > > -KB > > > > > > On Wednesday, March 6th, 2024 at 07:59, Dewey Dunnington via user > > > user@arrow.apache.org wrote: > > > > > > > Hi KB, > > > > > > > > There might be some other approaches I'm not aware of; however, I had > > > > some fun with Python's cffi package to generate some (untested) > > > > nanoarrow code based on the struct definitions [1]. If all you need > > > > are the types in Python or some other higher-level language (e.g., to > > > > read one of the CSV or JSON files generated by the tool you linked), > > > > you could generate Python code instead. > > > > > > > > I hope that's helpful! > > > > > > > > -dewey > > > > > > > > [1] https://gist.github.com/paleolimbot/e1667a57f837e4db7e973b9677e33ddb > > > > > > > > On Sun, Mar 3, 2024 at 10:08 PM kekronbekron > > > > kekronbek...@protonmail.com wrote: > > > > > > > > > Hello, > > > > > > > > > > Say I have a whole bunch of fully typed (with unions and all) data > > > > > structures like the one here - > > > > > https://github.com/IBM/IBM-Z-zOS/blob/main/SMF-Tools/SMF84Formatter/smf84fmt.h. > > > > > Say I'm parsing bytes with such a header...is it possible to then use > > > > > Arrow's C data interface (or maybe nanoarrow) to painlessly convert > > > > > such a struct to Arrow type(s)? > > > > > > > > > > - KB