chmp commented on pull request #1383:
URL: https://github.com/apache/arrow-rs/pull/1383#issuecomment-1059548659
Sure happy to sum up `serde_arrow`. First: at the moment it's only an
experiement. I found it useful for some private data processing, and thought
maybe it's also helpful to others. At the moment the status is usuable, but
only those cases are handled, that I needed so far.
The basic idea is to allow record batches to be used as a "file format" for
serde. A typical way of using serde is to take some Rust structs, to implement
serialize, and then to generate JSON from these structs. `serde_arrow` allows
you to use the same structs, but to generate record batches. So you don't have
to use the `arrow` builder API, but can simply derive `serde::Serialize` and
then call `serde_arrow::to_record_batch`. The reverse process, reading structs
from a record batch is also supported. One complication is, that the data
models of `serde` and `arrow` do not match. For that reason `serde_arrow`
offers additional logic to let the user specify how to translate the `serde`
data model into `arrow` one and back again.
For eaxmple:
```rust
#[derive(Serialize)]
struct Example {
a: f32,
b: i32,
}
let examples = vec![
Example { a: 1.0, b: 1 },
Example { a: 2.0, b: 2 },
];
let schema = serde_arrow::Schema::from_records(&examples)?;
let batch = serde_arrow::to_record_batch(&examples, schema)?;
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]