[GitHub] [arrow-rs] chmp commented on pull request #1383: Add write method to Json Writer

GitBox Fri, 04 Mar 2022 13:35:44 -0800


chmp commented on pull request #1383:
URL: https://github.com/apache/arrow-rs/pull/1383#issuecomment-1059548659



   Sure happy to sum up `serde_arrow`. First: at the moment it's only an 
experiement. I found it useful for some private data processing, and thought 
maybe it's also helpful to others. At the moment the status is usuable, but 
only those cases are handled, that I needed so far.
   
   The basic idea is to allow record batches to be used as a "file format" for 
serde. A typical way of using serde is to take some Rust structs, to implement 
serialize, and then to generate JSON from these structs. `serde_arrow` allows 
you to use the same structs, but to generate record batches. So you don't have 
to use the `arrow` builder API, but can simply derive `serde::Serialize` and 
then call `serde_arrow::to_record_batch`. The reverse process, reading structs 
from a record batch is also supported. One complication is, that the data 
models of `serde` and `arrow` do not match. For that reason `serde_arrow` 
offers additional logic to let the user specify how to translate the `serde` 
data model into `arrow` one and back again.
   
   For eaxmple:
   
   ```rust
   #[derive(Serialize)]
   struct Example {
       a: f32,
       b: i32,
   }
   
   let examples = vec![
       Example { a: 1.0, b: 1 },
       Example { a: 2.0, b: 2 },
   ];
   
   let schema = serde_arrow::Schema::from_records(&examples)?;
   let batch = serde_arrow::to_record_batch(&examples, schema)?;
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-rs] chmp commented on pull request #1383: Add write method to Json Writer

Reply via email to