martin-g commented on code in PR #293:
URL: https://github.com/apache/avro-rs/pull/293#discussion_r2357582753


##########
avro/README.md:
##########
@@ -748,6 +748,50 @@ registered and used!
 
 <!-- cargo-rdme end -->
 
+
+### Deserializing Avro Byte Arrays

Review Comment:
   Thanks!
   
   Two minor suggestions:
   * update the sample to show serialization to `Vec<u8>` and then deserialize 
it. As I suggested for the IT test
   * The README is auto-generated from the Rustdoc in lib.rs. Please move this 
new documentation to lib.rs (or leave it here and we will do it before merging 
the PR).



##########
avro/tests/avro-rs-285-bytes_deserialization.rs:
##########
@@ -0,0 +1,76 @@
+use apache_avro::{from_value, Reader};
+use serde::{Serialize,Deserialize};
+use std::fs::File;
+use std::io::BufReader;
+
+
+//UPDATE: For byte deserialization to work, you need to add the serde 
attribute #[serde(with = "apache_avro::serde_avro_bytes_opt")] in this case. 
There are a lot of other options as well documented in bytes.rs
+
+
+// This is the schema that was used to write
+// schema = {
+//     "type": "record",
+//     "name": "SimpleRecord",
+//     "fields": [
+//         {"name": "data_bytes", "type": ["null", "bytes"], "default": None},
+//         {"name": "description", "type": ["null", "string"], "default": None}
+//     ]
+// }
+
+
+// Here is an example struct that matches the schema, and another with 
filtered out byte array field
+// The reason this is very useful is that in extremely large deeply nested 
avro files, structs mapped to grab fields of interest in deserialization
+// is really effecient and effective. The issue is that when I'm trying to 
deserialize a byte array field I get the error below no matter how I approach.
+// Bytes enum under value doesn't implement Deserialize in that way so I can't 
just make it a Value::Bytes
+
+#[derive(Debug, Deserialize, Serialize, Clone)]
+

Review Comment:
   You will need to run `cargo fmt --all`



##########
avro/tests/avro-rs-285-bytes_deserialization.rs:
##########
@@ -0,0 +1,76 @@
+use apache_avro::{from_value, Reader};
+use serde::{Serialize,Deserialize};
+use std::fs::File;
+use std::io::BufReader;
+
+
+//UPDATE: For byte deserialization to work, you need to add the serde 
attribute #[serde(with = "apache_avro::serde_avro_bytes_opt")] in this case. 
There are a lot of other options as well documented in bytes.rs
+
+
+// This is the schema that was used to write
+// schema = {
+//     "type": "record",
+//     "name": "SimpleRecord",
+//     "fields": [
+//         {"name": "data_bytes", "type": ["null", "bytes"], "default": None},
+//         {"name": "description", "type": ["null", "string"], "default": None}
+//     ]
+// }
+
+
+// Here is an example struct that matches the schema, and another with 
filtered out byte array field
+// The reason this is very useful is that in extremely large deeply nested 
avro files, structs mapped to grab fields of interest in deserialization
+// is really effecient and effective. The issue is that when I'm trying to 
deserialize a byte array field I get the error below no matter how I approach.
+// Bytes enum under value doesn't implement Deserialize in that way so I can't 
just make it a Value::Bytes
+
+#[derive(Debug, Deserialize, Serialize, Clone)]
+
+struct ExampleByteArray{
+    
+    
+    //update I have discovered that this is the fix
+    #[serde(with = "apache_avro::serde_avro_bytes_opt")]
+    data_bytes: Option<Vec<u8>>,
+    description: Option<String>
+}
+
+
+#[derive(Debug, Deserialize, Serialize, Clone)]
+struct ExampleByteArrayFiltered{
+    description: Option<String>
+}
+
+#[test]
+fn avro_rs_285_bytes_deserialization_failure(){
+
+    // Load the example file into reader
+    let file = 
File::open("./tests/avro-rs-285-bytes_deserialization.avro".to_string()).unwrap();
+    let reader = BufReader::new(file);
+    let avro_reader = Reader::new(reader).unwrap();
+
+
+    // attempt to deserialize into struct with byte array field
+    for value in avro_reader{
+        let value = value.unwrap();
+        let deserialized = from_value::<ExampleByteArray>(&value).unwrap();
+        println!("{:?}", deserialized);
+    }
+
+}
+
+#[test]
+fn avro_rs_285_bytes_deserialization_pass_when_filtered(){
+
+    // Load the example file into reader
+    let file = 
File::open("./tests/avro-rs-285-bytes_deserialization.avro".to_string()).unwrap();

Review Comment:
   How about extend the test to serialize first (into a Vec) and then 
deserialize it ?
   This way we will show how to do a full roundtrip of serde for bytes. And 
there won't be a need to commit a binary file that is deserialized.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to