jecsand838 commented on code in PR #8006:
URL: https://github.com/apache/arrow-rs/pull/8006#discussion_r2253530422
##########
arrow-avro/src/reader/mod.rs:
##########
@@ -124,23 +132,26 @@ fn read_header<R: BufRead>(mut reader: R) ->
Result<Header, ArrowError> {
/// A low-level interface for decoding Avro-encoded bytes into Arrow
`RecordBatch`.
#[derive(Debug)]
pub struct Decoder {
- record_decoder: RecordDecoder,
+ active_decoder: RecordDecoder,
+ active_fingerprint: Option<Fingerprint>,
batch_size: usize,
- decoded_rows: usize,
+ remaining_capacity: usize,
+ #[cfg(feature = "lru")]
+ cache: LruCache<Fingerprint, RecordDecoder>,
+ #[cfg(not(feature = "lru"))]
+ cache: IndexMap<Fingerprint, RecordDecoder>,
+ max_cache_size: usize,
+ reader_schema: Option<AvroSchema<'static>>,
+ writer_schema_store: Option<SchemaStore<'static>>,
Review Comment:
@scovich I just pushed up an attempt at handling the schemas internally. I
managed to get rid of all of the `static` usage. Let me know what you think.
It's built around this new `AvroSchema` type in `schema.rs`:
```rust
/// A wrapper for an Avro schema in its JSON string representation.
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct AvroSchema {
/// The Avro schema as a JSON string.
pub json_string: String,
}
impl AvroSchema {
/// Creates a new `AvroSchema` from a JSON string.
pub fn new(json_string: String) -> Self {
Self { json_string }
}
/// Deserializes and returns the `AvroSchema`.
///
/// The returned schema borrows from `self`.
pub fn schema(&self) -> Result<Schema<'_>, ArrowError> {
serde_json::from_str(self.json_string.as_str())
.map_err(|e| ArrowError::ParseError(format!("Invalid Avro schema
JSON: {e}")))
}
/// Returns the Rabin fingerprint of the schema.
pub fn fingerprint(&self) -> Result<Fingerprint, ArrowError> {
generate_fingerprint_rabin(&self.schema()?)
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]