jecsand838 commented on code in PR #8006: URL: https://github.com/apache/arrow-rs/pull/8006#discussion_r2252735697
########## arrow-avro/src/reader/mod.rs: ########## @@ -124,23 +132,26 @@ fn read_header<R: BufRead>(mut reader: R) -> Result<Header, ArrowError> { /// A low-level interface for decoding Avro-encoded bytes into Arrow `RecordBatch`. #[derive(Debug)] pub struct Decoder { - record_decoder: RecordDecoder, + active_decoder: RecordDecoder, + active_fingerprint: Option<Fingerprint>, batch_size: usize, - decoded_rows: usize, + remaining_capacity: usize, + #[cfg(feature = "lru")] + cache: LruCache<Fingerprint, RecordDecoder>, + #[cfg(not(feature = "lru"))] + cache: IndexMap<Fingerprint, RecordDecoder>, + max_cache_size: usize, + reader_schema: Option<AvroSchema<'static>>, + writer_schema_store: Option<SchemaStore<'static>>, Review Comment: > The main concern for this PR is: > > > Does that [static lifetimes here] mean memory they [the passed-in schemas] reference must leak for all practical purposes? I haven't had to resort to any memory leaks in `arrow-avro`. The `AvroField` logic is also bound by the same lifetime and the `Schema` is only used to create a root `AvroField` with a `Codec` which in turn is then used to create a `RecordDecoder`. I haven't had to resort to any `box::leak` OR reference cycles and I was careful to add bounding to the cache. Also inside of the `Decoder` when making a new `RecordDecoder` I don't resort to using `clone` on either of the schemas and no new `Schema` are created either. Each `RecordDecoder` can only use the same `reader_schema` and set of `writer_schema`. If I'm missing something however I apologize in advance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org