jecsand838 commented on code in PR #8349:
URL: https://github.com/apache/arrow-rs/pull/8349#discussion_r2370094415
##########
arrow-avro/src/reader/record.rs:
##########
@@ -1518,19 +1104,340 @@ impl Decoder {
.map_err(|e| ArrowError::ParseError(e.to_string()))?;
Arc::new(vals)
}
- Self::Union(fields, type_ids, offsets, encodings, _, None) => {
- flush_union!(fields, type_ids, offsets, encodings)
- }
- Self::Union(fields, type_ids, offsets, encodings, _,
Some(union_resolution)) => {
- match &mut union_resolution.kind {
- UnionResolvedKind::Both { .. } |
UnionResolvedKind::FromSingle { .. } => {
- flush_union!(fields, type_ids, offsets, encodings)
- }
- UnionResolvedKind::ToSingle { target } =>
target.flush(nulls)?,
+ Self::Union(u) => u.flush(nulls)?,
+ })
+ }
+}
+
+#[derive(Debug)]
+struct DispatchLut {
+ to_reader: Box<[i16]>,
+ promotion: Box<[Promotion]>,
+}
+
+impl DispatchLut {
+ fn from_writer_to_reader(promotion_map: &[Option<(usize, Promotion)>]) ->
Self {
+ let mut to_reader = Vec::with_capacity(promotion_map.len());
+ let mut promotion = Vec::with_capacity(promotion_map.len());
+ for map in promotion_map {
+ match *map {
+ Some((idx, promo)) => {
+ debug_assert!(idx <= i16::MAX as usize);
+ to_reader.push(idx as i16);
Review Comment:
Oh the only significance is that `UnionFields` has `i8` type ids (max of 127
types). I figured `i16` was a good choice to compactly the Lookup Table and
allow the `-1` sentinel (to avoid needing an `Option`) since it gives some
headroom while still being 4× smaller than `usize` on 64‑bit.
There is no 32k limit in Avro that we are relying on.
If you’d prefer to squeeze the table further we could probably just switch
`to_reader` to `i8` (still using `-1` as the sentinel) OR I can also just go
with another approach if you think there's an upside to it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]