jecsand838 commented on code in PR #7954: URL: https://github.com/apache/arrow-rs/pull/7954#discussion_r2214304249
########## arrow-avro/src/reader/record.rs: ########## @@ -301,9 +301,23 @@ impl Decoder { } Codec::Uuid => Self::Uuid(Vec::with_capacity(DEFAULT_CAPACITY)), }; - Ok(match data_type.nullability() { - Some(nullability) => Self::Nullable( - nullability, + let union_order = match data_type.nullability() { + None => None, + Some(Nullability::NullFirst) => Some(Nullability::NullFirst), + Some(Nullability::NullSecond) => { + if strict_mode { + return Err(ArrowError::ParseError( + "Found Avro union of the form ['T','null'], which is disallowed in strict_mode" + .to_string(), + )); + } + Some(Nullability::NullSecond) + } + }; Review Comment: @veronica-m-ef After looking through this again, I'm beginning to think it may make more sense to move this check to the `Schema::Union(f)` branch inside the `make_data_type` method in codec.rs. That way we can determine whether the Avro Schema is valid when `strict_mode` is `true` before attempting to construct the `RecordDecoder`. By doing this we'd get an error earlier on line 224 in mod.rs, i.e. `let root_field = AvroField::try_from(schema)?;` You'd need to find a way to pass down `strict_mode` into `AvroField` though. Maybe we could implement an `AvroFieldBuilder` in codec.rs containing `with_utf8view` and `with_strict_mode` methods to achieve this? Then in the `make_data_type` method we could do: ```rust Schema::Union(f) => { // Special case the common case of nullable primitives let null = f .iter() .position(|x| x == &Schema::TypeName(TypeName::Primitive(PrimitiveType::Null))); match (f.len() == 2, null) { (true, Some(0)) => { let mut field = make_data_type(&f[1], namespace, resolver, use_utf8view, strict_mode)?; field.nullability = Some(Nullability::NullFirst); Ok(field) } (true, Some(1)) => { if strict_mode { return Err(ArrowError::SchemaError( "Found Avro union of the form ['T','null'], which is disallowed in strict_mode" .to_string(), )); } let mut field = make_data_type(&f[0], namespace, resolver, use_utf8view, strict_mode)?; field.nullability = Some(Nullability::NullSecond); Ok(field) } _ => Err(ArrowError::NotYetImplemented(format!( "Union of {f:?} not currently supported" ))), } } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org