alamb commented on code in PR #7934:
URL: https://github.com/apache/arrow-rs/pull/7934#discussion_r2213423122


##########
parquet-variant/src/variant/object.rs:
##########
@@ -244,16 +252,22 @@ impl<'m, 'v> VariantObject<'m, 'v> {
                 // to check lexicographical order
                 //
                 // Since we are probing the metadata dictionary by field id, 
this also verifies field ids are in-bounds
-                let are_field_names_sorted = field_ids
-                    .iter()
-                    .map(|&i| self.metadata.get(i))
-                    .collect::<Result<Vec<_>, _>>()?
-                    .is_sorted();
-
-                if !are_field_names_sorted {
-                    return Err(ArrowError::InvalidArgumentError(
-                        "field names not sorted".to_string(),
-                    ));
+                let mut current_field_name = match field_ids_iter.next() {

Review Comment:
   Got it -- I re-read the spec and I was confused and agree this check is 
doign the right thing
   
   The field ids and field offsets must be in lexicographical order of the 
corresponding field names in the metadata dictionary. However, the actual value 
entries do not need to be in any particular order. This implies that the 
field_offset values may not be monotonically increasing. For example, for the 
following object:



##########
parquet-variant/src/variant/object.rs:
##########
@@ -244,16 +252,22 @@ impl<'m, 'v> VariantObject<'m, 'v> {
                 // to check lexicographical order
                 //
                 // Since we are probing the metadata dictionary by field id, 
this also verifies field ids are in-bounds
-                let are_field_names_sorted = field_ids
-                    .iter()
-                    .map(|&i| self.metadata.get(i))
-                    .collect::<Result<Vec<_>, _>>()?
-                    .is_sorted();
-
-                if !are_field_names_sorted {
-                    return Err(ArrowError::InvalidArgumentError(
-                        "field names not sorted".to_string(),
-                    ));
+                let mut current_field_name = match field_ids_iter.next() {

Review Comment:
   Got it -- I re-read the spec and I was confused and agree this check is 
doign the right thing
   
   > The field ids and field offsets must be in lexicographical order of the 
corresponding field names in the metadata dictionary. However, the actual value 
entries do not need to be in any particular order. This implies that the 
field_offset values may not be monotonically increasing. For example, for the 
following object:



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to