scovich commented on code in PR #7833: URL: https://github.com/apache/arrow-rs/pull/7833#discussion_r2187208239
########## parquet-variant/src/builder.rs: ########## @@ -237,18 +237,37 @@ impl ValueBuffer { struct MetadataBuilder { // Field names -- field_ids are assigned in insert order field_names: IndexSet<String>, + + // flag that checks if field names by insertion order are also lexicographically sorted + is_sorted: bool, } impl MetadataBuilder { /// Upsert field name to dictionary, return its ID fn upsert_field_name(&mut self, field_name: &str) -> u32 { - let (id, _) = self.field_names.insert_full(field_name.to_string()); + let (id, new_entry) = self.field_names.insert_full(field_name.to_string()); + + if new_entry { + let n = self.num_field_names(); + + self.is_sorted = + n == 1 || self.is_sorted & (self.field_names[n - 2] < self.field_names[n - 1]); Review Comment: ```suggestion self.is_sorted = n == 1 || self.is_sorted && (self.field_names[n - 2] < self.field_names[n - 1]); ``` (`&&` has short-circuit behavior and avoids the string comparison when we already broke sorting) ########## parquet-variant/src/builder.rs: ########## @@ -237,18 +237,37 @@ impl ValueBuffer { struct MetadataBuilder { // Field names -- field_ids are assigned in insert order field_names: IndexSet<String>, + + // flag that checks if field names by insertion order are also lexicographically sorted + is_sorted: bool, } impl MetadataBuilder { /// Upsert field name to dictionary, return its ID fn upsert_field_name(&mut self, field_name: &str) -> u32 { - let (id, _) = self.field_names.insert_full(field_name.to_string()); + let (id, new_entry) = self.field_names.insert_full(field_name.to_string()); + + if new_entry { + let n = self.num_field_names(); + + self.is_sorted = + n == 1 || self.is_sorted & (self.field_names[n - 2] < self.field_names[n - 1]); Review Comment: Also, a code comment might help here, there's a lot to unpack: * An empty dictionary is unsorted (ambiguous in spec but required by interop tests) * A single-entry dictionary is trivially sorted * Otherwise, an already-sorted dictionary becomes unsorted if the new entry breaks order -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org