scovich commented on code in PR #7833:
URL: https://github.com/apache/arrow-rs/pull/7833#discussion_r2187208239


##########
parquet-variant/src/builder.rs:
##########
@@ -237,18 +237,37 @@ impl ValueBuffer {
 struct MetadataBuilder {
     // Field names -- field_ids are assigned in insert order
     field_names: IndexSet<String>,
+
+    // flag that checks if field names by insertion order are also 
lexicographically sorted
+    is_sorted: bool,
 }
 
 impl MetadataBuilder {
     /// Upsert field name to dictionary, return its ID
     fn upsert_field_name(&mut self, field_name: &str) -> u32 {
-        let (id, _) = self.field_names.insert_full(field_name.to_string());
+        let (id, new_entry) = 
self.field_names.insert_full(field_name.to_string());
+
+        if new_entry {
+            let n = self.num_field_names();
+
+            self.is_sorted =
+                n == 1 || self.is_sorted & (self.field_names[n - 2] < 
self.field_names[n - 1]);

Review Comment:
   ```suggestion
               self.is_sorted =
                   n == 1 || self.is_sorted && (self.field_names[n - 2] < 
self.field_names[n - 1]);
   ```
   (`&&` has short-circuit behavior and avoids the string comparison when we 
already broke sorting)



##########
parquet-variant/src/builder.rs:
##########
@@ -237,18 +237,37 @@ impl ValueBuffer {
 struct MetadataBuilder {
     // Field names -- field_ids are assigned in insert order
     field_names: IndexSet<String>,
+
+    // flag that checks if field names by insertion order are also 
lexicographically sorted
+    is_sorted: bool,
 }
 
 impl MetadataBuilder {
     /// Upsert field name to dictionary, return its ID
     fn upsert_field_name(&mut self, field_name: &str) -> u32 {
-        let (id, _) = self.field_names.insert_full(field_name.to_string());
+        let (id, new_entry) = 
self.field_names.insert_full(field_name.to_string());
+
+        if new_entry {
+            let n = self.num_field_names();
+
+            self.is_sorted =
+                n == 1 || self.is_sorted & (self.field_names[n - 2] < 
self.field_names[n - 1]);

Review Comment:
   Also, a code comment might help here, there's a lot to unpack:
   * An empty dictionary is unsorted (ambiguous in spec but required by interop 
tests)
   * A single-entry dictionary is trivially sorted
   * Otherwise, an already-sorted dictionary becomes unsorted if the new entry 
breaks order



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to