Re: [PR] GH-44758: [GLib] Add garrow_array_validate_full() [arrow]

via GitHub Fri, 24 Jan 2025 04:10:29 -0800


kou commented on code in PR #45342:
URL: https://github.com/apache/arrow/pull/45342#discussion_r1928583743



##########
c_glib/test/test-array.rb:
##########
@@ -202,4 +202,30 @@ def test_invalid
       end
     end
   end
+
+  sub_test_case("#validate_full") do
+    def test_valid
+      array = build_int32_array([1, 2, 3, 4, 5])
+      assert do
+        array.validate_full
+      end
+    end
+
+    def test_invalid
+      message = "[array][validate_full]: Invalid: Invalid UTF8 sequence at 
string index 0"
+
+      # UTF-8 string missing one byte.
+      data = Arrow::Buffer.new("\xe6\x97\xa5\xe6\x9c\xac\xe8\xaa")

Review Comment:
   Could you use `"\u65E5\u...".b[0..-8]` + `U+XXXX ...` for easy to maintain?
   
   See also:  
https://github.com/apache/arrow/blob/f4a63d41ebbc57566f215c1d1e87fc1647071dae/c_glib/test/test-buffer-input-stream.rb#L75-L76
   
   BTW, can we use more easy multibyte character than "日本語" something like "あ"?



##########
c_glib/test/test-array.rb:
##########
@@ -202,4 +202,30 @@ def test_invalid
       end
     end
   end
+
+  sub_test_case("#validate_full") do
+    def test_valid
+      array = build_int32_array([1, 2, 3, 4, 5])
+      assert do
+        array.validate_full
+      end
+    end
+
+    def test_invalid
+      message = "[array][validate_full]: Invalid: Invalid UTF8 sequence at 
string index 0"
+
+      # UTF-8 string missing one byte.
+      data = Arrow::Buffer.new("\xe6\x97\xa5\xe6\x9c\xac\xe8\xaa")
+      value_offsets = Arrow::Buffer.new([0, 8].pack("l*"))

Review Comment:
   Could you use `data.size` instead of `8`?



##########
c_glib/test/test-array.rb:
##########
@@ -202,4 +202,30 @@ def test_invalid
       end
     end
   end
+
+  sub_test_case("#validate_full") do
+    def test_valid
+      array = build_int32_array([1, 2, 3, 4, 5])
+      assert do
+        array.validate_full
+      end
+    end
+
+    def test_invalid
+      message = "[array][validate_full]: Invalid: Invalid UTF8 sequence at 
string index 0"
+
+      # UTF-8 string missing one byte.
+      data = Arrow::Buffer.new("\xe6\x97\xa5\xe6\x9c\xac\xe8\xaa")
+      value_offsets = Arrow::Buffer.new([0, 8].pack("l*"))
+      array = Arrow::StringArray.new(1,
+                            value_offsets,
+                            data,
+                            Arrow::Buffer.new([0b01].pack("C*")),
+                            -1)

Review Comment:
   Could you fix indent?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] GH-44758: [GLib] Add garrow_array_validate_full() [arrow]

Reply via email to