houqp commented on a change in pull request #9412:
URL: https://github.com/apache/arrow/pull/9412#discussion_r571374068
##########
File path: rust/arrow/test/data/mixed_arrays.json
##########
@@ -1,4 +1,4 @@
-{"a":1, "b":[2.0, 1.3, -6.1], "c":[false, true], "d":4.1}
+{"a":1, "b":[2.0, 1.3, -6.1], "c":[false, true], "d":["4.1"]}
{"a":-10, "b":[2.0, 1.3, -6.1], "c":null, "d":null}
-{"a":2, "b":[2.0, null, -6.1], "c":[false, null], "d":"text"}
-{"a":3, "b":4, "c": true, "d":[1, false, "array", 2.4]}
+{"a":2, "b":[2.0, null, -6.1], "c":[false, null], "d":["text"]}
+{"a":3, "b":[], "c": [], "d":["array"]}
Review comment:
@nevi-me i just tested it with spark, looks like it's doing the
conversion the other way, i.e. when a column contains both scalar and list
values, it gets converted to string type. as a result, boolean lists are parsed
into `"[true, false]"` string.
I don't have a strong opinion on this, but if we want to match spark, then
we should probably go that fallback to string if incompatible types are
detected. What do you think? @nevi-me @jorgecarleitao @andygrove .
##########
File path: rust/arrow/test/data/mixed_arrays.json
##########
@@ -1,4 +1,4 @@
-{"a":1, "b":[2.0, 1.3, -6.1], "c":[false, true], "d":4.1}
+{"a":1, "b":[2.0, 1.3, -6.1], "c":[false, true], "d":["4.1"]}
{"a":-10, "b":[2.0, 1.3, -6.1], "c":null, "d":null}
-{"a":2, "b":[2.0, null, -6.1], "c":[false, null], "d":"text"}
-{"a":3, "b":4, "c": true, "d":[1, false, "array", 2.4]}
+{"a":2, "b":[2.0, null, -6.1], "c":[false, null], "d":["text"]}
+{"a":3, "b":[], "c": [], "d":["array"]}
Review comment:
@nevi-me i just tested it with spark, looks like it's doing the
conversion the other way, i.e. when a column contains both scalar and list
values, it gets converted to string type. as a result, boolean lists are parsed
into `"[true, false]"` string.
I don't have a strong opinion on this, but if we want to match spark, then
we should probably go that fallback to string if incompatible types are
detected. What do you think? @nevi-me @jorgecarleitao @andygrove @alamb .
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]