chenhao-db commented on code in PR #47920:
URL: https://github.com/apache/spark/pull/47920#discussion_r1736844899


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/variant/VariantExpressionEvalUtilsSuite.scala:
##########
@@ -89,6 +89,12 @@ class VariantExpressionEvalUtilsSuite extends SparkFunSuite {
       /* offset list */ 0, 2, 4, 6,
       /* field data */ primitiveHeader(INT1), 1, primitiveHeader(INT1), 2, 
shortStrHeader(1), '3'),
       Array(VERSION, 3, 0, 1, 2, 3, 'a', 'b', 'c'))
+    check("""{"a": 1, "b": 2, "c": "3", "a": 4}""", Array(objectHeader(false, 
1, 1),

Review Comment:
   I agree that a JSON object is invalid if it contains duplicate keys. 
However, it is not required that our implementation must throw an error for 
this invalid input. As stated in the RFC:
   
   > Many implementations report the last name/value pair only.  Other 
implementations report an error or fail to parse the object, and some 
implementations report all of the name/value pairs, including duplicates.
   
   It seems fair to follow the "many implementations".
   
   As a side note, `from_json` also takes the last-win policy rather than throw 
an error.
   
   ```
   spark-sql (default)> select from_json('{"a": 1, "a": 2, "a": 3}', 'a int');
   {"a":3}
   Time taken: 1.164 seconds, Fetched 1 row(s)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to