dzcxzl created SPARK-23603: ------------------------------ Summary: When the length of the json is in a range,get_json_object will result in missing tail data Key: SPARK-23603 URL: https://issues.apache.org/jira/browse/SPARK-23603 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.3.0, 2.2.0, 2.0.0 Reporter: dzcxzl
Jackson(>=2.7.7) fixes the possibility of missing tail data when the length of the value is in a range [https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.7.7] [https://github.com/FasterXML/jackson-core/issues/307] spark-shell: {code:java} val value = "x" * 3000 val json = s"""{"big": "$value"}""" spark.sql("select length(get_json_object(\'"+json+"\','$.big'))" ).collect res0: Array[org.apache.spark.sql.Row] = Array([2991]) {code} correct result : 3000 There are two solutions One is bump jackson version to 2.7.7 The other one is Replace writeRaw(char[] text, int offset, int len) with writeRaw(String text) -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org