Cazen Lee created SPARK-12537: --------------------------------- Summary: Add option to accept quoting of all character backslash quoting mechanism Key: SPARK-12537 URL: https://issues.apache.org/jira/browse/SPARK-12537 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 1.5.2 Reporter: Cazen Lee
We can provides the option to choose JSON parser can be enabled to accept quoting of all character or not. For example, if JSON file that includes not listed by JSON backslash quoting specification, it returns corrupt_record JSON File <code> {"name": "Cazen Lee", "price": "$10"} {"name": "John Doe", "price": "\$20"} {"name": "Tracy", "price": "$10"} <code> <code> scala> df.show +--------------------+---------+-----+ | _corrupt_record| name|price| +--------------------+---------+-----+ | null|Cazen Lee| $10| |{"name": "John Do...| null| null| | null| Tracy| $10| +--------------------+---------+-----+ <code> And after apply this patch, we can enable allowBackslashEscapingAnyCharacter option like below <code> scala> val df = sqlContext.read.option("allowBackslashEscapingAnyCharacter", "true").json("/user/Cazen/test/test2.txt") df: org.apache.spark.sql.DataFrame = [name: string, price: string] scala> df.show +---------+-----+ | name|price| +---------+-----+ |Cazen Lee| $10| | John Doe| $20| | Tracy| $10| +---------+-----+ <code> This issue similar to HIVE-11825, HIVE-12717. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org