Yousof Hosny created SPARK-47371: ------------------------------------ Summary: XML: Ignore row tags in CDATA Key: SPARK-47371 URL: https://issues.apache.org/jira/browse/SPARK-47371 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.0.0 Reporter: Yousof Hosny
The current parser does not recognize CDATA sections and thus will read row tags that are enclosed within a CDATA section. The expected behavior is for none of the following rows to be read, but they are all read. {code:java} // BUG: rowTag in CDATA section val xmlString="""<?xml version="1.0" encoding="UTF-8" ?> <test><![CDATA[ <elem id="1" /> <elem id="2" > </elem> <elem> <id>3</id> </elem> ]]> </test> {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org