Shujing Yang created SPARK-46382: ------------------------------------ Summary: XML: Capture values interspersed between elements Key: SPARK-46382 URL: https://issues.apache.org/jira/browse/SPARK-46382 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 4.0.0 Reporter: Shujing Yang
In XML, elements typically consist of a name and a value, with the value enclosed between the opening and closing tags. But XML also allows to include arbitrary values interspersed between these elements. To address this, we provide an option named `valueTags`, which is enabled by default, to capture these values. Consider the following example: ``` <ROW> <a>1</a> value1 <b> value2 <c>2</c> value3 </b> </ROW> ``` In this example, `<a>`,`<b>`, and `<c>` are named elements with their respective values enclosed within tags. There are arbitrary values value1 value2 value3 interspersed between the elements. Please note that there can be multiple occurrences of values in a single element (i.e. there are value2, value3 in the element <b>) We should parse the values between tags into the valueTags field. If there are multiple occurrences of value tags, the value tag field will be converted to an array type. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org