[ 
https://issues.apache.org/jira/browse/SPARK-46108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46108:
-----------------------------------
    Labels: pull-request-available  (was: )

> XML: keepInnerXmlAsRaw option
> -----------------------------
>
>                 Key: SPARK-46108
>                 URL: https://issues.apache.org/jira/browse/SPARK-46108
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: Ufuk Süngü
>            Priority: Major
>              Labels: pull-request-available
>
> Built-in XML data source gives related value and schema of the inner or 
> nested elements. However, additional operations should be made by developers 
> manually to convert unstructured data to structured, tabular format. If 
> nested elements are kept in a format that is suitable with XML (for each 
> level), we can convert them easily to a structured, tabular format with the 
> existing methods that have already been developed (infer method of 
> XmlInferSchema and parseColumn method of StaxXmlParser). Therefore there 
> should be an option that affects StaxXmlParser and InferSchema classes to 
> keep inner XML elements in their original or raw format.
> https://github.com/apache/spark/pull/44022



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to