Re: Spark XML ignore namespaces

Hyukjin Kwon Thu, 03 Nov 2016 22:01:48 -0700

Oh, that PR was actually about not concerning the namespaces (meaning
leaving data as they are, including prefixes).



The problem was, each partition needs to produce each record with knowing
the namesapces.

It is fine to deal with them if they are within each XML documentation
(represented as a row in dataframe) but

it becomes problematic if they are in the parent of each XML documentation
(represented as a row in dataframe).


There is an issue open for this,
https://github.com/databricks/spark-xml/issues/74

It'd be nicer if we have an option to enable/disable this if we can
properly support namespace handling.


We might be able to talk more there.



2016-11-04 6:37 GMT+09:00 Arun Patel <arunp.bigd...@gmail.com>:

> I see that 'ignoring namespaces' issue is resolved.
>
> https://github.com/databricks/spark-xml/pull/75
>
> How do we enable this option and ignore namespace prefixes?
>
> - Arun
>

Re: Spark XML ignore namespaces

Reply via email to