Oh, that PR was actually about not concerning the namespaces (meaning leaving data as they are, including prefixes).
The problem was, each partition needs to produce each record with knowing the namesapces. It is fine to deal with them if they are within each XML documentation (represented as a row in dataframe) but it becomes problematic if they are in the parent of each XML documentation (represented as a row in dataframe). There is an issue open for this, https://github.com/databricks/spark-xml/issues/74 It'd be nicer if we have an option to enable/disable this if we can properly support namespace handling. We might be able to talk more there. 2016-11-04 6:37 GMT+09:00 Arun Patel <arunp.bigd...@gmail.com>: > I see that 'ignoring namespaces' issue is resolved. > > https://github.com/databricks/spark-xml/pull/75 > > How do we enable this option and ignore namespace prefixes? > > - Arun >