Hi,

see https://www.mail-archive.com/dev@spark.apache.org/msg03520.html for one
solution.

One issue with those XML files is that they cannot be processed line by
line in parallel; plus you inherently need shared/global state to parse XML
or check for well-formedness, I think. (Same issue with multi-line JSON, by
the way.)

Tobias

Reply via email to