Inline On Sun, Oct 18, 2015 at 11:37 AM, Julian Hyde <jh...@apache.org> wrote:
> ... > My proposed “solution” — and I suspect you’re not going to like it — is to > ignore, for now, harder XML problems and focus on the easier ones. Hmm.... I think that this may or may not be easy. But it is real important. > A lot of XML documents do not have repeating scalar values. They are > collections of records, perhaps with nested records or nested collections > of records. The scalar-ness of my example was just a simplification. The same problem occurs every time there is a list that sometimes contains 1 element. > Whitespace can be safely thrown away. Namespaces are not used. Fine. > A lot of data is in XML format because XML was the only option considered, > not because the data structure pushed the limits of what XML’s rich model > can express. > True. > I think 90% of cases can be handled using a simple XML-to-JSON mapper that > takes hints such as that the “employee” tag is to become a list of JSON > maps and the “salary” and “name” tags are to be treated as attributes. > Great. The real question is whether or not the XML community already has such a hinting mechanism. Or is Drill about to reinvent that? > > I really think that if we focus on the harder cases we’ll end up with the > wrong solution. > No doubt. This isn't one of those.