Well, very few lines of code imho. And simple. Been able to parse pretty deep structures with no issues so far. Performance? 10-15 5mb xml's in less than a second on my laptop but then I run it using Storm with some parallelism in place. Don't know if it's good or bad. I'll share the code next time I use computer. You don't need to use it, but it works at least.
/M Den 17 okt 2015 10:43 em skrev "Matt Burgess" <[email protected]>: > If the converter is clean and performant then I'm sure the community > (including me) is interested :) > > However I wonder if Drill can afford to add a translation layer between > data formats, could we be better served with similar parsing in Drill for > XML as we do for JSON, or can it be pushed down far enough (to the parser) > to not make a noticeable difference (which is what I think Julian is > implying)? > > Sent from my iPhone > > > On Oct 17, 2015, at 1:41 PM, Magnus Pierre <[email protected]> wrote: > > > > Hello, > > > > Just wrote a simple sax implementation that converts xml to json and that > > is able to deal with decently complex xml's, that I currently use in > Storm. > > Takes attributes, and everything. > > > > I can share it with the community if interesting. > > > > /Magnus > > Den 17 okt 2015 7:02 em skrev "Julian Hyde" <[email protected]>: > > > >> Seems to me the biggest problem is to make drill understand the nested > >> structure of an xml document. That work has been done for json, so let's > >> build on it. Suppose there was a translator that converted xml to json > >> (adding attributes for things that json lacks, such as namespaces, text, > >> element tags). Drill knows how to handle json, even if it is a bit > verbose. > >> The translator could be applied on the fly. > >> > >> Julian > >> > >> > >> > >> Sent from my iPad > >>>> On Oct 16, 2015, at 2:31 PM, Stefán Baxter <[email protected] > > > >>> wrote: > >>> > >>> Hi, > >>> > >>> It's not possible but there has been some talk here about supporting > it. > >>> If I remember correctly it's rather complicated and not really > feasible. > >>> (I'm just a newbie so don't take my words for it) > >>> > >>> > >>> Regards, > >>> -Stefan > >>> > >>> On Fri, Oct 16, 2015 at 8:54 PM, Daniel Ajo < > [email protected] > >>> > >>> wrote: > >>> > >>>> Hey there, > >>>> > >>>> I was wondering if it is possible to query XML files using Apache > Drill? > >>>> > >>>> I see there are several formats, and maybe it would work using an > xpath > >>>> query of some sorts, but just wondering if it would work to directly > >> query > >>>> it using some sort of plug-in. > >>>> > >>>> Well, let me know, > >>>> > >>>> Daniel Ajo > >>>> ********************************************************* > >> CONFIDENTIALITY > >>>> NOTE: This electronic transmission contains information belonging to > >> Abarca > >>>> Health LLC, which is confidential or legally privileged. If you are > not > >> the > >>>> intended recipient, please immediately advise the sender by reply > >> e-mail or > >>>> telephone that this message has been inadvertently transmitted to you > >> and > >>>> delete this e-mail from your system. If you have received this > >> transmission > >>>> in error, you are hereby notified that any disclosure, copying, > >>>> distribution or the taking of any action in reliance on the contents > of > >> the > >>>> information is strictly prohibited. > >> >
