Very fair, thanks! Should/can this approach be applied to other structured formats in the 1.x timeframe? YAML for example seems to have a manageable size in practice but there's nothing limiting its use in other domains and resulting in larger file(s).
Sent from my iPhone > On Oct 17, 2015, at 4:51 PM, Julian Hyde <[email protected]> wrote: > > Yes, frankly, performance is a concern. But there are also many > concerns about to fit a deep XML document into Drill's very > json-centric model. Building a good XML adapter is a very big task. My > hunch is that we should not let the perfect be the enemy of the good. > Build a version 1 XML adapter based on an XML-to-JSON converter and it > will give us plenty of ideas for what the "perfect" adapter in version > 2 should look like. > >> On Sat, Oct 17, 2015 at 1:43 PM, Matt Burgess <[email protected]> wrote: >> If the converter is clean and performant then I'm sure the community >> (including me) is interested :) >> >> However I wonder if Drill can afford to add a translation layer between data >> formats, could we be better served with similar parsing in Drill for XML as >> we do for JSON, or can it be pushed down far enough (to the parser) to not >> make a noticeable difference (which is what I think Julian is implying)? >> >> Sent from my iPhone >> >>> On Oct 17, 2015, at 1:41 PM, Magnus Pierre <[email protected]> wrote: >>> >>> Hello, >>> >>> Just wrote a simple sax implementation that converts xml to json and that >>> is able to deal with decently complex xml's, that I currently use in Storm. >>> Takes attributes, and everything. >>> >>> I can share it with the community if interesting. >>> >>> /Magnus >>> Den 17 okt 2015 7:02 em skrev "Julian Hyde" <[email protected]>: >>> >>>> Seems to me the biggest problem is to make drill understand the nested >>>> structure of an xml document. That work has been done for json, so let's >>>> build on it. Suppose there was a translator that converted xml to json >>>> (adding attributes for things that json lacks, such as namespaces, text, >>>> element tags). Drill knows how to handle json, even if it is a bit verbose. >>>> The translator could be applied on the fly. >>>> >>>> Julian >>>> >>>> >>>> >>>> Sent from my iPad >>>>>> On Oct 16, 2015, at 2:31 PM, Stefán Baxter <[email protected]> >>>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> It's not possible but there has been some talk here about supporting it. >>>>> If I remember correctly it's rather complicated and not really feasible. >>>>> (I'm just a newbie so don't take my words for it) >>>>> >>>>> >>>>> Regards, >>>>> -Stefan >>>>> >>>>> On Fri, Oct 16, 2015 at 8:54 PM, Daniel Ajo <[email protected] >>>>> >>>>> wrote: >>>>> >>>>>> Hey there, >>>>>> >>>>>> I was wondering if it is possible to query XML files using Apache Drill? >>>>>> >>>>>> I see there are several formats, and maybe it would work using an xpath >>>>>> query of some sorts, but just wondering if it would work to directly >>>> query >>>>>> it using some sort of plug-in. >>>>>> >>>>>> Well, let me know, >>>>>> >>>>>> Daniel Ajo >>>>>> ********************************************************* >>>> CONFIDENTIALITY >>>>>> NOTE: This electronic transmission contains information belonging to >>>> Abarca >>>>>> Health LLC, which is confidential or legally privileged. If you are not >>>> the >>>>>> intended recipient, please immediately advise the sender by reply >>>> e-mail or >>>>>> telephone that this message has been inadvertently transmitted to you >>>> and >>>>>> delete this e-mail from your system. If you have received this >>>> transmission >>>>>> in error, you are hereby notified that any disclosure, copying, >>>>>> distribution or the taking of any action in reliance on the contents of >>>> the >>>>>> information is strictly prohibited. >>>>
