Mike,

Just to echo Charles, thanks for the work; sounds like you are making good
progress.

The question you asked is tricky. Charles is right, the type of the data
structure is a map. The output you showed appears to be from  the sqlline
tool. If so, then it helps to understand that sqlline "cheats" by
converting maps to strings for display, making it look like you have a
string column.

Also, remember that Drill uses the standard JSON structure internally, just
as you described. However, referencing any column projects it to the top
level. Clients don't understand complex JSON types (maps, arrays, etc.
Sqlline compensates by converting the data to strings for display.

- Paul

On Tue, Oct 10, 2023 at 12:55 PM Charles Givre <cgi...@gmail.com> wrote:

> Hi Mike,
> Thanks for all the work you are doing on Drill.
>
> To answer your question, sub1 should be treated as a map in Drill.  You
> can verify this with the following query:
>
> SELECT drillTypeOf(sub1) FROM...
>
> In general, I'm pretty sure that Drill doesn't output strings that look
> like JSON objects unless they actually are complex objects.
>
> Take a look here for data type functions:
> https://drill.apache.org/docs/data-type-functions/
> Best,
> -- C
>
>
> > On Oct 10, 2023, at 7:56 AM, Mike Beckerle <mbecke...@apache.org> wrote:
> >
> > I am trying to understand the options for populating Drill data from a
> > Daffodil data parse.
> >
> > Suppose you have this JSON
> >
> > {"parent": { "sub1": { "a1":1, "a2":2}, sub2:{"b1":3, "b2":4, "b3":5}}}
> >
> > or this equivalent XML:
> >
> > <parent>
> >  <sub1><a1>1</a1><a2>2</a2></sub1>
> >  <sub2><b1>3</b1><b2>4</b2><b3>5</b3></sub2>
> > </parent>
> >
> > Unlike those texts, Daffodil is going to have a tree data structure
> where a
> > parent node contains two child nodes sub1 and sub2, and each of those has
> > children a1, a2, and b1, b2, b3 respectively.
> > It's analogous roughly to the DOM tree of the XML, or the tree of nested
> > JSON map nodes you'd get back from a JSON parse of that text.
> >
> > In Drill to query the JSON like:
> >
> > select parent.sub1 from myStructure
> >
> > gives you back single column containing what seems to be a string like
> >
> > |        sub1        |
> > ----------------------
> > | { "a1":1, "a2":2}  |
> >
> > So, my question is this. Is this actually a string in Drill, (what is the
> > type of sub1?) or is sub1 actually a Drill data row/map node value with
> two
> > node children, that just happens to print out looking like a JSON string?
> >
> > Thanks for any insight here.
> >
> > Mike Beckerle
> > Apache Daffodil PMC | daffodil.apache.org
> > OGF DFDL Workgroup Co-Chair |
> www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
> > Owl Cyber Defense | www.owlcyberdefense.com
>
>

Reply via email to