One thing I'll add is that if you have nested data, Drill will allow you to query the data using both dotted notation and bracket notation. IE:
SELECT foo.bar FROM.... AND SELECT foo['bar'] FROM Both will work, however, I STRONGLY recommend using the bracket notation. I've found that the dotted notation will cause some strange and inconsistent errors. The bracket notation is much more reliable. Best, -- C > On Aug 18, 2023, at 5:00 PM, Mike Beckerle <mbecke...@apache.org> wrote: > > I'm using Apache Daffodil in the mode where it outputs JSON data. (For the > moment, until we build a tighter integration. This is my conceptual test > framework for that integration.) > > I have parsed data to create this JSON which represents 2-level nested > repeating subrecords. > > All the simple fields are int. > > [{"a":1, "b":2, "c":[{"d":3, "e":4, "f":[{"g":5, "h":6 }, > {"g":7, "h":8 }]}, > {"d":9, "e":10, "f":[{"g":11, "h":12}, > {"g":13, "h":14}]}]}, > {"a":21, "b":22, "c":[{"d":23, "e":24, "f":[{"g":25, "h":26 }, > {"g":27, "h":28 }]}, > {"d":29, "e":30, "f":[{"g":31, "h":32}, > {"g":33, "h":34}]}]}] > > So, the top level is a vector of maps, > within that, field "c" is a vector of maps, > and within "c" is a field f which is a vector of maps. > > The reason I created this is I'm trying to understand the arrays and how > they work with Drill SQL. > > I'm trying to figure out how to get this rowset of 3 rows from a query, and > I'm stumped. > > a b d e g h > ( 1, 2, 3, 4, 5, 6) > ( 1, 2, 9, 10, 13, 14) > (21, 22, 29, 30, 33, 34) > > This is the SQL that is my conceptual framework, but I'm sure it won't work. > > SELECT a, b, c.d AS d, c.e AS e, c.f.g AS g, c.f.h AS h > FROM ... the json file... > WHERE g mod 10 == 3 OR g == 5 > > But I know it's not going to be that easy to get the query to traverse the > vector inside the vector. > > From the doc, the FLATTEN operator seems to be needed, but I can't really > figure it out. > > This is what all my data is like. Trees of nested vectors of sub-records. > > Can anyone advise on what the SQL might look like, or where there's an > example doing something like this I can learn from? > > Thanks for any help > > Mike Beckerle > Apache Daffodil PMC | daffodil.apache.org > OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl > Owl Cyber Defense | www.owlcyberdefense.com
signature.asc
Description: Message signed with OpenPGP