This depends how do you see the "batter" field. If you see it as an array, the query should be run as "select id, t.batters.batter[0].id"
But you can also view "batter" as a repeated map, which would mean "id" is a child field of this map, repeated a number of time and projecting " batters.batter.id" should return this field for each repeated instance. This is what MongoDB is doing. On Thu, Mar 10, 2016 at 7:54 PM, Jiang Wu <jiang...@numerxdata.com> wrote: > For the those who are curious, here is what Mongo returns for the same > conceptual query: > > > db.t1.find({},{"batters":1, "_id":0}); > { "batters" : { "batter" : [ { "id" : "1001", "type" : "Regular" }, { "id" > : "1002", "type" : "Chocolate" }, { "id" : "1003", "type" : "Blueberry" }, > { "id" : "1004", "type" : "Devil's Food" } ] } } > { "batters" : { "batter" : [ { "id" : "2001", "type" : "Regular" } ] } } > { "batters" : { "batter" : [ { "id" : "3001", "type" : "Regular" }, { "id" > : "3002", "type" : "Chocolate" } ] } } > > > db.t1.find({},{"batters.batter":1, "_id":0}); > { "batters" : { "batter" : [ { "id" : "1001", "type" : "Regular" }, { "id" > : "1002", "type" : "Chocolate" }, { "id" : "1003", "type" : "Blueberry" }, > { "id" : "1004", "type" : "Devil's Food" } ] } } > { "batters" : { "batter" : [ { "id" : "2001", "type" : "Regular" } ] } } > { "batters" : { "batter" : [ { "id" : "3001", "type" : "Regular" }, { "id" > : "3002", "type" : "Chocolate" } ] } } > > > db.t1.find({},{"batters.batter.id":1, "_id":0}); > { "batters" : { "batter" : [ { "id" : "1001" }, { "id" : "1002" }, { "id" > : "1003" }, { "id" : "1004" } ] } } > { "batters" : { "batter" : [ { "id" : "2001" } ] } } > { "batters" : { "batter" : [ { "id" : "3001" }, { "id" : "3002" } ] } } > > Each returned document is always a JSON document from the root. Hence, no > ambiguity in the last case. It is returning the inner most array with each > array element having a single map key. > > To produce the same conceptual results, all values in the inner most array > will need to returned for each row. > > -- Jiang > > -----Original Message----- > From: Neeraja Rentachintala [mailto:nrentachint...@maprtech.com] > Sent: Thursday, March 10, 2016 4:43 PM > To: user@drill.apache.org > Subject: Re: Question on nested JSON behavior > > Actually I agree with Jiang. The result does seem unintuitive. If it is a > file with just a list, it does still make sense to return the ids in that > list as an array unless the user has configured Drill to automatically > flatten the first level. > Does anyone know how does the other systems behave for this use case? (for > ex: Mongo) > > > > On Thu, Mar 10, 2016 at 4:21 PM, Nathan Griffith <ngriff...@dremio.com> > wrote: > > > Hi Jiang, > > > > Think of it this way: If you had a file that was just the list: > > > > {"id":"1001","type":"Regular"} > > {"id":"1002","type":"Chocolate"} > > {"id":"1003","type":"Blueberry"} > > {"id":"1004","type":"Devil's Food"} > > > > What would you like it to return when you query: > > > > select id from dfs.`/path/to/sample_file.json`; > > > > ? > > > > When you enter the query that you're asking about, you're indicating > > exactly that structure of data. Does this explanation make sense? > > > > Best, > > Nathan > > > > On Thu, Mar 10, 2016 at 4:07 PM, Jiang Wu <jiang...@numerxdata.com> > wrote: > > > Drill version: 1.4.0. Assuming 3 JSON objects with the following > > structure: > > > > > > { ... > > > "batters": > > > { > > > "batter": > > > [ > > > { "id": "1001", "type": "Regular" }, > > > { "id": "1002", "type": "Chocolate" }, > > > { "id": "1003", "type": "Blueberry" }, > > > { "id": "1004", "type": "Devil's Food" } > > > ] > > > }, > > > ... > > > } > > > > > > Now running a few sample queries against the above data: > > > > > > > > > A) select "batters" returns expected results, which are the values > > of "batters" from each row. > > > > > > 0: jdbc:drill:zk=local> select batters from > > > dfs.`c:\tmp\sample.json`; > > > +---------+ > > > | batters | > > > +---------+ > > > | > > {"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocola > > te"},{"id":"1003","type":"Blueberry"},{"id":"1004","type":"Devil's > > Food"}]} | > > > | {"batter":[{"id":"1001","type":"Regular"}]} | > > > | > > {"batter":[{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocola > > te"}]} > > | > > > +---------+ > > > 3 rows selected (0.243 seconds) > > > > > > > > > B) select "batters.batter" also returns the expected results, > which > > are the array values for "batters.batter" from each row. > > > > > > > > > 0: jdbc:drill:zk=local> select t.batters.batter from > > dfs.`c:\tmp\sample.json` t; > > > +--------+ > > > | EXPR$0 | > > > +--------+ > > > | > > [{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"},{"id" > > :"1003","type":"Blueberry"},{"id":"1004","type":"Devil's > > Food"}] | > > > | [{"id":"1001","type":"Regular"}] | > > > | [{"id":"1001","type":"Regular"},{"id":"1002","type":"Chocolate"}] > > > | | > > > +--------+ > > > 3 rows selected (0.198 seconds) > > > > > > > > > C) select "batters.batter.id" returns something unexpected: > > > > > > 0: jdbc:drill:zk=local> select t.batters.batter.id from > > dfs.`c:\tmp\sample.json` t; > > > +---------+ > > > | EXPR$0 | > > > +---------+ > > > | 1001 | > > > | 1002 | > > > | 1003 | > > > +---------+ > > > > > > The above result doesn't make sense. The result looks like the 3 > > > values > > from row 1. Should the result be the following instead? > > > > > > +---------+ > > > | EXPR$0 | > > > +---------+ > > > | [1001, 1002, 1003, 1004] | > > > | [1001] | > > > | [1001, 1002] | > > > +---------+ > > > > > > Any hints on what is happening here? Thanks. > > > > > > -- Jiang > > > > > >