This is a poor error messages that is produced when you try to flatten a field that is not an array, for these fields you can just use the dot notation to access their inner members (i.e. flattened_array_of_maps.member_field_in_map). If you have a field where the keys in a map are "unknown" or you want to do analysis on the keys, please refer to the KVGEN docs [1]. I have assigned the JIRA that reported this issue a while ago to myself and will work to improve the message [2].
[1] - https://drill.apache.org/docs/kvgen/ [2] - https://issues.apache.org/jira/browse/DRILL-2182 On Tue, Jan 12, 2016 at 1:20 AM, Paolo Spanevello <[email protected]> wrote: > Hi All, > > Jason,I used your suggests and it works, thanks a lot! > > As u wrote i used a subquery to have the all list of INTERVALS as I show > below. > > *select t.flat_rides.tags.Athlete as Athlete,t.flat_rides.crc as > crc,flatten(t.flat_rides.INTERVALS) as flat_intervals from (select > flatten(rides) as flat_rides from `dfs`.`tmp`.`provaRider`) as t* > > [image: Immagine incorporata 1] > > The attribute "flat_intervals" is full of data that I would like to have > separate them in several attributes as it is showed in the link: > https://drill.apache.org/docs/flatten/ > I used a new subquery to have them: > > > *select tt.Athlete, tt.crc, flatten(tt.flat_intervals) as newflat from > (select t.flat_rides.tags.Athlete as Athlete,t.flat_rides.crc as > crc,flatten(t.flat_rides.INTERVALS) as flat_intervals from (select > flatten(rides) as flat_rides from `dfs`.`tmp`.`provaRider`) as t) as tt* > > but I have this error: > > ERROR [HY000] [MapR][Drill] (1040) Drill failed to execute the query: > select tt.Athlete, tt.crc, flatten(tt.flat_intervals) as newflat from > (select t.flat_rides.tags.Athlete as Athlete,t.flat_rides.crc as > crc,flatten(t.flat_rides.INTERVALS) as flat_intervals from (select > flatten(rides) as flat_rides from `dfs`.`tmp`.`provaRider`) as t) as tt > [30027]Query execution error. Details:[ > SYSTEM ERROR: ClassCastException: Cannot cast > org.apache.drill.exec.vector.complex.MapVector to > org.apache.drill.exec.vector.complex.RepeatedValueVector > > Fragment 0:0 > > [Error Id: a22fe80f-43e4-43cb-bb98-5541ecb92d4c on 192.168.1.101:31010] > ] > > [image: Immagine incorporata 2] > > Thanks in advance! > > Paolo > > > 2016-01-11 17:39 GMT+01:00 Jason Altekruse <[email protected]>: > >> Paolo, >> >> Drill currently reads single JSON objects as single records. If you look >> at >> the top of your file you can see that the root of your document is a >> single >> JSON object. >> >> Drill accepts two formats for individual records: >> >> The Mongo import format, a series of JSON object one after the other in a >> file, whitespace is irrelevant, each one need not be followed by a newline >> >> {"a" : 1, "b" : "hello" } >> {"a": 5 : "b" : "guten tag" } >> >> A JSON array of objects >> >> [ >> {"a" : 1, "b" : "hello" }, >> {"a" : 2, "b", "guten tag"} >> ] >> >> When you have a file like this you can read it using the FLATTEN >> functionality of Drill to turn an array into a series of records. >> https://drill.apache.org/docs/flatten/ >> >> select flatten(rides) as flat_rides from dfs.tmp.`rideDB.json`; >> >> To work with the data further, you can put the flatten call in a subquery. >> Here is how you can select the first element from each records list of >> INTERVALS and select one of the nested fields inside of METRICS once the >> data has been flattened. >> To analyze the array, you could flatten again to get an exploded dataset >> with one record per interval across all records >> >> select t.flat_rides.INTERVALS[0], t.flat_rides.METRICS.skiba_wprime_low >> from (select flatten(rides) as flat_rides from dfs.tmp.`rideDB.json`) as >> t; >> >> Here you can see that individual columns can be selected next to the >> flatten call, this will copy the data into each new record: >> >> select flatten(t.flat_rides.INTERVALS) as flat_intervals, >> t.flat_rides.METRICS.skiba_wprime_low from (select flatten(rides) as >> flat_rides from dfs.tmp.`rideDB.json`) as t; >> >> Happy Drilling! >> >> On Sun, Jan 10, 2016 at 4:23 AM, Paolo Spanevello <[email protected]> >> wrote: >> >> > Hi all, >> > >> > i'm trying to query the file that you can find in attach with drill >> apache >> > 1.4 . The result of this qurey is always 1 record. >> > >> > The query that i'm running is : >> > >> > SELECT t.rides.INTERVALS.METRICS FROM rideDB.json t >> > >> > If i run the similar query with the file donuts.json found on >> > https://drill.apache.org/docs/sample-data-donuts/ the query runs >> properly. >> > >> > SELECT t.topping FROM donuts.json t >> > >> > Thanks in advance. >> > >> > Paolo >> > >> > >
