[ https://issues.apache.org/jira/browse/DRILL-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rahul Challapalli closed DRILL-1818. ------------------------------------ Verified and added the below testcase Functional/Passing/parquet_storage/parquet_generic/parquet_DRILL-1818.q > Parquet files generated by Drill ignore field names when nested elements are > queried > ------------------------------------------------------------------------------------ > > Key: DRILL-1818 > URL: https://issues.apache.org/jira/browse/DRILL-1818 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Writer > Reporter: Neeraja > Assignee: Steven Phillips > Priority: Blocker > Fix For: 0.7.0 > > Attachments: 0_0_0.parquet, DRILL-1818.patch > > > I observed this with this parquet file and a more comprehensive testing might > be needed here. The issue is that Drill seem to simply ignore field names at > the leaf level and accessing data in a positional fashion. > Below is the repro. > 1. Generate a parquet file using Drill. Input is the JSON doc below > create table dfs.tmp.sampleparquet as (select trans_id, cast(`date` as date) > transdate,cast(`time` as time) transtime, cast(amount as double) > amount,`user_info`,`marketing_info`, `trans_info` from > dfs.`/Users/nrentachintala/Downloads/sample.json` ) > 2. Now do queries. > Note in query below, there is no field name called 'keywords' in trans_info, > but data is just positionally returned (the data returned from prod_id > column). > 0: jdbc:drill:zk=local> select t.`trans_info`.keywords from > dfs.tmp.sampleparquet t where t.`trans_info`.keywords is not null; > +------------+ > | EXPR$0 | > +------------+ > | [16] | > | [] | > | [293,90] | > | [173,18,121,84,115,226,464,525,35,11,94,45] | > | [311,29,5,41] | > 0: jdbc:drill:zk=local> select t.`marketing_info`.keywords from > dfs.tmp.sampleparquet t; > Note in the query below, it is trying to return the first element in > marketing_Info which is camp_id which is of int type for keywords columns. > But keywords schema is string, so it throws error with type mismatch. > Query failed: Query failed: Failure while running fragment., You tried to > write a VarChar type when you are using a ValueWriter of type > NullableBigIntWriterImpl. [ c3761403-b8c5-43c1-8e90-2c4918d1f85c on > 10.0.0.20:31010 ] > [ c3761403-b8c5-43c1-8e90-2c4918d1f85c on 10.0.0.20:31010 ] > Error: exception while executing query: Failure while executing query. > (state=,code=0) > 0: jdbc:drill:zk=local> select > t.`marketing_info`.`camp_id`,t.`marketing_info`.keywords from > dfs.tmp.sampleparquet t; > +------------+------------+ > | EXPR$0 | EXPR$1 | > +------------+------------+ > | 4 | > ["go","to","thing","watch","made","laughing","might","pay","in","your","hold"] > | > | 6 | ["pronounce","tree","instead","games","sigh"] | > | 17 | [] | > | 17 | ["it's"] | > | 8 | ["fallout"] | > +------------+------------+ > Sample.json is below > {"trans_id":0,"date":"2013-07-26","time":"04:56:59","amount":80.5,"user_info":{"cust_id":28,"device":"IOS5","state":"mt"},"marketing_info":{"camp_id":4,"keywords":["go","to","thing","watch","made","laughing","might","pay","in","your","hold"]},"trans_info":{"prod_id":[16],"purch_flag":"false"}} > {"trans_id":1,"date":"2013-05-16","time":"07:31:54","amount":100.40, > "user_info":{"cust_id":86623,"device":"AOS4.2","state":"mi"},"marketing_info":{"camp_id":6,"keywords":["pronounce","tree","instead","games","sigh"]},"trans_info":{"prod_id":[],"purch_flag":"false"}} > {"trans_id":2,"date":"2013-06-09","time":"15:31:45","amount":20.25, > "user_info":{"cust_id":11,"device":"IOS5","state":"la"},"marketing_info":{"camp_id":17,"keywords":[]},"trans_info":{"prod_id":[293,90],"purch_flag":"true"}} > {"trans_id":3,"date":"2013-07-19","time":"11:24:22","amount":500.75, > "user_info":{"cust_id":666,"device":"IOS5","state":"nj"},"marketing_info":{"camp_id":17,"keywords":["it's"]},"trans_info":{"prod_id":[173,18,121,84,115,226,464,525,35,11,94,45],"purch_flag":"false"}} > {"trans_id":4,"date":"2013-07-21","time":"08:01:13","amount":34.20,"user_info":{"cust_id":999,"device":"IOS7","state":"ct"},"marketing_info":{"camp_id":8,"keywords":["fallout"]},"trans_info":{"prod_id":[311,29,5,41],"purch_flag":"false"}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)