hi James, I re-tried the steps once again after looking at the output you shared with me. But still I am not able to get the expected output on drill 1.20.1. Then I downloaded drill 1.21.1 and tried the same steps. This time I got the expected output. So can you please try the same on drill 1.20.1 once if possible? thx
REgards Prabhakar On Wed, Aug 2, 2023 at 3:11 PM James Turton <[email protected]> wrote: > Hi! > > I just got back from travelling. I ran a test and Drill did do what I > believe you're after. > > Here's the test JSON file > > ➜ ~ cat /tmp/foo/bar.json > {"id":"T06125309","cust_id":"A20","num":"VAB6169028"} > > > And here's my Drill session. > > apache drill> use dfs.tmp; > ok true > summary Default schema changed to [dfs.tmp] > > apache drill (dfs.tmp)> select * from foo; > id T06125309 > cust_id A20 > num VAB6169028 > > apache drill (dfs.tmp)> create schema (id varchar, cust_id varchar, num > varchar, tranamt double not null default '1.11' > ) for table foo; > ok true > summary Created schema for [foo] > > 1 row selected (0.257 seconds) > apache drill (dfs.tmp)> select * from foo; > id T06125309 > cust_id A20 > num VAB6169028 > tranamt 1.11 > > > On 2023/07/27 07:38, Prabhakar Bhosale wrote: > > Hi James, > Any advice on the problem on schema for json file as mentioned in > my previous mail? > > Regards > Prabhakar > > On Mon, Jul 24, 2023 at 3:41 PM Prabhakar Bhosale <[email protected]> > <[email protected]> > wrote: > > > Hi James, > > I tried the same on the JSON file and it is still not working. > > Below is JSON file content > > *{"id":"T06125309","cust_id":"A20","num":"VAB6169028"}* > > The contents of the ".query.schema" file are as follows > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *{ "table" : "mystore.`TEST_MOD`", "schema" : { "type" : > "tuple_schema", "columns" : [ { "name" : "ID", "type" > : "VARCHAR", "mode" : "OPTIONAL" }, { "name" : > "CUST_ID", "type" : "VARCHAR", "mode" : "OPTIONAL" }, > { "name" : "NUM", "type" : "VARCHAR", "mode" : > "OPTIONAL" }, { "name" : "TRANAMT", "type" : > "DOUBLE", "mode" : "REQUIRED", "properties" : { > "drill.default" : "1.11" } } ] }, "version" : 1}* > > When I fire the query below, I expect the value of TRANAMT column to be > *1.11* but it gives out value as *NULL* > > *select A.ID <http://A.ID> <http://A.ID>, A.CUST_ID, A.NUM, A.TRANAMT from > table(mystore.`TEST_MOD`(schema => > 'path=`/archived_files_nw/TEST_MOD/.drill.schema`')) A* > > So essentially it is not considering the schema given at the query > execution time. Please let me know if I am doing anything incorrectly. > > > Thanks and Regards > Prabhakar > > > On Wed, Jul 12, 2023 at 4:33 PM James Turton <[email protected]> > <[email protected]> wrote: > > > Hi Prabhakar > > From what I recall, Drill won't consider a provided schema when > querying Parquet because Parquet files bundle their own schema. You > might need to use a SQL function like COALESCE(TRAN_AMOUNT, 1.11) and > possibly put that in a SQL view for reuse. > > Regards > James > > On 2023/07/11 18:40, Prabhakar Bhosale wrote: > > Hi Team, > I am using drill 1.20.1 with parquet files. > > I have two parquet files in a directory with one column missing in one > file. When I query the directory it gives me NULL values for all those > > rows > > which are from the file where that column is missing. > > But I want a specific value for that column instead of NULL. So I > have created the schema as given below. But even after creating it is > > still > > returning the NULL value. Please let me know what is going wrong. > > I have also ensured that storage.table.user_schema_file=true at system > level. > > The files are stored on linux mount point. > The name of the missing column is "TRAN_AMOUNT". > > > > The schema is as below > > { > "table" : "archive.default.`executions`", > "schema" : { > "type" : "tuple_schema", > "columns" : [ > { > "name" : "EXEC_ID", > "type" : "VARCHAR", > "mode" : "OPTIONAL" > }, > { > "name" : "CUST_ID", > "type" : "VARCHAR", > "mode" : "OPTIONAL" > }, > { > "name" : "CELL_ID", > "type" : "VARCHAR", > "mode" : "OPTIONAL" > }, > { > "name" : "TRAN_AMOUNT", > "type" : "FLOAT", > "mode" : "REQUIRED", > "properties" : { > "drill.default" : "1.11" > } > } > ] > }, > "version" : 1 > } > > > >
