Charles Givre created DRILL-7308: ------------------------------------ Summary: Incorrect Metadata from text file queries Key: DRILL-7308 URL: https://issues.apache.org/jira/browse/DRILL-7308 Project: Apache Drill Issue Type: Bug Components: Metadata Affects Versions: 1.17.0 Reporter: Charles Givre Attachments: domains.csvh
I'm noticing some strange behavior with the newest version of Drill. If you query a CSV file, you get the following metadata: SELECT * FROM dfs.test.`domains.csvh` LIMIT 1 { "queryId": "22eee85f-c02c-5878-9735-091d18788061", "columns": [ "domain" ], "rows": [ { "domain": "thedataist.com" } ], "metadata": [ "VARCHAR(0, 0)", "VARCHAR(0, 0)" ], "queryState": "COMPLETED", "attemptedAutoLimit": 0 } There are two issues here: 1. VARCHAR now has precision 2. There are twice as many columns as there should be. Additionally, if you query a regular CSV, without the columns extracted, you get the following: "rows": [ { "columns": "[\"ACCT_NUM\",\"PRODUCT\",\"MONTH\",\"REVENUE\"]" } ], "metadata": [ "VARCHAR(0, 0)", "VARCHAR(0, 0)" ], -- This message was sent by Atlassian JIRA (v7.6.3#76005)