I’d like to weigh in here, but this would be EXTREMELY useful.  When I was 
trying to write connectors to enable various BI tools to connect to Drill, such 
as SQLPad and Metabase, the inability to get information about how drill 
interprets the data was really difficult to get around.  Just me .02. 

> On Apr 28, 2018, at 18:05, Paul Rogers <par0...@yahoo.com.INVALID> wrote:
> 
> Hi Rob,
> 
> Thanks for the suggestion. While this works for Hive (as you showed), it does 
> not work for CSV files:
> 
> DESCRIBE `csvh/cust.csvh`;
> +--------------+------------+--------------+
> | COLUMN_NAME  | DATA_TYPE  | IS_NULLABLE  |
> +--------------+------------+--------------+
> +--------------+------------+--------------+
> 
> The typeof() function is handy, but does not report the "is nullable" (or 
> repeated) "mode" of a column, and it loses the data type if a value is null. 
> The following CSV file (with headers) uses non-nullable VARCHAR columns:
> 
> SELECT typeof(custId) FROM `csvh/cust.csvh`;
> +----------+
> |  EXPR$0  |
> +----------+
> | VARCHAR  |
> +----------+
> 
> Now, do something similar with JSON which uses a (nullable) VARCHAR:
> 
> SELECT typeof(a) FROM `json/str-null.json`;
> +----------+
> |  EXPR$0  |
> +----------+
> | VARCHAR  |
> | NULL     |
> +----------+
> 
> Finally, use a CSV file without headers, so that all columns are returned in 
> the columns[] array:
> 
> SELECT typeof(columns) FROM `csv/cust.csv`;
> +----------+
> |  EXPR$0  |
> +----------+
> | VARCHAR  |
> +----------+
> 
> We know that the three "VARCHAR" are different because we know how Drill 
> works internally. But, the output of sqlline does not express that knowledge.
> 
> Sqlline presents all data as strings, which often hides the data type and 
> other details, making lit look like things work better than they actually do. 
> You can see this by running a query against two JSON where a VarChar column 
> is missing from one of the files. Drill guesses "nullable Int",  Sqlline 
> shows the value as null, and typeof() shows the type as NULL, hiding the fact 
> that there is actually a schema conflict (schema change) lurking in the data 
> that manifests only if, say, you sort the data.
> 
> Bottom line: it seems that, at present, there isn't a good way at present 
> (short of writing some Java code that uses the native Drill API) to get the 
> actually, detailed type of a column with both data type and cardinality 
> ("mode").
> 
> 
> So, would be great when explaining Drill concepts, if there was a clean 
> non-code way to show people the actual structure of the data. (Yep, I know 
> Drill is open source and welcomes contributions, so I'll try to offer a 
> solution when I get time...)
> 
> Thanks,
> - Paul
> 
> 
> 
>    On Thursday, April 26, 2018, 10:08:04 AM PDT, Rob Wu <robw...@gmail.com> 
> wrote:  
> 
> Hi Paul,
> 
> You could also use DESCRIBE (https://drill.apache.org/docs/describe/).
> 
> 0: jdbc:drill:drillbit=localhost:31010> describe
> `hive.default`.`integer_table`
> . . . . . . . . . . . . . . . . . . . > ;
> +--------------+--------------------+--------------+
> | COLUMN_NAME  |    DATA_TYPE      | IS_NULLABLE  |
> +--------------+--------------------+--------------+
> | keycolumn    | CHARACTER VARYING  | YES          |
> | column1      | INTEGER            | YES          |
> +--------------+--------------------+--------
> 
> Best regards,
> 
> Rob
> 
> On Wed, Apr 25, 2018 at 10:12 PM, Abhishek Girish <agir...@apache.org>
> wrote:
> 
>> Hey Paul,
>> 
>> You could use the typeof() function for this purpose. It takes a single
>> parameter - the column name.
>> 
>> For example:
>>> select typeof(c_current_cdemo_sk) from customer limit 1;
>> +---------+
>> | EXPR$0  |
>> +---------+
>> | BIGINT  |
>> +---------+
>> 1 row selected (0.472 seconds)
>> 
>> 
>> On Wed, Apr 25, 2018 at 9:23 PM Paul Rogers <par0...@yahoo.com.invalid>
>> wrote:
>> 
>>> Hi All,
>>> Anyone know if there is a non-code way to display the data types of
>>> columns returned from a Drill query? Sqlline appears to only show the
>>> column names and values. The same is true of the Drill web console.
>>> The EXPLAIN PLAN FOR ... command shows the query plan, but not type
>> (which
>>> are only known at run time.) Is there a statement, system table or some
>>> other trick to display column types in, say, Sqlline?
>>> In the past, I've gotten the types by using unit test style code. But,
>>> that is not to handy for use as an example for non-developers...
>>> Thanks,
>>> - Paul
>>> 
>>> 
>> 

Reply via email to