Krystal created DRILL-1524:
------------------------------

             Summary: Data from hive parquet table is displayed as "null" when 
select all columns 
                 Key: DRILL-1524
                 URL: https://issues.apache.org/jira/browse/DRILL-1524
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 0.6.0
            Reporter: Krystal


git.commit.id.abbrev=42f0a7e

>From hive-13, I created a parquet table:
hive> create table voter_parquet(voter_id int,name string,age tinyint, 
registration string,contributions float,voterzone smallint,create_time string) 
stored as parquet; 
hive> insert overwrite table voter_parquet select * from voter;

I can select against this table from hive:
hive> select * from voter_parquet limit 5;
OK
1       nick miller     68      green   717.12  13809   2014-05-25 03:41:54
2       ulysses white   48      green   840.06  19451   2014-07-30 08:03:11
3       holly garcia    18      democrat        128.2   8750    2014-09-15 
02:33:11
4       victor thompson 61      independent     721.6   20462   2014-06-17 
13:04:09
5       luke allen      39      socialist       800.22  25151   2015-02-01 
02:02:37

I ran the same select from sqlline and got all nulls:

0: jdbc:drill:schema=hive> select * from voter_parquet limit 5;
+------------+------------+------------+--------------+---------------+------------+-------------+
|  voter_id  |    name    |    age     | registration | contributions | 
voterzone  | create_time |
+------------+------------+------------+--------------+---------------+------------+-------------+
| null       | null       | null       | null         | null          | null    
   | null        |
| null       | null       | null       | null         | null          | null    
   | null        |
| null       | null       | null       | null         | null          | null    
   | null        |
| null       | null       | null       | null         | null          | null    
   | null        |
| null       | null       | null       | null         | null          | null    
   | null        |
+------------+------------+------------+--------------+---------------+------------+-------------+

Same if I explicitly specify all the columns:
0: jdbc:drill:schema=hive> select voter_id, name, age, registration, 
contributions, voterzone, create_time from voter_parquet limit 2;
+------------+------------+------------+--------------+---------------+------------+-------------+
|  voter_id  |    name    |    age     | registration | contributions | 
voterzone  | create_time |
+------------+------------+------------+--------------+---------------+------------+-------------+
| null       | null       | null       | null         | null          | null    
   | null        |
| null       | null       | null       | null         | null          | null    
   | null        |
+------------+------------+------------+--------------+---------------+------------+-------------+

However, if I select a few columns, then the data displays correctly:
0: jdbc:drill:schema=hive> select voter_id, name, age, registration from 
voter_parquet limit 5;
+------------+------------+------------+--------------+
|  voter_id  |    name    |    age     | registration |
+------------+------------+------------+--------------+
| 1          | nick miller | 68         | green        |
| 2          | ulysses white | 48         | green        |
| 3          | holly garcia | 18         | democrat     |
| 4          | victor thompson | 61         | independent  |
| 5          | luke allen | 39         | socialist    |
+------------+------------+------------+--------------+

0: jdbc:drill:schema=hive> describe voter_parquet;
+-------------+------------+-------------+
| COLUMN_NAME | DATA_TYPE  | IS_NULLABLE |
+-------------+------------+-------------+
| voter_id    | INTEGER    | YES         |
| name        | VARCHAR    | YES         |
| age         | TINYINT    | YES         |
| registration | VARCHAR    | YES         |
| contributions | FLOAT      | YES         |
| voterzone   | SMALLINT   | YES         |
| create_time | VARCHAR    | YES         |
+-------------+------------+-------------+




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to