[
https://issues.apache.org/jira/browse/DRILL-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14219890#comment-14219890
]
Zhiyong Liu commented on DRILL-1738:
------------------------------------
The reader will ignore cases if you select multiple columns, such as
select cOlUmN1, CoLuMn2 from some_table_with_mixed_cases_in_column_names;
> Parquet complex reader case sensitive
> -------------------------------------
>
> Key: DRILL-1738
> URL: https://issues.apache.org/jira/browse/DRILL-1738
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - Parquet
> Affects Versions: 0.7.0
> Reporter: Ramana Inukonda Nagaraj
> Assignee: Parth Chandra
> Priority: Blocker
> Fix For: 0.7.0
>
>
> On TPCDS parquet data while using the default reader drill is case sensitive
> for column names.
> 0: jdbc:drill:> select c_customer_sk from `0_0_0.parquet` limit 1;
> +---------------+
> | c_customer_sk |
> +---------------+
> | 1 |
> +---------------+
> 1 row selected (0.15 seconds)
> 0: jdbc:drill:> select c_customer_SK from `0_0_0.parquet` limit 1;
> +---------------+
> | c_customer_sk |
> +---------------+
> | 1 |
> +---------------+
> On using the new reader though
> 0: jdbc:drill:> alter session set `store.parquet.use_new_reader`=true;
> 0: jdbc:drill:> select c_customer_SK from `0_0_0.parquet` limit 1;
> +---------------+---------------+--------------------+--------------------+-------------------+------------------------+-----------------------+--------------+--------------+-------------+-----------------------+-------------+---------------+--------------+-----------------+------------+-----------------+--------------------+
> | c_customer_sk | c_customer_id | c_current_cdemo_sk | c_current_hdemo_sk |
> c_current_addr_sk | c_first_shipto_date_sk | c_first_sales_date_sk |
> c_salutation | c_first_name | c_last_name | c_preferred_cust_flag |
> c_birth_day | c_birth_month | c_birth_year | c_birth_country | c_login |
> c_email_address | c_last_review_date |
> +---------------+---------------+--------------------+--------------------+-------------------+------------------------+-----------------------+--------------+--------------+-------------+-----------------------+-------------+---------------+--------------+-----------------+------------+-----------------+--------------------+
> | 1 | AAAAAAAABAAAAAAA | 980124 | 7135
> | 32946 | 2452238 | 2452208 | Mr.
> | Javier | Lewis | Y | 9 | 12
> | 1936 | CHILE | null |
> [email protected] | 2452508 |
> +---------------+---------------+--------------------+--------------------+-------------------+------------------------+-----------------------+--------------+--------------+-------------+-----------------------+-------------+---------------+--------------+-----------------+------------+-----------------+--------------------+
> 1 row selected (0.368 seconds)
> Will file a separate bug for the issue that when the new parquet reader
> cannot find a column it does a * query instead and returns all columns.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)