Alan Jackoway created IMPALA-6954: ------------------------------------- Summary: Kudu CTAS Loses Partitioning Key: IMPALA-6954 URL: https://issues.apache.org/jira/browse/IMPALA-6954 Project: IMPALA Issue Type: Bug Reporter: Alan Jackoway
In certain types of queries, CTAS stored as Kudu will lose the partitioning. To reproduce: Create transactions table: {code:sql} create table alanj_transactions(account_id string, transaction_id string, total double, close_date string) {code} Don't need to put any data into it. Create Kudu table from it, trying to get the longest-lived record (close date to now): {code:sql} create table alanj_kudu primary key (account_id) partition by hash(account_id) partitions 5 stored as kudu as select account_id, datediff(now(), min(cast(close_date AS TIMESTAMP))) AS tenure_days from alanj_transactions group by 1 {code} You receive a warning like "Unpartitioned Kudu tables are inefficient for large data sizes." Show create table + the Kudu UIs confirm that partitions were not created. If you replace that datediff line with something like {{sum(total) as account_total}}, it works fine. Something about datediff is causing it to lose the partitioning. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org