[ https://issues.apache.org/jira/browse/IMPALA-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16495177#comment-16495177 ]
Attila Jeges commented on IMPALA-3316: -------------------------------------- [~boristyukin] The change is under review: https://gerrit.cloudera.org/#/c/9986/ > convert_legacy_hive_parquet_utc_timestamps=true makes reading parquet tables > 30x slower > --------------------------------------------------------------------------------------- > > Key: IMPALA-3316 > URL: https://issues.apache.org/jira/browse/IMPALA-3316 > Project: IMPALA > Issue Type: Bug > Components: Backend > Affects Versions: impala 2.3 > Environment: CDH 5.5.2/ Impala 2.3 > Parquet table with a timestamp column > Secure cluster > convert_legacy_hive_parquet_utc_timestamps=true > Timestamp column is not being filtered on > Reporter: Ruslan Dautkhanov > Assignee: Attila Jeges > Priority: Minor > Attachments: screenshot-1.png, screenshot-2.png > > > Enabling convert_legacy_hive_parquet_utc_timestamps=true > makes simple queries that don't even filter on a timestamp attribute perform > really poorly. > Parquet table. > Impala 2.3 / CDH 5.5.2. > convert_legacy_hive_parquet_utc_timestamps=true makes following simple query > 30x slower (1.1minutes -> over 30 minutes). > {quote} select * from parquet_table_with_a_timestamp_attribute where > bigint_attribute=1000771658169 {quote} > Notice I did not even filter on a timestamp attribute. > Made multiple tests with and without > convert_legacy_hive_parquet_utc_timestamps=true impalad present. > Also, from https://issues.cloudera.org/browse/IMPALA-1658 > {quote} Casey Ching added a comment - 15/Jun/15 5:12 PM > Btw, a perf test showed enabling this flag was 10x slower. {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org