sathishkumar paramasivam created IMPALA-6829: ------------------------------------------------
Summary: how to get compressed hdfs file using impala or hive Key: IMPALA-6829 URL: https://issues.apache.org/jira/browse/IMPALA-6829 Project: IMPALA Issue Type: Question Reporter: sathishkumar paramasivam hi, i am doing the self learning now the impala and trying to enable the compression for the table but could not see the hdfs file getting the extension? referring to [https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_txtfile.html] but not sure how the final compressed file are creating. When I try sqoop, i can get the compress file. please guide. create table csv_compressed (a string, b string, c string) row format delimited fields terminated by ","; insert into csv_compressed values ('one - uncompressed', 'two - uncompressed', 'three - uncompressed'), ('abc - uncompressed', 'xyz - uncompressed', '123 - uncompressed'); ...make equivalent .gz, .bz2, and .snappy files and load them into same table directory... select * from csv_compressed; +--------------------+--------------------+----------------------+ | a | b | c | +--------------------+--------------------+----------------------+ | one - snappy | two - snappy | three - snappy | | one - uncompressed | two - uncompressed | three - uncompressed | | abc - uncompressed | xyz - uncompressed | 123 - uncompressed | | one - bz2 | two - bz2 | three - bz2 | | abc - bz2 | xyz - bz2 | 123 - bz2 | | one - gzip | two - gzip | three - gzip | | abc - gzip | xyz - gzip | 123 - gzip | +--------------------+--------------------+----------------------+ $ hdfs dfs -ls 'hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/'; ...truncated for readability... 75 hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/csv_compressed.snappy 79 hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/csv_compressed_bz2.csv.bz2 80 hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/csv_compressed_gzip.csv.gz 116 hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/dd414df64d67d49b_data.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)