[ https://issues.apache.org/jira/browse/IMPALA-7087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944416#comment-16944416 ]
Yongzhi Chen commented on IMPALA-7087: -------------------------------------- Compare the two parquet files, the issue still exists for the fixed-length type: ychen-MBP15:target ychen$ java -jar parquet-tools-1.12.0-SNAPSHOT.jar meta ~/Downloads/binary_decimal_precision_and_scale_widening.parquet file: file:/Users/ychen/Downloads/binary_decimal_precision_and_scale_widening.parquet creator: impala version 3.1.0-SNAPSHOT (build 5e1023f243fd7312ea976b357bbfaa984f4cbd4e) file schema: schema -------------------------------------------------------------------------------- small_dec: OPTIONAL FIXED_LEN_BYTE_ARRAY L:DECIMAL(9,2) R:0 D:1 med_dec: OPTIONAL FIXED_LEN_BYTE_ARRAY L:DECIMAL(18,2) R:0 D:1 large_dec: OPTIONAL FIXED_LEN_BYTE_ARRAY L:DECIMAL(38,2) R:0 D:1 row group 1: RC:9 TS:295 OFFSET:4 -------------------------------------------------------------------------------- small_dec: FIXED_LEN_BYTE_ARRAY SNAPPY DO:4 FPO:45 SZ:74/72/0.97 VC:9 ENC:RLE,PLAIN_DICTIONARY ST:[min: -99999.99, max: 99999.99, num_nulls: 0] med_dec: FIXED_LEN_BYTE_ARRAY SNAPPY DO:148 FPO:213 SZ:100/119/1.19 VC:9 ENC:RLE,PLAIN_DICTIONARY ST:[min: -99999999999999.99, max: 99999999999999.99, num_nulls: 0] large_dec: FIXED_LEN_BYTE_ARRAY SNAPPY DO:326 FPO:412 SZ:121/192/1.59 VC:9 ENC:RLE,PLAIN_DICTIONARY ST:[min: -9999999999999999999999999999999999.99, max: 9999999999999999999999999999999999.99, num_nulls: 0] ychen-MBP15:target ychen$ java -jar parquet-tools-1.12.0-SNAPSHOT.jar meta ~/Downloads/table20.parquet file: file:/Users/ychen/Downloads/table20.parquet creator: parquet-mr version 1.5.0-cdh5.7.0-SNAPSHOT (build 40fb4dc5f83fa79d4834276449a8115f3a85eebb) extra: parquet.avro.schema = {"type":"record","name":"nested","namespace":"com.cloudera.impala","fields":[{"name":"col1","type":"long"},{"name":"col2","type":{"type":"bytes","logicalType":"decimal","precision":9,"scale":2}}]} file schema: com.cloudera.impala.nested -------------------------------------------------------------------------------- col1: REQUIRED INT64 R:0 D:0 col2: REQUIRED BINARY L:DECIMAL(9,2) R:0 D:0 row group 1: RC:5 TS:131 OFFSET:4 -------------------------------------------------------------------------------- col1: INT64 UNCOMPRESSED DO:0 FPO:4 SZ:81/81/1.00 VC:5 ENC:BIT_PACKED,PLAIN ST:[min: 3, max: 63, num_nulls: 0] col2: BINARY UNCOMPRESSED DO:0 FPO:85 SZ:50/50/1.00 VC:5 ENC:BIT_PACKED,PLAIN_DICTIONARY ST:[min: 123.00, max: 123.00, num_nulls: 0] > Impala is unable to read Parquet decimal columns with lower precision/scale > than table metadata > ----------------------------------------------------------------------------------------------- > > Key: IMPALA-7087 > URL: https://issues.apache.org/jira/browse/IMPALA-7087 > Project: IMPALA > Issue Type: Sub-task > Components: Backend > Reporter: Tim Armstrong > Assignee: Yongzhi Chen > Priority: Major > Labels: decimal, parquet > > This is similar to IMPALA-2515, except relates to a different precision/scale > in the file metadata rather than just a mismatch in the bytes used to store > the data. In a lot of cases we should be able to convert the decimal type on > the fly to the higher-precision type. > {noformat} > ERROR: File '/hdfs/path/000000_0_x_2' column 'alterd_decimal' has an invalid > type length. Expecting: 11 len in file: 8 > {noformat} > It would be convenient to allow reading parquet files where the > precision/scale in the file can be converted to the precision/scale in the > table metadata without loss of precision. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org