[ 
https://issues.apache.org/jira/browse/IMPALA-7087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944416#comment-16944416
 ] 

Yongzhi Chen commented on IMPALA-7087:
--------------------------------------

Compare the two parquet files, the issue still exists for the fixed-length type:

ychen-MBP15:target ychen$ java -jar parquet-tools-1.12.0-SNAPSHOT.jar meta 
~/Downloads/binary_decimal_precision_and_scale_widening.parquet 
file:        
file:/Users/ychen/Downloads/binary_decimal_precision_and_scale_widening.parquet 
creator:     impala version 3.1.0-SNAPSHOT (build 
5e1023f243fd7312ea976b357bbfaa984f4cbd4e) 

file schema: schema 
--------------------------------------------------------------------------------
small_dec:   OPTIONAL FIXED_LEN_BYTE_ARRAY L:DECIMAL(9,2) R:0 D:1
med_dec:     OPTIONAL FIXED_LEN_BYTE_ARRAY L:DECIMAL(18,2) R:0 D:1
large_dec:   OPTIONAL FIXED_LEN_BYTE_ARRAY L:DECIMAL(38,2) R:0 D:1

row group 1: RC:9 TS:295 OFFSET:4 
--------------------------------------------------------------------------------
small_dec:    FIXED_LEN_BYTE_ARRAY SNAPPY DO:4 FPO:45 SZ:74/72/0.97 VC:9 
ENC:RLE,PLAIN_DICTIONARY ST:[min: -99999.99, max: 99999.99, num_nulls: 0]
med_dec:      FIXED_LEN_BYTE_ARRAY SNAPPY DO:148 FPO:213 SZ:100/119/1.19 VC:9 
ENC:RLE,PLAIN_DICTIONARY ST:[min: -99999999999999.99, max: 99999999999999.99, 
num_nulls: 0]
large_dec:    FIXED_LEN_BYTE_ARRAY SNAPPY DO:326 FPO:412 SZ:121/192/1.59 VC:9 
ENC:RLE,PLAIN_DICTIONARY ST:[min: -9999999999999999999999999999999999.99, max: 
9999999999999999999999999999999999.99, num_nulls: 0]
ychen-MBP15:target ychen$ java -jar parquet-tools-1.12.0-SNAPSHOT.jar meta 
~/Downloads/table20.parquet 
file:        file:/Users/ychen/Downloads/table20.parquet 
creator:     parquet-mr version 1.5.0-cdh5.7.0-SNAPSHOT (build 
40fb4dc5f83fa79d4834276449a8115f3a85eebb) 
extra:       parquet.avro.schema = 
{"type":"record","name":"nested","namespace":"com.cloudera.impala","fields":[{"name":"col1","type":"long"},{"name":"col2","type":{"type":"bytes","logicalType":"decimal","precision":9,"scale":2}}]}
 

file schema: com.cloudera.impala.nested 
--------------------------------------------------------------------------------
col1:        REQUIRED INT64 R:0 D:0
col2:        REQUIRED BINARY L:DECIMAL(9,2) R:0 D:0

row group 1: RC:5 TS:131 OFFSET:4 
--------------------------------------------------------------------------------
col1:         INT64 UNCOMPRESSED DO:0 FPO:4 SZ:81/81/1.00 VC:5 
ENC:BIT_PACKED,PLAIN ST:[min: 3, max: 63, num_nulls: 0]
col2:         BINARY UNCOMPRESSED DO:0 FPO:85 SZ:50/50/1.00 VC:5 
ENC:BIT_PACKED,PLAIN_DICTIONARY ST:[min: 123.00, max: 123.00, num_nulls: 0]

> Impala is unable to read Parquet decimal columns with lower precision/scale 
> than table metadata
> -----------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-7087
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7087
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Tim Armstrong
>            Assignee: Yongzhi Chen
>            Priority: Major
>              Labels: decimal, parquet
>
> This is similar to IMPALA-2515, except relates to a different precision/scale 
> in the file metadata rather than just a mismatch in the bytes used to store 
> the data. In a lot of cases we should be able to convert the decimal type on 
> the fly to the higher-precision type.
> {noformat}
> ERROR: File '/hdfs/path/000000_0_x_2' column 'alterd_decimal' has an invalid 
> type length. Expecting: 11 len in file: 8
> {noformat}
> It would be convenient to allow reading parquet files where the 
> precision/scale in the file can be converted to the precision/scale in the 
> table metadata without loss of precision.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to