[GitHub] [hudi] lz1984sh opened a new issue, #6209: [SUPPORT] hudi 0.11 not support decimal field precision increase

GitBox Mon, 25 Jul 2022 03:56:48 -0700


lz1984sh opened a new issue, #6209:
URL: https://github.com/apache/hudi/issues/6209


   **Describe the problem you faced**
   It seems like hudi doesn't support decimal field precision increase when 
size of underlying fixed byte length array increases. E.g., I insert one row
   
   id1  name1   111.11
   
   with schema below
   
   id - string, name - string, amount decimal(**10,2**)
   
   and then update the row with same data but increased precision decimal schema
   
   id - string, name - string, amount decimal(**20,4**)
   
   Then java.lang.ArrayIndexOutOfBoundsException would be thrown, look details 
in exception stack trace below.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. Download attached 
[pom.xml.txt](https://github.com/apache/hudi/files/9180429/pom.xml.txt) and 
rename as pom.xml and create project in your IDE workspace.
   
   2. Download attached java program 
[HudiTest.java.txt](https://github.com/apache/hudi/files/9180407/HudiTest.java.txt)
 as HudiTest.java and run it in the project created in step 1.
   
   **Environment Description**
   
   * Hudi version : 0.11.0
   
   * Spark version : 3.1.1
   
   * Hive version : N/A
   
   * Hadoop version : 3.3.1
   
   * Storage (HDFS/S3/GCS..) : Local file system
   
   * Running on Docker? (yes/no) : no
   
   **Stacktrace**
   
   ```
   Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read 
value at 1 in block 0 in file 
file:/xxx/e6043078-cadc-43cc-8412-50ecf3dc9a5b-0_0-34-1617_20220725183004104.parquet
        at 
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:251)
        at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:132)
        at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)
        at 
org.apache.hudi.common.util.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:48)
        ... 10 more
   Caused by: java.lang.ArrayIndexOutOfBoundsException
        at java.lang.System.arraycopy(Native Method)
        at 
org.apache.avro.generic.GenericData.createFixed(GenericData.java:1168)
        at 
org.apache.parquet.avro.AvroConverters$FieldFixedConverter.convert(AvroConverters.java:310)
        at 
org.apache.parquet.avro.AvroConverters$BinaryConverter.addBinary(AvroConverters.java:62)
        at 
org.apache.parquet.column.impl.ColumnReaderImpl$2$6.writeValue(ColumnReaderImpl.java:317)
        at 
org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:367)
        at 
org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406)
        at 
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226)
        ... 13 more
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] lz1984sh opened a new issue, #6209: [SUPPORT] hudi 0.11 not support decimal field precision increase

Reply via email to