[jira] [Updated] (HUDI-7303) Date field type unexpectedly convert to Long when using date comparison operator

2024-06-06 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7303:

Fix Version/s: 0.15.0

> Date field type unexpectedly convert to Long when using date comparison 
> operator
> 
>
> Key: HUDI-7303
> URL: https://issues.apache.org/jira/browse/HUDI-7303
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Affects Versions: 0.14.0, 0.14.1
> Environment: Flink 1.15.4 Hudi 0.14.0
> Flink 1.17.1 Hudi 0.14.0
> Flink 1.17.1 Hudi 0.14.1rc1
>Reporter: Yao Zhang
>Assignee: Yao Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> Given the table date_dim from TPCDS as an example:
> {code:java}
> CREATE TABLE date_dim (
>   d_date_sk int,
>   d_date_id varchar(16) NOT NULL,
>   d_date date,
>   d_month_seq int,
>   d_week_seq int,
>   d_quarter_seq int,
>   d_year int,
>   d_dow int,
>   d_moy int,
>   d_dom int,
>   d_qoy int,
>   d_fy_year int, 
>   d_fy_quarter_seq int,
>   d_fy_week_seq int,
>   d_day_name varchar(9)
>   d_quarter_name varchar(6),
>   d_holiday char(1),
>   d_weekend char(1),
>   d_following_holiday char(1),
>   d_first_dom int,
>   d_last_dom int,
>   d_same_day_ly int,
>   d_same_day_lq int,
>   d_current_day char(1),
>   d_current_week char(1),
>   d_current_month char(1),
>   d_current_quarter char(1),
>   d_current_year char(1)) with (
>   'connector' = 'hudi',
>   'path' = 'hdfs:///table_path/date_dim',
>   'table.type' = 'COPY_ON_WRITE'); {code}
> When you execute the following select statement, an exception will be thrown:
> {code:java}
> select * from date_dim where d_date between cast('1999-02-22' as date) and 
> (cast('1999-02-22' as date) + INTERVAL '30' day);
> {code}
> The exception is:
> {code:java}
> java.lang.IllegalArgumentException: FilterPredicate column: d_date's declared 
> type (java.lang.Long) does not match the schema found in file metadata. 
> Column d_date is of type: INT32
> Valid types for this column are: [class java.lang.Integer]
>   at 
> org.apache.parquet.filter2.predicate.ValidTypeMap.assertTypeValid(ValidTypeMap.java:125)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumn(SchemaCompatibilityValidator.java:179)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumnFilterPredicate(SchemaCompatibilityValidator.java:149)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:113)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:56)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.Operators$GtEq.accept(Operators.java:246)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:119)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:56)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.Operators$And.accept(Operators.java:306) 
> ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.validate(SchemaCompatibilityValidator.java:61)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:95)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:45)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.compat.FilterCompat$FilterPredicateCompat.accept(FilterCompat.java:149)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.compat.RowGroupFilter.filterRowGroups(RowGroupFilter.java:67)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.hudi.table.format.cow.vector.reader.ParquetColumnarRowSplitReader.(ParquetColumnarRowSplitReader.java:142)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.hudi.table.format.cow.ParquetSplitReaderUtil.genPartColumnarRowReader(ParquetSplitReaderUtil.java:153)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.hudi.table.format.RecordIterators.getParquetRecordIterator(RecordIterators.java:78)
>  

[jira] [Updated] (HUDI-7303) Date field type unexpectedly convert to Long when using date comparison operator

2024-01-22 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated HUDI-7303:
-
Fix Version/s: 1.0.0

> Date field type unexpectedly convert to Long when using date comparison 
> operator
> 
>
> Key: HUDI-7303
> URL: https://issues.apache.org/jira/browse/HUDI-7303
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Affects Versions: 0.14.0, 0.14.1
> Environment: Flink 1.15.4 Hudi 0.14.0
> Flink 1.17.1 Hudi 0.14.0
> Flink 1.17.1 Hudi 0.14.1rc1
>Reporter: Yao Zhang
>Assignee: Yao Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> Given the table date_dim from TPCDS as an example:
> {code:java}
> CREATE TABLE date_dim (
>   d_date_sk int,
>   d_date_id varchar(16) NOT NULL,
>   d_date date,
>   d_month_seq int,
>   d_week_seq int,
>   d_quarter_seq int,
>   d_year int,
>   d_dow int,
>   d_moy int,
>   d_dom int,
>   d_qoy int,
>   d_fy_year int, 
>   d_fy_quarter_seq int,
>   d_fy_week_seq int,
>   d_day_name varchar(9)
>   d_quarter_name varchar(6),
>   d_holiday char(1),
>   d_weekend char(1),
>   d_following_holiday char(1),
>   d_first_dom int,
>   d_last_dom int,
>   d_same_day_ly int,
>   d_same_day_lq int,
>   d_current_day char(1),
>   d_current_week char(1),
>   d_current_month char(1),
>   d_current_quarter char(1),
>   d_current_year char(1)) with (
>   'connector' = 'hudi',
>   'path' = 'hdfs:///table_path/date_dim',
>   'table.type' = 'COPY_ON_WRITE'); {code}
> When you execute the following select statement, an exception will be thrown:
> {code:java}
> select * from date_dim where d_date between cast('1999-02-22' as date) and 
> (cast('1999-02-22' as date) + INTERVAL '30' day);
> {code}
> The exception is:
> {code:java}
> java.lang.IllegalArgumentException: FilterPredicate column: d_date's declared 
> type (java.lang.Long) does not match the schema found in file metadata. 
> Column d_date is of type: INT32
> Valid types for this column are: [class java.lang.Integer]
>   at 
> org.apache.parquet.filter2.predicate.ValidTypeMap.assertTypeValid(ValidTypeMap.java:125)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumn(SchemaCompatibilityValidator.java:179)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumnFilterPredicate(SchemaCompatibilityValidator.java:149)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:113)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:56)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.Operators$GtEq.accept(Operators.java:246)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:119)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:56)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.Operators$And.accept(Operators.java:306) 
> ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.validate(SchemaCompatibilityValidator.java:61)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:95)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:45)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.compat.FilterCompat$FilterPredicateCompat.accept(FilterCompat.java:149)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.compat.RowGroupFilter.filterRowGroups(RowGroupFilter.java:67)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.hudi.table.format.cow.vector.reader.ParquetColumnarRowSplitReader.(ParquetColumnarRowSplitReader.java:142)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.hudi.table.format.cow.ParquetSplitReaderUtil.genPartColumnarRowReader(ParquetSplitReaderUtil.java:153)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.hudi.table.format.RecordIterators.getParquetRecordIterator(RecordIterators.java:78)
>  

[jira] [Updated] (HUDI-7303) Date field type unexpectedly convert to Long when using date comparison operator

2024-01-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-7303:
-
Labels: pull-request-available  (was: )

> Date field type unexpectedly convert to Long when using date comparison 
> operator
> 
>
> Key: HUDI-7303
> URL: https://issues.apache.org/jira/browse/HUDI-7303
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Affects Versions: 0.14.0, 0.14.1
> Environment: Flink 1.15.4 Hudi 0.14.0
> Flink 1.17.1 Hudi 0.14.0
> Flink 1.17.1 Hudi 0.14.1rc1
>Reporter: Yao Zhang
>Assignee: Yao Zhang
>Priority: Major
>  Labels: pull-request-available
>
> Given the table date_dim from TPCDS as an example:
> {code:java}
> CREATE TABLE date_dim (
>   d_date_sk int,
>   d_date_id varchar(16) NOT NULL,
>   d_date date,
>   d_month_seq int,
>   d_week_seq int,
>   d_quarter_seq int,
>   d_year int,
>   d_dow int,
>   d_moy int,
>   d_dom int,
>   d_qoy int,
>   d_fy_year int, 
>   d_fy_quarter_seq int,
>   d_fy_week_seq int,
>   d_day_name varchar(9)
>   d_quarter_name varchar(6),
>   d_holiday char(1),
>   d_weekend char(1),
>   d_following_holiday char(1),
>   d_first_dom int,
>   d_last_dom int,
>   d_same_day_ly int,
>   d_same_day_lq int,
>   d_current_day char(1),
>   d_current_week char(1),
>   d_current_month char(1),
>   d_current_quarter char(1),
>   d_current_year char(1)) with (
>   'connector' = 'hudi',
>   'path' = 'hdfs:///table_path/date_dim',
>   'table.type' = 'COPY_ON_WRITE'); {code}
> When you execute the following select statement, an exception will be thrown:
> {code:java}
> select * from date_dim where d_date between cast('1999-02-22' as date) and 
> (cast('1999-02-22' as date) + INTERVAL '30' day);
> {code}
> The exception is:
> {code:java}
> java.lang.IllegalArgumentException: FilterPredicate column: d_date's declared 
> type (java.lang.Long) does not match the schema found in file metadata. 
> Column d_date is of type: INT32
> Valid types for this column are: [class java.lang.Integer]
>   at 
> org.apache.parquet.filter2.predicate.ValidTypeMap.assertTypeValid(ValidTypeMap.java:125)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumn(SchemaCompatibilityValidator.java:179)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumnFilterPredicate(SchemaCompatibilityValidator.java:149)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:113)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:56)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.Operators$GtEq.accept(Operators.java:246)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:119)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:56)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.Operators$And.accept(Operators.java:306) 
> ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.predicate.SchemaCompatibilityValidator.validate(SchemaCompatibilityValidator.java:61)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:95)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:45)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.compat.FilterCompat$FilterPredicateCompat.accept(FilterCompat.java:149)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.parquet.filter2.compat.RowGroupFilter.filterRowGroups(RowGroupFilter.java:67)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.hudi.table.format.cow.vector.reader.ParquetColumnarRowSplitReader.(ParquetColumnarRowSplitReader.java:142)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.hudi.table.format.cow.ParquetSplitReaderUtil.genPartColumnarRowReader(ParquetSplitReaderUtil.java:153)
>  ~[hudi-flink1.17-bundle-0.14.0.jar:0.14.0]
>   at 
> org.apache.hudi.table.format.RecordIterators.getParquetRecordIterator(RecordIterators.java:78)
>