[jira] [Updated] (HUDI-7315) Disable constructing NOT filter predicate when pushing down its wrapped filter unsupported, as its operand's primitive value is incomparable.
[ https://issues.apache.org/jira/browse/HUDI-7315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yao Zhang updated HUDI-7315: Description: This issue is extended from HUDI-7309, as the risk still exists for the NOT filter predicate when the predicate it wraps does not support pushing down (e.g. expression with the operand typed Decimal). It is similar to the issue of AND/OR filter in HUDI-7309. Though I have not yet reproduced NOT filter issue in practice, the risk still exists. We should fix it. was: This issue is extended from HUDI-7309, as the risk still exists when the predicate it wraps does not support pushing down (e.g. expression with the operand typed Decimal). It is similar to the issue of AND/OR filter in HUDI-7309. Though I have not yet reproduced NOT filter issue in practice, the risk still exists. We should fix it. > Disable constructing NOT filter predicate when pushing down its wrapped > filter unsupported, as its operand's primitive value is incomparable. > - > > Key: HUDI-7315 > URL: https://issues.apache.org/jira/browse/HUDI-7315 > Project: Apache Hudi > Issue Type: Bug > Components: flink >Affects Versions: 0.14.0, 0.14.1 > Environment: Flink 1.17.1 > Hudi 0.14.x >Reporter: Yao Zhang >Assignee: Yao Zhang >Priority: Major > > This issue is extended from HUDI-7309, as the risk still exists for the NOT > filter predicate when the predicate it wraps does not support pushing down > (e.g. expression with the operand typed Decimal). > It is similar to the issue of AND/OR filter in HUDI-7309. Though I have not > yet reproduced NOT filter issue in practice, the risk still exists. We should > fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7315) Disable constructing NOT filter predicate when pushing down its wrapped filter unsupported, as its operand's primitive value is incomparable.
[ https://issues.apache.org/jira/browse/HUDI-7315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7315: - Labels: pull-request-available (was: ) > Disable constructing NOT filter predicate when pushing down its wrapped > filter unsupported, as its operand's primitive value is incomparable. > - > > Key: HUDI-7315 > URL: https://issues.apache.org/jira/browse/HUDI-7315 > Project: Apache Hudi > Issue Type: Bug > Components: flink >Affects Versions: 0.14.0, 0.14.1 > Environment: Flink 1.17.1 > Hudi 0.14.x >Reporter: Yao Zhang >Assignee: Yao Zhang >Priority: Major > Labels: pull-request-available > > This issue is extended from HUDI-7309, as the risk still exists for the NOT > filter predicate when the predicate it wraps does not support pushing down > (e.g. expression with the operand typed Decimal). > It is similar to the issue of AND/OR filter in HUDI-7309. Though I have not > yet reproduced NOT filter issue in practice, the risk still exists. We should > fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [HUDI-7315] Disable constructing NOT filter predicate when pushing do… [hudi]
paul8263 opened a new pull request, #10537: URL: https://github.com/apache/hudi/pull/10537 …wn its wrapped filter unsupported, as its operand's primitive value is incomparable. ### Change Logs This issue is extended from [HUDI-7309](https://issues.apache.org/jira/browse/HUDI-7309), as the risk still exists for the NOT filter predicate when the predicate it wraps does not support pushing down (e.g. expression with the operand typed Decimal). This PR fixed the issue for the NOT filter predicate. ### Impact Low impact ### Risk level (write none, low medium or high below) Low risk. ### Documentation Update No need to update. ### Contributor's checklist - [x] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [x] Change Logs and Impact were stated clearly - [x] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7314] Hudi Create table support index type check [hudi]
hudi-bot commented on PR #10536: URL: https://github.com/apache/hudi/pull/10536#issuecomment-1899892735 ## CI report: * c1f04403b97c15ffea9d99358f923a593db33e3f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22070) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7315) Disable constructing NOT filter predicate when pushing down its wrapped filter unsupported, as its operand's primitive value is incomparable.
[ https://issues.apache.org/jira/browse/HUDI-7315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yao Zhang updated HUDI-7315: Summary: Disable constructing NOT filter predicate when pushing down its wrapped filter unsupported, as its operand's primitive value is incomparable. (was: Disable constructing NOT filter predicate when pushing down its wrapped filter unsupported, as its operand's primitive value is uncomparable.) > Disable constructing NOT filter predicate when pushing down its wrapped > filter unsupported, as its operand's primitive value is incomparable. > - > > Key: HUDI-7315 > URL: https://issues.apache.org/jira/browse/HUDI-7315 > Project: Apache Hudi > Issue Type: Bug > Components: flink >Affects Versions: 0.14.0, 0.14.1 > Environment: Flink 1.17.1 > Hudi 0.14.x >Reporter: Yao Zhang >Assignee: Yao Zhang >Priority: Major > > This issue is extended from HUDI-7309, as the risk still exists when the > predicate it wraps does not support pushing down (e.g. expression with the > operand typed Decimal). > It is similar to the issue of AND/OR filter in HUDI-7309. Though I have not > yet reproduced NOT filter issue in practice, the risk still exists. We should > fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-7315) Disable constructing NOT filter predicate when pushing down its wrapped filter unsupported, as its operand's primitive value is uncomparable.
Yao Zhang created HUDI-7315: --- Summary: Disable constructing NOT filter predicate when pushing down its wrapped filter unsupported, as its operand's primitive value is uncomparable. Key: HUDI-7315 URL: https://issues.apache.org/jira/browse/HUDI-7315 Project: Apache Hudi Issue Type: Bug Components: flink Affects Versions: 0.14.1, 0.14.0 Environment: Flink 1.17.1 Hudi 0.14.x Reporter: Yao Zhang Assignee: Yao Zhang This issue is extended from HUDI-7309, as the risk still exists when the predicate it wraps does not support pushing down (e.g. expression with the operand typed Decimal). It is similar to the issue of AND/OR filter in HUDI-7309. Though I have not yet reproduced NOT filter issue in practice, the risk still exists. We should fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [HUDI-7314] Hudi Create table support index type check [hudi]
hudi-bot commented on PR #10536: URL: https://github.com/apache/hudi/pull/10536#issuecomment-1899885161 ## CI report: * c1f04403b97c15ffea9d99358f923a593db33e3f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899876688 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * 7462f45fd73f66c5ec848c930f5e9b86a72b25e5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22067) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7302] Consistent hashing row writer support sorting [hudi]
hudi-bot commented on PR #10515: URL: https://github.com/apache/hudi/pull/10515#issuecomment-1899877102 ## CI report: * 7e2d36fc73a2143afdda0f6d1d088fed0bee5367 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22068) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7314) Hudi Create table support index type check
[ https://issues.apache.org/jira/browse/HUDI-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7314: - Labels: pull-request-available (was: ) > Hudi Create table support index type check > -- > > Key: HUDI-7314 > URL: https://issues.apache.org/jira/browse/HUDI-7314 > Project: Apache Hudi > Issue Type: Improvement > Components: spark-sql >Reporter: xy >Priority: Major > Labels: pull-request-available > > Currently Hudi is not check index type when create table,even when user set > inaccurate index name is passed when set absent value. Need fix it -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [HUDI-7314] Hudi Create table support index type check [hudi]
xuzifu666 opened a new pull request, #10536: URL: https://github.com/apache/hudi/pull/10536 ### Change Logs Currently Hudi is not check index type when create table,even when user set inaccurate index name is passed when set absent value. such as: create table ${targetTable} ( `id` string, `name` string, `dt` bigint, `day` STRING, `hour` INT ) using hudi OPTIONS ('hoodie.datasource.write.hive_style_partitioning' 'false', 'hoodie.datasource.meta.sync.enable' 'false', 'hoodie.datasource.hive_sync.enable' 'false') tblproperties ( 'primaryKey' = 'id', 'type' = 'mor', 'preCombineField'='dt', 'hoodie.index.type' = 'BUCKET_XXX', 'hoodie.bucket.index.hash.field' = 'id', 'hoodie.bucket.index.num.buckets'=512 ) partitioned by (`day`,`hour`) Need fix it with check ### Impact low ### Risk level (write none, low medium or high below) low ### Documentation Update _Describe any necessary documentation update if there is any new feature, config, or user-facing change_ - _The config description must be updated if new configs are added or the default value of the configs are changed_ - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make changes to the website._ ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-7314) Hudi Create table support index type check
xy created HUDI-7314: Summary: Hudi Create table support index type check Key: HUDI-7314 URL: https://issues.apache.org/jira/browse/HUDI-7314 Project: Apache Hudi Issue Type: Improvement Components: spark-sql Reporter: xy Currently Hudi is not check index type when create table,even when user set inaccurate index name is passed when set absent value. Need fix it -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [MINOR] Added descriptive exception if column present in required avro schema does not exist in hudi table [hudi]
hudi-bot commented on PR #10527: URL: https://github.com/apache/hudi/pull/10527#issuecomment-1899833215 ## CI report: * 75af48bad9c182e664bc156d8c430e76b9f6afc5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22069) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Update docker_demo.md [hudi]
yanghua commented on code in PR #10522: URL: https://github.com/apache/hudi/pull/10522#discussion_r1458400710 ## website/versioned_docs/version-0.14.1/docker_demo.md: ## @@ -134,9 +135,10 @@ $ docker ps :::note Please note the following for Mac AArch64 users - The demo must be built and run using the master branch. We currently plan to include support starting with the -0.13.0 release. + The demo must be built and run using the release-0.14.1 tag. Presto and Trino are not currently supported in the demo. + You will see warningss that there is no historyserver for your archittecture. You can ignore this + You wil see warnings "Unable to load native-hadoop library for your platform... using builtin-java classes where applicable." You can ignore this Review Comment: You `will` see `the warning` "Unable to load native-hadoop library for your platform... using builtin-java classes where applicable." You can ignore this `.` ## website/versioned_docs/version-0.14.1/docker_demo.md: ## @@ -134,9 +135,10 @@ $ docker ps :::note Please note the following for Mac AArch64 users - The demo must be built and run using the master branch. We currently plan to include support starting with the -0.13.0 release. + The demo must be built and run using the release-0.14.1 tag. Presto and Trino are not currently supported in the demo. + You will see warningss that there is no historyserver for your archittecture. You can ignore this Review Comment: You will see `warnings` that there is no `history server` for your `architecture`. You can ignore this `.` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] initial commit and added applied intuition and penn entertainment to … [hudi]
nfarah86 opened a new pull request, #10535: URL: https://github.com/apache/hudi/pull/10535 Updated the logo wall with applied intuition and penn entertainment. I also updated an older blog that referenced the powered-by page. I removed the old powered-by the logo. cc @bhasudha https://github.com/apache/hudi/assets/5392555/3e3af9b0-da45-48df-956e-c524e780e54c";> - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Added descriptive exception if column present in required avro schema does not exist in hudi table [hudi]
hudi-bot commented on PR #10527: URL: https://github.com/apache/hudi/pull/10527#issuecomment-1899783731 ## CI report: * b943b1d18880206e5f3cdec32fd789c9a22afef0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22031) * 75af48bad9c182e664bc156d8c430e76b9f6afc5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22069) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Added descriptive exception if column present in required avro schema does not exist in hudi table [hudi]
hudi-bot commented on PR #10527: URL: https://github.com/apache/hudi/pull/10527#issuecomment-1899778371 ## CI report: * b943b1d18880206e5f3cdec32fd789c9a22afef0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22031) * 75af48bad9c182e664bc156d8c430e76b9f6afc5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7311] Add literal type auto conversion before filter push down [hudi]
hudi-bot commented on PR #10531: URL: https://github.com/apache/hudi/pull/10531#issuecomment-1899772496 ## CI report: * bd895568aabd022dc155b810a56791400702c2a6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22066) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Added descriptive exception if column present in required avro schema is not present in hudi table [hudi]
prathit06 commented on code in PR #10527: URL: https://github.com/apache/hudi/pull/10527#discussion_r1458332575 ## hudi-flink-datasource/hudi-flink1.14.x/src/main/java/org/apache/hudi/table/format/cow/ParquetSplitReaderUtil.java: ## @@ -119,6 +119,11 @@ public static ParquetColumnarRowSplitReader genPartColumnarRowReader( long splitLength, FilterPredicate filterPredicate, UnboundRecordFilter recordFilter) throws IOException { + +if (Arrays.stream(selectedFields).anyMatch(x -> x == -1)) { + throw new AssertionError("One or more specified columns does not exist in the hudi table"); Review Comment: have updated it to `ValidationUtils.checkState` ## hudi-flink-datasource/hudi-flink1.14.x/src/main/java/org/apache/hudi/table/format/cow/ParquetSplitReaderUtil.java: ## @@ -119,6 +119,11 @@ public static ParquetColumnarRowSplitReader genPartColumnarRowReader( long splitLength, FilterPredicate filterPredicate, UnboundRecordFilter recordFilter) throws IOException { + +if (Arrays.stream(selectedFields).anyMatch(x -> x == -1)) { + throw new AssertionError("One or more specified columns does not exist in the hudi table"); Review Comment: have updated to `ValidationUtils.checkState` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7144] Build storage partition stats index and use it for data skipping [hudi]
hudi-bot commented on PR #10352: URL: https://github.com/apache/hudi/pull/10352#issuecomment-1899748308 ## CI report: * e72c465b0212e834039ced6bb76fdf09dbc66d98 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22065) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7302] Consistent hashing row writer support sorting [hudi]
hudi-bot commented on PR #10515: URL: https://github.com/apache/hudi/pull/10515#issuecomment-1899729127 ## CI report: * 21509bc638de40df8ddaebbb4544c002aabe0bd2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22000) * 7e2d36fc73a2143afdda0f6d1d088fed0bee5367 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22068) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899727604 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * f0338bae4263c94e7afc59cc6e177028d9721d01 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22064) * 7462f45fd73f66c5ec848c930f5e9b86a72b25e5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22067) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7302] Consistent hashing row writer support sorting [hudi]
hudi-bot commented on PR #10515: URL: https://github.com/apache/hudi/pull/10515#issuecomment-1899696732 ## CI report: * 21509bc638de40df8ddaebbb4544c002aabe0bd2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22000) * 7e2d36fc73a2143afdda0f6d1d088fed0bee5367 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899694639 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * f0338bae4263c94e7afc59cc6e177028d9721d01 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22064) * 7462f45fd73f66c5ec848c930f5e9b86a72b25e5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899667745 ## CI report: * 7e9f9a7371dabcd14c4b644c9bb399a28d99b77f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22061) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7311] Add literal type auto conversion before filter push down [hudi]
hudi-bot commented on PR #10531: URL: https://github.com/apache/hudi/pull/10531#issuecomment-1899667365 ## CI report: * 8b6316b18855ec9d4c2ea891abff52ad1629cd2a Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22059) * bd895568aabd022dc155b810a56791400702c2a6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22066) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899665246 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * f0338bae4263c94e7afc59cc6e177028d9721d01 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22064) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7310] Optimize Column Stats Partition Pruning for Non-Partition Pruning Queries [hudi]
stream2000 commented on code in PR #10528: URL: https://github.com/apache/hudi/pull/10528#discussion_r1458286332 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala: ## @@ -361,9 +364,16 @@ case class HoodieFileIndex(spark: SparkSession, // For that we use a simple-heuristic to determine whether we should read and process CSI in-memory or // on-cluster: total number of rows of the expected projected portion of the index has to be below the // threshold (of 100k records) - val prunedFileNames = getPrunedFileNames(prunedPartitionsAndFileSlices) val shouldReadInMemory = columnStatsIndex.shouldReadInMemory(this, queryReferencedColumns) - columnStatsIndex.loadTransposed(queryReferencedColumns, shouldReadInMemory, prunedFileNames) { transposedColStatsDF => + val prunedFileNames = getPrunedFileNames(prunedPartitionsAndFileSlices) + // NOTE: This judgment has two purposes: Review Comment: nit: We can simplify the comment to: // If partition pruning doesn't prune any files, then there's no need to apply file filters when loading the Column Statistics Index ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala: ## @@ -233,8 +233,9 @@ case class HoodieFileIndex(spark: SparkSession, //- Col-Stats Index is present //- Record-level Index is present //- List of predicates (filters) is present + val shouldPushDownFilesFilter = !partitionFilters.isEmpty val candidateFilesNamesOpt: Option[Set[String]] = - lookupCandidateFilesInMetadataTable(dataFilters, prunedPartitionsAndFileSlices) match { + lookupCandidateFilesInMetadataTable(dataFilters, shouldPushDownFilesFilter, prunedPartitionsAndFileSlices) match { Review Comment: We can move the `shouldPushDownFilesFilter` to the end of the parameter list. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7308] LockManager::unlock should not call updateLockHeldTimerMetrics if lockDurationTimer has not been started [hudi]
kbuci commented on PR #10523: URL: https://github.com/apache/hudi/pull/10523#issuecomment-1899648628 Thanks for comments @danny0405 , Iupdated PR and replied to comments. The CI job seemed to pass at first before the recent commit where I added volatile, now I retried it twice and it's failing https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22034&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0 . The logs don't fully appear yet so I can't confirm yet if its due to flaky tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7302] Consistent hashing row writer support sorting [hudi]
stream2000 commented on code in PR #10515: URL: https://github.com/apache/hudi/pull/10515#discussion_r1458254203 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/clustering/run/strategy/SparkConsistentBucketClusteringExecutionStrategy.java: ## @@ -72,7 +72,7 @@ public HoodieData performClusteringWithRecordsAsRow(Dataset in HoodieWriteConfig newConfig = HoodieWriteConfig.newBuilder().withProps(props).build(); -ConsistentBucketIndexBulkInsertPartitionerWithRows partitioner = new ConsistentBucketIndexBulkInsertPartitionerWithRows(getHoodieTable(), shouldPreserveHoodieMetadata); +ConsistentBucketIndexBulkInsertPartitionerWithRows partitioner = new ConsistentBucketIndexBulkInsertPartitionerWithRows(getHoodieTable(), strategyParams, shouldPreserveHoodieMetadata); Review Comment: sure and done. ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/execution/bulkinsert/ConsistentBucketIndexBulkInsertPartitionerWithRows.java: ## @@ -142,10 +203,11 @@ public void addHashingChildrenNodes(String partition, List 0) +|| table.requireSortedRecords() || table.getConfig().getBulkInsertSortMode() != BulkInsertSortMode.NONE; } - private int getBucketId(Row row) { + private Integer getBucketId(Row row) { Review Comment: reverted it. ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/execution/bulkinsert/ConsistentBucketIndexBulkInsertPartitionerWithRows.java: ## @@ -105,10 +121,55 @@ public int numPartitions() { } }; -return rows.sparkSession().createDataFrame(rowJavaRDD -.mapToPair(row -> new Tuple2<>(getBucketId(row), row)) -.partitionBy(partitioner) -.values(), rows.schema()); +if (sortColumnNames != null && sortColumnNames.length > 0) { + return rows.sparkSession().createDataFrame(rowJavaRDD + .mapToPair(row -> new Tuple2<>(row, row)) + .repartitionAndSortWithinPartitions(partitioner, new CustomRowColumnsComparator()) + .values(), + rows.schema()); +} else if (table.requireSortedRecords() || table.getConfig().getBulkInsertSortMode() != BulkInsertSortMode.NONE) { Review Comment: Yes we are actually implementing `PARTITION_SORT`, I'm just wondering for sort modes other than PARTITION_SORT, should we default to a 'no sort' behavior similar to `BulkInsertSortMode=NONE`, automatically switch to `PARTITION_SORT`, or should we throw an exception to indicate that the sort mode is not supported? Hope for your opinion, or we can keep the current behavior that switch to `PARTITION_SORT` automatically. ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/execution/bulkinsert/ConsistentBucketIndexBulkInsertPartitionerWithRows.java: ## @@ -105,10 +121,55 @@ public int numPartitions() { } }; -return rows.sparkSession().createDataFrame(rowJavaRDD -.mapToPair(row -> new Tuple2<>(getBucketId(row), row)) -.partitionBy(partitioner) -.values(), rows.schema()); +if (sortColumnNames != null && sortColumnNames.length > 0) { + return rows.sparkSession().createDataFrame(rowJavaRDD + .mapToPair(row -> new Tuple2<>(row, row)) Review Comment: We will still need the row for comparing and sort it, so keep this line ` .mapToPair(row -> new Tuple2<>(row, row))` is OK. Also comparing with partitionBy + sortWithinPartitions, repartitionAndSortWithinPartitions will be more efficient because it performs the shuffle operation only once, with both repartitioning and sorting happening in the same step. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7311] Add literal type auto conversion before filter push down [hudi]
hudi-bot commented on PR #10531: URL: https://github.com/apache/hudi/pull/10531#issuecomment-1899614325 ## CI report: * 9e644af9a9c0e4f2f38da4a4826d7923edf80f70 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22044) * 8b6316b18855ec9d4c2ea891abff52ad1629cd2a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22059) * bd895568aabd022dc155b810a56791400702c2a6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22066) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7144] Build storage partition stats index and use it for data skipping [hudi]
hudi-bot commented on PR #10352: URL: https://github.com/apache/hudi/pull/10352#issuecomment-1899613364 ## CI report: * f8c89bcfeb969a48bc4c5b56c2b3e5104dc6e940 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22028) * e72c465b0212e834039ced6bb76fdf09dbc66d98 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22065) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899613097 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * 4173edaec044dfb8e0691bebcddba2c7b8ec2d8d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22056) * f0338bae4263c94e7afc59cc6e177028d9721d01 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22064) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7311] Add literal type auto conversion before filter push down [hudi]
hudi-bot commented on PR #10531: URL: https://github.com/apache/hudi/pull/10531#issuecomment-1899594945 ## CI report: * 9e644af9a9c0e4f2f38da4a4826d7923edf80f70 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22044) * 8b6316b18855ec9d4c2ea891abff52ad1629cd2a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22059) * bd895568aabd022dc155b810a56791400702c2a6 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7144] Build storage partition stats index and use it for data skipping [hudi]
hudi-bot commented on PR #10352: URL: https://github.com/apache/hudi/pull/10352#issuecomment-1899593959 ## CI report: * f8c89bcfeb969a48bc4c5b56c2b3e5104dc6e940 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22028) * e72c465b0212e834039ced6bb76fdf09dbc66d98 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899593678 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * 4173edaec044dfb8e0691bebcddba2c7b8ec2d8d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22056) * f0338bae4263c94e7afc59cc6e177028d9721d01 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7312] Spark3ParsePartitionUtil support inferPartitionColumnValue with all unnest type [hudi]
xuzifu666 closed pull request #10530: [HUDI-7312] Spark3ParsePartitionUtil support inferPartitionColumnValue with all unnest type URL: https://github.com/apache/hudi/pull/10530 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7277] fix `hoodie.bulkinsert.shuffle.parallelism` not activated… [hudi]
hudi-bot commented on PR #10532: URL: https://github.com/apache/hudi/pull/10532#issuecomment-1899570363 ## CI report: * f81ca8d2c6c8788c7e139a1e5d6d1a701dce182d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22048) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899570616 ## CI report: * e4a1343c89890092c5e0cd5c880691ccee6a47bc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22057) * df5625a0b3e3e742257bae44542e628b3546c78c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22058) * ffe23c1e7fbeee0b371296a3caa55588ca86cc55 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22060) * 7e9f9a7371dabcd14c4b644c9bb399a28d99b77f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22061) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7303] Fix date field type unexpectedly convert to Long when usi… [hudi]
paul8263 commented on PR #10517: URL: https://github.com/apache/hudi/pull/10517#issuecomment-1899570791 > There are test failures: > > ``` > [ERROR] Errors: > [ERROR] TestHoodieTableSource.testBucketPruningSpecialKeyDataType:267 » ClassCast java... > [ERROR] TestHoodieTableSource.testBucketPruningSpecialKeyDataType:267 » ClassCast java... > ``` Err... I could not find exceptions in CI check reports. Could you please help me about the detailed error stacktrace? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7303] Fix date field type unexpectedly convert to Long when usi… [hudi]
danny0405 commented on PR #10517: URL: https://github.com/apache/hudi/pull/10517#issuecomment-1899550901 There are test failures: ```xml [ERROR] Errors: [ERROR] TestHoodieTableSource.testBucketPruningSpecialKeyDataType:267 » ClassCast java... [ERROR] TestHoodieTableSource.testBucketPruningSpecialKeyDataType:267 » ClassCast java... ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (HUDI-7309) Disable filter pushing down when the parquet type corresponding to its field logical type is not comparable
[ https://issues.apache.org/jira/browse/HUDI-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7309. Resolution: Fixed Fixed via master branch: e85f8f43547cf107ce4dedf737eede74e2c811e7 > Disable filter pushing down when the parquet type corresponding to its field > logical type is not comparable > --- > > Key: HUDI-7309 > URL: https://issues.apache.org/jira/browse/HUDI-7309 > Project: Apache Hudi > Issue Type: Bug > Components: flink >Affects Versions: 0.14.0, 0.14.1 > Environment: Hudi 0.14.0 > Hudi 0.14.1rc1 > Flink 1.17.1 >Reporter: Yao Zhang >Assignee: Yao Zhang >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > > Given thee table web_sales from TPCDS: > {code:sql} > CREATE TABLE web_sales ( >ws_sold_date_sk int, >ws_sold_time_sk int, >ws_ship_date_sk int, >ws_item_sk int, >ws_bill_customer_sk int, >ws_bill_cdemo_sk int, >ws_bill_hdemo_sk int, >ws_bill_addr_sk int, >ws_ship_customer_sk int, >ws_ship_cdemo_sk int, >ws_ship_hdemo_sk int, >ws_ship_addr_sk int, >ws_web_page_sk int, >ws_web_site_sk int, >ws_ship_mode_sk int, >ws_warehouse_sk int, >ws_promo_sk int, >ws_order_number int, >ws_quantity int, >ws_wholesale_cost decimal(7,2), >ws_list_price decimal(7,2), >ws_sales_price decimal(7,2), >ws_ext_discount_amt decimal(7,2), >ws_ext_sales_price decimal(7,2), >ws_ext_wholesale_cost decimal(7,2), >ws_ext_list_price decimal(7,2), >ws_ext_tax decimal(7,2), >ws_coupon_amt decimal(7,2), >ws_ext_ship_cost decimal(7,2), >ws_net_paid decimal(7,2), >ws_net_paid_inc_tax decimal(7,2), >ws_net_paid_inc_ship decimal(7,2), >ws_net_paid_inc_ship_tax decimal(7,2), >ws_net_profit decimal(7,2) > ) with ( > 'connector' = 'hudi', > 'path' = 'hdfs://path/to/web_sales', > 'table.type' = 'COPY_ON_WRITE', > 'hoodie.datasource.write.recordkey.field' = > 'ws_item_sk,ws_order_number' > ); > {code} > And execute: > {code:sql} > select * from web_sales where ws_sold_date_sk = 2451268 and ws_sales_price > between 100.00 and 150.00 > {code} > An exception will occur: > {code:java} > Caused by: java.lang.NullPointerException: left cannot be null > at java.util.Objects.requireNonNull(Objects.java:228) > at > org.apache.parquet.filter2.predicate.Operators$BinaryLogicalFilterPredicate.(Operators.java:257) > at > org.apache.parquet.filter2.predicate.Operators$And.(Operators.java:301) > at > org.apache.parquet.filter2.predicate.FilterApi.and(FilterApi.java:249) > at > org.apache.hudi.source.ExpressionPredicates$And.filter(ExpressionPredicates.java:551) > at > org.apache.hudi.source.ExpressionPredicates$Or.filter(ExpressionPredicates.java:589) > at > org.apache.hudi.source.ExpressionPredicates$Or.filter(ExpressionPredicates.java:589) > at > org.apache.hudi.table.format.RecordIterators.getParquetRecordIterator(RecordIterators.java:68) > at > org.apache.hudi.table.format.cow.CopyOnWriteInputFormat.open(CopyOnWriteInputFormat.java:130) > at > org.apache.hudi.table.format.cow.CopyOnWriteInputFormat.open(CopyOnWriteInputFormat.java:66) > at > org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:84) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:110) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:67) > at > org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:333) > {code} > After further investigation, decimal type is not comparable in the form it > stored in parquet format (fix length byte array). The way that pushes down
(hudi) branch master updated: [HUDI-7309] Disable constructing AND & OR filter predicates when filter pushing down for any of its operand's logical type for is unsupported in ExpressionPredicates::to
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new e85f8f43547 [HUDI-7309] Disable constructing AND & OR filter predicates when filter pushing down for any of its operand's logical type for is unsupported in ExpressionPredicates::toParquetPredicate (#10524) e85f8f43547 is described below commit e85f8f43547cf107ce4dedf737eede74e2c811e7 Author: Paul Zhang AuthorDate: Fri Jan 19 10:27:36 2024 +0800 [HUDI-7309] Disable constructing AND & OR filter predicates when filter pushing down for any of its operand's logical type for is unsupported in ExpressionPredicates::toParquetPredicate (#10524) --- .../org/apache/hudi/source/ExpressionPredicates.java| 6 ++ .../apache/hudi/source/TestExpressionPredicates.java| 17 + 2 files changed, 23 insertions(+) diff --git a/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/ExpressionPredicates.java b/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/ExpressionPredicates.java index 046e4b739ad..34bb58f6c8e 100644 --- a/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/ExpressionPredicates.java +++ b/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/ExpressionPredicates.java @@ -548,6 +548,9 @@ public class ExpressionPredicates { @Override public FilterPredicate filter() { + if (null == predicates[0].filter() || null == predicates[1].filter()) { +return null; + } return and(predicates[0].filter(), predicates[1].filter()); } @@ -586,6 +589,9 @@ public class ExpressionPredicates { @Override public FilterPredicate filter() { + if (null == predicates[0].filter() || null == predicates[1].filter()) { +return null; + } return or(predicates[0].filter(), predicates[1].filter()); } diff --git a/hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/TestExpressionPredicates.java b/hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/TestExpressionPredicates.java index 97b06644266..b8c4b1caf2e 100644 --- a/hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/TestExpressionPredicates.java +++ b/hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/TestExpressionPredicates.java @@ -42,6 +42,7 @@ import org.apache.parquet.filter2.predicate.Operators.IntColumn; import org.apache.parquet.filter2.predicate.Operators.Lt; import org.junit.jupiter.api.Test; +import java.math.BigDecimal; import java.util.Arrays; import java.util.Collections; import java.util.List; @@ -58,6 +59,7 @@ import static org.apache.parquet.filter2.predicate.FilterApi.not; import static org.apache.parquet.filter2.predicate.FilterApi.notEq; import static org.apache.parquet.filter2.predicate.FilterApi.or; import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertNull; /** * Test cases for {@link ExpressionPredicates}. @@ -164,4 +166,19 @@ public class TestExpressionPredicates { assertEquals(predicate19.toString(), predicate20.toString()); assertEquals(or(lt, gt), predicate20.filter()); } + + @Test + public void testDisablePredicatesPushDownForUnsupportedType() { +FieldReferenceExpression fieldReference = new FieldReferenceExpression("f_decimal", DataTypes.DECIMAL(7, 2), 0, 0); +ValueLiteralExpression valueLiteral = new ValueLiteralExpression(BigDecimal.valueOf(100.00)); +List expressions = Arrays.asList(fieldReference, valueLiteral); + +CallExpression greaterThanExpression = new CallExpression(BuiltInFunctionDefinitions.GREATER_THAN, expressions, DataTypes.DECIMAL(7, 2)); +Predicate greaterThanPredicate = fromExpression(greaterThanExpression); +CallExpression lessThanExpression = new CallExpression(BuiltInFunctionDefinitions.LESS_THAN, expressions, DataTypes.DECIMAL(7, 2)); +Predicate lessThanPredicate = fromExpression(lessThanExpression); + +assertNull(And.getInstance().bindPredicates(greaterThanPredicate, lessThanPredicate).filter(), "Decimal type push down is unsupported, so we expect null"); +assertNull(Or.getInstance().bindPredicates(greaterThanPredicate, lessThanPredicate).filter(), "Decimal type push down is unsupported, so we expect null"); + } }
[jira] [Updated] (HUDI-7309) Disable filter pushing down when the parquet type corresponding to its field logical type is not comparable
[ https://issues.apache.org/jira/browse/HUDI-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7309: - Fix Version/s: 1.0.0 > Disable filter pushing down when the parquet type corresponding to its field > logical type is not comparable > --- > > Key: HUDI-7309 > URL: https://issues.apache.org/jira/browse/HUDI-7309 > Project: Apache Hudi > Issue Type: Bug > Components: flink >Affects Versions: 0.14.0, 0.14.1 > Environment: Hudi 0.14.0 > Hudi 0.14.1rc1 > Flink 1.17.1 >Reporter: Yao Zhang >Assignee: Yao Zhang >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > > Given thee table web_sales from TPCDS: > {code:sql} > CREATE TABLE web_sales ( >ws_sold_date_sk int, >ws_sold_time_sk int, >ws_ship_date_sk int, >ws_item_sk int, >ws_bill_customer_sk int, >ws_bill_cdemo_sk int, >ws_bill_hdemo_sk int, >ws_bill_addr_sk int, >ws_ship_customer_sk int, >ws_ship_cdemo_sk int, >ws_ship_hdemo_sk int, >ws_ship_addr_sk int, >ws_web_page_sk int, >ws_web_site_sk int, >ws_ship_mode_sk int, >ws_warehouse_sk int, >ws_promo_sk int, >ws_order_number int, >ws_quantity int, >ws_wholesale_cost decimal(7,2), >ws_list_price decimal(7,2), >ws_sales_price decimal(7,2), >ws_ext_discount_amt decimal(7,2), >ws_ext_sales_price decimal(7,2), >ws_ext_wholesale_cost decimal(7,2), >ws_ext_list_price decimal(7,2), >ws_ext_tax decimal(7,2), >ws_coupon_amt decimal(7,2), >ws_ext_ship_cost decimal(7,2), >ws_net_paid decimal(7,2), >ws_net_paid_inc_tax decimal(7,2), >ws_net_paid_inc_ship decimal(7,2), >ws_net_paid_inc_ship_tax decimal(7,2), >ws_net_profit decimal(7,2) > ) with ( > 'connector' = 'hudi', > 'path' = 'hdfs://path/to/web_sales', > 'table.type' = 'COPY_ON_WRITE', > 'hoodie.datasource.write.recordkey.field' = > 'ws_item_sk,ws_order_number' > ); > {code} > And execute: > {code:sql} > select * from web_sales where ws_sold_date_sk = 2451268 and ws_sales_price > between 100.00 and 150.00 > {code} > An exception will occur: > {code:java} > Caused by: java.lang.NullPointerException: left cannot be null > at java.util.Objects.requireNonNull(Objects.java:228) > at > org.apache.parquet.filter2.predicate.Operators$BinaryLogicalFilterPredicate.(Operators.java:257) > at > org.apache.parquet.filter2.predicate.Operators$And.(Operators.java:301) > at > org.apache.parquet.filter2.predicate.FilterApi.and(FilterApi.java:249) > at > org.apache.hudi.source.ExpressionPredicates$And.filter(ExpressionPredicates.java:551) > at > org.apache.hudi.source.ExpressionPredicates$Or.filter(ExpressionPredicates.java:589) > at > org.apache.hudi.source.ExpressionPredicates$Or.filter(ExpressionPredicates.java:589) > at > org.apache.hudi.table.format.RecordIterators.getParquetRecordIterator(RecordIterators.java:68) > at > org.apache.hudi.table.format.cow.CopyOnWriteInputFormat.open(CopyOnWriteInputFormat.java:130) > at > org.apache.hudi.table.format.cow.CopyOnWriteInputFormat.open(CopyOnWriteInputFormat.java:66) > at > org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:84) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:110) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:67) > at > org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:333) > {code} > After further investigation, decimal type is not comparable in the form it > stored in parquet format (fix length byte array). The way that pushes down > this filter to parquet predicates are not > supported(Expre
Re: [PR] [HUDI-7309] Disable constructing AND & OR filter predicates when filt… [hudi]
danny0405 merged PR #10524: URL: https://github.com/apache/hudi/pull/10524 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Added descriptive exception if column present in required avro schema is not present in hudi table [hudi]
danny0405 commented on code in PR #10527: URL: https://github.com/apache/hudi/pull/10527#discussion_r1458202123 ## hudi-flink-datasource/hudi-flink1.14.x/src/main/java/org/apache/hudi/table/format/cow/ParquetSplitReaderUtil.java: ## @@ -119,6 +119,11 @@ public static ParquetColumnarRowSplitReader genPartColumnarRowReader( long splitLength, FilterPredicate filterPredicate, UnboundRecordFilter recordFilter) throws IOException { + +if (Arrays.stream(selectedFields).anyMatch(x -> x == -1)) { + throw new AssertionError("One or more specified columns does not exist in the hudi table"); Review Comment: Maybe you can just use `ValidationUtils.checkState`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7303] Fix date field type unexpectedly convert to Long when usi… [hudi]
paul8263 commented on code in PR #10517: URL: https://github.com/apache/hudi/pull/10517#discussion_r1458202085 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/ExpressionPredicates.java: ## @@ -602,10 +602,10 @@ private static FilterPredicate toParquetPredicate(FunctionDefinition functionDef case TINYINT: case SMALLINT: case INTEGER: + case DATE: case TIME_WITHOUT_TIME_ZONE: return predicateSupportsLtGt(functionDefinition, intColumn(columnName), (Integer) literal); case BIGINT: - case DATE: case TIMESTAMP_WITHOUT_TIME_ZONE: Review Comment: Hi @danny0405 , Please see: [HUDI-7303](https://issues.apache.org/jira/browse/HUDI-7303). In parquet, date type is stored as INT32 (epoch day). But if we add some conditions with date typed field in SQL where clause, its type will unexpectedly convert to Long. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] Migration partitionned table with complex key generator to 0.14.1 leads to duplicates when recordkey length =1 [hudi]
danny0405 commented on issue #10508: URL: https://github.com/apache/hudi/issues/10508#issuecomment-1899537658 Yeah, we need a compatibility solution for this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899536622 ## CI report: * 8ed4533f65fd470149f70f746d7ce5983c6f7f65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22055) * e4a1343c89890092c5e0cd5c880691ccee6a47bc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22057) * df5625a0b3e3e742257bae44542e628b3546c78c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22058) * ffe23c1e7fbeee0b371296a3caa55588ca86cc55 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22060) * 7e9f9a7371dabcd14c4b644c9bb399a28d99b77f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22061) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7277] fix `hoodie.bulkinsert.shuffle.parallelism` not activated… [hudi]
hudi-bot commented on PR #10532: URL: https://github.com/apache/hudi/pull/10532#issuecomment-1899536567 ## CI report: * f81ca8d2c6c8788c7e139a1e5d6d1a701dce182d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22048) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (HUDI-7297) Exception thrown when field type mismatch is ambiguous
[ https://issues.apache.org/jira/browse/HUDI-7297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7297. Resolution: Fixed Fixed via master branch: 0dee3b4a1a393a12dbbe71039db76c9c8cb680fa > Exception thrown when field type mismatch is ambiguous > -- > > Key: HUDI-7297 > URL: https://issues.apache.org/jira/browse/HUDI-7297 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Yao Zhang >Assignee: Yao Zhang >Priority: Minor > Labels: pull-request-available > Fix For: 1.0.0 > > > If you create a table with mismatched file types in Flink SQL, for example > you define a field as bigint while the actual field type is int, an > IllegalArgumentException would be thrown like below: > java.lang.IllegalArgumentException: Unexpected type: INT32 > The exception is way too ambiguous. It is difficult to figure out which field > type is incorrect and what the correct type is. You have to refer to the > source code. > Currently I plan to make the exception message more informative. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7297) Exception thrown when field type mismatch is ambiguous
[ https://issues.apache.org/jira/browse/HUDI-7297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7297: - Fix Version/s: 1.0.0 > Exception thrown when field type mismatch is ambiguous > -- > > Key: HUDI-7297 > URL: https://issues.apache.org/jira/browse/HUDI-7297 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Yao Zhang >Assignee: Yao Zhang >Priority: Minor > Labels: pull-request-available > Fix For: 1.0.0 > > > If you create a table with mismatched file types in Flink SQL, for example > you define a field as bigint while the actual field type is int, an > IllegalArgumentException would be thrown like below: > java.lang.IllegalArgumentException: Unexpected type: INT32 > The exception is way too ambiguous. It is difficult to figure out which field > type is incorrect and what the correct type is. You have to refer to the > source code. > Currently I plan to make the exception message more informative. -- This message was sent by Atlassian Jira (v8.20.10#820010)
(hudi) branch master updated (696911ed8c4 -> 0dee3b4a1a3)
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 696911ed8c4 [HUDI-7305] Fix cast exception for byte/short/float partitioned field (#10518) add 0dee3b4a1a3 [HUDI-7297] Fix ambiguous error message when field type defined in schema mismatches that in parquet file (#10497) No new revisions were added by this update. Summary of changes: .../table/format/cow/ParquetSplitReaderUtil.java | 48 ++ .../reader/ParquetColumnarRowSplitReader.java | 16 +--- .../table/format/cow/ParquetSplitReaderUtil.java | 48 ++ .../reader/ParquetColumnarRowSplitReader.java | 16 +--- .../table/format/cow/ParquetSplitReaderUtil.java | 48 ++ .../reader/ParquetColumnarRowSplitReader.java | 16 +--- .../table/format/cow/ParquetSplitReaderUtil.java | 48 ++ .../reader/ParquetColumnarRowSplitReader.java | 16 +--- .../table/format/cow/ParquetSplitReaderUtil.java | 48 ++ .../reader/ParquetColumnarRowSplitReader.java | 16 +--- 10 files changed, 205 insertions(+), 115 deletions(-)
Re: [PR] [HUDI-7297] Fix ambiguous error message when field type defined in sc… [hudi]
danny0405 merged PR #10497: URL: https://github.com/apache/hudi/pull/10497 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] change hive/adb tool not auto create database default [hudi]
hudi-bot commented on PR #9640: URL: https://github.com/apache/hudi/pull/9640#issuecomment-1899535613 ## CI report: * cefd96781f2f87b7af3a92e5c6334724f7aeb400 UNKNOWN * 453da96a0389b91f7eefc06f7f2ab30a9147fa15 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22047) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7277] fix `hoodie.bulkinsert.shuffle.parallelism` not activated… [hudi]
KnightChess commented on PR #10532: URL: https://github.com/apache/hudi/pull/10532#issuecomment-1899532304 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7277] fix `hoodie.bulkinsert.shuffle.parallelism` not activated… [hudi]
KnightChess closed pull request #10532: [HUDI-7277] fix `hoodie.bulkinsert.shuffle.parallelism` not activated… URL: https://github.com/apache/hudi/pull/10532 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (HUDI-7305) Fix cast exception while reading byte/short/float type of partitioned field
[ https://issues.apache.org/jira/browse/HUDI-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7305. Resolution: Fixed Fixed via master branch: 696911ed8c48bd74cd1a93322a4c1d39bba11a6c > Fix cast exception while reading byte/short/float type of partitioned field > --- > > Key: HUDI-7305 > URL: https://issues.apache.org/jira/browse/HUDI-7305 > Project: Apache Hudi > Issue Type: Bug >Reporter: Qijun Fu >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > > Fix cast exception while reading byte/short/float type of partitioned field -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7305) Fix cast exception while reading byte/short/float type of partitioned field
[ https://issues.apache.org/jira/browse/HUDI-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7305: - Fix Version/s: 1.0.0 > Fix cast exception while reading byte/short/float type of partitioned field > --- > > Key: HUDI-7305 > URL: https://issues.apache.org/jira/browse/HUDI-7305 > Project: Apache Hudi > Issue Type: Bug >Reporter: Qijun Fu >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > > Fix cast exception while reading byte/short/float type of partitioned field -- This message was sent by Atlassian Jira (v8.20.10#820010)
(hudi) branch master updated: [HUDI-7305] Fix cast exception for byte/short/float partitioned field (#10518)
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 696911ed8c4 [HUDI-7305] Fix cast exception for byte/short/float partitioned field (#10518) 696911ed8c4 is described below commit 696911ed8c48bd74cd1a93322a4c1d39bba11a6c Author: stream2000 <18889897...@163.com> AuthorDate: Fri Jan 19 10:12:43 2024 +0800 [HUDI-7305] Fix cast exception for byte/short/float partitioned field (#10518) --- .../apache/spark/sql/hudi/TestInsertTable.scala| 37 ++ .../datasources/Spark3ParsePartitionUtil.scala | 10 +++--- 2 files changed, 43 insertions(+), 4 deletions(-) diff --git a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestInsertTable.scala b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestInsertTable.scala index 044b6451cdf..05a04daf417 100644 --- a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestInsertTable.scala +++ b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestInsertTable.scala @@ -2334,6 +2334,43 @@ class TestInsertTable extends HoodieSparkSqlTestBase { }) } + test("Test various data types as partition fields") { +withRecordType()(withTempDir { tmp => + val tableName = generateTableName + spark.sql( +s""" + |CREATE TABLE $tableName ( + | id INT, + | boolean_field BOOLEAN, + | float_field FLOAT, + | byte_field BYTE, + | short_field SHORT, + | decimal_field DECIMAL(10, 5), + | date_field DATE, + | string_field STRING, + | timestamp_field TIMESTAMP + |) USING hudi + | TBLPROPERTIES (primaryKey = 'id') + | PARTITIONED BY (boolean_field, float_field, byte_field, short_field, decimal_field, date_field, string_field, timestamp_field) + |LOCATION '${tmp.getCanonicalPath}' + """.stripMargin) + + // Insert data into partitioned table + spark.sql( +s""" + |INSERT INTO $tableName VALUES + |(1, TRUE, CAST(1.0 as FLOAT), 1, 1, 1234.56789, DATE '2021-01-05', 'partition1', TIMESTAMP '2021-01-05 10:00:00'), + |(2, FALSE,CAST(2.0 as FLOAT), 2, 2, 6789.12345, DATE '2021-01-06', 'partition2', TIMESTAMP '2021-01-06 11:00:00') + """.stripMargin) + + checkAnswer(s"SELECT id, boolean_field FROM $tableName ORDER BY id")( +Seq(1, true), +Seq(2, false) + ) +}) + } + + def ingestAndValidateDataDupPolicy(tableType: String, tableName: String, tmp: File, expectedOperationtype: WriteOperationType = WriteOperationType.INSERT, setOptions: List[String] = List.empty, diff --git a/hudi-spark-datasource/hudi-spark3-common/src/main/scala/org/apache/spark/sql/execution/datasources/Spark3ParsePartitionUtil.scala b/hudi-spark-datasource/hudi-spark3-common/src/main/scala/org/apache/spark/sql/execution/datasources/Spark3ParsePartitionUtil.scala index ebe92a5a32a..fca21d202a9 100644 --- a/hudi-spark-datasource/hudi-spark3-common/src/main/scala/org/apache/spark/sql/execution/datasources/Spark3ParsePartitionUtil.scala +++ b/hudi-spark-datasource/hudi-spark3-common/src/main/scala/org/apache/spark/sql/execution/datasources/Spark3ParsePartitionUtil.scala @@ -20,7 +20,6 @@ package org.apache.spark.sql.execution.datasources import org.apache.hadoop.fs.Path import org.apache.hudi.common.util.PartitionPathEncodeUtils.DEFAULT_PARTITION_PATH import org.apache.hudi.spark3.internal.ReflectUtil -import org.apache.hudi.util.JFunction import org.apache.spark.sql.catalyst.InternalRow import org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils.unescapePathName import org.apache.spark.sql.catalyst.expressions.{Cast, Literal} @@ -29,10 +28,9 @@ import org.apache.spark.sql.execution.datasources.PartitioningUtils.timestampPar import org.apache.spark.sql.types._ import org.apache.spark.unsafe.types.UTF8String -import java.lang.{Boolean => JBoolean, Double => JDouble, Long => JLong} +import java.lang.{Double => JDouble, Long => JLong} import java.math.{BigDecimal => JBigDecimal} import java.time.ZoneId -import java.util import java.util.concurrent.ConcurrentHashMap import java.util.{Locale, TimeZone} import scala.collection.convert.Wrappers.JConcurrentMapWrapper @@ -259,10 +257,12 @@ object Spark3ParsePartitionUtil extends SparkParsePartitionUtil { zoneId: ZoneId): Any = desiredType match { case _ if value == DEFAULT_PARTITION_PATH => null case NullType => null -case BooleanType => JBoolean.parseBoolean(value) case StringType => UTF8String.fromString(unescapePathName(value)) +case ByteType => Integer.parseInt(value).to
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899530528 ## CI report: * 8ed4533f65fd470149f70f746d7ce5983c6f7f65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22055) * e4a1343c89890092c5e0cd5c880691ccee6a47bc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22057) * df5625a0b3e3e742257bae44542e628b3546c78c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22058) * ffe23c1e7fbeee0b371296a3caa55588ca86cc55 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22060) * 7e9f9a7371dabcd14c4b644c9bb399a28d99b77f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7308] LockManager::unlock should not call updateLockHeldTimerMetrics if lockDurationTimer has not been started [hudi]
hudi-bot commented on PR #10523: URL: https://github.com/apache/hudi/pull/10523#issuecomment-1899530372 ## CI report: * 19894baa43bbd130a226c64ed02b6edc96d0f4f8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22034) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7305] Fix cast exception for byte/short/float partitioned field [hudi]
danny0405 merged PR #10518: URL: https://github.com/apache/hudi/pull/10518 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] change hive/adb tool not auto create database default [hudi]
hudi-bot commented on PR #9640: URL: https://github.com/apache/hudi/pull/9640#issuecomment-1899529365 ## CI report: * cefd96781f2f87b7af3a92e5c6334724f7aeb400 UNKNOWN * 453da96a0389b91f7eefc06f7f2ab30a9147fa15 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22047) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7303] Fix date field type unexpectedly convert to Long when usi… [hudi]
danny0405 commented on code in PR #10517: URL: https://github.com/apache/hudi/pull/10517#discussion_r1458194926 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/ExpressionPredicates.java: ## @@ -602,10 +602,10 @@ private static FilterPredicate toParquetPredicate(FunctionDefinition functionDef case TINYINT: case SMALLINT: case INTEGER: + case DATE: case TIME_WITHOUT_TIME_ZONE: return predicateSupportsLtGt(functionDefinition, intColumn(columnName), (Integer) literal); case BIGINT: - case DATE: case TIMESTAMP_WITHOUT_TIME_ZONE: Review Comment: Then why we need this fix? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6217] Spark reads should skip record with delete operation metadata [hudi]
beyond1920 commented on PR #10219: URL: https://github.com/apache/hudi/pull/10219#issuecomment-1899527966 Hi, @xuzifu666 , sorry, I didn't understand. What's the question? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] change hive/adb tool not auto create database default [hudi]
KnightChess commented on PR #9640: URL: https://github.com/apache/hudi/pull/9640#issuecomment-1899524618 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899522328 ## CI report: * 8ed4533f65fd470149f70f746d7ce5983c6f7f65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22055) * e4a1343c89890092c5e0cd5c880691ccee6a47bc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22057) * df5625a0b3e3e742257bae44542e628b3546c78c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22058) * ffe23c1e7fbeee0b371296a3caa55588ca86cc55 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22060) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7308] LockManager::unlock should not call updateLockHeldTimerMetrics if lockDurationTimer has not been started [hudi]
hudi-bot commented on PR #10523: URL: https://github.com/apache/hudi/pull/10523#issuecomment-1899522207 ## CI report: * 19894baa43bbd130a226c64ed02b6edc96d0f4f8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22034) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899521823 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * 4173edaec044dfb8e0691bebcddba2c7b8ec2d8d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22056) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7308] LockManager::unlock should not call updateLockHeldTimerMetrics if lockDurationTimer has not been started [hudi]
kbuci commented on PR #10523: URL: https://github.com/apache/hudi/pull/10523#issuecomment-1899516189 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899482274 ## CI report: * 8ed4533f65fd470149f70f746d7ce5983c6f7f65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22055) * e4a1343c89890092c5e0cd5c880691ccee6a47bc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22057) * df5625a0b3e3e742257bae44542e628b3546c78c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22058) * ffe23c1e7fbeee0b371296a3caa55588ca86cc55 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7311] Add literal type auto conversion before filter push down [hudi]
hudi-bot commented on PR #10531: URL: https://github.com/apache/hudi/pull/10531#issuecomment-1899482215 ## CI report: * 9e644af9a9c0e4f2f38da4a4826d7923edf80f70 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22044) * 8b6316b18855ec9d4c2ea891abff52ad1629cd2a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22059) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7311] Add literal type auto conversion before filter push down [hudi]
hudi-bot commented on PR #10531: URL: https://github.com/apache/hudi/pull/10531#issuecomment-1899475103 ## CI report: * 9e644af9a9c0e4f2f38da4a4826d7923edf80f70 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22044) * 8b6316b18855ec9d4c2ea891abff52ad1629cd2a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899475156 ## CI report: * 8ed4533f65fd470149f70f746d7ce5983c6f7f65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22055) * e4a1343c89890092c5e0cd5c880691ccee6a47bc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22057) * df5625a0b3e3e742257bae44542e628b3546c78c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22058) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899468410 ## CI report: * 19ae7f342eba514f5aaff0e01470c723e776e638 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22054) * 8ed4533f65fd470149f70f746d7ce5983c6f7f65 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22055) * e4a1343c89890092c5e0cd5c880691ccee6a47bc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22057) * df5625a0b3e3e742257bae44542e628b3546c78c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899429479 ## CI report: * 19ae7f342eba514f5aaff0e01470c723e776e638 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22054) * 8ed4533f65fd470149f70f746d7ce5983c6f7f65 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22055) * e4a1343c89890092c5e0cd5c880691ccee6a47bc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22057) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899420896 ## CI report: * 19ae7f342eba514f5aaff0e01470c723e776e638 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22054) * 8ed4533f65fd470149f70f746d7ce5983c6f7f65 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22055) * e4a1343c89890092c5e0cd5c880691ccee6a47bc UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899373796 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * 9eeb97c900f39bb9e701b8c8df059d42ee880017 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22053) * 4173edaec044dfb8e0691bebcddba2c7b8ec2d8d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22056) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899366071 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * 9eeb97c900f39bb9e701b8c8df059d42ee880017 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22053) * 4173edaec044dfb8e0691bebcddba2c7b8ec2d8d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899358730 ## CI report: * 19ae7f342eba514f5aaff0e01470c723e776e638 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22054) * 8ed4533f65fd470149f70f746d7ce5983c6f7f65 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22055) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899358193 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * 02a0a6c8e2ad1d6789b6be73769a85f3339f03dd Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22052) * 9eeb97c900f39bb9e701b8c8df059d42ee880017 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22053) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899317082 ## CI report: * 19ae7f342eba514f5aaff0e01470c723e776e638 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22054) * 8ed4533f65fd470149f70f746d7ce5983c6f7f65 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899316522 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * 069e933b4bfdbdc9fba2f506c6c9ba8019ecfd4f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22051) * 02a0a6c8e2ad1d6789b6be73769a85f3339f03dd Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22052) * 9eeb97c900f39bb9e701b8c8df059d42ee880017 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22053) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899308364 ## CI report: * 19ae7f342eba514f5aaff0e01470c723e776e638 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22054) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
hudi-bot commented on PR #10534: URL: https://github.com/apache/hudi/pull/10534#issuecomment-1899299663 ## CI report: * 19ae7f342eba514f5aaff0e01470c723e776e638 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [HUDI-6902] Take a dump at hadoop-mr-java-client module [hudi]
linliu-code opened a new pull request, #10534: URL: https://github.com/apache/hudi/pull/10534 ### Change Logs to discover the culprit of the orphan processes. ### Impact find the bad guy. ### Risk level (write none, low medium or high below) None. ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-7313) ZookeeperBasedLockProvider should store application info in lock node
Krishen Bhan created HUDI-7313: -- Summary: ZookeeperBasedLockProvider should store application info in lock node Key: HUDI-7313 URL: https://issues.apache.org/jira/browse/HUDI-7313 Project: Apache Hudi Issue Type: Improvement Components: metrics, multi-writer Reporter: Krishen Bhan Currently when ZookeeperBasedLockProvider acquires a lock, it does not provide information on the lock holder via [https://curator.apache.org/apidocs/org/apache/curator/framework/recipes/locks/InterProcessMutex.html#getLockNodeBytes|https://curator.apache.org/apidocs/org/apache/curator/framework/recipes/locks/InterProcessMutex.html#getLockNodeBytes--] which can be used to store info about the application that acquired the lock. Updating HUDI to implement this API would help users easily identify information about the application that acquired the ZooKeeper lock when they use ZooKeeper tooling (such as zkcli). -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899218088 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * 421f1cc41a22c8a8ed93d4888329f2864c8c3bf9 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22040) * 069e933b4bfdbdc9fba2f506c6c9ba8019ecfd4f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22051) * 02a0a6c8e2ad1d6789b6be73769a85f3339f03dd Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22052) * 9eeb97c900f39bb9e701b8c8df059d42ee880017 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22053) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7308] LockManager::unlock should not call updateLockHeldTimerMetrics if lockDurationTimer has not been started [hudi]
hudi-bot commented on PR #10523: URL: https://github.com/apache/hudi/pull/10523#issuecomment-1899208141 ## CI report: * 19894baa43bbd130a226c64ed02b6edc96d0f4f8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22034) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899207507 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * 421f1cc41a22c8a8ed93d4888329f2864c8c3bf9 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22040) * 069e933b4bfdbdc9fba2f506c6c9ba8019ecfd4f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22051) * 02a0a6c8e2ad1d6789b6be73769a85f3339f03dd Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22052) * 9eeb97c900f39bb9e701b8c8df059d42ee880017 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7308] LockManager::unlock should not call updateLockHeldTimerMetrics if lockDurationTimer has not been started [hudi]
hudi-bot commented on PR #10523: URL: https://github.com/apache/hudi/pull/10523#issuecomment-1899197612 ## CI report: * 19894baa43bbd130a226c64ed02b6edc96d0f4f8 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22034) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899197027 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * 421f1cc41a22c8a8ed93d4888329f2864c8c3bf9 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22040) * 069e933b4bfdbdc9fba2f506c6c9ba8019ecfd4f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22051) * 02a0a6c8e2ad1d6789b6be73769a85f3339f03dd UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7308] LockManager::unlock should not call updateLockHeldTimerMetrics if lockDurationTimer has not been started [hudi]
kbuci commented on PR #10523: URL: https://github.com/apache/hudi/pull/10523#issuecomment-1899190388 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899145175 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * 421f1cc41a22c8a8ed93d4888329f2864c8c3bf9 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22040) * 069e933b4bfdbdc9fba2f506c6c9ba8019ecfd4f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22051) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7218] Integrate new HFile reader with file reader factory [hudi]
hudi-bot commented on PR #10330: URL: https://github.com/apache/hudi/pull/10330#issuecomment-1899135210 ## CI report: * 3bc5c381966197e816eba184466d337961c5d6e3 UNKNOWN * 421f1cc41a22c8a8ed93d4888329f2864c8c3bf9 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22040) * 069e933b4bfdbdc9fba2f506c6c9ba8019ecfd4f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(hudi) branch master updated (48ce342a19b -> 9111d430a1c)
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 48ce342a19b [HUDI-7170] Implement HFile reader independent of HBase (#10241) add 9111d430a1c [HUDI-6902] Fix a unit test (#10513) No new revisions were added by this update. Summary of changes: .../hudi/utilities/sources/TestGcsEventsSource.java| 18 +- 1 file changed, 9 insertions(+), 9 deletions(-)
(hudi) branch master updated (9111d430a1c -> 407366447de)
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 9111d430a1c [HUDI-6902] Fix a unit test (#10513) add 407366447de [HUDI-6902] Shutdown metric hooks properly (#10520) No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/hudi/DefaultSource.scala| 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-)
Re: [PR] [HUDI-6902] Shutdown metric hooks properly [hudi]
nsivabalan merged PR #10520: URL: https://github.com/apache/hudi/pull/10520 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Fix a unit test [hudi]
nsivabalan merged PR #10513: URL: https://github.com/apache/hudi/pull/10513 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org