[GitHub] [hudi] hudi-bot commented on pull request #8237: [HUDI-5958] Improve ResolvedSchema Instead of TableSchema.

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8237: URL: https://github.com/apache/hudi/pull/8237#issuecomment-1479010965 ## CI report: * d3ce6b5a9b0633f286acc9cdc4718eaac88c129f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1583

[GitHub] [hudi] hudi-bot commented on pull request #7826: [HUDI-5675] fix lazy clean schedule rollback on completed instant

2023-03-21 Thread via GitHub
hudi-bot commented on PR #7826: URL: https://github.com/apache/hudi/pull/7826#issuecomment-1479010116 ## CI report: * ed708d668d1521a10b4baa62d5c79aaefdd326f1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1583

[GitHub] [hudi] LiJie20190102 commented on issue #8257: [SUPPORT]HoodieDeltaStreamer (0.13.0 ),FileSystem is null,resulting in a NullPointerException

2023-03-21 Thread via GitHub
LiJie20190102 commented on issue #8257: URL: https://github.com/apache/hudi/issues/8257#issuecomment-1478986022 The most direct exception tells me that the FileSystem is null, and I don't know if this is the root cause -- This is an automated message from the Apache Git Service. To respon

[GitHub] [hudi] codope commented on issue #8240: [SUPPORT] dynamic catalog

2023-03-21 Thread via GitHub
codope commented on issue #8240: URL: https://github.com/apache/hudi/issues/8240#issuecomment-1478983884 @linfey90 Can you elaborate your use case for the dynamic catalog? Also, what kind of class hierarchy and refactoring are you proposing, let's discuss in this thread. -- This is an au

[GitHub] [hudi] LiJie20190102 commented on issue #8257: [SUPPORT]HoodieDeltaStreamer (0.13.0 ),FileSystem is null,resulting in a NullPointerException

2023-03-21 Thread via GitHub
LiJie20190102 commented on issue #8257: URL: https://github.com/apache/hudi/issues/8257#issuecomment-1478983619 > I recall raising this point in a PR https://github.com/apache/hudi/pull/6016/files#r931695710 - this was added in Hudi 0.12.1 and later versions. @LiJie20190102 I think we shoul

[GitHub] [hudi] mezhangremoterepository opened a new pull request, #8266: Update flink-quick-start-guide.md

2023-03-21 Thread via GitHub
mezhangremoterepository opened a new pull request, #8266: URL: https://github.com/apache/hudi/pull/8266 Add '<' ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing fea

[GitHub] [hudi] codope commented on issue #8221: [SUPPORT] Can we use "Amazon S3 Multi-Region Access Points" with Hudi ?

2023-03-21 Thread via GitHub
codope commented on issue #8221: URL: https://github.com/apache/hudi/issues/8221#issuecomment-1478981426 @nikspatel03 Hudi supports savepoint/restore for DR scenarios. However, I guess your DR use case is to backup in a separate region altogether so that all AZs failure of a region does not

[GitHub] [hudi] codope closed issue #8215: [SUPPORT] spark-shell cannot obtain the latest data

2023-03-21 Thread via GitHub
codope closed issue #8215: [SUPPORT] spark-shell cannot obtain the latest data URL: https://github.com/apache/hudi/issues/8215 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [hudi] codope commented on issue #8215: [SUPPORT] spark-shell cannot obtain the latest data

2023-03-21 Thread via GitHub
codope commented on issue #8215: URL: https://github.com/apache/hudi/issues/8215#issuecomment-1478976949 @LiJie20190102 Closing this issue. I have added a comment on the issue. Most likely that's a bug. -- This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [hudi] codope commented on issue #8257: [SUPPORT]HoodieDeltaStreamer (0.13.0 ),FileSystem is null,resulting in a NullPointerException

2023-03-21 Thread via GitHub
codope commented on issue #8257: URL: https://github.com/apache/hudi/issues/8257#issuecomment-1478976633 I recall raising this point in a PR https://github.com/apache/hudi/pull/6016/files#r931695710 - this was added in Hudi 0.12.1 and later versions. @LiJie20190102 I think we should fix

[GitHub] [hudi] hudi-bot commented on pull request #8237: [HUDI-5958] Improve ResolvedSchema Instead of TableSchema.

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8237: URL: https://github.com/apache/hudi/pull/8237#issuecomment-1478976159 ## CI report: * d3ce6b5a9b0633f286acc9cdc4718eaac88c129f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1583

[GitHub] [hudi] hudi-bot commented on pull request #7826: [HUDI-5675] fix lazy clean schedule rollback on completed instant

2023-03-21 Thread via GitHub
hudi-bot commented on PR #7826: URL: https://github.com/apache/hudi/pull/7826#issuecomment-1478975524 ## CI report: * ed708d668d1521a10b4baa62d5c79aaefdd326f1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1583

[jira] [Updated] (HUDI-5971) About full schema evolution, alter column type do not support nest column but can alter inside struct type

2023-03-21 Thread Alberic Liu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alberic Liu updated HUDI-5971: -- Description: reproduce steps: * create the test table: CREATE TABLE default.schema_evolution_test (  

[jira] [Updated] (HUDI-5971) About full schema evolution, alter column type do not support nest column but can alter inside struct type

2023-03-21 Thread Alberic Liu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alberic Liu updated HUDI-5971: -- Description: reproduce steps: * create the test table: CREATE TABLE default.schema_evolution_test (  

[GitHub] [hudi] hudi-bot commented on pull request #7680: [HUDI-5548] spark sql show | update hudi's table properties

2023-03-21 Thread via GitHub
hudi-bot commented on PR #7680: URL: https://github.com/apache/hudi/pull/7680#issuecomment-1478970842 ## CI report: * 41d7a2a249bf0c75491790a7aafc95761e9d49d4 UNKNOWN * 89868d4c16c65b4d118c4b061b1db8232e5abc34 UNKNOWN * 2d7c3e1446936cd68d33f503fa2bee18ff7cccf8 Azure: [SUCCES

[GitHub] [hudi] hudi-bot commented on pull request #8102: [HUDI-5880] Support partition pruning for flink streaming source in runtime

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8102: URL: https://github.com/apache/hudi/pull/8102#issuecomment-1478966063 ## CI report: * 3ad4f77d51bc9a280161865028f89f362fcdb754 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1584

[GitHub] [hudi] hudi-bot commented on pull request #7680: [HUDI-5548] spark sql show | update hudi's table properties

2023-03-21 Thread via GitHub
hudi-bot commented on PR #7680: URL: https://github.com/apache/hudi/pull/7680#issuecomment-1478965473 ## CI report: * 41d7a2a249bf0c75491790a7aafc95761e9d49d4 UNKNOWN * 89868d4c16c65b4d118c4b061b1db8232e5abc34 UNKNOWN * 0c5dea833612c53880831dccb652494f1a00c192 Azure: [SUCCES

[jira] [Created] (HUDI-5971) About full schema evolution, alter column type do not support nest column but can alter inside struct type

2023-03-21 Thread Alberic Liu (Jira)
Alberic Liu created HUDI-5971: - Summary: About full schema evolution, alter column type do not support nest column but can alter inside struct type Key: HUDI-5971 URL: https://issues.apache.org/jira/browse/HUDI-5971

[GitHub] [hudi] bvaradar commented on pull request #7680: [HUDI-5548] spark sql show | update hudi's table properties

2023-03-21 Thread via GitHub
bvaradar commented on PR #7680: URL: https://github.com/apache/hudi/pull/7680#issuecomment-1478961607 @XuQianJin-Stars : Can you look at the failing CI test ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [hudi] hudi-bot commented on pull request #8263: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8263: URL: https://github.com/apache/hudi/pull/8263#issuecomment-1478960484 ## CI report: * 6aa7bc9a0209cef639ae47d9c7814851dc8c06cb Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1583

[GitHub] [hudi] codope closed issue #8213: [SUPPORT] Error while setting OCC in spark structured streaming

2023-03-21 Thread via GitHub
codope closed issue #8213: [SUPPORT] Error while setting OCC in spark structured streaming URL: https://github.com/apache/hudi/issues/8213 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [hudi] codope commented on issue #8213: [SUPPORT] Error while setting OCC in spark structured streaming

2023-03-21 Thread via GitHub
codope commented on issue #8213: URL: https://github.com/apache/hudi/issues/8213#issuecomment-1478959529 @haripriyarhp the suggestion above should work for you. Please reopen in case you're still facing some issue. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [hudi] nfarah86 commented on a diff in pull request #8093: [HUDI-5886][DOCS] Improve File Sizing, Timeline, and Flink docs

2023-03-21 Thread via GitHub
nfarah86 commented on code in PR #8093: URL: https://github.com/apache/hudi/pull/8093#discussion_r1144235771 ## website/docs/flink_configuration.md: ## @@ -3,115 +3,177 @@ title: Flink Setup toc: true --- -## Global Configurations -When using Flink, you can set some global c

[GitHub] [hudi] 1032851561 commented on issue #8087: [SUPPORT] split_reader don't checkpoint before consuming all splits

2023-03-21 Thread via GitHub
1032851561 commented on issue #8087: URL: https://github.com/apache/hudi/issues/8087#issuecomment-1478908186 > So it's the `StreamReadMonitoringFunction` that blocks the checkpoint barrier while distributing the input splits, not the split_reader No, `StreamReadMonitoringFunction` co

[GitHub] [hudi] danny0405 commented on a diff in pull request #8102: [HUDI-5880] Support partition pruning for flink streaming source in runtime

2023-03-21 Thread via GitHub
danny0405 commented on code in PR #8102: URL: https://github.com/apache/hudi/pull/8102#discussion_r1144213813 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/ExpressionUtils.java: ## @@ -177,4 +183,119 @@ public static Object getValueFromLiteral(ValueLite

[GitHub] [hudi] danny0405 commented on a diff in pull request #8102: [HUDI-5880] Support partition pruning for flink streaming source in runtime

2023-03-21 Thread via GitHub
danny0405 commented on code in PR #8102: URL: https://github.com/apache/hudi/pull/8102#discussion_r1144209473 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java: ## @@ -318,6 +318,13 @@ private FlinkOptions() { + "1)

[GitHub] [hudi] hudi-bot commented on pull request #8264: [HUDI-5970] add check to merge into update actions

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8264: URL: https://github.com/apache/hudi/pull/8264#issuecomment-1478895334 ## CI report: * 802d182c179e183ccb06325d67c6121f447815b2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1584

[GitHub] [hudi] hudi-bot commented on pull request #8102: [HUDI-5880] Support partition pruning for flink streaming source in runtime

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8102: URL: https://github.com/apache/hudi/pull/8102#issuecomment-1478895092 ## CI report: * e2ceb307219b4e27b276ab986c47bf77a1ec2d25 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1567

[GitHub] [hudi] danny0405 commented on a diff in pull request #8102: [HUDI-5880] Support partition pruning for flink streaming source in runtime

2023-03-21 Thread via GitHub
danny0405 commented on code in PR #8102: URL: https://github.com/apache/hudi/pull/8102#discussion_r1144209473 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java: ## @@ -318,6 +318,13 @@ private FlinkOptions() { + "1)

[GitHub] [hudi] hudi-bot commented on pull request #8264: [HUDI-5970] add check to merge into update actions

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8264: URL: https://github.com/apache/hudi/pull/8264#issuecomment-1478890949 ## CI report: * 802d182c179e183ccb06325d67c6121f447815b2 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #8102: [HUDI-5880] Support partition pruning for flink streaming source in runtime

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8102: URL: https://github.com/apache/hudi/pull/8102#issuecomment-1478890643 ## CI report: * e2ceb307219b4e27b276ab986c47bf77a1ec2d25 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1567

[GitHub] [hudi] huyuanfeng2018 commented on issue #8265: [SUPPORT] Flink Table planner not loading problem

2023-03-21 Thread via GitHub
huyuanfeng2018 commented on issue #8265: URL: https://github.com/apache/hudi/issues/8265#issuecomment-1478882171 cc @danny0405 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [hudi] huyuanfeng2018 opened a new issue, #8265: [SUPPORT] Flink Table planner not loading problem

2023-03-21 Thread via GitHub
huyuanfeng2018 opened a new issue, #8265: URL: https://github.com/apache/hudi/issues/8265 **Describe the problem you faced** 版本信息 hudi-0.13.0 flink1.16 flink sql( HUDI CREATE DDL): ``` CREATE TABLE `ods_action_log_huya_hudi_nopro_test` ( `stime` VARCHAR PRIMARY

[jira] [Updated] (HUDI-5970) add check to merge into update actions to prevent data quality problems

2023-03-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5970: - Labels: pull-request-available (was: ) > add check to merge into update actions to prevent data q

[GitHub] [hudi] KnightChess opened a new pull request, #8264: [HUDI-5970] add check to merge into update actions

2023-03-21 Thread via GitHub
KnightChess opened a new pull request, #8264: URL: https://github.com/apache/hudi/pull/8264 ### Change Logs https://github.com/apache/hudi/pull/8133#pullrequestreview-1334095535 as the pr comment, remove merge into update action check will cause data quality problems if source table wi

[GitHub] [hudi] danny0405 commented on pull request #7159: [HUDI-5173] Skip if there is only one file in clusteringGroup

2023-03-21 Thread via GitHub
danny0405 commented on PR #7159: URL: https://github.com/apache/hudi/pull/7159#issuecomment-1478872465 You need to configure the clustering strategy to avoid this issue. Maybe your clustering startegy is a little aggresive. -- This is an automated message from the Apache Git Service. To r

[GitHub] [hudi] danny0405 commented on a diff in pull request #8128: [HUDI-5782] Tweak defaults and remove unnecessary configs after config review

2023-03-21 Thread via GitHub
danny0405 commented on code in PR #8128: URL: https://github.com/apache/hudi/pull/8128#discussion_r1144191740 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java: ## @@ -288,9 +288,6 @@ private HoodieWriteConfig createMet

[jira] [Created] (HUDI-5970) add check to merge into update actions to prevent data quality problems

2023-03-21 Thread KnightChess (Jira)
KnightChess created HUDI-5970: - Summary: add check to merge into update actions to prevent data quality problems Key: HUDI-5970 URL: https://issues.apache.org/jira/browse/HUDI-5970 Project: Apache Hudi

[GitHub] [hudi] danny0405 commented on pull request #8029: [HUDI-5832] add relocated prefix for hbase classes in hbase-site.xml

2023-03-21 Thread via GitHub
danny0405 commented on PR #8029: URL: https://github.com/apache/hudi/pull/8029#issuecomment-1478866831 > Another question, what if some one just use hudi-common-*.jar? This is a good question, unfortunately we have no good way to solve it unless we publish our own shaded Hbase jars, i

[GitHub] [hudi] danny0405 commented on a diff in pull request #8029: [HUDI-5832] add relocated prefix for hbase classes in hbase-site.xml

2023-03-21 Thread via GitHub
danny0405 commented on code in PR #8029: URL: https://github.com/apache/hudi/pull/8029#discussion_r1144167793 ## hudi-common/src/main/resources/hbase-site.xml: ## @@ -158,7 +158,7 @@ possible configurations would overwhelm and obscure the important. hbase.master.logcle

[GitHub] [hudi] vinothchandar commented on a diff in pull request #8093: [HUDI-5886][DOCS] Improve File Sizing, Timeline, and Flink docs

2023-03-21 Thread via GitHub
vinothchandar commented on code in PR #8093: URL: https://github.com/apache/hudi/pull/8093#discussion_r1144186146 ## website/docs/timeline.md: ## @@ -3,40 +3,386 @@ title: Timeline toc: true --- -## Timeline -At its core, Hudi maintains a `timeline` of all actions performed

[GitHub] [hudi] hudi-bot commented on pull request #8263: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8263: URL: https://github.com/apache/hudi/pull/8263#issuecomment-1478860387 ## CI report: * 6aa7bc9a0209cef639ae47d9c7814851dc8c06cb Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1583

[GitHub] [hudi] xicm commented on pull request #8029: [HUDI-5832] add relocated prefix for hbase classes in hbase-site.xml

2023-03-21 Thread via GitHub
xicm commented on PR #8029: URL: https://github.com/apache/hudi/pull/8029#issuecomment-1478859087 > @danny0405 Thanks for review this. I give more context here. > > I meet a problem about ClassNotFound exception `org.apache.hadoop.hbase.master.ClusterStatusPublisher$Multic

[GitHub] [hudi] hudi-bot commented on pull request #8263: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8263: URL: https://github.com/apache/hudi/pull/8263#issuecomment-1478855450 ## CI report: * 6aa7bc9a0209cef639ae47d9c7814851dc8c06cb UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] voonhous commented on pull request #8263: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

2023-03-21 Thread via GitHub
voonhous commented on PR #8263: URL: https://github.com/apache/hudi/pull/8263#issuecomment-1478849811 @danny0405 Re-created the PR as https://github.com/apache/hudi/pull/7997 is unable to trigger CI. -- This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [hudi] voonhous commented on pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

2023-03-21 Thread via GitHub
voonhous commented on PR #7997: URL: https://github.com/apache/hudi/pull/7997#issuecomment-1478849285 > @voonhous The CI can not be triggered, can you fire another PR instead. https://github.com/apache/hudi/pull/8263 -- This is an automated message from the Apache Git Service. To re

[GitHub] [hudi] voonhous opened a new pull request, #8263: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

2023-03-21 Thread via GitHub
voonhous opened a new pull request, #8263: URL: https://github.com/apache/hudi/pull/8263 ...ndle Applying the fix from #5185 will fix write issues for MOR tables, but will cause write issues for COW tables. More information on how to reproduce the COW error in this jira issue:

[GitHub] [hudi] voonhous closed pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa...

2023-03-21 Thread via GitHub
voonhous closed pull request #7997: [HUDI-5822] Fix FileId not found exception when FileId is passed to HoodieMergeHa... URL: https://github.com/apache/hudi/pull/7997 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[GitHub] [hudi] danny0405 commented on a diff in pull request #8029: [HUDI-5832] add relocated prefix for hbase classes in hbase-site.xml

2023-03-21 Thread via GitHub
danny0405 commented on code in PR #8029: URL: https://github.com/apache/hudi/pull/8029#discussion_r1144167793 ## hudi-common/src/main/resources/hbase-site.xml: ## @@ -158,7 +158,7 @@ possible configurations would overwhelm and obscure the important. hbase.master.logcle

[GitHub] [hudi] vinothchandar commented on pull request #8088: [HUDI-5873] The pending compactions of dataset table should not block…

2023-03-21 Thread via GitHub
vinothchandar commented on PR #8088: URL: https://github.com/apache/hudi/pull/8088#issuecomment-1478834804 https://issues.apache.org/jira/browse/HUDI-2458 I read through this JIRA. and seems fairly old. @prashantwason do you have thoughts here? @nsivabalan whats your take on the updated

[hudi] branch master updated (749a93ba269 -> ff921568226)

2023-03-21 Thread forwardxu
This is an automated email from the ASF dual-hosted git repository. forwardxu pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 749a93ba269 [HUDI-5781] Refactor SQL transformer configs to use HoodieConfig and ConfigProperty (#8155) add ff9

[GitHub] [hudi] XuQianJin-Stars merged pull request #8251: [HUDI-5964] fix error info not show in CreateHoodieTableCommand

2023-03-21 Thread via GitHub
XuQianJin-Stars merged PR #8251: URL: https://github.com/apache/hudi/pull/8251 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.

[GitHub] [hudi] 1032851561 commented on issue #8087: [SUPPORT] split_reader don't checkpoint before consuming all splits

2023-03-21 Thread via GitHub
1032851561 commented on issue #8087: URL: https://github.com/apache/hudi/issues/8087#issuecomment-1478790737 > Thanks, here the question is: why the checkpoint barrier can not cham in earlier? Like in a time point within the range time1 ~ time6 > Thanks, here the question is:

[GitHub] [hudi] ehurheap commented on issue #8209: [SUPPORT] auto_clean stopped running during ingest

2023-03-21 Thread via GitHub
ehurheap commented on issue #8209: URL: https://github.com/apache/hudi/issues/8209#issuecomment-1478788893 I have attempted to run the cleaner as a separate step from the ingestion. The ingestion is now configured with ``` hoodie.clean.automatic -> false hoodie.archive.automatic ->

[GitHub] [hudi] vinothchandar commented on a diff in pull request #8107: [HUDI-5514] Adding auto generation of record keys support to Hudi/Spark

2023-03-21 Thread via GitHub
vinothchandar commented on code in PR #8107: URL: https://github.com/apache/hudi/pull/8107#discussion_r1144114597 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/NonpartitionedAvroKeyGenerator.java: ## @@ -36,8 +39,9 @@ public class NonpartitionedAvroKeyGe

[GitHub] [hudi] yihua commented on pull request #8029: [HUDI-5832] add relocated prefix for hbase classes in hbase-site.xml

2023-03-21 Thread via GitHub
yihua commented on PR #8029: URL: https://github.com/apache/hudi/pull/8029#issuecomment-1478763497 @stayrascal could you check the CI failure? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] yihua commented on pull request #8029: [HUDI-5832] add relocated prefix for hbase classes in hbase-site.xml

2023-03-21 Thread via GitHub
yihua commented on PR #8029: URL: https://github.com/apache/hudi/pull/8029#issuecomment-1478762598 > @danny0405 Thanks for review this. I give more context here. > > I meet a problem about ClassNotFound exception `org.apache.hadoop.hbase.master.ClusterStatusPublisher$MulticastPublishe

[GitHub] [hudi] yihua commented on a diff in pull request #8029: [HUDI-5832] add relocated prefix for hbase classes in hbase-site.xml

2023-03-21 Thread via GitHub
yihua commented on code in PR #8029: URL: https://github.com/apache/hudi/pull/8029#discussion_r1144112156 ## hudi-common/src/main/resources/hbase-site.xml: ## @@ -158,7 +158,7 @@ possible configurations would overwhelm and obscure the important. hbase.master.logcleaner

[hudi] branch master updated: [HUDI-5781] Refactor SQL transformer configs to use HoodieConfig and ConfigProperty (#8155)

2023-03-21 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 749a93ba269 [HUDI-5781] Refactor SQL transformer co

[GitHub] [hudi] yihua merged pull request #8155: [HUDI-5781] Refactor SQL transformer configs to use HoodieConfig and ConfigProperty

2023-03-21 Thread via GitHub
yihua merged PR #8155: URL: https://github.com/apache/hudi/pull/8155 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

[GitHub] [hudi] yihua commented on pull request #8155: [HUDI-5781] Refactor SQL transformer configs to use HoodieConfig and ConfigProperty

2023-03-21 Thread via GitHub
yihua commented on PR #8155: URL: https://github.com/apache/hudi/pull/8155#issuecomment-1478753240 CI passes. https://user-images.githubusercontent.com/2497195/226770006-d0b83f46-5ee9-4830-b465-abf5b0a2914b.png";> -- This is an automated message from the Apache Git Service. To respo

[GitHub] [hudi] bhasudha opened a new pull request, #8262: [DOCS][MINOR] Fix community sync schedule

2023-03-21 Thread via GitHub
bhasudha opened a new pull request, #8262: URL: https://github.com/apache/hudi/pull/8262 - Add link for Oracle cloud in storage configurations side nav ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _De

[GitHub] [hudi] hudi-bot commented on pull request #8128: [HUDI-5782] Tweak defaults and remove unnecessary configs after config review

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8128: URL: https://github.com/apache/hudi/pull/8128#issuecomment-1478651197 ## CI report: * 894861b03430217482771663639c9e413b0dca3b UNKNOWN * 2bfe86334532ea18748adbb7ff7481fa146b81fd Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] kazdy commented on issue #5262: [SUPPORT] Deltastreamer Error upserting bucketType UPDATE for partition :0

2023-03-21 Thread via GitHub
kazdy commented on issue #5262: URL: https://github.com/apache/hudi/issues/5262#issuecomment-1478644822 @stym06 were you running clustering in deltastreamer when this was happening? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [hudi] soumilshah1995 commented on issue #8260: [SUPPORT] How to implement incremental join

2023-03-21 Thread via GitHub
soumilshah1995 commented on issue #8260: URL: https://github.com/apache/hudi/issues/8260#issuecomment-1478618985 please follow my code snippets :D -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] soumilshah1995 commented on issue #8260: [SUPPORT] How to implement incremental join

2023-03-21 Thread via GitHub
soumilshah1995 commented on issue #8260: URL: https://github.com/apache/hudi/issues/8260#issuecomment-1478616922 Yes i have done streaming ETL with Flink ![image](https://user-images.githubusercontent.com/39345855/226747262-c187110d-bf9e-40a6-a981-063ad86b6a57.png) REPO https://gi

[GitHub] [hudi] kazdy commented on issue #8260: [SUPPORT] How to implement incremental join

2023-03-21 Thread via GitHub
kazdy commented on issue #8260: URL: https://github.com/apache/hudi/issues/8260#issuecomment-1478615373 @soumilshah1995 I think you did some work on incremental joins with hudi and flink? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [hudi] selvarajperiyasamy commented on issue #8186: upgrade from 0.5.0 to 0.13.0

2023-03-21 Thread via GitHub
selvarajperiyasamy commented on issue #8186: URL: https://github.com/apache/hudi/issues/8186#issuecomment-1478588840 Folks, could someone shed some light here ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [hudi] hudi-bot commented on pull request #7826: [HUDI-5675] fix lazy clean schedule rollback on completed instant

2023-03-21 Thread via GitHub
hudi-bot commented on PR #7826: URL: https://github.com/apache/hudi/pull/7826#issuecomment-1478563821 ## CI report: * ed708d668d1521a10b4baa62d5c79aaefdd326f1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1583

[GitHub] [hudi] HEPBO3AH commented on pull request #7159: [HUDI-5173] Skip if there is only one file in clusteringGroup

2023-03-21 Thread via GitHub
HEPBO3AH commented on PR #7159: URL: https://github.com/apache/hudi/pull/7159#issuecomment-1478532186 Is it possible for you to reconsider this? Even though this change is technically not a bug fix in terms of data integrity, it's adding an overhead by doing unnecessary operations and rewri

[GitHub] [hudi] hudi-bot commented on pull request #8128: [HUDI-5782] Tweak defaults and remove unnecessary configs after config review

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8128: URL: https://github.com/apache/hudi/pull/8128#issuecomment-1478522849 ## CI report: * 894861b03430217482771663639c9e413b0dca3b UNKNOWN * ae2e733408b7375dea2c9ee9d0e8f9acf0c82b44 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[jira] [Updated] (HUDI-5967) Add partition ordering for full table scans

2023-03-21 Thread Alex Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Guo updated HUDI-5967: --- Description: I am running a streaming read query on an hourly partitioned COW table with the following settin

[jira] [Updated] (HUDI-5967) Add partition ordering for full table scans

2023-03-21 Thread Alex Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Guo updated HUDI-5967: --- Description: I am running a streaming read query on an hourly partitioned COW table with the following settin

[GitHub] [hudi] hudi-bot commented on pull request #8128: [HUDI-5782] Tweak defaults and remove unnecessary configs after config review

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8128: URL: https://github.com/apache/hudi/pull/8128#issuecomment-1478513128 ## CI report: * 894861b03430217482771663639c9e413b0dca3b UNKNOWN * ae2e733408b7375dea2c9ee9d0e8f9acf0c82b44 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] kazdy commented on issue #8259: [SUPPORT] Clustering created files with modified schema, corrupted table

2023-03-21 Thread via GitHub
kazdy commented on issue #8259: URL: https://github.com/apache/hudi/issues/8259#issuecomment-1478507792 Another thing I noticed is that clustering created a file that's around 900Mb, is this expected? I'm using defaults all the way when it comes to file sizing. -- This is an automated me

[GitHub] [hudi] hudi-bot commented on pull request #8251: [HUDI-5964] fix error info not show in CreateHoodieTableCommand

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8251: URL: https://github.com/apache/hudi/pull/8251#issuecomment-1478498386 ## CI report: * 51eb9add29cae638ea2242cf2c6f5072f351 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1582

[GitHub] [hudi] soumilshah1995 commented on issue #8207: [SUPPORT] Hudi 0.13 Consistent Hashing Issue for MOR Tables

2023-03-21 Thread via GitHub
soumilshah1995 commented on issue #8207: URL: https://github.com/apache/hudi/issues/8207#issuecomment-1478470507 Here is Video and Exercise file https://soumilshah1995.blogspot.com/2023/03/topic-consistent-hashing-rfc-42.html https://www.youtube.com/watch?v=zN8JOBKXxP0 -

[GitHub] [hudi] AbhijeetSachdev1 commented on issue #8210: [SUPPORT] Hive sync failing with "Invalid default for field operationType" while migrating form Hudi 0.8 to 0.12

2023-03-21 Thread via GitHub
AbhijeetSachdev1 commented on issue #8210: URL: https://github.com/apache/hudi/issues/8210#issuecomment-1478453315 The problem lies in archive files. When I tried to merge them then also I am getting the same exception. Do we have any way of cleaning archive files ? I cannot find any

[GitHub] [hudi] kazdy commented on issue #8259: [SUPPORT] ParquetDecodingException: Can not read value at 0 in block -1 in file after switching to async services.

2023-03-21 Thread via GitHub
kazdy commented on issue #8259: URL: https://github.com/apache/hudi/issues/8259#issuecomment-1478402727 I inspected all files that were source files for clustering and all have the same schema. The file created as result of clustering has changed schema, column named "year" was moved fro

[GitHub] [hudi] gamblewin commented on issue #8260: [SUPPORT] How to implement incremental join

2023-03-21 Thread via GitHub
gamblewin commented on issue #8260: URL: https://github.com/apache/hudi/issues/8260#issuecomment-1478377153 I find that all of the insert operation I did is one-time operation in flink, which means after inserting data, the task will end, so the final join table `hudi_pat_disease` will not

[GitHub] [hudi] hudi-bot commented on pull request #8237: [HUDI-5958] Improve ResolvedSchema Instead of TableSchema.

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8237: URL: https://github.com/apache/hudi/pull/8237#issuecomment-1478345411 ## CI report: * d3ce6b5a9b0633f286acc9cdc4718eaac88c129f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1583

[GitHub] [hudi] hudi-bot commented on pull request #8225: [DO NOT REVIEW] [DO NOT MERGE] Bootstrap Performance

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8225: URL: https://github.com/apache/hudi/pull/8225#issuecomment-1478345256 ## CI report: * 0412d5eaa4deb05c7ff0569eea5fb4af2263fb9f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1583

[GitHub] [hudi] codope commented on a diff in pull request #8107: [HUDI-5514] Adding auto generation of record keys support to Hudi

2023-03-21 Thread via GitHub
codope commented on code in PR #8107: URL: https://github.com/apache/hudi/pull/8107#discussion_r1143727808 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala: ## @@ -1116,31 +1124,47 @@ object HoodieSparkSqlWriter { So

[GitHub] [hudi] hudi-bot commented on pull request #7826: [HUDI-5675] fix lazy clean schedule rollback on completed instant

2023-03-21 Thread via GitHub
hudi-bot commented on PR #7826: URL: https://github.com/apache/hudi/pull/7826#issuecomment-1478286948 ## CI report: * 6832b7787f8363cb09e672d8675d8bc68bad3a2c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1580

[GitHub] [hudi] AbhijeetSachdev1 commented on issue #8210: [SUPPORT] Hive sync failing with "Invalid default for field operationType" while migrating form Hudi 0.8 to 0.12

2023-03-21 Thread via GitHub
AbhijeetSachdev1 commented on issue #8210: URL: https://github.com/apache/hudi/issues/8210#issuecomment-1478279970 Hi @ad1happy2go, thanks for looking into, for us also table upsert was successful (o.8 -> 0.12) it was failing during hive_sync. We tried this for 3 tables and it worked

[jira] [Created] (HUDI-5969) Precombine field is not required for metadata only bootstrap

2023-03-21 Thread Jonathan Vexler (Jira)
Jonathan Vexler created HUDI-5969: - Summary: Precombine field is not required for metadata only bootstrap Key: HUDI-5969 URL: https://issues.apache.org/jira/browse/HUDI-5969 Project: Apache Hudi

[GitHub] [hudi] hudi-bot commented on pull request #7826: [HUDI-5675] fix lazy clean schedule rollback on completed instant

2023-03-21 Thread via GitHub
hudi-bot commented on PR #7826: URL: https://github.com/apache/hudi/pull/7826#issuecomment-1478268519 ## CI report: * 6832b7787f8363cb09e672d8675d8bc68bad3a2c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1580

[GitHub] [hudi] kazdy commented on issue #8261: [SUPPORT] How to reduce hoodie commit latency

2023-03-21 Thread via GitHub
kazdy commented on issue #8261: URL: https://github.com/apache/hudi/issues/8261#issuecomment-1478254161 in 0.12.1 there was a bug related to hive sync, I also observed increasing processing time, which was mostly spent on reading all files under .hoodie/archived directory you can disable

[GitHub] [hudi] hudi-bot commented on pull request #7999: [MINOR] Remove duplicated WriteOperationType.INSERT.value

2023-03-21 Thread via GitHub
hudi-bot commented on PR #7999: URL: https://github.com/apache/hudi/pull/7999#issuecomment-1478250804 ## CI report: * 7291097d89a1af99866d24b4b7ef943129968b41 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1583

[GitHub] [hudi] kazdy commented on pull request #7999: [MINOR] Remove duplicated WriteOperationType.INSERT.value

2023-03-21 Thread via GitHub
kazdy commented on PR #7999: URL: https://github.com/apache/hudi/pull/7999#issuecomment-1478243585 @bvaradar I rebased this branch on master and CI is green -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [hudi] Mulavar commented on issue #8253: [SUPPORT]HoodieJavaWriteClientExample can not run normally.

2023-03-21 Thread via GitHub
Mulavar commented on issue #8253: URL: https://github.com/apache/hudi/issues/8253#issuecomment-1478191156 I found some information that jol-core cannot be used on oracle jdk, refer to: 1. https://stackoverflow.com/questions/64593567/run-jol-core-exception-the-exception-message-is-process

[jira] [Created] (HUDI-5968) Global index update partition for MOR creating duplicates

2023-03-21 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-5968: Summary: Global index update partition for MOR creating duplicates Key: HUDI-5968 URL: https://issues.apache.org/jira/browse/HUDI-5968 Project: Apache Hudi Issue Typ

[GitHub] [hudi] hudi-bot commented on pull request #8225: [DO NOT REVIEW] [DO NOT MERGE] Bootstrap Performance

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8225: URL: https://github.com/apache/hudi/pull/8225#issuecomment-1478144981 ## CI report: * 90a62c81f6240ea04d772b14d2f2c16ae23ad9b9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1577

[GitHub] [hudi] hudi-bot commented on pull request #8251: [HUDI-5964] fix error info not show in CreateHoodieTableCommand

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8251: URL: https://github.com/apache/hudi/pull/8251#issuecomment-1478118274 ## CI report: * 51eb9add29cae638ea2242cf2c6f5072f351 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1582

[GitHub] [hudi] alexone95 opened a new issue, #8261: [SUPPORT] How to reduce hoodie commit latency

2023-03-21 Thread via GitHub
alexone95 opened a new issue, #8261: URL: https://github.com/apache/hudi/issues/8261 Hello, we are facing the fact that commit are getting slower and slower as time goes by (from a delta commit of 160 s during the day 1 to a delta commit of 300 s during day 4). Our deploy condition are the

[GitHub] [hudi] hudi-bot commented on pull request #8225: [DO NOT REVIEW] [DO NOT MERGE] Bootstrap Performance

2023-03-21 Thread via GitHub
hudi-bot commented on PR #8225: URL: https://github.com/apache/hudi/pull/8225#issuecomment-1478117189 ## CI report: * 90a62c81f6240ea04d772b14d2f2c16ae23ad9b9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1577

[GitHub] [hudi] xuzifu666 commented on pull request #8251: [HUDI-5964] fix error info not show in CreateHoodieTableCommand

2023-03-21 Thread via GitHub
xuzifu666 commented on PR #8251: URL: https://github.com/apache/hudi/pull/8251#issuecomment-1478097292 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [hudi] jonvex commented on a diff in pull request #8225: [DO NOT REVIEW] [DO NOT MERGE] Bootstrap Performance

2023-03-21 Thread via GitHub
jonvex commented on code in PR #8225: URL: https://github.com/apache/hudi/pull/8225#discussion_r1143604434 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/clustering/run/strategy/MultipleSparkJobExecutionStrategy.java: ## @@ -368,9 +368,6 @@ private Dataset

[GitHub] [hudi] jonvex commented on a diff in pull request #8225: [DO NOT REVIEW] [DO NOT MERGE] Bootstrap Performance

2023-03-21 Thread via GitHub
jonvex commented on code in PR #8225: URL: https://github.com/apache/hudi/pull/8225#discussion_r1143605051 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala: ## @@ -480,7 +481,9 @@ abstract class HoodieBaseRelation(val sqlContext:

[GitHub] [hudi] gamblewin opened a new issue, #8260: [SUPPORT] How to implement incremental join

2023-03-21 Thread via GitHub
gamblewin opened a new issue, #8260: URL: https://github.com/apache/hudi/issues/8260 - Problem Now I want to do a poc on hudi with flinksql. That is I have table A and table B, and I join table A and table B to get a new table C. Now when I make changes to base table(table A or table B),

  1   2   >