[jira] [Updated] (HUDI-7149) Add a dbt example project with CDC capability

2023-11-27 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7149: - Labels: pull-request-available (was: ) > Add a dbt example project with CDC capability >

[PR] [HUDI-7149] Add a dbt example project with CDC capability [hudi]

2023-11-27 Thread via GitHub
xushiyan opened a new pull request, #10192: URL: https://github.com/apache/hudi/pull/10192 ### Change Logs Add a new example project to illustrate dbt usage with CDC. ### Impact Show usage with dbt. ### Risk level None. ### Documentation Update

Re: [PR] [minor] when metric prefix length is 0 ignore the metric prefix [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10190: URL: https://github.com/apache/hudi/pull/10190#issuecomment-1829281568 ## CI report: * d0bf0e1cc06b74fe5ce777841e2065460698ef7e UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-6207] spark support bucket index query for table with bucket index [hudi]

2023-11-27 Thread via GitHub
KnightChess commented on code in PR #10191: URL: https://github.com/apache/hudi/pull/10191#discussion_r1407354272 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/BucketIndexSupport.scala: ## @@ -0,0 +1,164 @@ +/* + * Licensed to the Apache Software

Re: [PR] [HUDI-7138] Fix error table writer and schema registry provider [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10173: URL: https://github.com/apache/hudi/pull/10173#issuecomment-1829281456 ## CI report: * ab14d610c80a0e579c096861c70949a1eee1fee6 Azure:

Re: [PR] [HUDI-7086] Fix the default for gcp pub sub max sync time to 1min for PR #10073 [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10171: URL: https://github.com/apache/hudi/pull/10171#issuecomment-1829281413 ## CI report: * 55ebdbeb5871a0d15a6572ac6a4d7d71fe0b471e Azure:

[jira] [Updated] (HUDI-6207) Files pruning for bucket index table pk filtering queries using Spark SQL

2023-11-27 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6207: - Labels: pull-request-available (was: ) > Files pruning for bucket index table pk filtering

[PR] [HUDI-6207] spark support bucket index query for table with bucket index [hudi]

2023-11-27 Thread via GitHub
KnightChess opened a new pull request, #10191: URL: https://github.com/apache/hudi/pull/10191 ### Change Logs spark support query filter use bucket field if a bucket table query with appropriate expression( = 、in、and、or) ### Impact impore table query performance when

[jira] [Created] (HUDI-7149) Add a dbt example project with CDC capability

2023-11-27 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-7149: Summary: Add a dbt example project with CDC capability Key: HUDI-7149 URL: https://issues.apache.org/jira/browse/HUDI-7149 Project: Apache Hudi Issue Type:

Re: [PR] [HUDI-7138] Fix error table writer and schema registry provider [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10173: URL: https://github.com/apache/hudi/pull/10173#issuecomment-1829272778 ## CI report: * ab14d610c80a0e579c096861c70949a1eee1fee6 Azure:

Re: [PR] [HUDI-7086] Fix the default for gcp pub sub max sync time to 1min for PR #10073 [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10171: URL: https://github.com/apache/hudi/pull/10171#issuecomment-1829272699 ## CI report: * 55ebdbeb5871a0d15a6572ac6a4d7d71fe0b471e Azure:

[PR] [minor] when metric prefix length is 0 ignore the metric prefix [hudi]

2023-11-27 Thread via GitHub
LXin96 opened a new pull request, #10190: URL: https://github.com/apache/hudi/pull/10190 ### Change Logs when metric prefix length is 0 ignore the metric prefix ### Impact _Describe any public API or user-facing feature change or any performance impact._ ### Risk

Re: [PR] [HUDI-2453] Update the flink version of site [hudi]

2023-11-27 Thread via GitHub
danny0405 closed pull request #10189: [HUDI-2453] Update the flink version of site URL: https://github.com/apache/hudi/pull/10189 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [HUDI-7114] Add AWS region for testGetAWSCredentialsWithInvalidAssumeRole test [hudi]

2023-11-27 Thread via GitHub
hussein-awala commented on code in PR #10184: URL: https://github.com/apache/hudi/pull/10184#discussion_r1407332396 ## hudi-aws/src/test/java/org/apache/hudi/aws/TestHoodieAWSCredentialsProviderFactory.java: ## @@ -41,11 +40,11 @@ public void testGetAWSCredentials() {

[PR] [HUDI-2453] Update the flink version of site [hudi]

2023-11-27 Thread via GitHub
danny0405 opened a new pull request, #10189: URL: https://github.com/apache/hudi/pull/10189 ### Change Logs Fix the link. ### Impact none ### Risk level (write none, low medium or high below) none ### Documentation Update _Describe any

Re: [PR] [HUDI-7147] Fix CDC write flush bug [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10186: URL: https://github.com/apache/hudi/pull/10186#issuecomment-1829207595 ## CI report: * 15cfdfb94c7f332b80cad6383b1198d57be83fb3 UNKNOWN * 87fc3f4a0b2d8a03735ee8a575cfa299b137cdf5 Azure:

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1829200902 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * 5450affcda38adf0edab0fd77c6b4214873261d3 Azure:

Re: [PR] [HUDI-7135] Spark reads hudi table error when flink creates the table without pre… [hudi]

2023-11-27 Thread via GitHub
empcl commented on code in PR #10157: URL: https://github.com/apache/hudi/pull/10157#discussion_r1407275760 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/catalog/HoodieHiveCatalog.java: ## @@ -510,6 +511,11 @@ private void

Re: [PR] [HUDI-7135] Spark reads hudi table error when flink creates the table without pre… [hudi]

2023-11-27 Thread via GitHub
empcl commented on PR #10157: URL: https://github.com/apache/hudi/pull/10157#issuecomment-1829181702 > Thanks for the nice contribution, can you write some basic UTs in the `TestHoodieHiveCatalog` or `TestHoodieCatalog` ? Okay, I have this plan in mind -- This is an automated

Re: [PR] [HUDI-7148] Add an additional fix to the potential thread insecurity problem of heartbeat client. [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10188: URL: https://github.com/apache/hudi/pull/10188#issuecomment-1829151327 ## CI report: * ca7705fb73ecbc08d28f33113eee76c9fb0ec835 Azure:

Re: [PR] [HUDI-7148] Add an additional fix to the potential thread insecurity problem of heartbeat client. [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10188: URL: https://github.com/apache/hudi/pull/10188#issuecomment-1829143143 ## CI report: * ca7705fb73ecbc08d28f33113eee76c9fb0ec835 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7135] Spark reads hudi table error when flink creates the table without pre… [hudi]

2023-11-27 Thread via GitHub
danny0405 commented on code in PR #10157: URL: https://github.com/apache/hudi/pull/10157#discussion_r1407197997 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/catalog/HoodieHiveCatalog.java: ## @@ -510,6 +511,11 @@ private void

Re: [I] [SUPPORT] The INSERT records are marked as UPDATE [hudi]

2023-11-27 Thread via GitHub
danny0405 commented on issue #10156: URL: https://github.com/apache/hudi/issues/10156#issuecomment-1829129236 yes, the mor reader merges the payloads before returning the result set. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [HUDI-7114] Add AWS region for testGetAWSCredentialsWithInvalidAssumeRole test [hudi]

2023-11-27 Thread via GitHub
danny0405 commented on code in PR #10184: URL: https://github.com/apache/hudi/pull/10184#discussion_r1407195625 ## hudi-aws/src/test/java/org/apache/hudi/aws/TestHoodieAWSCredentialsProviderFactory.java: ## @@ -41,11 +40,11 @@ public void testGetAWSCredentials() {

(hudi) branch master updated: [MINOR] Schema Converter should use default identity transform if not specified (#10178)

2023-11-27 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 675abf180bd [MINOR] Schema Converter should

Re: [PR] [MINOR] Schema Converter should use default identity transform if not specified [hudi]

2023-11-27 Thread via GitHub
danny0405 merged PR #10178: URL: https://github.com/apache/hudi/pull/10178 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] [SUPPORT] Handling of DELETE operation using Debezium Kafka connector [hudi]

2023-11-27 Thread via GitHub
danny0405 commented on issue #10181: URL: https://github.com/apache/hudi/issues/10181#issuecomment-1829125604 cc @ad1happy2go for the clarification ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [HUDI-7136] in the dfs catalog scenario, solve the problem of Primary key definit… [hudi]

2023-11-27 Thread via GitHub
danny0405 commented on code in PR #10162: URL: https://github.com/apache/hudi/pull/10162#discussion_r1407192799 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/catalog/HoodieCatalog.java: ## @@ -325,10 +326,19 @@ public void createTable(ObjectPath

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
codope commented on code in PR #10137: URL: https://github.com/apache/hudi/pull/10137#discussion_r1407078231 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/common/model/HoodieSparkRecord.java: ## @@ -449,7 +449,8 @@ private static void validateRow(InternalRow

[jira] [Updated] (HUDI-7148) Add an additional fix to the potential thread insecurity problem of heartbeat client.

2023-11-27 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-7148: --- Attachment: Snipaste_2023-11-28_11-57-06.png > Add an additional fix to the potential thread insecurity problem of

[jira] [Updated] (HUDI-7148) Add an additional fix to the potential thread insecurity problem of heartbeat client.

2023-11-27 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-7148: --- Description: !Snipaste_2023-11-28_11-57-06.png! > Add an additional fix to the potential thread insecurity problem of

[PR] [HUDI-7148] Add an additional fix to the potential thread insecurity problem of heartbeat client. [hudi]

2023-11-27 Thread via GitHub
eric9204 opened a new pull request, #10188: URL: https://github.com/apache/hudi/pull/10188 ### Change Logs A potential problem: If the heartbeat client is updating the heartbeat time for the instant t1, then the write client completes the commit of the instant t1 and stops the

Re: [PR] [HUDI-7125] Fix bugs for CDC queries [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10144: URL: https://github.com/apache/hudi/pull/10144#issuecomment-1829090980 ## CI report: * 847fee8e1ce7b0e2d9af6dadbc802f4d67f06ee7 Azure:

Re: [I] [SUPPORT] null [hudi]

2023-11-27 Thread via GitHub
zlinsc closed issue #10187: [SUPPORT] null URL: https://github.com/apache/hudi/issues/10187 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1829055464 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * 3ae4d30b5af0842ac2fbd4380f70f2f573c0b7d2 Azure:

[jira] [Updated] (HUDI-7148) Add an additional fix to the potential thread insecurity problem of heartbeat client.

2023-11-27 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-7148: --- Summary: Add an additional fix to the potential thread insecurity problem of heartbeat client. (was: Add a handle to

[jira] [Created] (HUDI-7148) Add a handle to the potential thread insecurity problem of heartbeat client

2023-11-27 Thread eric (Jira)
eric created HUDI-7148: -- Summary: Add a handle to the potential thread insecurity problem of heartbeat client Key: HUDI-7148 URL: https://issues.apache.org/jira/browse/HUDI-7148 Project: Apache Hudi

Re: [PR] [HUDI-7147] Fix CDC write flush bug [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10186: URL: https://github.com/apache/hudi/pull/10186#issuecomment-1829027873 ## CI report: * 15cfdfb94c7f332b80cad6383b1198d57be83fb3 UNKNOWN * 87fc3f4a0b2d8a03735ee8a575cfa299b137cdf5 Azure:

Re: [PR] [HUDI-7125] Fix bugs for CDC queries [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10144: URL: https://github.com/apache/hudi/pull/10144#issuecomment-1829027764 ## CI report: * a07636ea1c3aed95781c75e34ccb43aa914541e6 Azure:

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1829027734 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * a045da6d6c31c09a410449788002b75e8c335548 Azure:

[jira] [Updated] (HUDI-3204) Allow original partition column value to be retrieved when using TimestampBasedKeyGen

2023-11-27 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3204: - Fix Version/s: 0.14.1 > Allow original partition column value to be retrieved when using >

Re: [PR] [HUDI-7147] Fix CDC write flush bug [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10186: URL: https://github.com/apache/hudi/pull/10186#issuecomment-1829023174 ## CI report: * 15cfdfb94c7f332b80cad6383b1198d57be83fb3 UNKNOWN * 87fc3f4a0b2d8a03735ee8a575cfa299b137cdf5 UNKNOWN Bot commands @hudi-bot supports the

Re: [PR] [HUDI-7125] Fix bugs for CDC queries [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10144: URL: https://github.com/apache/hudi/pull/10144#issuecomment-1829023044 ## CI report: * a07636ea1c3aed95781c75e34ccb43aa914541e6 Azure:

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1829022999 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * a045da6d6c31c09a410449788002b75e8c335548 Azure:

Re: [PR] [HUDI-7147] Fix CDC write flush bug [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10186: URL: https://github.com/apache/hudi/pull/10186#issuecomment-1829017884 ## CI report: * 15cfdfb94c7f332b80cad6383b1198d57be83fb3 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7136] in the dfs catalog scenario, solve the problem of Primary key definit… [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10162: URL: https://github.com/apache/hudi/pull/10162#issuecomment-1829017797 ## CI report: * 64589da09eb106b1fc771ca77b64d30c81ae5970 Azure:

[jira] [Updated] (HUDI-7023) Support querying without syncing partition metadata to catalog

2023-11-27 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-7023: -- Reviewers: Ethan Guo > Support querying without syncing partition metadata to catalog >

[I] [SUPPORT] Double writing to the MOR table throws an exception when using bucket OCC and bucket index [hudi]

2023-11-27 Thread via GitHub
zlinsc opened a new issue, #10187: URL: https://github.com/apache/hudi/issues/10187 ### Describe the problem you faced I ran two flink job to write same rows into a MOR table but got NumberFormatException. The error shows the job had wrong formatted logfile name. my flink hudi

Re: [PR] [HUDI-7136] in the dfs catalog scenario, solve the problem of Primary key definit… [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10162: URL: https://github.com/apache/hudi/pull/10162#issuecomment-1828978956 ## CI report: * 64589da09eb106b1fc771ca77b64d30c81ae5970 Azure:

[jira] [Updated] (HUDI-7147) Hudi cdc write throws Unsupported Operation Exception

2023-11-27 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7147: - Labels: pull-request-available (was: ) > Hudi cdc write throws Unsupported Operation Exception >

[PR] [HUDI-7147] Fix CDC write flush bug [hudi]

2023-11-27 Thread via GitHub
zhangyue19921010 opened a new pull request, #10186: URL: https://github.com/apache/hudi/pull/10186 ### Change Logs ``` 2023-11-27 19:39:20 java.io.IOException: Could not perform checkpoint 10 for operator Sink: bucket_write(table=hudi__cdc) (60/192)#7. at

Re: [PR] [HUDI-6497] WIP HoodieStorage abstraction [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10185: URL: https://github.com/apache/hudi/pull/10185#issuecomment-1828979078 ## CI report: * c80af48f4d838dc06b5d4d43bf5f12ab246231a1 Azure:

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828978858 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * a045da6d6c31c09a410449788002b75e8c335548 Azure:

(hudi) branch master updated (4c3a1db146b -> fb062dfc9ae)

2023-11-27 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 4c3a1db146b [HUDI-7110][FOLLOW-UP] Improve call procedure for show column stats information (#10169) add

Re: [PR] [Minor] Fix the flaky tests in TestRemoteHoodieTableFileSystemView [hudi]

2023-11-27 Thread via GitHub
danny0405 merged PR #10179: URL: https://github.com/apache/hudi/pull/10179 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Created] (HUDI-7147) Hudi cdc write throws Unsupported Operation Exception

2023-11-27 Thread Yue Zhang (Jira)
Yue Zhang created HUDI-7147: --- Summary: Hudi cdc write throws Unsupported Operation Exception Key: HUDI-7147 URL: https://issues.apache.org/jira/browse/HUDI-7147 Project: Apache Hudi Issue Type:

Re: [PR] [HUDI-7136] in the dfs catalog scenario, solve the problem of Primary key definit… [hudi]

2023-11-27 Thread via GitHub
empcl commented on PR #10162: URL: https://github.com/apache/hudi/pull/10162#issuecomment-1828976248 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [HUDI-6497] WIP HoodieStorage abstraction [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10185: URL: https://github.com/apache/hudi/pull/10185#issuecomment-1828973207 ## CI report: * c80af48f4d838dc06b5d4d43bf5f12ab246231a1 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828973086 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * a045da6d6c31c09a410449788002b75e8c335548 Azure:

[jira] [Updated] (HUDI-6497) Introduce a new HudiFileSystem & HudiPath abstraction to remove Hadoop from hudi-common

2023-11-27 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6497: - Labels: pull-request-available (was: ) > Introduce a new HudiFileSystem & HudiPath abstraction

[PR] [HUDI-6497] WIP HoodieStorage abstraction [hudi]

2023-11-27 Thread via GitHub
yihua opened a new pull request, #10185: URL: https://github.com/apache/hudi/pull/10185 ### Change Logs This PR introduces `HoodieStorage` abstraction and tries to remove the usage of Hadoop File System classes (`org.apache.hadoop.fs.`[`FileSystem`, `Path`, `FileStatus`], etc.) in

Re: [I] [SUPPORT] The INSERT records are marked as UPDATE [hudi]

2023-11-27 Thread via GitHub
zdl1 commented on issue #10156: URL: https://github.com/apache/hudi/issues/10156#issuecomment-1828955749 > the compaction writer actually knows the record operation when it does the payload merging Sorry for the late reply, what if I just `select count(*) from table`? Does it

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828920258 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * a045da6d6c31c09a410449788002b75e8c335548 Azure:

Re: [PR] [HUDI-7114] Add AWS region for testGetAWSCredentialsWithInvalidAssumeRole test [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10184: URL: https://github.com/apache/hudi/pull/10184#issuecomment-1828886340 ## CI report: * f5b64ea26989bed224b6d1a23adae58cc859f55c Azure:

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828871007 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * 91be81a12c0321d089c89ceeb7ed0ec8d18079ea Azure:

Re: [I] [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern [hudi]

2023-11-27 Thread via GitHub
soumilshah1995 commented on issue #10165: URL: https://github.com/apache/hudi/issues/10165#issuecomment-1828865064 # Test Passed Ran Delta Streamer in CONT Mode ``` spark-submit \ --class org.apache.hudi.utilities.streamer.HoodieStreamer \ --packages

Re: [I] [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern [hudi]

2023-11-27 Thread via GitHub
soumilshah1995 closed issue #10165: [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern URL: https://github.com/apache/hudi/issues/10165 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern [hudi]

2023-11-27 Thread via GitHub
soumilshah1995 commented on issue #10165: URL: https://github.com/apache/hudi/issues/10165#issuecomment-1828853204 Well I was testing this out looked fine for few run why is it clustering 2023-11 ? ``` None

Re: [I] [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern [hudi]

2023-11-27 Thread via GitHub
soumilshah1995 commented on issue #10165: URL: https://github.com/apache/hudi/issues/10165#issuecomment-1828848290 I am looking fwd to make YouTube videos on this topics -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern [hudi]

2023-11-27 Thread via GitHub
soumilshah1995 commented on issue #10165: URL: https://github.com/apache/hudi/issues/10165#issuecomment-1828844187 Issue has been resolved I had to pass this in quotes ``` --hoodie-conf 'hoodie.clustering.plan.strategy.partition.regex.pattern=^2023-10-[0-9][0-9]$' \

Re: [I] [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern [hudi]

2023-11-27 Thread via GitHub
soumilshah1995 closed issue #10165: [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern URL: https://github.com/apache/hudi/issues/10165 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern [hudi]

2023-11-27 Thread via GitHub
soumilshah1995 commented on issue #10165: URL: https://github.com/apache/hudi/issues/10165#issuecomment-1828824587 @noahtaite you are right I did set the flag and now I can see all partition that were clustered ```

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828818769 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * c22d1db44cd389160fe06a6f20233fb0e22df322 Azure:

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828810016 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * c22d1db44cd389160fe06a6f20233fb0e22df322 Azure:

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828803008 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * c22d1db44cd389160fe06a6f20233fb0e22df322 Azure:

[jira] [Updated] (HUDI-7131) The requested schema is not compatible with the file schema

2023-11-27 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-7131: - Fix Version/s: 0.14.1 > The requested schema is not compatible with the file schema >

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828759007 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * 38b26030e7df02d865cfe4efd03a955124d2272a Azure:

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828749924 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * 38b26030e7df02d865cfe4efd03a955124d2272a Azure:

Re: [PR] [HUDI-7114] Add AWS region for testGetAWSCredentialsWithInvalidAssumeRole test [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10184: URL: https://github.com/apache/hudi/pull/10184#issuecomment-1828740255 ## CI report: * 67d766e2571b003e629119123517f694c57cfbb3 Azure:

Re: [I] [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern [hudi]

2023-11-27 Thread via GitHub
noahtaite commented on issue #10165: URL: https://github.com/apache/hudi/issues/10165#issuecomment-1828736370 @soumilshah1995 I did a bit of digging, https://hudi.apache.org/docs/procedures/ doesn't explicitly say it, but there appears to be a `show_involved_partition` boolean flag we

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828680284 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * 38b26030e7df02d865cfe4efd03a955124d2272a Azure:

Re: [PR] [HUDI-7114] Add AWS region for testGetAWSCredentialsWithInvalidAssumeRole test [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10184: URL: https://github.com/apache/hudi/pull/10184#issuecomment-1828680591 ## CI report: * 67d766e2571b003e629119123517f694c57cfbb3 Azure:

Re: [I] [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern [hudi]

2023-11-27 Thread via GitHub
soumilshah1995 commented on issue #10165: URL: https://github.com/apache/hudi/issues/10165#issuecomment-1828671661 Here is my replacement commits https://github.com/apache/hudi/assets/39345855/1d5d32ef-0e78-4b19-96e6-3e9dc450231f;> ``` { "partitionToWriteStats" : {

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828669465 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * 38b26030e7df02d865cfe4efd03a955124d2272a Azure:

Re: [PR] [HUDI-7114] Add AWS region for testGetAWSCredentialsWithInvalidAssumeRole test [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10184: URL: https://github.com/apache/hudi/pull/10184#issuecomment-1828669742 ## CI report: * 67d766e2571b003e629119123517f694c57cfbb3 Azure:

Re: [I] [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern [hudi]

2023-11-27 Thread via GitHub
soumilshah1995 commented on issue #10165: URL: https://github.com/apache/hudi/issues/10165#issuecomment-1828669324 if i am missing any config please let me know happy to learn from experts here -- This is an automated message from the Apache Git Service. To respond to

Re: [I] [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern [hudi]

2023-11-27 Thread via GitHub
soumilshah1995 commented on issue #10165: URL: https://github.com/apache/hudi/issues/10165#issuecomment-1828665873 I changed my partition to look like this https://github.com/apache/hudi/assets/39345855/b645db43-f074-4880-bb26-1952752e7683;> Hudi configs ```

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828656724 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * 38b26030e7df02d865cfe4efd03a955124d2272a Azure:

Re: [I] [SUPPORT] Compaction & Clustering are not working [hudi]

2023-11-27 Thread via GitHub
noahtaite commented on issue #10183: URL: https://github.com/apache/hudi/issues/10183#issuecomment-1828629712 @Cpandey43 Hey! I'm another Hudi 0.13.1 MOR user, so just thought I'd come by to help lend a hand and dig a bit deeper into the problem you're reporting. **To be clear - I'm

Re: [I] [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern [hudi]

2023-11-27 Thread via GitHub
subash-metica commented on issue #10165: URL: https://github.com/apache/hudi/issues/10165#issuecomment-1828598815 > The partition paths in your example, does it contain only dates in the format of -MM-dd ? @noahtaite is right! Looks like your partition path is not -MM-dd

Re: [I] [SUPPORT] Async Clustering: Seeking Help on Specific Partitioning and Regex Pattern [hudi]

2023-11-27 Thread via GitHub
noahtaite commented on issue #10165: URL: https://github.com/apache/hudi/issues/10165#issuecomment-1828595136 >

Re: [PR] [HUDI-7114] Add AWS region for testGetAWSCredentialsWithInvalidAssumeRole test [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10184: URL: https://github.com/apache/hudi/pull/10184#issuecomment-1828594235 ## CI report: * 67d766e2571b003e629119123517f694c57cfbb3 Azure:

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828593909 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * 2a9a36323053c5d35ae68cc5b2c38bab707b0f1a Azure:

Re: [PR] [HUDI-7137] Implement Bootstrap for new FG reader [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10137: URL: https://github.com/apache/hudi/pull/10137#issuecomment-1828581987 ## CI report: * 77205b47c45501a0d9de1ebc74d5bb8c960cd95a UNKNOWN * 2a9a36323053c5d35ae68cc5b2c38bab707b0f1a Azure:

Re: [PR] [HUDI-7114] Add AWS region for testGetAWSCredentialsWithInvalidAssumeRole test [hudi]

2023-11-27 Thread via GitHub
hudi-bot commented on PR #10184: URL: https://github.com/apache/hudi/pull/10184#issuecomment-1828582304 ## CI report: * 67d766e2571b003e629119123517f694c57cfbb3 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7114] Add AWS region for testGetAWSCredentialsWithInvalidAssumeRole test [hudi]

2023-11-27 Thread via GitHub
hussein-awala commented on PR #10184: URL: https://github.com/apache/hudi/pull/10184#issuecomment-1828576828 @yihua Could you take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[jira] [Updated] (HUDI-7114) Fix TestHoodieAWSCredentialsProviderFactory#testGetAWSCredentialsWithInvalidAssumeRole

2023-11-27 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7114: - Labels: pull-request-available (was: ) > Fix >

[PR] Add AWS region for testGetAWSCredentialsWithInvalidAssumeRole test [hudi]

2023-11-27 Thread via GitHub
hussein-awala opened a new pull request, #10184: URL: https://github.com/apache/hudi/pull/10184 ### Change Logs This PR enables `testGetAWSCredentialsWithInvalidAssumeRole` test and adds the system property `aws.region` to define the default AWS region for the test. ### Impact

[I] [SUPPORT] Compaction & Clustering are not working [hudi]

2023-11-27 Thread via GitHub
Cpandey43 opened a new issue, #10183: URL: https://github.com/apache/hudi/issues/10183 **Describe the problem you faced** **Issue:1** I configured the application with async compaction, async clustering, and async cleaning in the job but all are not working as per the configured

Re: [I] [SUPPORT] Additional records in dataset after clustering [hudi]

2023-11-27 Thread via GitHub
noahtaite closed issue #10172: [SUPPORT] Additional records in dataset after clustering URL: https://github.com/apache/hudi/issues/10172 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

  1   2   >