Re: [PR] [DOCS] Added video resources to Concepts and Services Sections [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #10080: URL: https://github.com/apache/hudi/pull/10080#discussion_r1396826530 ## website/docs/indexing.md: ## @@ -155,4 +155,11 @@ to finally check the incoming updates against all files. The `SIMPLE` Index will `HBASE` index can be employed,

Re: [PR] [DOCS] Added video resources to Concepts and Services Sections [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #10080: URL: https://github.com/apache/hudi/pull/10080#discussion_r1396826296 ## website/docs/indexing.md: ## @@ -155,4 +155,11 @@ to finally check the incoming updates against all files. The `SIMPLE` Index will `HBASE` index can be employed,

Re: [PR] [HUDI-7072] Remove support for Flink 1.13 [hudi]

2023-11-16 Thread via GitHub
beyond1920 commented on code in PR #10052: URL: https://github.com/apache/hudi/pull/10052#discussion_r1396823348 ## .github/workflows/bot.yml: ## @@ -377,15 +370,6 @@ jobs: - flinkProfile: 'flink1.14' sparkProfile: 'spark3.2' sparkRuntime:

Re: [PR] [HUDI-7072] Remove support for Flink 1.13 [hudi]

2023-11-16 Thread via GitHub
beyond1920 commented on code in PR #10052: URL: https://github.com/apache/hudi/pull/10052#discussion_r1396822953 ## .github/workflows/bot.yml: ## @@ -302,15 +301,9 @@ jobs: - flinkProfile: 'flink1.14' sparkProfile: 'spark3.2' sparkRuntime:

Re: [PR] [DOCS] Added video resources to Concepts and Services Sections [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #10080: URL: https://github.com/apache/hudi/pull/10080#discussion_r1396823127 ## website/docs/key_generation.md: ## @@ -209,3 +209,8 @@ Partition path generated from key generator: "2020040118" Input field value: "20200401" Partition path

Re: [PR] [HUDI-7041] Optimize the mem usage of partitionToFileGroupsMap during the cleaning [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10002: URL: https://github.com/apache/hudi/pull/10002#issuecomment-1815889971 ## CI report: * 47200120dd37ebaee77c583628bfddac1f564b3b Azure:

[I] [SUPPORT]hudi insert is too slow & [hudi]

2023-11-16 Thread via GitHub
zyclove opened a new issue, #10131: URL: https://github.com/apache/hudi/issues/10131 **Describe the problem you faced** spark sql bulk insert data is too slow , how to turn performance. as https://hudi.apache.org/docs/performance I do change many config, but is not

Re: [PR] [DOCS] Added video resources to Concepts and Services Sections [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #10080: URL: https://github.com/apache/hudi/pull/10080#discussion_r1396818699 ## website/docs/concurrency_control.md: ## @@ -279,4 +279,10 @@ hoodie.cleaner.policy.failed.writes=EAGER ## Caveats If you are using the `WriteClient` API, please

Re: [PR] [DOCS] Added video resources to Concepts and Services Sections [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #10080: URL: https://github.com/apache/hudi/pull/10080#discussion_r1396818335 ## website/docs/compaction.md: ## @@ -231,3 +231,5 @@ Offline compaction needs to submit the Flink task on the command line. The progr | `--seq` | `LIFO` (Optional)

Re: [PR] Flink quickstart and sql website updates [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on PR #9952: URL: https://github.com/apache/hudi/pull/9952#issuecomment-1815859946 Can we also add an example for Incremental queries under Flink in sql_queries page for completeness? -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Flink quickstart and sql website updates [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #9952: URL: https://github.com/apache/hudi/pull/9952#discussion_r1396792988 ## website/docs/sql_dml.md: ## @@ -199,7 +199,52 @@ You can control the behavior of these operations using various configuration opt ## Flink -Flink SQL also

[jira] [Assigned] (HUDI-7117) Functional index creation not working when table is created using datasource writer

2023-11-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit reassigned HUDI-7117: - Assignee: Sagar Sumit > Functional index creation not working when table is created using

Re: [PR] [HUDI-7110] Add call procedure for show column stats information [hudi]

2023-11-16 Thread via GitHub
majian1998 commented on code in PR #10120: URL: https://github.com/apache/hudi/pull/10120#discussion_r1396791841 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/procedures/ShowMetadataTableColumnStatsProcedure.scala: ## @@ -0,0 +1,105 @@ +/*

Re: [PR] Flink quickstart and sql website updates [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #9952: URL: https://github.com/apache/hudi/pull/9952#discussion_r1396791058 ## website/docs/sql_dml.md: ## @@ -199,7 +199,52 @@ You can control the behavior of these operations using various configuration opt ## Flink -Flink SQL also

Re: [PR] Flink quickstart and sql website updates [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #9952: URL: https://github.com/apache/hudi/pull/9952#discussion_r1396790547 ## website/docs/sql_dml.md: ## @@ -199,7 +199,52 @@ You can control the behavior of these operations using various configuration opt ## Flink -Flink SQL also

[jira] [Updated] (HUDI-7117) Functional index creation not working when table is created using datasource writer

2023-11-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-7117: -- Epic Link: HUDI-512 > Functional index creation not working when table is created using datasource >

Re: [I] [SUPPORT] RFC 63 Functional Index Hudi 0.1.0-beta [hudi]

2023-11-16 Thread via GitHub
ad1happy2go commented on issue #10110: URL: https://github.com/apache/hudi/issues/10110#issuecomment-1815852059 @soumilshah1995 @codope Create JIRA to track this issue - https://issues.apache.org/jira/browse/HUDI-7117 -- This is an automated message from the Apache Git Service. To

Re: [PR] Flink quickstart and sql website updates [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #9952: URL: https://github.com/apache/hudi/pull/9952#discussion_r1396788955 ## website/docs/sql_ddl.md: ## @@ -383,18 +383,57 @@ CREATE CATALOG hoodie_catalog The following is an example of creating a Flink table. Read the [Flink Quick

[jira] [Created] (HUDI-7117) Functional index creation not working when table is created using datasource writer

2023-11-16 Thread Aditya Goenka (Jira)
Aditya Goenka created HUDI-7117: --- Summary: Functional index creation not working when table is created using datasource writer Key: HUDI-7117 URL: https://issues.apache.org/jira/browse/HUDI-7117

Re: [PR] Flink quickstart and sql website updates [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #9952: URL: https://github.com/apache/hudi/pull/9952#discussion_r1396786024 ## website/docs/sql_dml.md: ## @@ -199,7 +199,52 @@ You can control the behavior of these operations using various configuration opt ## Flink -Flink SQL also

Re: [PR] [HUDI-7111] Fix performance regression of tag when written into simple bucket index table [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10130: URL: https://github.com/apache/hudi/pull/10130#issuecomment-1815846308 ## CI report: * 42ed5726bd8cbfef2bed148ed6186034b11fd9eb Azure:

Re: [PR] [HUDI-7099] Providing metrics for archive and defining some string constants [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10101: URL: https://github.com/apache/hudi/pull/10101#issuecomment-1815846194 ## CI report: * 3a4f9e92cd71dac2506c883c57785f07ee5bcf24 Azure:

Re: [PR] Flink quickstart and sql website updates [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #9952: URL: https://github.com/apache/hudi/pull/9952#discussion_r1396784871 ## website/docs/sql_ddl.md: ## @@ -383,18 +383,57 @@ CREATE CATALOG hoodie_catalog The following is an example of creating a Flink table. Read the [Flink Quick

Re: [PR] Flink quickstart and sql website updates [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #9952: URL: https://github.com/apache/hudi/pull/9952#discussion_r1396784495 ## website/docs/sql_ddl.md: ## @@ -383,18 +383,57 @@ CREATE CATALOG hoodie_catalog The following is an example of creating a Flink table. Read the [Flink Quick

Re: [PR] [HUDI-7111] Fix performance regression of tag when written into simple bucket index table [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10130: URL: https://github.com/apache/hudi/pull/10130#issuecomment-1815839243 ## CI report: * 42ed5726bd8cbfef2bed148ed6186034b11fd9eb UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-7099] Providing metrics for archive and defining some string constants [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10101: URL: https://github.com/apache/hudi/pull/10101#issuecomment-1815839107 ## CI report: * 3a4f9e92cd71dac2506c883c57785f07ee5bcf24 Azure:

Re: [PR] Flink quickstart and sql website updates [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #9952: URL: https://github.com/apache/hudi/pull/9952#discussion_r1396777966 ## website/docs/flink-quick-start-guide.md: ## @@ -215,9 +215,9 @@ HoodiePipeline.Builder builder = HoodiePipeline.builder(targetTable) builder.sink(dataStream,

Re: [PR] Flink quickstart and sql website updates [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #9952: URL: https://github.com/apache/hudi/pull/9952#discussion_r1396777280 ## website/docs/flink-quick-start-guide.md: ## @@ -322,6 +359,20 @@ The `DELETE` statement is supported since Flink 1.17, so only Hudi Flink bundle Only **batch**

Re: [PR] Flink quickstart and sql website updates [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #9952: URL: https://github.com/apache/hudi/pull/9952#discussion_r1396776833 ## website/docs/flink-quick-start-guide.md: ## @@ -302,7 +331,15 @@ Only **batch** queries on Hudi table with primary key work correctly. ::: ## Delete Data

Re: [PR] Flink quickstart and sql website updates [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #9952: URL: https://github.com/apache/hudi/pull/9952#discussion_r1396776055 ## website/docs/flink-quick-start-guide.md: ## @@ -287,10 +289,37 @@ Refers to [Table types and queries](/docs/concepts#table-types--queries) for mor This is similar

Re: [PR] Flink quickstart and sql website updates [hudi]

2023-11-16 Thread via GitHub
bhasudha commented on code in PR #9952: URL: https://github.com/apache/hudi/pull/9952#discussion_r1396770283 ## website/docs/flink-quick-start-guide.md: ## @@ -287,10 +289,37 @@ Refers to [Table types and queries](/docs/concepts#table-types--queries) for mor This is similar

Re: [PR] [HUDI-7116] Add docker image for flink 1.14 and spark 2.4.8 [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on PR #10126: URL: https://github.com/apache/hudi/pull/10126#issuecomment-1815827193 Thanks for the help! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [HUDI-7099] Providing metrics for archive and defining some string constants [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on code in PR #10101: URL: https://github.com/apache/hudi/pull/10101#discussion_r1396767580 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/timeline/HoodieTimelineArchiver.java: ## @@ -111,13 +114,13 @@ public boolean

Re: [PR] [HUDI-7099] Providing metrics for archive and defining some string constants [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on code in PR #10101: URL: https://github.com/apache/hudi/pull/10101#discussion_r1396767580 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/timeline/HoodieTimelineArchiver.java: ## @@ -111,13 +114,13 @@ public boolean

Re: [PR] [HUDI-7116] Add docker image for flink 1.14 and spark 2.4.8 [hudi]

2023-11-16 Thread via GitHub
yihua commented on PR #10126: URL: https://github.com/apache/hudi/pull/10126#issuecomment-1815823664 The upload is done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [HUDI-7072] Remove support for Flink 1.13 [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on code in PR #10052: URL: https://github.com/apache/hudi/pull/10052#discussion_r1396761804 ## .github/workflows/bot.yml: ## @@ -377,15 +370,6 @@ jobs: - flinkProfile: 'flink1.14' sparkProfile: 'spark3.2' sparkRuntime:

Re: [PR] [HUDI-7072] Remove support for Flink 1.13 [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on code in PR #10052: URL: https://github.com/apache/hudi/pull/10052#discussion_r1396761419 ## .github/workflows/bot.yml: ## @@ -302,15 +301,9 @@ jobs: - flinkProfile: 'flink1.14' sparkProfile: 'spark3.2' sparkRuntime:

Re: [PR] [HUDI-7116] Add docker image for flink 1.14 and spark 2.4.8 [hudi]

2023-11-16 Thread via GitHub
danny0405 merged PR #10126: URL: https://github.com/apache/hudi/pull/10126 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[PR] [HUDI-7111] Fix performance regression of tag when written into simple bucket index table [hudi]

2023-11-16 Thread via GitHub
beyond1920 opened a new pull request, #10130: URL: https://github.com/apache/hudi/pull/10130 ### Change Logs After upgrade the version to 0.14.0, the performance of the Spark job, which is written into a simple bucket index table, is regressing.

[jira] [Updated] (HUDI-7111) Performance regression of spark job which written into simple bucket index table

2023-11-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7111: - Labels: pull-request-available (was: ) > Performance regression of spark job which written into

Re: [PR] [HUDI-7116] Add docker image for flink 1.14 and spark 2.4.8 [hudi]

2023-11-16 Thread via GitHub
yihua commented on PR #10126: URL: https://github.com/apache/hudi/pull/10126#issuecomment-1815805457 The docker image `apachehudi/hudi-ci-bundle-validation-base:flink1146hive239spark248` is being uploaded. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [HUDI-7115] Add in new options for the bigquery sync [hudi]

2023-11-16 Thread via GitHub
yihua commented on code in PR #10125: URL: https://github.com/apache/hudi/pull/10125#discussion_r1396743948 ## hudi-gcp/src/main/java/org/apache/hudi/gcp/bigquery/HoodieBigQuerySyncClient.java: ## @@ -51,34 +51,41 @@ import java.util.Map; import java.util.stream.Collectors;

Re: [PR] [HUDI-7115] Add in new options for the bigquery sync [hudi]

2023-11-16 Thread via GitHub
yihua commented on code in PR #10125: URL: https://github.com/apache/hudi/pull/10125#discussion_r1396743251 ## hudi-gcp/src/main/java/org/apache/hudi/gcp/bigquery/BigQuerySyncConfig.java: ## @@ -122,6 +121,16 @@ public class BigQuerySyncConfig extends HoodieSyncConfig

Re: [PR] [HUDI-7041] Optimize the mem usage of partitionToFileGroupsMap during the cleaning [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10002: URL: https://github.com/apache/hudi/pull/10002#issuecomment-1815793435 ## CI report: * bb60d3f2fe5737fc43a700bcc6c37806fe48868a Azure:

Re: [PR] [HUDI-7112] Reuse existing timeline server and performance improvements [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10122: URL: https://github.com/apache/hudi/pull/10122#issuecomment-1815788365 ## CI report: * faf61fb4c40584fd9dbdd4aafc85e699c3d9d8ba Azure:

Re: [PR] [HUDI-7041] Optimize the mem usage of partitionToFileGroupsMap during the cleaning [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10002: URL: https://github.com/apache/hudi/pull/10002#issuecomment-1815788149 ## CI report: * bb60d3f2fe5737fc43a700bcc6c37806fe48868a Azure:

Re: [PR] [HUDI-7116] Add docker image for flink 1.14 and spark 2.4.8 [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10126: URL: https://github.com/apache/hudi/pull/10126#issuecomment-1815754278 ## CI report: * 35fa17181c858c202869a4f9f7807ccb37c83438 Azure:

[PR] DOCS-updated community sync page for new details [hudi]

2023-11-16 Thread via GitHub
nfarah86 opened a new pull request, #10129: URL: https://github.com/apache/hudi/pull/10129 @bhasudha updated the community sync page to point to the LinkedIn live events page https://github.com/apache/hudi/assets/5392555/0eb42563-38f1-4fe4-98a8-a83d0ec2d2ab;> -- This is

Re: [PR] [HUDI-7116] Add docker image for flink 1.14 and spark 2.4.8 [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10126: URL: https://github.com/apache/hudi/pull/10126#issuecomment-1815749391 ## CI report: * 35fa17181c858c202869a4f9f7807ccb37c83438 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[I] [SUPPORT] Hudi Sink Connector is not working with Version 0.14.0 and 0.13.1 [hudi]

2023-11-16 Thread via GitHub
seethb opened a new issue, #10128: URL: https://github.com/apache/hudi/issues/10128 Hi I have followed Hudi kafka connect instructions from this document https://github.com/apache/hudi/blob/master/hudi-kafka-connect/README.md and trying to setup a Hudi Sink connector in my local

[I] [SUPPORT] Clean action failure triggers an exception while trying to check whether metadata is a table [hudi]

2023-11-16 Thread via GitHub
shubhamn21 opened a new issue, #10127: URL: https://github.com/apache/hudi/issues/10127 **Describe the problem you faced** The hudi job runs fine for an hour but then crashes after a Warning about `Clean Action failure` and subsequently raising an exception

[jira] [Updated] (HUDI-7116) Add docker image for flink 1.14 and spark 2.4.8

2023-11-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7116: - Labels: pull-request-available (was: ) > Add docker image for flink 1.14 and spark 2.4.8 >

[PR] [HUDI-7116] Add docker image for flink 1.14 and spark 2.4.8 [hudi]

2023-11-16 Thread via GitHub
danny0405 opened a new pull request, #10126: URL: https://github.com/apache/hudi/pull/10126 ### Change Logs Add the build image for flink 1.14 and spark 2.4.8. ### Impact none ### Risk level (write none, low medium or high below) none ###

[jira] [Created] (HUDI-7116) Add docker image for flink 1.14 and spark 2.4.8

2023-11-16 Thread Danny Chen (Jira)
Danny Chen created HUDI-7116: Summary: Add docker image for flink 1.14 and spark 2.4.8 Key: HUDI-7116 URL: https://issues.apache.org/jira/browse/HUDI-7116 Project: Apache Hudi Issue Type:

Re: [PR] [HUDI-7072] Remove support for Flink 1.13 [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on code in PR #10052: URL: https://github.com/apache/hudi/pull/10052#discussion_r1396685712 ## .github/workflows/bot.yml: ## @@ -377,15 +370,6 @@ jobs: - flinkProfile: 'flink1.14' sparkProfile: 'spark3.2' sparkRuntime:

Re: [PR] [HUDI-7072] Remove support for Flink 1.13 [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on code in PR #10052: URL: https://github.com/apache/hudi/pull/10052#discussion_r1396685357 ## .github/workflows/bot.yml: ## @@ -302,15 +301,9 @@ jobs: - flinkProfile: 'flink1.14' sparkProfile: 'spark3.2' sparkRuntime:

Re: [PR] [HUDI-7112] Reuse existing timeline server and performance improvements [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10122: URL: https://github.com/apache/hudi/pull/10122#issuecomment-1815718739 ## CI report: * 1877ef99bb4c939181b5341e19c44cbba742d7cd Azure:

Re: [I] [SUPPORT] Query failure due to replacecommit being archived [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on issue #10107: URL: https://github.com/apache/hudi/issues/10107#issuecomment-1815714178 We did have some fixes in recent releases: https://github.com/apache/hudi/pull/7568, https://github.com/apache/hudi/pull/8443. -- This is an automated message from the Apache

Re: [PR] [HUDI-7112] Reuse existing timeline server and performance improvements [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10122: URL: https://github.com/apache/hudi/pull/10122#issuecomment-1815713303 ## CI report: * 1877ef99bb4c939181b5341e19c44cbba742d7cd Azure:

Re: [PR] [MINOR] Build failed using master [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on PR #9726: URL: https://github.com/apache/hudi/pull/9726#issuecomment-1815711729 If we are talking about the version coflicts between the explcity introduced jackson jar and the parquet jar, it is actually a bug, should have a fix. -- This is an automated message

Re: [PR] [HUDI-7112] Reuse existing timeline server and performance improvements [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on code in PR #10122: URL: https://github.com/apache/hudi/pull/10122#discussion_r1396674754 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/embedded/EmbeddedTimelineServerHelper.java: ## @@ -23,66 +23,42 @@ import

Re: [PR] [HUDI-7099] Providing metrics for archive and defining some string constants [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10101: URL: https://github.com/apache/hudi/pull/10101#issuecomment-1815708138 ## CI report: * 3a4f9e92cd71dac2506c883c57785f07ee5bcf24 Azure:

Re: [PR] [HUDI-7099] Providing metrics for archive and defining some string constants [hudi]

2023-11-16 Thread via GitHub
majian1998 commented on code in PR #10101: URL: https://github.com/apache/hudi/pull/10101#discussion_r1396670883 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/timeline/HoodieTimelineArchiver.java: ## @@ -111,13 +114,13 @@ public boolean

Re: [PR] [HUDI-7099] Providing metrics for archive and defining some string constants [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on code in PR #10101: URL: https://github.com/apache/hudi/pull/10101#discussion_r1396663741 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/timeline/HoodieTimelineArchiver.java: ## @@ -111,13 +114,13 @@ public boolean

Re: [PR] [HUDI-7099] Providing metrics for archive and defining some string constants [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10101: URL: https://github.com/apache/hudi/pull/10101#issuecomment-1815676361 ## CI report: * 178ef4eadac6ab6d009d86ab86d35babe952 Azure:

Re: [I] [SUPPORT] Query failure due to replacecommit being archived [hudi]

2023-11-16 Thread via GitHub
haoxie-aws commented on issue #10107: URL: https://github.com/apache/hudi/issues/10107#issuecomment-1815675385 I find this query, `select * from samplehudi as t1, samplehudi as t2, samplehudi as t3 where t1.key=t2.key and t1.key=t3.key limit 1`, is slightly easier to replicate the

Re: [PR] [MINOR] Build failed using master [hudi]

2023-11-16 Thread via GitHub
Forus0322 commented on PR #9726: URL: https://github.com/apache/hudi/pull/9726#issuecomment-1815673222 @codope Hi, I reanalyzed the problem. The essence is that parquet 1.10.1 contains the org.codehaus.jackson dependency package, and parquet 1.12.0 version does not contain the

Re: [PR] [HUDI-7099] Providing metrics for archive and defining some string constants [hudi]

2023-11-16 Thread via GitHub
stream2000 commented on code in PR #10101: URL: https://github.com/apache/hudi/pull/10101#discussion_r1396632832 ## hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/io/TestHoodieTimelineArchiver.java: ## @@ -256,8 +256,8 @@ public void testArchiveEmptyTable() throws

Re: [PR] [HUDI-7099] Providing metrics for archive and defining some string constants [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10101: URL: https://github.com/apache/hudi/pull/10101#issuecomment-1815670631 ## CI report: * 178ef4eadac6ab6d009d86ab86d35babe952 Azure:

Re: [PR] [HUDI-7115] Add in new options for the bigquery sync [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10125: URL: https://github.com/apache/hudi/pull/10125#issuecomment-1815666263 ## CI report: * d94d74a02df88f3ca32807c7f580900b268ca0d0 UNKNOWN * f2f380dec7f0afa5fd7fb0accbe8c17e22853f00 Azure:

Re: [I] [SUPPORT] Performance Tuning: Slow stages (Building Workload Profile & Getting Small files from partitions) during Hudi Writes [hudi]

2023-11-16 Thread via GitHub
zyclove commented on issue #2620: URL: https://github.com/apache/hudi/issues/2620#issuecomment-1815661842 I also encountered the same problem with 0.14.0, how to solve it? disable metadata ? set hoodie.metadata.table=false; change hoodie.parquet.small.file.limit ? set

Re: [PR] [HUDI-7099] Providing metrics for archive and defining some string constants [hudi]

2023-11-16 Thread via GitHub
majian1998 commented on PR #10101: URL: https://github.com/apache/hudi/pull/10101#issuecomment-1815653571 Now, the archive metrics have been moved to `BaseHoodieTableServiceClient`, and the return value of archiveIfRequired has been modified. In the previous implementation, the `success`

Re: [I] [SUPPORT] Disproportionately Slow performance during "Building workload profile" phase [hudi]

2023-11-16 Thread via GitHub
zyclove commented on issue #8189: URL: https://github.com/apache/hudi/issues/8189#issuecomment-1815649576 ![image](https://github.com/apache/hudi/assets/15028279/49829897-5dbc-4a98-bde9-2ded448681a2)

Re: [PR] [HUDI-7110] Add call procedure for show column stats information [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on code in PR #10120: URL: https://github.com/apache/hudi/pull/10120#discussion_r1396587932 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/procedures/ShowMetadataTableColumnStatsProcedure.scala: ## @@ -0,0 +1,105 @@ +/*

Re: [PR] [HUDI-7115] Add in new options for the bigquery sync [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10125: URL: https://github.com/apache/hudi/pull/10125#issuecomment-1815627236 ## CI report: * 7c1b9cc77e2e5ea2ee9d6089f41b5a9c482de9f5 Azure:

Re: [PR] [HUDI-7115] Add in new options for the bigquery sync [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10125: URL: https://github.com/apache/hudi/pull/10125#issuecomment-1815621023 ## CI report: * 7c1b9cc77e2e5ea2ee9d6089f41b5a9c482de9f5 Azure:

[jira] [Updated] (HUDI-7109) Fix Flink may re-use a committed instant in append mode

2023-11-16 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7109: - Fix Version/s: 0.14.1 1.0.0 > Fix Flink may re-use a committed instant in append mode

[jira] [Closed] (HUDI-7109) Fix Flink may re-use a committed instant in append mode

2023-11-16 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7109. Resolution: Fixed Fixed via master branch: 3d0c4501d6b7062f38b7755f71660818cd95c1f6 > Fix Flink may re-use

(hudi) branch master updated (f06ff5b3e0e -> 3d0c4501d6b)

2023-11-16 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from f06ff5b3e0e [HUDI-7090] Set the maxParallelism for singleton operator (#10090) add 3d0c4501d6b [HUDI-7109]

Re: [PR] [HUDI-7109] Fix Flink may re-use a committed instant in append mode [hudi]

2023-11-16 Thread via GitHub
danny0405 merged PR #10119: URL: https://github.com/apache/hudi/pull/10119 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Closed] (HUDI-7090) Set maxParallelism for singleton operator ,for example compact_plan_generate、split_monitor、compact_commit

2023-11-16 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7090. Resolution: Fixed Fixed via master branch: f06ff5b3e0ee8bb6e49aad04d3b6054d6c46e272 > Set maxParallelism

[jira] [Updated] (HUDI-7090) Set maxParallelism for singleton operator ,for example compact_plan_generate、split_monitor、compact_commit

2023-11-16 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7090: - Fix Version/s: 0.14.1 1.0.0 > Set maxParallelism for singleton operator ,for example

(hudi) branch master updated: [HUDI-7090] Set the maxParallelism for singleton operator (#10090)

2023-11-16 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new f06ff5b3e0e [HUDI-7090] Set the maxParallelism

Re: [PR] [HUDI-7090]Set the maxParallelism for singleton operator [hudi]

2023-11-16 Thread via GitHub
danny0405 merged PR #10090: URL: https://github.com/apache/hudi/pull/10090 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] [SUPPORT]When hudi integrates hive, an error is reported when the hive external table is queried [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on issue #10084: URL: https://github.com/apache/hudi/issues/10084#issuecomment-1815611713 Flink 1.13.1 should use Parquet 1.11 right? Have you checked the project parquet version for other modules so you do not package multiple parquet jars in one shot. -- This is an

Re: [PR] [HUDI-7099] Providing metrics for archive and defining some string constants [hudi]

2023-11-16 Thread via GitHub
danny0405 commented on code in PR #10101: URL: https://github.com/apache/hudi/pull/10101#discussion_r1396571766 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/timeline/HoodieTimelineArchiver.java: ## @@ -117,6 +122,10 @@ public boolean

Re: [PR] [HUDI-7115] Add in new options for the bigquery sync [hudi]

2023-11-16 Thread via GitHub
the-other-tim-brown commented on code in PR #10125: URL: https://github.com/apache/hudi/pull/10125#discussion_r1396557170 ## hudi-gcp/src/main/java/org/apache/hudi/gcp/bigquery/BigQuerySyncTool.java: ## @@ -79,7 +78,7 @@ public BigQuerySyncTool(Properties props) {

Re: [PR] [HUDI-7115] Add in new options for the bigquery sync [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10125: URL: https://github.com/apache/hudi/pull/10125#issuecomment-1815576193 ## CI report: * 7c1b9cc77e2e5ea2ee9d6089f41b5a9c482de9f5 Azure:

Re: [PR] [HUDI-7115] Add in new options for the bigquery sync [hudi]

2023-11-16 Thread via GitHub
the-other-tim-brown commented on code in PR #10125: URL: https://github.com/apache/hudi/pull/10125#discussion_r139668 ## hudi-gcp/src/main/java/org/apache/hudi/gcp/bigquery/BigQuerySyncConfig.java: ## @@ -122,6 +121,16 @@ public class BigQuerySyncConfig extends

Re: [PR] [HUDI-7115] Add in new options for the bigquery sync [hudi]

2023-11-16 Thread via GitHub
the-other-tim-brown commented on code in PR #10125: URL: https://github.com/apache/hudi/pull/10125#discussion_r1396555489 ## hudi-gcp/pom.xml: ## @@ -70,7 +70,6 @@ See https://github.com/GoogleCloudPlatform/cloud-opensource-java/wiki/The-Google com.google.cloud

Re: [PR] [HUDI-7115] Add in new options for the bigquery sync [hudi]

2023-11-16 Thread via GitHub
yihua commented on code in PR #10125: URL: https://github.com/apache/hudi/pull/10125#discussion_r1396550878 ## hudi-gcp/src/main/java/org/apache/hudi/gcp/bigquery/BigQuerySyncConfig.java: ## @@ -83,7 +83,6 @@ public class BigQuerySyncConfig extends HoodieSyncConfig implements

Re: [PR] [HUDI-7115] Add in new options for the bigquery sync [hudi]

2023-11-16 Thread via GitHub
hudi-bot commented on PR #10125: URL: https://github.com/apache/hudi/pull/10125#issuecomment-1815570231 ## CI report: * 7c1b9cc77e2e5ea2ee9d6089f41b5a9c482de9f5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[jira] [Updated] (HUDI-7115) Add more options for BigQuery Sync

2023-11-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7115: - Labels: pull-request-available (was: ) > Add more options for BigQuery Sync >

[PR] [HUDI-7115] Add in new options for the bigquery sync [hudi]

2023-11-16 Thread via GitHub
the-other-tim-brown opened a new pull request, #10125: URL: https://github.com/apache/hudi/pull/10125 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or

[jira] [Created] (HUDI-7115) Add more options for BigQuery Sync

2023-11-16 Thread Timothy Brown (Jira)
Timothy Brown created HUDI-7115: --- Summary: Add more options for BigQuery Sync Key: HUDI-7115 URL: https://issues.apache.org/jira/browse/HUDI-7115 Project: Apache Hudi Issue Type: Improvement

Re: [I] [SUPPORT] RFC 63 Functional Index Hudi 0.1.0-beta [hudi]

2023-11-16 Thread via GitHub
soumilshah1995 commented on issue #10110: URL: https://github.com/apache/hudi/issues/10110#issuecomment-1815424659 # Code ``` from pyspark.sql import SparkSession from pyspark.sql.types import StructType, StructField, StringType, TimestampType, FloatType from datetime

Re: [PR] [MINOR] Fix default config values if not specified in MultipleSparkJobExecutionStrategy [hudi]

2023-11-16 Thread via GitHub
nsivabalan commented on code in PR #9625: URL: https://github.com/apache/hudi/pull/9625#discussion_r1396445022 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/clustering/run/strategy/MultipleSparkJobExecutionStrategy.java: ## @@ -116,7 +116,7 @@ public

Re: [I] Unable to alter column name for a Hudi table in AWS [hudi]

2023-11-16 Thread via GitHub
soumilshah1995 commented on issue #9780: URL: https://github.com/apache/hudi/issues/9780#issuecomment-1815417347 Found the code here is code https://github.com/soumilshah1995/code-snippets/blob/main/schema_evol_lab.ipynb Here is Video Guide

Re: [I] Unable to alter column name for a Hudi table in AWS [hudi]

2023-11-16 Thread via GitHub
soumilshah1995 commented on issue #9780: URL: https://github.com/apache/hudi/issues/9780#issuecomment-1815415276 let me search my code snip I know I have done this on AWS -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [HUDI-6958] Simplify Out Of Box Schema Evolution Functionality - DOCS [hudi]

2023-11-16 Thread via GitHub
lokesh-lingarajan-0310 commented on code in PR #9881: URL: https://github.com/apache/hudi/pull/9881#discussion_r1396339333 ## website/docs/schema_evolution.md: ## @@ -22,21 +22,36 @@ the previous schema (e.g., renaming a column). Furthermore, the evolved schema is queryable

  1   2   3   >