Re: [PR] [HUDI-7215] Delete NewHoodieParquetFileFormat [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10304: URL: https://github.com/apache/hudi/pull/10304#issuecomment-1852320625 ## CI report: * d858eaac14b3de45d4066165622738d91ff603fe Azure:

Re: [PR] [HUDI-7215] Delete NewHoodieParquetFileFormat [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10304: URL: https://github.com/apache/hudi/pull/10304#issuecomment-1852305331 ## CI report: * d858eaac14b3de45d4066165622738d91ff603fe Azure:

Re: [I] [SUPPORT] hoodie only support org.apache.spark.serializer.KryoSerializer as spark.serializer [hudi]

2023-12-12 Thread via GitHub
young138120 commented on issue #10320: URL: https://github.com/apache/hudi/issues/10320#issuecomment-1852267384 I have configured the value of this parameter spark.serializer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[I] [SUPPORT] [hudi]

2023-12-12 Thread via GitHub
young138120 opened a new issue, #10320: URL: https://github.com/apache/hudi/issues/10320 **Describe the problem you faced** I run spark job to write data to hudi, and init spark session like this:

Re: [PR] Incoming batch schema is not compatible with the table's one #9980 [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10308: URL: https://github.com/apache/hudi/pull/10308#issuecomment-1852079550 ## CI report: * 737e09fc37912e88f640393b11357cb8b27a29c5 Azure:

[I] [SUPPORT] Reuse table configuration between Spark Writes and HoodieStreamer [hudi]

2023-12-12 Thread via GitHub
baunz opened a new issue, #10319: URL: https://github.com/apache/hudi/issues/10319 **Describe the problem you faced** We are bootstrapping a MOR table with a spark job using bulkinsert, and periodically upsert data afterwards with HoodieStreamer. Currently, it is not clear to

Re: [PR] Incoming batch schema is not compatible with the table's one #9980 [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10308: URL: https://github.com/apache/hudi/pull/10308#issuecomment-1852064937 ## CI report: * 737e09fc37912e88f640393b11357cb8b27a29c5 Azure:

Re: [PR] [HUDI-7225] Correcting spelling errors or annotations with non-standa… [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10317: URL: https://github.com/apache/hudi/pull/10317#issuecomment-1852047364 ## CI report: * d17847ad9ae0724c7e93fc3a8423ba069326541a Azure:

(hudi) branch asf-site updated: added link and command (#10293)

2023-12-12 Thread bhavanisudha
This is an automated email from the ASF dual-hosted git repository. bhavanisudha pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new d283192def4 added link and command

Re: [PR] [MINOR][DOCS] Updates to Glue Catalog Sync page [hudi]

2023-12-12 Thread via GitHub
bhasudha merged PR #10293: URL: https://github.com/apache/hudi/pull/10293 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [MINOR][DOCS] Updates to Glue Catalog Sync page [hudi]

2023-12-12 Thread via GitHub
bhasudha commented on code in PR #10293: URL: https://github.com/apache/hudi/pull/10293#discussion_r1423999115 ## website/docs/syncing_aws_glue_data_catalog.md: ## @@ -16,3 +16,18 @@ be passed along. ```shell --sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool

Re: [PR] [MINOR] [DOCS] changes to redshift & starrocks compat matrix [hudi]

2023-12-12 Thread via GitHub
bhasudha commented on PR #10294: URL: https://github.com/apache/hudi/pull/10294#issuecomment-1852036173 minor nit: Please avoid intellij suggested or whitespace changes going forward. Since this can be different across individual person's settings. And gets in the way of review :) --

Re: [PR] [MINOR] [DOCS] changes to redshift & starrocks compat matrix [hudi]

2023-12-12 Thread via GitHub
bhasudha commented on code in PR #10294: URL: https://github.com/apache/hudi/pull/10294#discussion_r1423990484 ## website/docs/sql_queries.md: ## @@ -362,37 +349,37 @@ Following tables show whether a given query is supported on specific query engin ### Copy-On-Write tables

Re: [PR] [MINOR] [DOCS] changes to redshift & starrocks compat matrix [hudi]

2023-12-12 Thread via GitHub
bhasudha commented on code in PR #10294: URL: https://github.com/apache/hudi/pull/10294#discussion_r1423989899 ## website/docs/sql_queries.md: ## @@ -362,37 +349,37 @@ Following tables show whether a given query is supported on specific query engin ### Copy-On-Write tables

Re: [PR] [MINOR] [DOCS] changes to redshift & starrocks compat matrix [hudi]

2023-12-12 Thread via GitHub
bhasudha commented on code in PR #10294: URL: https://github.com/apache/hudi/pull/10294#discussion_r1423987814 ## website/docs/sql_queries.md: ## @@ -146,15 +142,11 @@ There are 3 use cases for incremental query: the interval is a closed one: both start commit and end

Re: [PR] [MINOR] [DOCS] changes to redshift & starrocks compat matrix [hudi]

2023-12-12 Thread via GitHub
bhasudha commented on code in PR #10294: URL: https://github.com/apache/hudi/pull/10294#discussion_r1423989370 ## website/docs/sql_queries.md: ## @@ -337,10 +326,8 @@ will be supported in the future. ## StarRocks -Copy on Write tables in Apache Hudi 0.10.0 and above can be

Re: [PR] [MINOR] [DOCS] changes to redshift & starrocks compat matrix [hudi]

2023-12-12 Thread via GitHub
bhasudha commented on code in PR #10294: URL: https://github.com/apache/hudi/pull/10294#discussion_r1423987370 ## website/docs/sql_queries.md: ## @@ -98,44 +98,40 @@ Once the Flink Hudi tables have been registered to the Flink catalog, they can b relying on the custom Hudi

Re: [I] [SUPPORT] Data loss in MOR table after clustering partition [hudi]

2023-12-12 Thread via GitHub
ad1happy2go commented on issue #9977: URL: https://github.com/apache/hudi/issues/9977#issuecomment-1852010551 Yes, They may be related. We missed to back port to 0.12.X minor releases. Does your original dataset also have more than 100 columns? -- This is an automated message from the

Re: [PR] Incoming batch schema is not compatible with the table's one #9980 [hudi]

2023-12-12 Thread via GitHub
njalan commented on code in PR #10308: URL: https://github.com/apache/hudi/pull/10308#discussion_r1423964727 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala: ## @@ -1092,6 +1092,10 @@ class HoodieSparkSqlWriterInternal {

Re: [PR] [HUDI-7131] Fixing schema used to read base file in HoodieMergedReadHandle [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10318: URL: https://github.com/apache/hudi/pull/10318#issuecomment-1851946870 ## CI report: * 32e63551638725305e5b3318816aa4a469399796 Azure:

Re: [PR] [MINOR] NPE fix while adding projection field & added its test cases [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10313: URL: https://github.com/apache/hudi/pull/10313#issuecomment-1851861748 ## CI report: * 5273d8cc9ed428d2ac6896f52664618ed02c98a1 Azure:

Re: [PR] [HUDI-7132] Data may be lost for flink task failure [hudi]

2023-12-12 Thread via GitHub
voonhous commented on PR #10312: URL: https://github.com/apache/hudi/pull/10312#issuecomment-1851820623 @danny0405 @cuibo01 Read through the JIRA ticket. While I understand how the state of the TM and JM can cause the potential data loss, I am still not very sure how the TM and JM reaches

Re: [PR] [HUDI-7131] Fixing schema used to read base file in HoodieMergedReadHandle [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10318: URL: https://github.com/apache/hudi/pull/10318#issuecomment-1851799300 ## CI report: * 32e63551638725305e5b3318816aa4a469399796 Azure:

Re: [PR] [HUDI-7131] Fixing schema used to read base file in HoodieMergedReadHandle [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10318: URL: https://github.com/apache/hudi/pull/10318#issuecomment-1851786461 ## CI report: * 32e63551638725305e5b3318816aa4a469399796 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [I] [SUPPORT] How to skip some partitions in a table when readStreaming in Spark at the init stage [hudi]

2023-12-12 Thread via GitHub
danny0405 commented on issue #10315: URL: https://github.com/apache/hudi/issues/10315#issuecomment-1851775346 > but I want a config that can tell source that only reads the partition that in my configs so I do not need to use filter That does not follow the common intuition. --

Re: [PR] [HUDI-7225] Correcting spelling errors or annotations with non-standa… [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10317: URL: https://github.com/apache/hudi/pull/10317#issuecomment-1851773322 ## CI report: * d17847ad9ae0724c7e93fc3a8423ba069326541a Azure:

Re: [PR] [MINOR] NPE fix while adding projection field & added its test cases [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10313: URL: https://github.com/apache/hudi/pull/10313#issuecomment-1851773223 ## CI report: * b9ebe136bdcafc4d5bbd407691f2420ccab45adc Azure:

Re: [PR] Incoming batch schema is not compatible with the table's one #9980 [hudi]

2023-12-12 Thread via GitHub
danny0405 commented on code in PR #10308: URL: https://github.com/apache/hudi/pull/10308#discussion_r1423799754 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala: ## @@ -1092,6 +1092,10 @@ class HoodieSparkSqlWriterInternal {

[jira] [Closed] (HUDI-7132) Data may be lost in Flink checkpoint

2023-12-12 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7132. Fix Version/s: 0.14.1 Resolution: Fixed Fixed via master branch:

Re: [PR] [HUDI-7132] Data may be lost for flink task failure [hudi]

2023-12-12 Thread via GitHub
danny0405 merged PR #10312: URL: https://github.com/apache/hudi/pull/10312 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

(hudi) branch master updated (cacbb82254c -> 17b62a2c0f4)

2023-12-12 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from cacbb82254c [HUDI-6658] Inject filters for incremental query (#10225) add 17b62a2c0f4 [HUDI-7132] Data may be

[jira] [Updated] (HUDI-7131) The requested schema is not compatible with the file schema

2023-12-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7131: - Labels: core merge pull-request-available spark (was: core merge spark) > The requested schema

[PR] [HUDI-7131] Fixing schema used to read base file in HoodieMergedReadHandle [hudi]

2023-12-12 Thread via GitHub
nsivabalan opened a new pull request, #10318: URL: https://github.com/apache/hudi/pull/10318 ### Change Logs Fixing schema used to read base file in HoodieMergedReadHandle ### Impact MIT works for global index use-cases. ### Risk level (write none, low medium or

Re: [PR] [HUDI-7225] Correcting spelling errors or annotations with non-standa… [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10317: URL: https://github.com/apache/hudi/pull/10317#issuecomment-1851697428 ## CI report: * d17847ad9ae0724c7e93fc3a8423ba069326541a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[jira] [Updated] (HUDI-7225) Correcting spelling errors or annotations with non-standard spelling

2023-12-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7225: - Labels: pull-request-available (was: ) > Correcting spelling errors or annotations with

[PR] [HUDI-7225] Correcting spelling errors or annotations with non-standa… [hudi]

2023-12-12 Thread via GitHub
LeshracTheMalicious opened a new pull request, #10317: URL: https://github.com/apache/hudi/pull/10317 …rd spelling ### Change Logs Modify some spelling errors or non-standard spelling comments pointed out by Typo ### Impact Theoretically no impact ### Risk

[jira] [Updated] (HUDI-7225) Correcting spelling errors or annotations with non-standard spelling

2023-12-12 Thread mazhengxuan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mazhengxuan updated HUDI-7225: -- Description: Modify some spelling errors or non-standard spelling comments pointed out by Typo (was:

Re: [PR] [HUDI-7224] HoodieSparkSqlWriter metasync success or not show details messages log [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10314: URL: https://github.com/apache/hudi/pull/10314#issuecomment-1851635963 ## CI report: * 88b9f8d9518f5afd376479ba9c87a8dd30170ffc Azure:

Re: [PR] [HUDI-7132] Data may be lost for flink task failure [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10312: URL: https://github.com/apache/hudi/pull/10312#issuecomment-1851635788 ## CI report: * 5c971e1a0cafb635ad9cfed0f452751314bdb21c Azure:

Re: [PR] Incoming batch schema is not compatible with the table's one #9980 [hudi]

2023-12-12 Thread via GitHub
hudi-bot commented on PR #10308: URL: https://github.com/apache/hudi/pull/10308#issuecomment-1851635617 ## CI report: * 737e09fc37912e88f640393b11357cb8b27a29c5 Azure:

[jira] [Updated] (HUDI-7225) Correcting spelling errors or annotations with non-standard spelling

2023-12-12 Thread mazhengxuan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mazhengxuan updated HUDI-7225: -- Summary: Correcting spelling errors or annotations with non-standard spelling (was: Correcting

Re: [PR] [MINOR] NPE fix while adding projection field & added its test cases [hudi]

2023-12-12 Thread via GitHub
prathit06 commented on code in PR #10313: URL: https://github.com/apache/hudi/pull/10313#discussion_r1423664817 ## hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java: ## @@ -86,7 +86,7 @@ private static Configuration

Re: [I] [SUPPORT] how to config hudi table TTL in S3? The table_meta can be separated into a directory? [hudi]

2023-12-12 Thread via GitHub
zyclove commented on issue #10316: URL: https://github.com/apache/hudi/issues/10316#issuecomment-1851604695 > @zyclove Dont think if there is a way to point the different directory outside table directory OR having any such TTL configuration. Why can't we consider storing metadata

Re: [PR] [MINOR] NPE fix while adding projection field & added its test cases [hudi]

2023-12-12 Thread via GitHub
prathit06 commented on code in PR #10313: URL: https://github.com/apache/hudi/pull/10313#discussion_r1423664817 ## hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java: ## @@ -86,7 +86,7 @@ private static Configuration

Re: [PR] [HUDI-7132] Data may be lost for flink task failure [hudi]

2023-12-12 Thread via GitHub
cuibo01 commented on PR #10312: URL: https://github.com/apache/hudi/pull/10312#issuecomment-1851568190 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[jira] [Assigned] (HUDI-7170) Implement HFile reader independent of HBase

2023-12-12 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui reassigned HUDI-7170: Assignee: Bo Cui (was: Ethan Guo) > Implement HFile reader independent of HBase >

Re: [PR] [MINOR] NPE fix while adding projection field & added its test cases [hudi]

2023-12-12 Thread via GitHub
prathit06 commented on code in PR #10313: URL: https://github.com/apache/hudi/pull/10313#discussion_r1423664817 ## hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java: ## @@ -86,7 +86,7 @@ private static Configuration

[jira] (HUDI-7132) Data may be lost in Flink checkpoint

2023-12-12 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7132 ] Bo Cui deleted comment on HUDI-7132: -- was (Author: bo cui): >From the code, this pr ([https://github.com/apache/hudi/pull/9867/files]) >fixes the logic during initialization, but it doesn't fix the

[jira] [Updated] (HUDI-7132) Data may be lost in Flink checkpoint

2023-12-12 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HUDI-7132: - Attachment: (was: screenshot-1.png) > Data may be lost in Flink checkpoint >

[jira] [Commented] (HUDI-7132) Data may be lost in Flink checkpoint

2023-12-12 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795645#comment-17795645 ] Bo Cui commented on HUDI-7132: -- >From the code, this pr (https://github.com/apache/hudi/pull/9867/files)

[jira] [Updated] (HUDI-7132) Data may be lost in Flink checkpoint

2023-12-12 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HUDI-7132: - Attachment: screenshot-1.png > Data may be lost in Flink checkpoint > > >

[jira] [Comment Edited] (HUDI-7132) Data may be lost in Flink checkpoint

2023-12-12 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795645#comment-17795645 ] Bo Cui edited comment on HUDI-7132 at 12/12/23 8:50 AM: >From the code, this pr

Re: [PR] [HUDI-6979][RFC-76] support event time based compaction strategy [hudi]

2023-12-12 Thread via GitHub
waitingF commented on code in PR #10266: URL: https://github.com/apache/hudi/pull/10266#discussion_r1423651556 ## rfc/rfc-76/rfc-76.md: ## @@ -0,0 +1,238 @@ + +# RFC-[74]: [support EventTimeBasedCompactionStrategy] + +## Proposers + +- @waitingF + +## Approvers + - @ + - @ +

[jira] [Created] (HUDI-7225) Correcting comments with incorrect spelling

2023-12-12 Thread mazhengxuan (Jira)
mazhengxuan created HUDI-7225: - Summary: Correcting comments with incorrect spelling Key: HUDI-7225 URL: https://issues.apache.org/jira/browse/HUDI-7225 Project: Apache Hudi Issue Type:

Re: [I] [SUPPORT] how to config hudi table TTL in S3? The table_meta can be separated into a directory? [hudi]

2023-12-12 Thread via GitHub
ad1happy2go commented on issue #10316: URL: https://github.com/apache/hudi/issues/10316#issuecomment-1851532435 @zyclove Dont think if there is a way to point the different directory outside table directory OR having any such TTL configuration. -- This is an automated message from the

<    1   2