Re: [PR] [HUDI-7010] Build clustering group reduces redundant traversals [hudi]

2023-10-30 Thread via GitHub
stream2000 commented on code in PR #9957: URL: https://github.com/apache/hudi/pull/9957#discussion_r1377126517 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/cluster/strategy/PartitionAwareClusteringPlanStrategy.java: ## @@ -79,6 +79,11 @@ protected

Re: [PR] [HUDI-6993] Support Flink 1.18 [hudi]

2023-10-30 Thread via GitHub
PrabhuJoseph commented on code in PR #9949: URL: https://github.com/apache/hudi/pull/9949#discussion_r1377125686 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/catalog/HoodieHiveCatalog.java: ## @@ -245,7 +246,7 @@ public void createDatabase( Map

Re: [PR] [HUDI-7011] a metric to indicate whether rollback has occurred in final compaction state [hudi]

2023-10-30 Thread via GitHub
LXin96 commented on code in PR #9956: URL: https://github.com/apache/hudi/pull/9956#discussion_r1377119005 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkCompactionMetrics.java: ## @@ -103,4 +110,16 @@ public void endCompaction() { this.compa

Re: [PR] [DOCS] Add tags for video blogs [hudi]

2023-10-30 Thread via GitHub
yihua commented on code in PR #9939: URL: https://github.com/apache/hudi/pull/9939#discussion_r1377117759 ## website/videoBlog/2022-12-14-Build_Slowly_Changing_Dimensions_Type_2_SCD2_with_Apache_Spark_and_Apache_Hudi_Hands_on_Labs.md: ## @@ -8,5 +8,8 @@ image: /assets/images/hud

Re: [PR] [HUDI-5210] Implement functional indexes [hudi]

2023-10-30 Thread via GitHub
yihua commented on code in PR #9872: URL: https://github.com/apache/hudi/pull/9872#discussion_r1377108821 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java: ## @@ -205,17 +216,27 @@ private void initMetadataReader() {

[jira] [Updated] (HUDI-7012) The BootstrapOperator reduces the memory.

2023-10-30 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HUDI-7012: - Description: the stream job and hoodieTable never release memory. !image-2023-10-31-14-42-57-424.png! was: the

Re: [I] [SUPPORT]Unable to write to erasure coded HDFS directory on EMR [hudi]

2023-10-30 Thread via GitHub
sathyanarayananc commented on issue #9947: URL: https://github.com/apache/hudi/issues/9947#issuecomment-1786542869 @ad1happy2go Adding the hadoop-hdfs jar worked. thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[jira] [Updated] (HUDI-7012) The BootstrapOperator reduces the memory.

2023-10-30 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HUDI-7012: - Attachment: (was: image-2023-10-31-14-37-10-118.png) > The BootstrapOperator reduces the memory. > ---

[jira] [Updated] (HUDI-7012) The BootstrapOperator reduces the memory.

2023-10-30 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HUDI-7012: - Description: the stream job and hoodieTable never release memory.   was: the stream job and hoodieTable never

[jira] [Updated] (HUDI-7012) The BootstrapOperator reduces the memory.

2023-10-30 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HUDI-7012: - Description: the stream job and hoodieTable never release memory. !image-2023-10-31-14-37-10-118.png! was: the

[jira] [Updated] (HUDI-7012) The BootstrapOperator reduces the memory.

2023-10-30 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HUDI-7012: - Attachment: image-2023-10-31-14-37-10-118.png > The BootstrapOperator reduces the memory. > --

[jira] [Updated] (HUDI-7012) The BootstrapOperator reduces the memory.

2023-10-30 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HUDI-7012: - Description: the stream job and hoodieTable never release memory.   was: the stream job and hoodieTable never

[jira] [Updated] (HUDI-7012) The BootstrapOperator reduces the memory.

2023-10-30 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HUDI-7012: - Description: the stream job and hoodieTable never release memory. !image-2023-10-31-14-29-55-715.png! was: the

Re: [PR] [HUDI-6801] Implement merging partial updates from log files for MOR tables [hudi]

2023-10-30 Thread via GitHub
yihua merged PR #9883: URL: https://github.com/apache/hudi/pull/9883 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

(hudi) branch master updated: [HUDI-6801] Implement merging partial updates from log files for MOR tables (#9883)

2023-10-30 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 26ecb287684 [HUDI-6801] Implement merging partial u

[jira] [Assigned] (HUDI-7012) The BootstrapOperator reduces the memory.

2023-10-30 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui reassigned HUDI-7012: Assignee: Bo Cui > The BootstrapOperator reduces the memory. > - >

[jira] [Updated] (HUDI-7012) The BootstrapOperator reduces the memory.

2023-10-30 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HUDI-7012: - Summary: The BootstrapOperator reduces the memory. (was: The BootstrapOperator reduces the memory usage.) > The

[jira] [Updated] (HUDI-7012) The BootstrapOperator reduces the memory usage.

2023-10-30 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7012: - Labels: pull-request-available (was: ) > The BootstrapOperator reduces the memory usage. > --

[PR] [HUDI-7012]The BootstrapOperator reduces the memory usage. [hudi]

2023-10-30 Thread via GitHub
cuibo01 opened a new pull request, #9959: URL: https://github.com/apache/hudi/pull/9959 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance

Re: [PR] [HUDI-5210] Implement functional indexes [hudi]

2023-10-30 Thread via GitHub
yihua commented on code in PR #9872: URL: https://github.com/apache/hudi/pull/9872#discussion_r1377086649 ## hudi-common/src/main/java/org/apache/hudi/common/config/HoodieFunctionalIndexConfig.java: ## @@ -0,0 +1,319 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] [HUDI-6989] Stop handling more data if task is aborted & clean partial files if possible in task side [hudi]

2023-10-30 Thread via GitHub
boneanxs commented on PR #9922: URL: https://github.com/apache/hudi/pull/9922#issuecomment-1786515927 > > Thanks for the fix, from high-level, I kind of think we should avoid to relies on the Spark mechanisms to add any rollback/cleaning improvement here, it's hacky to maintain and it is no

[jira] [Assigned] (HUDI-7009) Filter out null value records from avro kafka source

2023-10-30 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-7009: - Assignee: sivabalan narayanan > Filter out null value records from avro kafka sou

[jira] [Updated] (HUDI-7009) Filter out null value records from avro kafka source

2023-10-30 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-7009: -- Fix Version/s: 0.14.1 > Filter out null value records from avro kafka source > -

Re: [PR] [HUDI-6989] Stop handling more data if task is aborted & clean partial files if possible in task side [hudi]

2023-10-30 Thread via GitHub
boneanxs commented on code in PR #9922: URL: https://github.com/apache/hudi/pull/9922#discussion_r1377093308 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/row/HoodieRowCreateHandle.java: ## @@ -272,11 +272,12 @@ private static Path makeNewPath(FileSys

[jira] [Updated] (HUDI-7012) The BootstrapOperator reduces the memory usage.

2023-10-30 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HUDI-7012: - Description: the stream job and hoodieTable never release memory.   was: the stream job and hoodieTable never

[jira] [Updated] (HUDI-7012) The BootstrapOperator reduces the memory usage.

2023-10-30 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HUDI-7012: - Description: the stream job and hoodieTable never release memory. !image-2023-10-31-14-05-18-391.png|width=884,he

Re: [PR] [HUDI-6993] Support Flink 1.18 [hudi]

2023-10-30 Thread via GitHub
danny0405 commented on code in PR #9949: URL: https://github.com/apache/hudi/pull/9949#discussion_r1377089763 ## .github/workflows/bot.yml: ## @@ -283,6 +284,11 @@ jobs: strategy: matrix: include: + - flinkProfile: 'flink1.18' +sparkProf

[jira] [Updated] (HUDI-7012) The BootstrapOperator reduces the memory usage.

2023-10-30 Thread Bo Cui (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HUDI-7012: - Description: the stream job and hoodieTable never release memory. !image-2023-10-31-14-00-41-402.png! was: the

[jira] [Updated] (HUDI-7005) Flink SQL Queries on Hudi Table fail when using the hudi-aws-bundle jar

2023-10-30 Thread Prabhu Joseph (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated HUDI-7005: Description: Flink SQL Queries on Hudi Table fail when using the hudi-aws-bundle jar. hudi-aws-bund

[jira] [Created] (HUDI-7012) The BootstrapOperator reduces the memory usage.

2023-10-30 Thread Bo Cui (Jira)
Bo Cui created HUDI-7012: Summary: The BootstrapOperator reduces the memory usage. Key: HUDI-7012 URL: https://issues.apache.org/jira/browse/HUDI-7012 Project: Apache Hudi Issue Type: Improvement

Re: [PR] [HUDI-6993] Support Flink 1.18 [hudi]

2023-10-30 Thread via GitHub
danny0405 commented on code in PR #9949: URL: https://github.com/apache/hudi/pull/9949#discussion_r1377088315 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/catalog/HoodieHiveCatalog.java: ## @@ -245,7 +246,7 @@ public void createDatabase( Map pro

Re: [I] [SUPPORT]: Facing issue while upserting multiple tables together with Hudi 0.14 [hudi]

2023-10-30 Thread via GitHub
danny0405 commented on issue #9958: URL: https://github.com/apache/hudi/issues/9958#issuecomment-1786492605 Should be fixed in: https://github.com/apache/hudi/pull/9786 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [I] [SUPPORT]: Facing issue while upserting multiple tables together with Hudi 0.14 [hudi]

2023-10-30 Thread via GitHub
danny0405 closed issue #9958: [SUPPORT]: Facing issue while upserting multiple tables together with Hudi 0.14 URL: https://github.com/apache/hudi/issues/9958 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] [HUDI-7011] a metric to indicate whether rollback has occurred in final compaction state [hudi]

2023-10-30 Thread via GitHub
danny0405 commented on code in PR #9956: URL: https://github.com/apache/hudi/pull/9956#discussion_r1377082818 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkCompactionMetrics.java: ## @@ -103,4 +110,16 @@ public void endCompaction() { this.co

[I] [SUPPORT]: Facing issue while upserting multiple tables together with Hudi 0.14 [hudi]

2023-10-30 Thread via GitHub
ketkidev opened a new issue, #9958: URL: https://github.com/apache/hudi/issues/9958 **Describe the problem you faced** In **Hudi 0.14** when running our application where more than one tables are processed and upserted at a time using multithreading, we get this error once in a while

Re: [PR] [Docs] Update videos tags+thumbnails page 4 [hudi]

2023-10-30 Thread via GitHub
bhasudha commented on PR #9953: URL: https://github.com/apache/hudi/pull/9953#issuecomment-1786457273 @ckonehouse Please change the PR to merge into `asf-site` branch of Hudi not the `master`. Then you wont see 5K plus files changed in the PR. -- This is an automated message from the A

Re: [PR] [HUDI-6946] Data Duplicates with range pruning while using hoodie.bloom.index.use.metadata [hudi]

2023-10-30 Thread via GitHub
danny0405 commented on code in PR #9886: URL: https://github.com/apache/hudi/pull/9886#discussion_r1377054672 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java: ## @@ -212,7 +212,7 @@ protected List> loadColumnRangesFromMetaIndex(

[jira] [Updated] (HUDI-6998) Fix drop table failure when load table as spark v2 table whose path is delete

2023-10-30 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-6998: - Fix Version/s: 1.0.0 0.14.1 > Fix drop table failure when load table as spark v2 table

[jira] [Closed] (HUDI-6998) Fix drop table failure when load table as spark v2 table whose path is delete

2023-10-30 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-6998. Resolution: Fixed Fixed via master branch: c7320f78407a68c009954b74faade03dd4fb494c > Fix drop table failur

(hudi) branch master updated: [HUDI-6998] Fix drop table failure when load table as spark v2 table whose path is delete (#9932)

2023-10-30 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new c7320f78407 [HUDI-6998] Fix drop table failure

Re: [PR] [HUDI-6998] Fix drop table failure when load table as spark v2 table whose path is delete [hudi]

2023-10-30 Thread via GitHub
danny0405 merged PR #9932: URL: https://github.com/apache/hudi/pull/9932 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

Re: [PR] [MINOR] change hive/adb tool not auto create database default [hudi]

2023-10-30 Thread via GitHub
danny0405 commented on code in PR #9640: URL: https://github.com/apache/hudi/pull/9640#discussion_r1377052147 ## hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncConfigHolder.java: ## @@ -72,7 +72,7 @@ public class HiveSyncConfigHolder { .withDocumentat

(hudi) branch asf-site updated: [DOCS] Add tags and thumbnails to video guides (#9950)

2023-10-30 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 2a4ea93c18b [DOCS] Add tags and thumbnails

Re: [PR] [DOCS] Add tags and thumbnails to video guides [hudi]

2023-10-30 Thread via GitHub
danny0405 merged PR #9950: URL: https://github.com/apache/hudi/pull/9950 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

Re: [I] [SUPPORT] Flink CDC to HUDI cannot handle rowKind correctly [hudi]

2023-10-30 Thread via GitHub
danny0405 commented on issue #9940: URL: https://github.com/apache/hudi/issues/9940#issuecomment-1786436697 > I just need to set the primary key to be the same Just updates with correct RowKind with `-D` would trigger the deletion. -- This is an automated message from the Apache Gi

Re: [PR] [WIP][HUDI-7001] ComplexAvroKeyGenerator should represent single record key as the value string without composing the key field name [hudi]

2023-10-30 Thread via GitHub
danny0405 commented on code in PR #9936: URL: https://github.com/apache/hudi/pull/9936#discussion_r1377043304 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/keygen/ComplexAvroKeyGenerator.java: ## @@ -41,6 +41,9 @@ public ComplexAvroKeyGenerator(TypedProperties

Re: [PR] [HUDI-6991] Fix hoodie.parquet.max.file.size conf reset error [hudi]

2023-10-30 Thread via GitHub
ksmou commented on code in PR #9924: URL: https://github.com/apache/hudi/pull/9924#discussion_r1377022399 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/clustering/run/strategy/SparkSortAndSizeExecutionStrategy.java: ## @@ -68,7 +68,7 @@ public HoodieData

Re: [PR] [HUDI-6990] Configurable clustering task parallelism [hudi]

2023-10-30 Thread via GitHub
ksmou commented on code in PR #9925: URL: https://github.com/apache/hudi/pull/9925#discussion_r1377021854 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java: ## @@ -161,6 +161,13 @@ public class HoodieClusteringConfig extends Hoodi

[jira] [Updated] (HUDI-7010) Build clustering group reduces redundant traversals

2023-10-30 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7010: - Labels: pull-request-available (was: ) > Build clustering group reduces redundant traversals > --

Re: [I] [SUPPORT] Flink CDC to HUDI cannot handle rowKind correctly [hudi]

2023-10-30 Thread via GitHub
zdl1 commented on issue #9940: URL: https://github.com/apache/hudi/issues/9940#issuecomment-1786397985 > I think it is because the sink materializer, you need to remove it first. Thanks for your reply! Now I can delete the data correctly, because I changed the datasource, before w

[PR] [HUDI-7010] Build clustering group reduces redundant traversals [hudi]

2023-10-30 Thread via GitHub
ksmou opened a new pull request, #9957: URL: https://github.com/apache/hudi/pull/9957 ### Change Logs We build clustering group and get the final clustering plan with `getWriteConfig().getClusteringMaxNumGroups()` size. So there is no need to travel all FileSlices in `buildClustering

[jira] [Updated] (HUDI-7011) a metric to indicate whether rollback has occurred in final compaction state

2023-10-30 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7011: - Labels: pull-request-available (was: ) > a metric to indicate whether rollback has occurred in f

[PR] [HUDI-7011] a metric to indicate whether rollback has occurred in final compaction state [hudi]

2023-10-30 Thread via GitHub
LXin96 opened a new pull request, #9956: URL: https://github.com/apache/hudi/pull/9956 currently, when flink job start async compaction on a mor table, the metrics in org.apache.hudi.metrics.FlinkCompactionMetrics will update including pendingCompactionCount,compactionDelay,compactionCos

[jira] [Created] (HUDI-7011) a metric to indicate whether rollback has occurred in final compaction state

2023-10-30 Thread jack Lei (Jira)
jack Lei created HUDI-7011: -- Summary: a metric to indicate whether rollback has occurred in final compaction state Key: HUDI-7011 URL: https://issues.apache.org/jira/browse/HUDI-7011 Project: Apache Hudi

[jira] [Created] (HUDI-7010) Build clustering group reduces redundant traversals

2023-10-30 Thread kwang (Jira)
kwang created HUDI-7010: --- Summary: Build clustering group reduces redundant traversals Key: HUDI-7010 URL: https://issues.apache.org/jira/browse/HUDI-7010 Project: Apache Hudi Issue Type: Improvement

Re: [PR] [HUDI-7009] Filtering out null values from avro kafka source [hudi]

2023-10-30 Thread via GitHub
nsivabalan commented on code in PR #9955: URL: https://github.com/apache/hudi/pull/9955#discussion_r1376989168 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/AvroKafkaSource.java: ## @@ -103,14 +103,14 @@ JavaRDD toRDD(OffsetRange[] offsetRanges) { //Do

Re: [I] [SUPPORT] RowDataToAvroConverters does not support data in flink timestamp_ltz (timestamp_with_local_time_zone) format. [hudi]

2023-10-30 Thread via GitHub
enterwhat commented on issue #4698: URL: https://github.com/apache/hudi/issues/4698#issuecomment-1786342517 > Do you have intreast to contribute that part? yes, I have fixed it in my code by alter org.apache.flink.formats.avro.typeutils.AvroSchemaConverter org.apache.flink.forma

Re: [I] [SUPPORT] Flink CDC to HUDI cannot handle rowKind correctly [hudi]

2023-10-30 Thread via GitHub
danny0405 commented on issue #9940: URL: https://github.com/apache/hudi/issues/9940#issuecomment-1786336273 I think it is because the sink materializer, you need to remove it first. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [I] [SUPPORT] RowDataToAvroConverters does not support data in flink timestamp_ltz (timestamp_with_local_time_zone) format. [hudi]

2023-10-30 Thread via GitHub
danny0405 commented on issue #4698: URL: https://github.com/apache/hudi/issues/4698#issuecomment-1786335672 Do you have intreast to contribute that part? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] [SUPPORT] RowDataToAvroConverters does not support data in flink timestamp_ltz (timestamp_with_local_time_zone) format. [hudi]

2023-10-30 Thread via GitHub
enterwhat commented on issue #4698: URL: https://github.com/apache/hudi/issues/4698#issuecomment-1786328858 > Yeah, the timestamp_LTZ is not supported yet for flink, i have fired a JIRA to support this feature: https://issues.apache.org/jira/browse/HUDI-3388 It's not solved by 0.14.0,

Re: [PR] [HUDI-6495][RFC-66] Non-blocking Concurrency Control [hudi]

2023-10-30 Thread via GitHub
danny0405 commented on PR #7907: URL: https://github.com/apache/hudi/pull/7907#issuecomment-1786320842 > > > for this feature, how do we handle the failure writing commits,will it be rollback by other writing tasks? > > > > > > There is no failure thrown actively because the confl

Re: [PR] [HUDI-7009] Filtering out null values from avro kafka source [hudi]

2023-10-30 Thread via GitHub
rmahindra123 commented on code in PR #9955: URL: https://github.com/apache/hudi/pull/9955#discussion_r1376970845 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/AvroKafkaSource.java: ## @@ -103,14 +103,14 @@ JavaRDD toRDD(OffsetRange[] offsetRanges) { //

Re: [PR] [HUDI-6997] A new WriteConcurrencyMode type for non-blocking concurrency control [hudi]

2023-10-30 Thread via GitHub
danny0405 commented on PR #9933: URL: https://github.com/apache/hudi/pull/9933#issuecomment-1786273726 Tests passed: https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=20566&view=results -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] [HUDI-2461] Support out of order commits in MDT with completion time view [hudi]

2023-10-30 Thread via GitHub
danny0405 commented on code in PR #9871: URL: https://github.com/apache/hudi/pull/9871#discussion_r1376941626 ## hudi-client/hudi-java-client/src/test/java/org/apache/hudi/client/TestJavaHoodieBackedMetadata.java: ## @@ -525,7 +523,6 @@ public void testVirtualKeysInBaseFiles() t

[jira] [Updated] (HUDI-7009) Filter out null value records from avro kafka source

2023-10-30 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7009: - Labels: pull-request-available (was: ) > Filter out null value records from avro kafka source > -

[PR] [HUDI-7009] Filtering out null values from avro kafka source [hudi]

2023-10-30 Thread via GitHub
nsivabalan opened a new pull request, #9955: URL: https://github.com/apache/hudi/pull/9955 ### Change Logs - Filtering out null values from avro kafka source - Tombstone records could have null values and hence filtering them out. ### Impact - Will unblock pipelines ha

[jira] [Created] (HUDI-7009) Filter out null value records from avro kafka source

2023-10-30 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-7009: - Summary: Filter out null value records from avro kafka source Key: HUDI-7009 URL: https://issues.apache.org/jira/browse/HUDI-7009 Project: Apache Hudi

Re: [PR] [DOCS] Add tags for video blogs [hudi]

2023-10-30 Thread via GitHub
bhasudha commented on code in PR #9939: URL: https://github.com/apache/hudi/pull/9939#discussion_r1376934132 ## website/videoBlog/2022-12-14-Build_Slowly_Changing_Dimensions_Type_2_SCD2_with_Apache_Spark_and_Apache_Hudi_Hands_on_Labs.md: ## @@ -8,5 +8,8 @@ image: /assets/images/

Re: [I] [SUPPORT] Facing org.apache.parquet.io.ParquetDecodingException: Failed to read N bytes on Hudi 0.14.0 with offline clustering [hudi]

2023-10-30 Thread via GitHub
loustler commented on issue #9942: URL: https://github.com/apache/hudi/issues/9942#issuecomment-1786217428 @ad1happy2go This table loaded using HoodieClusteringJob. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] [MINOR] change hive/adb tool not auto create database default [hudi]

2023-10-30 Thread via GitHub
bvaradar commented on code in PR #9640: URL: https://github.com/apache/hudi/pull/9640#discussion_r1376887639 ## hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncConfigHolder.java: ## @@ -72,7 +72,7 @@ public class HiveSyncConfigHolder { .withDocumentati

(hudi) branch master updated (d85b57e59d7 -> 7b649237f31)

2023-10-30 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from d85b57e59d7 [HUDI-6896] HoodieAvroHFileReader.RecordIterator iteration never terminates (#9789) add 7b649237f31 [

Re: [PR] [HUDI-7000] Fix HoodieActiveTimeline::deleteInstantFileIfExists not show the file path when occur delete not success [hudi]

2023-10-30 Thread via GitHub
bvaradar merged PR #9935: URL: https://github.com/apache/hudi/pull/9935 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.

Re: [PR] The predicate pushdown misses the where filter condition [hudi]

2023-10-30 Thread via GitHub
bvaradar commented on PR #8201: URL: https://github.com/apache/hudi/pull/8201#issuecomment-1786157423 Closing this PR as this does not seem to be an issue. @renshangtao : Please reopen if this is not the case. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] The predicate pushdown misses the where filter condition [hudi]

2023-10-30 Thread via GitHub
bvaradar closed pull request #8201: The predicate pushdown misses the where filter condition URL: https://github.com/apache/hudi/pull/8201 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

(hudi) branch master updated: [HUDI-6896] HoodieAvroHFileReader.RecordIterator iteration never terminates (#9789)

2023-10-30 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new d85b57e59d7 [HUDI-6896] HoodieAvroHFileReader.Rec

Re: [PR] [HUDI-6896] HoodieAvroHFileReader.RecordIterator iteration never terminates [hudi]

2023-10-30 Thread via GitHub
bvaradar merged PR #9789: URL: https://github.com/apache/hudi/pull/9789 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.

Re: [PR] [HUDI-6896] HoodieAvroHFileReader.RecordIterator iteration never terminates [hudi]

2023-10-30 Thread via GitHub
bvaradar commented on code in PR #9789: URL: https://github.com/apache/hudi/pull/9789#discussion_r1376867703 ## hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieAvroHFileReader.java: ## @@ -684,6 +685,10 @@ private static class RecordIterator implements ClosableIterat

[jira] [Assigned] (HUDI-7008) Fixing usage of Kafka Avro deserializer w/ debezium sources

2023-10-30 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-7008: Assignee: sivabalan narayanan > Fixing usage of Kafka Avro deserializer w/ debezium

[jira] [Created] (HUDI-7008) Fixing usage of Kafka Avro deserializer w/ debezium sources

2023-10-30 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-7008: Summary: Fixing usage of Kafka Avro deserializer w/ debezium sources Key: HUDI-7008 URL: https://issues.apache.org/jira/browse/HUDI-7008 Project: Apache Hudi

Re: [PR] [HUDI-6297] Fixed issue in consuming transactional topic [hudi]

2023-10-30 Thread via GitHub
bvaradar commented on PR #9059: URL: https://github.com/apache/hudi/pull/9059#issuecomment-1786090555 @ad1happy2go : Can you introduce the change in KafkaSource (with a config) which can wrap the RDD and handle the behavior instead of making the change in DeltaSync ? -- This is an automa

Re: [PR] [HUDI-5194] Fix schema files cleaning by FileBasedInternalSchemaStorageManager [hudi]

2023-10-30 Thread via GitHub
bvaradar commented on PR #7183: URL: https://github.com/apache/hudi/pull/7183#issuecomment-1786062899 @xiarixiaoyao : Is this still an issue ? If so, Can you please update this PR based on changes from https://github.com/apache/hudi/pull/6358 and rebase. I will review and land this diff.

[PR] Fix incr errors new reader [hudi]

2023-10-30 Thread via GitHub
jonvex opened a new pull request, #9954: URL: https://github.com/apache/hudi/pull/9954 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance

Re: [I] [SUPPORT] How to deal with hard deletes in one pass [hudi]

2023-10-30 Thread via GitHub
Maks3300 commented on issue #5094: URL: https://github.com/apache/hudi/issues/5094#issuecomment-1785977537 i had the same problem asked by the rguillome . We do not want to separately deleting soft delete separately. @nsivabalan as explained by you , we have to separately delete thi

Re: [PR] [DOCS] Add tags for video blogs [hudi]

2023-10-30 Thread via GitHub
yihua commented on code in PR #9939: URL: https://github.com/apache/hudi/pull/9939#discussion_r1376695351 ## website/videoBlog/2022-12-14-Build_Slowly_Changing_Dimensions_Type_2_SCD2_with_Apache_Spark_and_Apache_Hudi_Hands_on_Labs.md: ## @@ -8,5 +8,8 @@ image: /assets/images/hud

Re: [PR] [HUDI-6790] Support incremental/CDC queries using HadoopFsRelation [hudi]

2023-10-30 Thread via GitHub
hudi-bot commented on PR #9888: URL: https://github.com/apache/hudi/pull/9888#issuecomment-1785820744 ## CI report: * 2501f4ca40591cd9b2d94b5c4daa360aa6454cef UNKNOWN * b25d25b4f3543bddfd4a138c1031d7f608e734ef Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

Re: [PR] [HUDI-2461] Support out of order commits in MDT with completion time view [hudi]

2023-10-30 Thread via GitHub
hudi-bot commented on PR #9871: URL: https://github.com/apache/hudi/pull/9871#issuecomment-1785820498 ## CI report: * f0a1258092388ff7d2ac67b8de7180be25a2137e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2040

Re: [I] [SUPPORT] Hudi MERGE INTO on Glue fails when using functions such as (filter, zip_with) on array of structs [hudi]

2023-10-30 Thread via GitHub
rita-ihnatsyeva commented on issue #9838: URL: https://github.com/apache/hudi/issues/9838#issuecomment-1785794247 Well I changed calculation logic. But for what I've tried it seems that Nullpointer was the case. so probably it failed because of the arriving array values, somewhere with null

Re: [I] [SUPPORT]Unable to write to erasure coded HDFS directory on EMR [hudi]

2023-10-30 Thread via GitHub
sathyanarayananc commented on issue #9947: URL: https://github.com/apache/hudi/issues/9947#issuecomment-1785768804 @ad1happy2go newbie in the block, so I maybe wrong here. Don't think I can build my own jar since I am tied to EMR. -- This is an automated message from the Apache Git Se

Re: [I] [SUPPORT] "OutOfMemoryError: Requested array size exceeds VM limit" on data ingestion to MOR table [hudi]

2023-10-30 Thread via GitHub
mzheng-plaid commented on issue #9934: URL: https://github.com/apache/hudi/issues/9934#issuecomment-1785762225 I updated `.hoodie` directory to be sorted in the description I updated the description, but I now think the issue is in [https://github.com/apache/hudi/blob/release-0.12.2/h

Re: [PR] [HUDI-7004] Add support of snapshotLoadQuerySplitter in s3/gcs sources [hudi]

2023-10-30 Thread via GitHub
hudi-bot commented on PR #9943: URL: https://github.com/apache/hudi/pull/9943#issuecomment-1785755367 ## CI report: * cf2a16031d1e2048d3ae75cc9ecf35ae409eac17 UNKNOWN * 0d581beb02b4918d7418e00f05029b9e84a2da40 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

Re: [PR] [HUDI-2461] Support out of order commits in MDT with completion time view [hudi]

2023-10-30 Thread via GitHub
hudi-bot commented on PR #9871: URL: https://github.com/apache/hudi/pull/9871#issuecomment-1785754612 ## CI report: * f0a1258092388ff7d2ac67b8de7180be25a2137e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2040

Re: [PR] [HUDI-7005] Fix hudi-aws-bundle relocation issue with avro [hudi]

2023-10-30 Thread via GitHub
PrabhuJoseph commented on PR #9946: URL: https://github.com/apache/hudi/pull/9946#issuecomment-1785736828 @umehrot2 has reviewed the patch and found a problem in relocating avro classes in the hudi-aws-bundle. It will affect the hudi-spark bundle as which has not relocated avro. The right f

Re: [I] [SUPPORT]Loss record when complete compaction [hudi]

2023-10-30 Thread via GitHub
ad1happy2go commented on issue #9869: URL: https://github.com/apache/hudi/issues/9869#issuecomment-1785696425 @15663671003 Do you still faces this issue after setting that config? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [I] [SUPPORT] Compaction error [hudi]

2023-10-30 Thread via GitHub
ad1happy2go commented on issue #9885: URL: https://github.com/apache/hudi/issues/9885#issuecomment-1785695170 @fearlsgroove Were you able to resolve it. Let us know if you still face this issue. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [I] [SUPPORT] Archival not working for hudi & corresponding hudi metadata table [hudi]

2023-10-30 Thread via GitHub
ad1happy2go commented on issue #9478: URL: https://github.com/apache/hudi/issues/9478#issuecomment-1785685127 @PankajKaushal Do you need any other help on this? Feel free to close this issue if all good. Thanks. -- This is an automated message from the Apache Git Service. To respond to th

Re: [I] [SUPPORT] Schema evolution wrt to datatype promotion isnt working. org.apache.avro.AvroRuntimeException: cannot support rewrite value for schema type: "long" since the old schema type is: "dou

2023-10-30 Thread via GitHub
codope closed issue #8160: [SUPPORT] Schema evolution wrt to datatype promotion isnt working. org.apache.avro.AvroRuntimeException: cannot support rewrite value for schema type: "long" since the old schema type is: "double" URL: https://github.com/apache/hudi/issues/8160 -- This is an automa

Re: [I] [SUPPORT]tmp file in timeline [hudi]

2023-10-30 Thread via GitHub
codope closed issue #8726: [SUPPORT]tmp file in timeline URL: https://github.com/apache/hudi/issues/8726 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[PR] [Docs] Update videos tags+thumbnails page 4 [hudi]

2023-10-30 Thread via GitHub
ckonehouse opened a new pull request, #9953: URL: https://github.com/apache/hudi/pull/9953 ### Change Logs Added tags and thumbnails to the videos on page 4 ### Impact Docs UI changes ### Risk level (write none, low medium or high below) None ### Docu

Re: [PR] [HUDI-7004] Add support of snapshotLoadQuerySplitter in s3/gcs sources [hudi]

2023-10-30 Thread via GitHub
hudi-bot commented on PR #9943: URL: https://github.com/apache/hudi/pull/9943#issuecomment-1785664683 ## CI report: * cf2a16031d1e2048d3ae75cc9ecf35ae409eac17 UNKNOWN * 6a54b847fdd0818b5d921569a7216cd08d51a73c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

Re: [PR] [HUDI-6790] Support incremental/CDC queries using HadoopFsRelation [hudi]

2023-10-30 Thread via GitHub
hudi-bot commented on PR #9888: URL: https://github.com/apache/hudi/pull/9888#issuecomment-1785664085 ## CI report: * 2501f4ca40591cd9b2d94b5c4daa360aa6454cef UNKNOWN * ad753318ae00d66dd7b05c3d0b021d32ae7a0808 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

  1   2   3   >