[GitHub] [hudi] hudi-bot commented on pull request #8579: [MINOR] Added docs for gotchas when using PartialUpdateAvroPayload

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8579: URL: https://github.com/apache/hudi/pull/8579#issuecomment-1523541516 ## CI report: * fa50b514ec994cde256ae1f85778648dc94e5ef6 Azure:

[GitHub] [hudi] nsivabalan commented on issue #8572: [SUPPORT] Getting java.io.FileNotFoundException when reading MOR table.

2023-04-26 Thread via GitHub
nsivabalan commented on issue #8572: URL: https://github.com/apache/hudi/issues/8572#issuecomment-1523538698 Generally incremental query will work only if cleaner has not run. for eg, if you have 100 commits in your timeline and cleaner has cleaned up the data pertaining to first 25

[GitHub] [hudi] nsivabalan commented on issue #8567: [SUPPORT] Metadata table not cleaned / compacted, log files growing rapidly

2023-04-26 Thread via GitHub
nsivabalan commented on issue #8567: URL: https://github.com/apache/hudi/issues/8567#issuecomment-1523534602 if you have any pending/inflight in data table timeline, metadata table compaction will stalled until that gets to completion. may be there is some lingering pending operation

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-26 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1177963180 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java: ## @@ -269,6 +269,7 @@ private int doCompact(JavaSparkContext jsc) throws Exception

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-26 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1177963180 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java: ## @@ -269,6 +269,7 @@ private int doCompact(JavaSparkContext jsc) throws Exception

[GitHub] [hudi] agrawalreetika commented on issue #8447: [SUPPORT] Docker Demo Issue With Current master(0.14.0-SNAPSHOT)

2023-04-26 Thread via GitHub
agrawalreetika commented on issue #8447: URL: https://github.com/apache/hudi/issues/8447#issuecomment-1523495809 Sure thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] danny0405 commented on pull request #8579: [MINOR] Added docs for gotchas when using PartialUpdateAvroPayload

2023-04-26 Thread via GitHub
danny0405 commented on PR #8579: URL: https://github.com/apache/hudi/pull/8579#issuecomment-1523481553 Okay, got the idea, you are addressing that the always merging behavior can be variable with different sequence of inputs, that's true, and it is not a bug I think. -- This is an

[GitHub] [hudi] danny0405 commented on a diff in pull request #8512: [HUDI-6057] Support Flink 1.17

2023-04-26 Thread via GitHub
danny0405 commented on code in PR #8512: URL: https://github.com/apache/hudi/pull/8512#discussion_r1177916345 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java: ## @@ -386,6 +392,28 @@ private static void

[GitHub] [hudi] danny0405 commented on a diff in pull request #8501: [HUDI-6103] Validate required columns when fetching required positions

2023-04-26 Thread via GitHub
danny0405 commented on code in PR #8501: URL: https://github.com/apache/hudi/pull/8501#discussion_r1177909927 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java: ## @@ -401,4 +408,35 @@ private static void

[GitHub] [hudi] danny0405 commented on pull request #8529: [HUDI-6120]filter base file when there is only one file slice fetched

2023-04-26 Thread via GitHub
danny0405 commented on PR #8529: URL: https://github.com/apache/hudi/pull/8529#issuecomment-1523450973 > as we add some document like "this method is only used for IncrementalInputSplits"? Add some notion like this: ```java CAUTION: the method requires that all the file

[GitHub] [hudi] hudi-bot commented on pull request #8550: [HUDI-6127]Flink Hudi Write support commit on an empty batch

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8550: URL: https://github.com/apache/hudi/pull/8550#issuecomment-1523445894 ## CI report: * 563e10e0492a8194d789772de6bb9ced9f8c0721 UNKNOWN * 7f3f4aa438927aa50346ac5dbbb38f3e5241135d Azure:

[GitHub] [hudi] danny0405 commented on a diff in pull request #8568: [HUDI-6134] prevent two clean run concurrently in flink.

2023-04-26 Thread via GitHub
danny0405 commented on code in PR #8568: URL: https://github.com/apache/hudi/pull/8568#discussion_r1177902562 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/clustering/ClusteringCommitSink.java: ## @@ -179,7 +179,7 @@ private void doCommit(String

[GitHub] [hudi] stream2000 commented on pull request #8550: [HUDI-6127]Flink Hudi Write support commit on an empty batch

2023-04-26 Thread via GitHub
stream2000 commented on PR #8550: URL: https://github.com/apache/hudi/pull/8550#issuecomment-1523435785 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] hudi-bot commented on pull request #8514: [HUDI-6113] Support multiple transformers using the same config keys in DeltaStreamer

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8514: URL: https://github.com/apache/hudi/pull/8514#issuecomment-1523432834 ## CI report: * 66a5fa1d0e1423f564c257be983e8ceae80b973c Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8514: [HUDI-6113] Support multiple transformers using the same config keys in DeltaStreamer

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8514: URL: https://github.com/apache/hudi/pull/8514#issuecomment-1523419282 ## CI report: * 79b47ea05df2b35c2f4dc2824764f7d4f728b7cd Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8503: URL: https://github.com/apache/hudi/pull/8503#issuecomment-1523419046 ## CI report: * 0738d975df341763e384b9ac9bcad14b006c9c47 UNKNOWN * 56040691bc99ee34cdeb4e5bc758abe0ba9f7711 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8303: [HUDI-5998] Speed up reads from bootstrapped tables in spark

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8303: URL: https://github.com/apache/hudi/pull/8303#issuecomment-1523417964 ## CI report: * c6908a16bf2f1fb46735781f8d969177eadc23a4 Azure:

[GitHub] [hudi] SabyasachiDasTR opened a new issue, #8581: Hudi reads failing on upgrade to Hudi 0.12.2 from Hudi 0.11.1

2023-04-26 Thread via GitHub
SabyasachiDasTR opened a new issue, #8581: URL: https://github.com/apache/hudi/issues/8581 **Describe the problem you faced** We have a spark streaming job which does only hudi upsert to load data to the partitions. We have 1000’s of collections/partions where data is upserted at

[GitHub] [hudi] SabyasachiDasTR opened a new issue, #8580: Hudi reads failing on upgrade to Hudi 0.12.2 from Hudi 0.11.1

2023-04-26 Thread via GitHub
SabyasachiDasTR opened a new issue, #8580: URL: https://github.com/apache/hudi/issues/8580 **Describe the problem you faced** We have a spark streaming job which does only hudi upsert to load data to the partitions. We have 1000’s of collections/partions where data is upserted at

[GitHub] [hudi] lokeshj1703 commented on pull request #8514: [HUDI-6113] Support multiple transformers using the same config keys in DeltaStreamer

2023-04-26 Thread via GitHub
lokeshj1703 commented on PR #8514: URL: https://github.com/apache/hudi/pull/8514#issuecomment-1523356895 Some comments from @vinothchandar https://github.com/apache/hudi/pull/8574#discussion_r1176508675 https://github.com/apache/hudi/pull/8574#discussion_r1176513255

[GitHub] [hudi] hudi-bot commented on pull request #8520: [HUDI-6115] Hardening expectation of corruptRecordColumn in ChainedTransformer.

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8520: URL: https://github.com/apache/hudi/pull/8520#issuecomment-1523354987 ## CI report: * aea97f3976876ed811329b4e7892b99ae97e0bab Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8514: [HUDI-6113] Support multiple transformers using the same config keys in DeltaStreamer

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8514: URL: https://github.com/apache/hudi/pull/8514#issuecomment-1523354905 ## CI report: * 79b47ea05df2b35c2f4dc2824764f7d4f728b7cd Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8491: [HUDI-6095] Refactor the judgment condition of WorkloadProfile

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8491: URL: https://github.com/apache/hudi/pull/8491#issuecomment-1523354674 ## CI report: * bcd54355f02696b50cd3998e8cc93f5e64cfc338 UNKNOWN * e14ff9c13938b28ec4e74af8572eb09a664df717 Azure:

[GitHub] [hudi] lokeshj1703 commented on a diff in pull request #8574: [HUDI-6139] Add support for Transformer schema validation in DeltaStreamer

2023-04-26 Thread via GitHub
lokeshj1703 commented on code in PR #8574: URL: https://github.com/apache/hudi/pull/8574#discussion_r1177815595 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java: ## @@ -276,9 +276,19 @@ public static class Config implements

[GitHub] [hudi] lokeshj1703 commented on a diff in pull request #8574: [HUDI-6139] Add support for Transformer schema validation in DeltaStreamer

2023-04-26 Thread via GitHub
lokeshj1703 commented on code in PR #8574: URL: https://github.com/apache/hudi/pull/8574#discussion_r1177815595 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java: ## @@ -276,9 +276,19 @@ public static class Config implements

[GitHub] [hudi] hudi-bot commented on pull request #8520: [HUDI-6115] Hardening expectation of corruptRecordColumn in ChainedTransformer.

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8520: URL: https://github.com/apache/hudi/pull/8520#issuecomment-1523341774 ## CI report: * aea97f3976876ed811329b4e7892b99ae97e0bab Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8514: [HUDI-6113] Support multiple transformers using the same config keys in DeltaStreamer

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8514: URL: https://github.com/apache/hudi/pull/8514#issuecomment-1523341663 ## CI report: * 79b47ea05df2b35c2f4dc2824764f7d4f728b7cd Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8491: [HUDI-6095] Refactor the judgment condition of WorkloadProfile

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8491: URL: https://github.com/apache/hudi/pull/8491#issuecomment-1523341354 ## CI report: * bcd54355f02696b50cd3998e8cc93f5e64cfc338 UNKNOWN * e14ff9c13938b28ec4e74af8572eb09a664df717 Azure:

[GitHub] [hudi] lokeshj1703 commented on a diff in pull request #8574: [HUDI-6139] Add support for Transformer schema validation in DeltaStreamer

2023-04-26 Thread via GitHub
lokeshj1703 commented on code in PR #8574: URL: https://github.com/apache/hudi/pull/8574#discussion_r1177791890 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/transform/ChainedTransformer.java: ## @@ -18,25 +18,89 @@ package org.apache.hudi.utilities.transform;

[GitHub] [hudi] lokeshj1703 commented on a diff in pull request #8574: [HUDI-6139] Add support for Transformer schema validation in DeltaStreamer

2023-04-26 Thread via GitHub
lokeshj1703 commented on code in PR #8574: URL: https://github.com/apache/hudi/pull/8574#discussion_r1177791660 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java: ## @@ -276,9 +276,19 @@ public static class Config implements

[GitHub] [hudi] lokeshj1703 commented on a diff in pull request #8574: [HUDI-6139] Add support for Transformer schema validation in DeltaStreamer

2023-04-26 Thread via GitHub
lokeshj1703 commented on code in PR #8574: URL: https://github.com/apache/hudi/pull/8574#discussion_r1177791359 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/transform/ChainedTransformer.java: ## @@ -18,25 +18,89 @@ package org.apache.hudi.utilities.transform;

[GitHub] [hudi] lokeshj1703 commented on a diff in pull request #8574: [HUDI-6139] Add support for Transformer schema validation in DeltaStreamer

2023-04-26 Thread via GitHub
lokeshj1703 commented on code in PR #8574: URL: https://github.com/apache/hudi/pull/8574#discussion_r1177791095 ## hudi-utilities/src/test/java/org/apache/hudi/utilities/deltastreamer/TestHoodieDeltaStreamer.java: ## @@ -1802,6 +1802,44 @@ private void

[GitHub] [hudi] huangxiaopingRD commented on pull request #8491: [HUDI-6095] Refactor the judgment condition of WorkloadProfile

2023-04-26 Thread via GitHub
huangxiaopingRD commented on PR #8491: URL: https://github.com/apache/hudi/pull/8491#issuecomment-1523311405 gently ping @danny0405 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] huangxiaopingRD closed pull request #8355: [HUDI-6016] HoodieCLIUtils supports creating HoodieClient with non-default database

2023-04-26 Thread via GitHub
huangxiaopingRD closed pull request #8355: [HUDI-6016] HoodieCLIUtils supports creating HoodieClient with non-default database URL: https://github.com/apache/hudi/pull/8355 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [hudi] huangxiaopingRD commented on pull request #8355: [HUDI-6016] HoodieCLIUtils supports creating HoodieClient with non-default database

2023-04-26 Thread via GitHub
huangxiaopingRD commented on PR #8355: URL: https://github.com/apache/hudi/pull/8355#issuecomment-1523305496 > Can you rebase with the latest master and resolve the conflicts I found this [PR](https://github.com/apache/hudi/pull/8488) had fixed the issue at yesterday. I will close

[GitHub] [hudi] hudi-bot commented on pull request #8512: [HUDI-6057] Support Flink 1.17

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8512: URL: https://github.com/apache/hudi/pull/8512#issuecomment-1523276755 ## CI report: * 37f4d036c3236384b944fdeaf1bd09fba18822f1 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7627: [HUDI-5517] HoodieTimeline support filter instants by state transition time

2023-04-26 Thread via GitHub
hudi-bot commented on PR #7627: URL: https://github.com/apache/hudi/pull/7627#issuecomment-1523274852 ## CI report: * 85b25f5cda4ccd8189a1607259e1732a910c3262 UNKNOWN * bfb9fbbed9a2423ba1781962cea8ccc277a84880 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8550: [HUDI-6127]Flink Hudi Write support commit on an empty batch

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8550: URL: https://github.com/apache/hudi/pull/8550#issuecomment-1523265839 ## CI report: * 563e10e0492a8194d789772de6bb9ced9f8c0721 UNKNOWN * 7f3f4aa438927aa50346ac5dbbb38f3e5241135d Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8512: [HUDI-6057] Support Flink 1.17

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8512: URL: https://github.com/apache/hudi/pull/8512#issuecomment-1523265415 ## CI report: * 4a932bed1750432693a2a8b56a8599eead076d01 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7627: [HUDI-5517] HoodieTimeline support filter instants by state transition time

2023-04-26 Thread via GitHub
hudi-bot commented on PR #7627: URL: https://github.com/apache/hudi/pull/7627#issuecomment-1523263066 ## CI report: * 85b25f5cda4ccd8189a1607259e1732a910c3262 UNKNOWN * bfb9fbbed9a2423ba1781962cea8ccc277a84880 Azure:

[GitHub] [hudi] PrabhuJoseph commented on pull request #8512: [HUDI-6057] Support Flink 1.17

2023-04-26 Thread via GitHub
PrabhuJoseph commented on PR #8512: URL: https://github.com/apache/hudi/pull/8512#issuecomment-1523258734 > Thanks @PrabhuJoseph For making this contribution, besides the CI testing was wondering if you ran anything manually? @rahil-c I have verified manually running Flink Streaming

[GitHub] [hudi] hudi-bot commented on pull request #8520: [HUDI-6115] Hardening expectation of corruptRecordColumn in ChainedTransformer.

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8520: URL: https://github.com/apache/hudi/pull/8520#issuecomment-1523256074 ## CI report: * aea97f3976876ed811329b4e7892b99ae97e0bab Azure:

[GitHub] [hudi] PrabhuJoseph commented on a diff in pull request #8512: [HUDI-6057] Support Flink 1.17

2023-04-26 Thread via GitHub
PrabhuJoseph commented on code in PR #8512: URL: https://github.com/apache/hudi/pull/8512#discussion_r1177726966 ## scripts/release/deploy_staging_jars.sh: ## @@ -75,6 +75,7 @@ declare -a ALL_VERSION_OPTS=( "-Dscala-2.12 -Dflink1.14 -Davro.version=1.10.0 -pl

[GitHub] [hudi] PrabhuJoseph commented on a diff in pull request #8512: [HUDI-6057] Support Flink 1.17

2023-04-26 Thread via GitHub
PrabhuJoseph commented on code in PR #8512: URL: https://github.com/apache/hudi/pull/8512#discussion_r1177726006 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bulk/sort/SortOperator.java: ## @@ -89,7 +95,7 @@ public void open() throws Exception {

[GitHub] [hudi] Mulavar commented on pull request #8529: [HUDI-6120]filter base file when there is only one file slice fetched

2023-04-26 Thread via GitHub
Mulavar commented on PR #8529: URL: https://github.com/apache/hudi/pull/8529#issuecomment-1523235110 > > I agreed with you that IncrementalInputSplits has no error with this patch, and I made a mistake about it. However I'm thinking about whether the logic of this method itself is correct,

[jira] [Created] (HUDI-6141) Fix config issue for PostgresDebeziumSource

2023-04-26 Thread Aditya Goenka (Jira)
Aditya Goenka created HUDI-6141: --- Summary: Fix config issue for PostgresDebeziumSource Key: HUDI-6141 URL: https://issues.apache.org/jira/browse/HUDI-6141 Project: Apache Hudi Issue Type: Bug

[GitHub] [hudi] ad1happy2go commented on issue #8521: [SUPPORT] Deltastreamer not recognizing config `hoodie.deltastreamer.source.kafka.value.deserializer.class` with PostgresDebeziumSource

2023-04-26 Thread via GitHub
ad1happy2go commented on issue #8521: URL: https://github.com/apache/hudi/issues/8521#issuecomment-1523233107 JIRA to track - https://issues.apache.org/jira/browse/HUDI-6141 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [hudi] ad1happy2go commented on issue #8474: [SUPPORT] Duplicate records caused by misclassification as insert during upsert after spark executor loss

2023-04-26 Thread via GitHub
ad1happy2go commented on issue #8474: URL: https://github.com/apache/hudi/issues/8474#issuecomment-1523227232 @coffee34 Can you please provide more details on this issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [hudi] voonhous commented on a diff in pull request #8501: [HUDI-6103] Validate required columns when fetching required positions

2023-04-26 Thread via GitHub
voonhous commented on code in PR #8501: URL: https://github.com/apache/hudi/pull/8501#discussion_r1177710126 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java: ## @@ -401,4 +408,35 @@ private static void

[GitHub] [hudi] ad1happy2go commented on issue #8486: [SUPPORT] AvroRecordConverter throws NoSuchMethodError(Avro defaultValue) on schema change

2023-04-26 Thread via GitHub
ad1happy2go commented on issue #8486: URL: https://github.com/apache/hudi/issues/8486#issuecomment-1523224879 You need to build code using spark 3.2. mvn clean package -T2C -DskipTests -Dspark.version=3.2 Can you provide the steps to reproduce this issue. -- This is an automated

[GitHub] [hudi] hbgstc123 commented on a diff in pull request #8568: [HUDI-6134] prevent two clean run concurrently in flink.

2023-04-26 Thread via GitHub
hbgstc123 commented on code in PR #8568: URL: https://github.com/apache/hudi/pull/8568#discussion_r1177699120 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/clustering/ClusteringCommitSink.java: ## @@ -179,7 +179,7 @@ private void doCommit(String

[GitHub] [hudi] voonhous commented on pull request #8579: [MINOR] Added docs for gotchas when using PartialUpdateAvroPayload

2023-04-26 Thread via GitHub
voonhous commented on PR #8579: URL: https://github.com/apache/hudi/pull/8579#issuecomment-1523206446 > > If preCombine is invoked with the same key when an old data {price: 11.00, _ts:999} is received together with a new data {price: null, _ts: 1001}, the old data's column value might

[GitHub] [hudi] hudi-bot commented on pull request #8546: [MINOR] Add log in flink compact/cluster commit sink for troubleshoot…

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8546: URL: https://github.com/apache/hudi/pull/8546#issuecomment-1523204823 ## CI report: * 71bf09ba661b1a296233f14f116894c83788fc64 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8501: [HUDI-6103] Validate required columns when fetching required positions

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8501: URL: https://github.com/apache/hudi/pull/8501#issuecomment-1523204539 ## CI report: * d8a60f864ecc2906ad70d4791badde3bec7a3e98 Azure:

[GitHub] [hudi] voonhous commented on a diff in pull request #8501: [HUDI-6103] Validate required columns when fetching required positions

2023-04-26 Thread via GitHub
voonhous commented on code in PR #8501: URL: https://github.com/apache/hudi/pull/8501#discussion_r1177685237 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java: ## @@ -401,4 +408,35 @@ private static void

[GitHub] [hudi] hudi-bot commented on pull request #8546: [MINOR] Add log in flink compact/cluster commit sink for troubleshoot…

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8546: URL: https://github.com/apache/hudi/pull/8546#issuecomment-1523195141 ## CI report: * 2914fd9a3052f735733c8a212644918349943618 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8501: [HUDI-6103] Validate required columns when fetching required positions

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8501: URL: https://github.com/apache/hudi/pull/8501#issuecomment-1523194773 ## CI report: * d8a60f864ecc2906ad70d4791badde3bec7a3e98 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8568: [HUDI-6134] prevent two clean run concurrently in flink.

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8568: URL: https://github.com/apache/hudi/pull/8568#issuecomment-1523182801 ## CI report: * 708a1e072cecfe4c78e7c3159a5d6e122d532024 Azure:

[GitHub] [hudi] chenbodeng719 commented on issue #6804: [SUPPORT] Repairing the hudi table from No such file or directory of parquet file.

2023-04-26 Thread via GitHub
chenbodeng719 commented on issue #6804: URL: https://github.com/apache/hudi/issues/6804#issuecomment-1523182272 @nsivabalan I faced the same issue. It happens once a week, what can I do to avoid it? -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] hudi-bot commented on pull request #7355: [HUDI-5308] Hive query returns null when the where clause has a partition field

2023-04-26 Thread via GitHub
hudi-bot commented on PR #7355: URL: https://github.com/apache/hudi/pull/7355#issuecomment-1523178914 ## CI report: * 7f6f117d4bbab638c02004bcb00ec63778c190a4 Azure:

[GitHub] [hudi] hbgstc123 commented on a diff in pull request #8546: [MINOR] Add log in flink compact/cluster commit sink for troubleshoot…

2023-04-26 Thread via GitHub
hbgstc123 commented on code in PR #8546: URL: https://github.com/apache/hudi/pull/8546#discussion_r1177668814 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionCommitSink.java: ## @@ -101,6 +101,12 @@ public void open(Configuration

[GitHub] [hudi] danny0405 commented on a diff in pull request #8501: [HUDI-6103] Validate required columns when fetching required positions

2023-04-26 Thread via GitHub
danny0405 commented on code in PR #8501: URL: https://github.com/apache/hudi/pull/8501#discussion_r1177669046 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java: ## @@ -401,4 +408,35 @@ private static void

[GitHub] [hudi] danny0405 commented on pull request #8579: [MINOR] Added docs for gotchas when using PartialUpdateAvroPayload

2023-04-26 Thread via GitHub
danny0405 commented on PR #8579: URL: https://github.com/apache/hudi/pull/8579#issuecomment-1523173556 > If preCombine is invoked with the same key when an old data {price: 11.00, _ts:999} is received together with a new data {price: null, _ts: 1001}, the old data's column value might

[GitHub] [hudi] danny0405 commented on pull request #8512: [HUDI-6057] Support Flink 1.17

2023-04-26 Thread via GitHub
danny0405 commented on PR #8512: URL: https://github.com/apache/hudi/pull/8512#issuecomment-1523170925 @PrabhuJoseph Can you fix the Flink 1.17 to use the avro 1.11 instead? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [hudi] danny0405 commented on a diff in pull request #8512: [HUDI-6057] Support Flink 1.17

2023-04-26 Thread via GitHub
danny0405 commented on code in PR #8512: URL: https://github.com/apache/hudi/pull/8512#discussion_r1177660676 ## scripts/release/deploy_staging_jars.sh: ## @@ -75,6 +75,7 @@ declare -a ALL_VERSION_OPTS=( "-Dscala-2.12 -Dflink1.14 -Davro.version=1.10.0 -pl

[GitHub] [hudi] shaurya-nwse commented on issue #8486: [SUPPORT] AvroRecordConverter throws NoSuchMethodError(Avro defaultValue) on schema change

2023-04-26 Thread via GitHub
shaurya-nwse commented on issue #8486: URL: https://github.com/apache/hudi/issues/8486#issuecomment-1523166426 Some context, we have 3 topics being ingested via a multitable deltastreamer and 2 of them work fine but after the schema changed for the 3rd table we ran into the problem with

[GitHub] [hudi] danny0405 commented on a diff in pull request #7355: [HUDI-5308] Hive query returns null when the where clause has a partition field

2023-04-26 Thread via GitHub
danny0405 commented on code in PR #7355: URL: https://github.com/apache/hudi/pull/7355#discussion_r1177656027 ## hudi-integ-test/src/test/java/org/apache/hudi/integ/ITTestHoodieSanity.java: ## @@ -193,19 +193,6 @@ public void testRunHoodieJavaApp(String command, String

[GitHub] [hudi] danny0405 commented on pull request #7627: [HUDI-5517] HoodieTimeline support filter instants by state transition time

2023-04-26 Thread via GitHub
danny0405 commented on PR #7627: URL: https://github.com/apache/hudi/pull/7627#issuecomment-1523156575 Can you rebase with the latest master and force push again~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] shaurya-nwse commented on issue #8486: [SUPPORT] AvroRecordConverter throws NoSuchMethodError(Avro defaultValue) on schema change

2023-04-26 Thread via GitHub
shaurya-nwse commented on issue #8486: URL: https://github.com/apache/hudi/issues/8486#issuecomment-1523155772 Hi @ad1happy2go We already have tables written using 0.11. Nonetheless when I tried to write using the 0.13.0 utilities, this is what I get: ``` Caused by:

[GitHub] [hudi] danny0405 commented on pull request #8529: [HUDI-6120]filter base file when there is only one file slice fetched

2023-04-26 Thread via GitHub
danny0405 commented on PR #8529: URL: https://github.com/apache/hudi/pull/8529#issuecomment-1523151511 > I agreed with you that IncrementalInputSplits has no error with this patch, and I made a mistake about it. However I'm thinking about whether the logic of this method itself is correct,

[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-26 Thread via GitHub
danny0405 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1177650137 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java: ## @@ -269,6 +269,7 @@ private int doCompact(JavaSparkContext jsc) throws Exception {

[jira] [Closed] (HUDI-6131) Refactor getWritePathsOfInstants in Flink WriteProfiles

2023-04-26 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-6131. Resolution: Fixed Fixed via master branch: c332c60ad7b9e43ebbd09e2eb7c6e53ed7e4a95d > Refactor

[jira] [Updated] (HUDI-6131) Refactor getWritePathsOfInstants in Flink WriteProfiles

2023-04-26 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-6131: - Fix Version/s: 0.14.0 > Refactor getWritePathsOfInstants in Flink WriteProfiles >

[hudi] branch master updated (b690346a700 -> c332c60ad7b)

2023-04-26 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from b690346a700 [MINOR] fix misleading configuration value (#8534) add c332c60ad7b [HUDI-6131] Refactor

[GitHub] [hudi] voonhous commented on pull request #8579: [MINOR] Added docs for gotchas when using PartialUpdateAvroPayload

2023-04-26 Thread via GitHub
voonhous commented on PR #8579: URL: https://github.com/apache/hudi/pull/8579#issuecomment-1523138078 > > Results will be different if combineAndUpdateValue is invoked in order without invoking preCombine. > > What is exactly lost here? # PreCombine +

[GitHub] [hudi] danny0405 merged pull request #8556: [HUDI-6131] Refactor getWritePathsOfInstants in Flink WriteProfiles

2023-04-26 Thread via GitHub
danny0405 merged PR #8556: URL: https://github.com/apache/hudi/pull/8556 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] danny0405 commented on pull request #8579: [MINOR] Added docs for gotchas when using PartialUpdateAvroPayload

2023-04-26 Thread via GitHub
danny0405 commented on PR #8579: URL: https://github.com/apache/hudi/pull/8579#issuecomment-1523133981 > Results will be different if combineAndUpdateValue is invoked in order without invoking preCombine. What is exactly lost here? -- This is an automated message from the Apache

[GitHub] [hudi] danny0405 commented on a diff in pull request #8579: [MINOR] Added docs for gotchas when using PartialUpdateAvroPayload

2023-04-26 Thread via GitHub
danny0405 commented on code in PR #8579: URL: https://github.com/apache/hudi/pull/8579#discussion_r1177638616 ## hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestPartialUpdateAvroPayload.scala: ## @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] hudi-bot commented on pull request #8512: [HUDI-6057] Support Flink 1.17

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8512: URL: https://github.com/apache/hudi/pull/8512#issuecomment-1523110909 ## CI report: * 4a932bed1750432693a2a8b56a8599eead076d01 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8076: [HUDI-5884] Support bulk_insert for insert_overwrite and insert_overwrite_table

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8076: URL: https://github.com/apache/hudi/pull/8076#issuecomment-1523109704 ## CI report: * 6a239ada8998fd440f19c0082b26d206ed589870 UNKNOWN * 5914b1a1fedea0ae708e5e2f96130c2a73dd5b66 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7355: [HUDI-5308] Hive query returns null when the where clause has a partition field

2023-04-26 Thread via GitHub
hudi-bot commented on PR #7355: URL: https://github.com/apache/hudi/pull/7355#issuecomment-1523106901 ## CI report: * 7f6f117d4bbab638c02004bcb00ec63778c190a4 Azure:

[GitHub] [hudi] voonhous commented on issue #8571: [ISSUE] spark-sql doesn't read the latest snapshot of MOR table

2023-04-26 Thread via GitHub
voonhous commented on issue #8571: URL: https://github.com/apache/hudi/issues/8571#issuecomment-1523102114 @stayrascal Can you try running this before executing your Spark-SQL query? ``` REFRESH TABLE flink_hudi_mor_streaming_tbl_1_rt; ``` -- This is an automated message from

[GitHub] [hudi] hudi-bot commented on pull request #8512: [HUDI-6057] Support Flink 1.17

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8512: URL: https://github.com/apache/hudi/pull/8512#issuecomment-1523097495 ## CI report: * 4a932bed1750432693a2a8b56a8599eead076d01 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8076: [HUDI-5884] Support bulk_insert for insert_overwrite and insert_overwrite_table

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8076: URL: https://github.com/apache/hudi/pull/8076#issuecomment-1523096261 ## CI report: * 6a239ada8998fd440f19c0082b26d206ed589870 UNKNOWN * 5914b1a1fedea0ae708e5e2f96130c2a73dd5b66 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8556: [HUDI-6131] Refactor getWritePathsOfInstants in Flink WriteProfiles

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8556: URL: https://github.com/apache/hudi/pull/8556#issuecomment-1523086501 ## CI report: * 65abf463bdb0e4be484251a243fc7300a49c1604 Azure:

[GitHub] [hudi] rohan-uptycs commented on a diff in pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-26 Thread via GitHub
rohan-uptycs commented on code in PR #8503: URL: https://github.com/apache/hudi/pull/8503#discussion_r1177555612 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java: ## @@ -509,7 +509,15 @@ private Stream

[GitHub] [hudi] rohan-uptycs commented on a diff in pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-26 Thread via GitHub
rohan-uptycs commented on code in PR #8503: URL: https://github.com/apache/hudi/pull/8503#discussion_r1177555612 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java: ## @@ -509,7 +509,15 @@ private Stream

[GitHub] [hudi] boneanxs commented on a diff in pull request #8076: [HUDI-5884] Support bulk_insert for insert_overwrite and insert_overwrite_table

2023-04-26 Thread via GitHub
boneanxs commented on code in PR #8076: URL: https://github.com/apache/hudi/pull/8076#discussion_r1177516056 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala: ## @@ -770,66 +770,70 @@ object HoodieSparkSqlWriter { } }

[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8505: URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522992882 ## CI report: * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN *

[GitHub] [hudi] SteNicholas commented on a diff in pull request #8503: [HUDI-6047] Clustering operation on consistent hashing index resulting in duplicate data

2023-04-26 Thread via GitHub
SteNicholas commented on code in PR #8503: URL: https://github.com/apache/hudi/pull/8503#discussion_r1177521748 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java: ## @@ -509,7 +509,15 @@ private Stream

[GitHub] [hudi] hudi-bot commented on pull request #7627: [HUDI-5517] HoodieTimeline support filter instants by state transition time

2023-04-26 Thread via GitHub
hudi-bot commented on PR #7627: URL: https://github.com/apache/hudi/pull/7627#issuecomment-1522990780 ## CI report: * 85b25f5cda4ccd8189a1607259e1732a910c3262 UNKNOWN * bfb9fbbed9a2423ba1781962cea8ccc277a84880 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7627: [HUDI-5517] HoodieTimeline support filter instants by state transition time

2023-04-26 Thread via GitHub
hudi-bot commented on PR #7627: URL: https://github.com/apache/hudi/pull/7627#issuecomment-1522936046 ## CI report: * 85b25f5cda4ccd8189a1607259e1732a910c3262 UNKNOWN * bfb9fbbed9a2423ba1781962cea8ccc277a84880 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8579: [MINOR] Added docs on gotchas when using PartialUpdateAvroPayload

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8579: URL: https://github.com/apache/hudi/pull/8579#issuecomment-1522928842 ## CI report: * fa50b514ec994cde256ae1f85778648dc94e5ef6 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8505: URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522928401 ## CI report: * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN *

[GitHub] [hudi] boneanxs commented on pull request #7627: [HUDI-5517] HoodieTimeline support filter instants by state transition time

2023-04-26 Thread via GitHub
boneanxs commented on PR #7627: URL: https://github.com/apache/hudi/pull/7627#issuecomment-1522926916 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] alexone95 commented on issue #8535: [SUPPORT] manually deleting file under .hoodie/archived

2023-04-26 Thread via GitHub
alexone95 commented on issue #8535: URL: https://github.com/apache/hudi/issues/8535#issuecomment-1522925284 @nsivabalan the only problem is that we are using the hudi 12.0.1 version where the the issue is not fixed so we had to do this workaround to avoid this -- This is an automated

[GitHub] [hudi] hudi-bot commented on pull request #8579: [MINOR] Added docs on gotchas when using PartialUpdateAvroPayload

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8579: URL: https://github.com/apache/hudi/pull/8579#issuecomment-1522918267 ## CI report: * fa50b514ec994cde256ae1f85778648dc94e5ef6 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-26 Thread via GitHub
hudi-bot commented on PR #8505: URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522917865 ## CI report: * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN *

[GitHub] [hudi] xushiyan closed issue #8109: [SUPPORT] Spark32PlusHoodieParquetFileFormat should set "SQLConf.LEGACY_PARQUET_NANOS_AS_LONG" ?

2023-04-26 Thread via GitHub
xushiyan closed issue #8109: [SUPPORT] Spark32PlusHoodieParquetFileFormat should set "SQLConf.LEGACY_PARQUET_NANOS_AS_LONG" ? URL: https://github.com/apache/hudi/issues/8109 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [hudi] ad1happy2go commented on issue #8502: [SUPPORT] Does spark.sql("MERGE INTO") supports schema evolution write option

2023-04-26 Thread via GitHub
ad1happy2go commented on issue #8502: URL: https://github.com/apache/hudi/issues/8502#issuecomment-1522911733 @jhchee Spark sql parser doesn't supports this so not sure if we can do anything on our end. All configs comes into play during the execution of sql. you can do ALTER table

<    1   2   3   >