[GitHub] [hudi] qingyuan18 commented on issue #8382: [SUPPORT] hudi 0.12 spark batch ingestion throw out archive format validation error

2023-04-04 Thread via GitHub
qingyuan18 commented on issue #8382: URL: https://github.com/apache/hudi/issues/8382#issuecomment-1497007079 yes, indeed ---Original--- From: "Danny ***@***.***> Date: Wed, Apr 5, 2023 14:02 PM To: ***@***.***>; Cc: ***@***.**@***.***>; Subject: Re: [apache/h

[GitHub] [hudi] hudi-bot commented on pull request #8384: [HUDI-6039] Fixing FS based listing for full cleaning in clean Planner

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8384: URL: https://github.com/apache/hudi/pull/8384#issuecomment-1497006537 ## CI report: * e0515dc272a9c4e810839366bc53310a216540a3 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] waitingF commented on a diff in pull request #8378: [HUDI-6031] fix bug: checkpoint lost after changing cow to mor

2023-04-04 Thread via GitHub
waitingF commented on code in PR #8378: URL: https://github.com/apache/hudi/pull/8378#discussion_r1158079124 ## hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java: ## @@ -78,7 +78,7 @@ public JavaRDD compact() throws Exception { publ

[GitHub] [hudi] voonhous commented on issue #8371: [SUPPORT] Flink cant read metafield '_hoodie_commit_time'

2023-04-04 Thread via GitHub
voonhous commented on issue #8371: URL: https://github.com/apache/hudi/issues/8371#issuecomment-1496970307 We've encountered similar issues around this code recently. we can't seem to reproduce your issue, is it possible to provide a minimal example of your table so i can trigger this

[GitHub] [hudi] hudi-bot commented on pull request #8380: [HUDI-6033] Fix rounding exception when performing a float to decimal…

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8380: URL: https://github.com/apache/hudi/pull/8380#issuecomment-1496970046 ## CI report: * 4127079fc6162fee6b08501c700cf9b835a38d3c UNKNOWN * 7d131d8a247226108b978fe920c0f703290b39bd Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[jira] [Updated] (HUDI-6039) Fix FS based listing in clean planner

2023-04-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6039: - Labels: pull-request-available (was: ) > Fix FS based listing in clean planner >

[GitHub] [hudi] nsivabalan opened a new pull request, #8384: [HUDI-6039] Fixing FS based listing for full cleaning in clean Planner

2023-04-04 Thread via GitHub
nsivabalan opened a new pull request, #8384: URL: https://github.com/apache/hudi/pull/8384 ### Change Logs Looks like when we fallback to full partition cleaning in clean planner, we do FS based listing even though metadata is enabled. It was added in https://github.com/apache/hudi/p

[GitHub] [hudi] danny0405 commented on a diff in pull request #8378: [HUDI-6031] fix bug: checkpoint lost after changing cow to mor

2023-04-04 Thread via GitHub
danny0405 commented on code in PR #8378: URL: https://github.com/apache/hudi/pull/8378#discussion_r1158059577 ## hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java: ## @@ -78,7 +78,7 @@ public JavaRDD compact() throws Exception { pub

[jira] [Created] (HUDI-6039) Fix FS based listing in clean planner

2023-04-04 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-6039: - Summary: Fix FS based listing in clean planner Key: HUDI-6039 URL: https://issues.apache.org/jira/browse/HUDI-6039 Project: Apache Hudi Issue Type:

[jira] [Closed] (HUDI-6024) Hotfix in MergeIntoHoodieTableCommand::validate with remove used arguments

2023-04-04 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-6024. Resolution: Fixed Fixed via master branch: 8d7fc94dc54d5981140f4010160f8b3062f09458 > Hotfix in MergeIntoHo

[hudi] branch master updated (affb11efd44 -> 8d7fc94dc54)

2023-04-04 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from affb11efd44 [HUDI-6013] Support database name for meta sync in bootstrap (#8351) add 8d7fc94dc54 [HUDI-6024] Ho

[GitHub] [hudi] danny0405 merged pull request #8369: [HUDI-6024] Hotfix in MergeIntoHoodieTableCommand::validate with remove used arguments

2023-04-04 Thread via GitHub
danny0405 merged PR #8369: URL: https://github.com/apache/hudi/pull/8369 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[GitHub] [hudi] danny0405 commented on issue #8382: [SUPPORT] hudi 0.12 spark batch ingestion throw out archive format validation error

2023-04-04 Thread via GitHub
danny0405 commented on issue #8382: URL: https://github.com/apache/hudi/issues/8382#issuecomment-1496962270 Did you also clean the .hoodie/archive folder ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[jira] [Closed] (HUDI-6013) Support database name for meta sync in bootstrap

2023-04-04 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-6013. Fix Version/s: 0.14.0 Resolution: Fixed Fixed via master branch: affb11efd448b7ad9d19a74a5f589d543d35

[hudi] branch master updated (257e1680c1e -> affb11efd44)

2023-04-04 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 257e1680c1e [HUDI-6030] Cleans the ckp meta while the JM restarts (#8374) add affb11efd44 [HUDI-6013] Support d

[GitHub] [hudi] danny0405 merged pull request #8351: [HUDI-6013] Support database name for meta sync in bootstrap

2023-04-04 Thread via GitHub
danny0405 merged PR #8351: URL: https://github.com/apache/hudi/pull/8351 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[jira] [Updated] (HUDI-6038) Fix async compact/clustering serdes conflicts caused by WatermarkStatus in multi versions Flink

2023-04-04 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-6038: - Component/s: flink > Fix async compact/clustering serdes conflicts caused by WatermarkStatus in > multi v

[jira] [Created] (HUDI-6038) Fix async compact/clustering serdes conflicts caused by WatermarkStatus in multi versions Flink

2023-04-04 Thread Danny Chen (Jira)
Danny Chen created HUDI-6038: Summary: Fix async compact/clustering serdes conflicts caused by WatermarkStatus in multi versions Flink Key: HUDI-6038 URL: https://issues.apache.org/jira/browse/HUDI-6038 P

[jira] [Updated] (HUDI-6038) Fix async compact/clustering serdes conflicts caused by WatermarkStatus in multi versions Flink

2023-04-04 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-6038: - Fix Version/s: 0.14.0 > Fix async compact/clustering serdes conflicts caused by WatermarkStatus in > mult

[GitHub] [hudi] qingyuan18 commented on issue #8382: [SUPPORT] hudi 0.12 spark batch ingestion throw out archive format validation error

2023-04-04 Thread via GitHub
qingyuan18 commented on issue #8382: URL: https://github.com/apache/hudi/issues/8382#issuecomment-1496957557 no , i have cleaned up table dataand re run the job .  th error still reproduce ---Original--- From: "Danny ***@***.***> Date: Wed, Apr 5, 2023 13:48 PM To:

[GitHub] [hudi] danny0405 commented on a diff in pull request #8379: Fix async compact/clustering serdes conflicts caused by WatermarkStatus in multi versions Flink

2023-04-04 Thread via GitHub
danny0405 commented on code in PR #8379: URL: https://github.com/apache/hudi/pull/8379#discussion_r1158052521 ## hudi-flink-datasource/hudi-flink1.13.x/src/main/java/org/apache/hudi/adapter/SafeAsyncOutputAdapter.java: ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] danny0405 commented on issue #8366: [SUPPORT] Flink streaming write to Hudi table using data stream API java.lang.NoClassDefFoundError: org.apache.hudi.configuration.FlinkOptions

2023-04-04 Thread via GitHub
danny0405 commented on issue #8366: URL: https://github.com/apache/hudi/issues/8366#issuecomment-1496953281 This class should be included in your jar right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [hudi] hudi-bot commented on pull request #8383: [HUDI-6036] Add more tests for task and coordinator interaction

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8383: URL: https://github.com/apache/hudi/pull/8383#issuecomment-1496952829 ## CI report: * 6e5b49c1d13fc8f0bbf15ea0cbd3e57b39227110 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1613

[GitHub] [hudi] danny0405 commented on issue #8382: [SUPPORT] hudi 0.12 spark batch ingestion throw out archive format validation error

2023-04-04 Thread via GitHub
danny0405 commented on issue #8382: URL: https://github.com/apache/hudi/issues/8382#issuecomment-1496952221 Did you ever try to write to a legacy table? It seems a version compatibility. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [hudi] danny0405 commented on pull request #8326: [HUDI-6006] Deprecate hoodie.payload.ordering.field

2023-04-04 Thread via GitHub
danny0405 commented on PR #8326: URL: https://github.com/apache/hudi/pull/8326#issuecomment-1496950173 > It's no longer "pre" combine/ deduplicate incoming batch, but rather combine on write I agree, the `preCombine` is kind of confusing and for the literal meanings, it seems to do t

[GitHub] [hudi] danny0405 commented on pull request #8231: [HUDI-5963] Release 0.13.1 prep

2023-04-04 Thread via GitHub
danny0405 commented on PR #8231: URL: https://github.com/apache/hudi/pull/8231#issuecomment-1496945086 Hope we can also include: https://github.com/apache/hudi/pull/8374 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [hudi] yihua commented on a diff in pull request #8326: [HUDI-6006] Deprecate hoodie.payload.ordering.field

2023-04-04 Thread via GitHub
yihua commented on code in PR #8326: URL: https://github.com/apache/hudi/pull/8326#discussion_r1158041585 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodiePayloadConfig.java: ## @@ -60,6 +54,14 @@ public class HoodiePayloadConfig extends HoodieConfig

[GitHub] [hudi] hudi-bot commented on pull request #8383: [HUDI-6036] Add more tests for task and coordinator interaction

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8383: URL: https://github.com/apache/hudi/pull/8383#issuecomment-1496922798 ## CI report: * 6e5b49c1d13fc8f0bbf15ea0cbd3e57b39227110 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] codope commented on a diff in pull request #7929: [HUDI-5754] Add new sources to deltastreamer docs

2023-04-04 Thread via GitHub
codope commented on code in PR #7929: URL: https://github.com/apache/hudi/pull/7929#discussion_r1158026176 ## website/docs/hoodie_deltastreamer.md: ## @@ -340,6 +388,26 @@ to trigger/processing of new or changed data as soon as it is available on S3. Insert code sample from

[jira] [Created] (HUDI-6037) Improve compaction docs

2023-04-04 Thread nadine (Jira)
nadine created HUDI-6037: Summary: Improve compaction docs Key: HUDI-6037 URL: https://issues.apache.org/jira/browse/HUDI-6037 Project: Apache Hudi Issue Type: Improvement Reporter: nadin

[jira] [Updated] (HUDI-6036) Add more tests for task and coordinator interaction

2023-04-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-6036: - Labels: pull-request-available (was: ) > Add more tests for task and coordinator interaction > --

[GitHub] [hudi] danny0405 opened a new pull request, #8383: [HUDI-6036] Add more tests for task and coordinator interaction

2023-04-04 Thread via GitHub
danny0405 opened a new pull request, #8383: URL: https://github.com/apache/hudi/pull/8383 ### Change Logs Add more tests for task partial over, global failover, and job failover. ### Impact none ### Risk level (write none, low medium or high below) none

[jira] [Created] (HUDI-6036) Add more tests for task failover

2023-04-04 Thread Danny Chen (Jira)
Danny Chen created HUDI-6036: Summary: Add more tests for task failover Key: HUDI-6036 URL: https://issues.apache.org/jira/browse/HUDI-6036 Project: Apache Hudi Issue Type: Improvement

[jira] [Updated] (HUDI-6036) Add more tests for task and coordinator interaction

2023-04-04 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-6036: - Summary: Add more tests for task and coordinator interaction (was: Add more tests for task failover) > A

[GitHub] [hudi] hudi-bot commented on pull request #8380: [HUDI-6033] Fix rounding exception when performing a float to decimal…

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8380: URL: https://github.com/apache/hudi/pull/8380#issuecomment-1496913515 ## CI report: * 4127079fc6162fee6b08501c700cf9b835a38d3c UNKNOWN * 73ab33274af42797b7d5ec9bfef8d7e7d81c8132 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] qingyuan18 opened a new issue, #8382: [SUPPORT] hudi 0.12 spark batch ingestion throw out archive format validation error

2023-04-04 Thread via GitHub
qingyuan18 opened a new issue, #8382: URL: https://github.com/apache/hudi/issues/8382 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at dev-subsc

[GitHub] [hudi] bvaradar commented on pull request #5573: [HUDI-4093] Fix NPE when insert records that partition column is null

2023-04-04 Thread via GitHub
bvaradar commented on PR #5573: URL: https://github.com/apache/hudi/pull/5573#issuecomment-1496886640 @watermelon12138 : If this is not an issue in current master, can you kindly close it ? -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [hudi] hudi-bot commented on pull request #8380: [HUDI-6033] Fix rounding exception when performing a float to decimal…

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8380: URL: https://github.com/apache/hudi/pull/8380#issuecomment-1496886402 ## CI report: * 4127079fc6162fee6b08501c700cf9b835a38d3c UNKNOWN * 73ab33274af42797b7d5ec9bfef8d7e7d81c8132 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[jira] [Updated] (HUDI-6033) Fix to DECIMAL(p, s) schema evolution when reading avro log files when scale is lost

2023-04-04 Thread voon (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] voon updated HUDI-6033: --- Description: This issue only exists in MOR tables.   When performing a DECIMAL/FLOAT to DECIMAL(p, s) casting and wh

[GitHub] [hudi] bvaradar commented on pull request #6799: [HUDI-4920] fix PartialUpdatePayload cannot return deleted record in …

2023-04-04 Thread via GitHub
bvaradar commented on PR #6799: URL: https://github.com/apache/hudi/pull/6799#issuecomment-1496878811 @fengjian428 : Can you rebase against latest master and also check the failing test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [hudi] bvaradar commented on pull request #6799: [HUDI-4920] fix PartialUpdatePayload cannot return deleted record in …

2023-04-04 Thread via GitHub
bvaradar commented on PR #6799: URL: https://github.com/apache/hudi/pull/6799#issuecomment-1496877912 LGTM. Will approve once the test passes and I make a final pass. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [hudi] bvaradar commented on pull request #5165: [HUDI-3742] Enable parquet enableVectorizedReader for spark inc query to improve peformance

2023-04-04 Thread via GitHub
bvaradar commented on PR #5165: URL: https://github.com/apache/hudi/pull/5165#issuecomment-1496875101 Good point. It doesn't look like the place where we set spark vectorized reading on has knowledge of read schema to safely enable vectorization for certain cases for MOR. @xiarixiaoyao : T

[jira] [Updated] (HUDI-6033) Fix to DECIMAL(p, s) schema evolution when reading avro log files when scale is lost

2023-04-04 Thread voon (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] voon updated HUDI-6033: --- Summary: Fix to DECIMAL(p, s) schema evolution when reading avro log files when scale is lost (was: Fix DECIMAL/FLOAT

[jira] [Updated] (HUDI-6033) Fix DECIMAL/FLOAT to DECIMAL(p, s) schema evolution when reading avro log files

2023-04-04 Thread voon (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] voon updated HUDI-6033: --- Description: This issue only exists in MOR tables.   When performing a DECIMAL/FLOAT to DECIMAL(p, s) casting and wh

[jira] [Updated] (HUDI-6033) Fix DECIMAL/FLOAT to DECIMAL(p, s) schema evolution when reading avro log files

2023-04-04 Thread voon (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] voon updated HUDI-6033: --- Description: This issue only exists in MOR tables.   When performing a DECIMAL/FLOAT to DECIMAL(p, s) casting and wh

[jira] [Updated] (HUDI-6033) Fix DECIMAL/FLOAT to DECIMAL(p, s) schema evolution when reading avro log files

2023-04-04 Thread voon (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] voon updated HUDI-6033: --- Summary: Fix DECIMAL/FLOAT to DECIMAL(p, s) schema evolution when reading avro log files (was: Fix FLOAT to DECIMAL(p

[GitHub] [hudi] nfarah86 commented on issue #8040: [SUPPORT] Getting error when writing into MOR HUDI table if schema changed (datatype changed / column dropped)

2023-04-04 Thread via GitHub
nfarah86 commented on issue #8040: URL: https://github.com/apache/hudi/issues/8040#issuecomment-1496816796 flagging. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[GitHub] [hudi] sydneyhoran commented on issue #8372: [SUPPORT] Config conflict with Deltastreamer CustomKeyGenerator - PartitionPath

2023-04-04 Thread via GitHub
sydneyhoran commented on issue #8372: URL: https://github.com/apache/hudi/issues/8372#issuecomment-1496812416 @berniedurfee-renaissance this was what I changed to make it work in my fork https://github.com/sydneyhoran/hudi/commit/b1692c6ba3901d40b0523fe5226b5c5bff51ac7f -- This is an au

[GitHub] [hudi] hudi-bot commented on pull request #8326: [HUDI-6006] Deprecate hoodie.payload.ordering.field

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8326: URL: https://github.com/apache/hudi/pull/8326#issuecomment-1496811825 ## CI report: * c102f8ebe70e59643a49f854c796a5696e1fe98f UNKNOWN * ea6470d3531faa6f32b731550d6107857819bb42 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] danny0405 commented on issue #8060: [SUPPORT] An instant exception occurs when the flink job is restarted

2023-04-04 Thread via GitHub
danny0405 commented on issue #8060: URL: https://github.com/apache/hudi/issues/8060#issuecomment-1496804288 Should be fixed via #8374 , feel free to re-open it if the problem still exists. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [hudi] danny0405 closed issue #8060: [SUPPORT] An instant exception occurs when the flink job is restarted

2023-04-04 Thread via GitHub
danny0405 closed issue #8060: [SUPPORT] An instant exception occurs when the flink job is restarted URL: https://github.com/apache/hudi/issues/8060 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[jira] [Closed] (HUDI-6030) Cleans the ckp meta while the JM restarts

2023-04-04 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-6030. Fix Version/s: 0.12.3 0.14.0 Resolution: Fixed Fixed via master branch: 257e1680c1

[GitHub] [hudi] danny0405 merged pull request #8374: [HUDI-6030] Cleans the ckp meta while the JM restarts

2023-04-04 Thread via GitHub
danny0405 merged PR #8374: URL: https://github.com/apache/hudi/pull/8374 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[hudi] branch master updated: [HUDI-6030] Cleans the ckp meta while the JM restarts (#8374)

2023-04-04 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 257e1680c1e [HUDI-6030] Cleans the ckp meta whi

[GitHub] [hudi] nfarah86 commented on pull request #8381: Improve compaction doc

2023-04-04 Thread via GitHub
nfarah86 commented on PR #8381: URL: https://github.com/apache/hudi/pull/8381#issuecomment-1496779645 cc @bhasudha @danny0405 @nsivabalan if anyone of you want to review. I'll add a jira ticket shortly. -- This is an automated message from the Apache Git Service. To respond to the messa

[GitHub] [hudi] nfarah86 opened a new pull request, #8381: Improve compaction doc

2023-04-04 Thread via GitHub
nfarah86 opened a new pull request, #8381: URL: https://github.com/apache/hudi/pull/8381 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact ### Documentation Update Redid the compaction documentation.

[GitHub] [hudi] hudi-bot commented on pull request #6799: [HUDI-4920] fix PartialUpdatePayload cannot return deleted record in …

2023-04-04 Thread via GitHub
hudi-bot commented on PR #6799: URL: https://github.com/apache/hudi/pull/6799#issuecomment-1496777071 ## CI report: * cacd92c2c7fe8885871b5de05fcc31d02d9ebd90 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1612

[GitHub] [hudi] hudi-bot commented on pull request #8303: [HUDI-5998] Speed up reads from bootstrapped tables in spark

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8303: URL: https://github.com/apache/hudi/pull/8303#issuecomment-1496736168 ## CI report: * 5bca709bbf2690b2cd3a077f8214bd6e627d8420 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1612

[GitHub] [hudi] hudi-bot commented on pull request #8102: [HUDI-5880] Support partition pruning for flink streaming source in runtime

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8102: URL: https://github.com/apache/hudi/pull/8102#issuecomment-1496697343 ## CI report: * daabef5271b668f376ff2f7e82071d56c717cbbe Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1612

[GitHub] [hudi] hudi-bot commented on pull request #8379: Fix async compact/clustering serdes conflicts caused by WatermarkStatus in multi versions Flink

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8379: URL: https://github.com/apache/hudi/pull/8379#issuecomment-1496649173 ## CI report: * ec51fe8c615982b08d19a8d52c392db42e239494 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1612

[GitHub] [hudi] hudi-bot commented on pull request #8380: [HUDI-6033] Fix rounding exception when performing a float to decimal…

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8380: URL: https://github.com/apache/hudi/pull/8380#issuecomment-1496649201 ## CI report: * 4127079fc6162fee6b08501c700cf9b835a38d3c UNKNOWN * 73ab33274af42797b7d5ec9bfef8d7e7d81c8132 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] soumilshah1995 commented on issue #8040: [SUPPORT] Getting error when writing into MOR HUDI table if schema changed (datatype changed / column dropped)

2023-04-04 Thread via GitHub
soumilshah1995 commented on issue #8040: URL: https://github.com/apache/hudi/issues/8040#issuecomment-1496629170 I've just posted this in Slack; ideally, we can reach an agreement sooner on this issue -- This is an automated message from the Apache Git Service. To respond to the mess

[jira] [Updated] (HUDI-6035) Make simple index parallelism auto inferred

2023-04-04 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-6035: - Component/s: index > Make simple index parallelism auto inferred > ---

[jira] [Created] (HUDI-6035) Make simple index parallelism auto inferred

2023-04-04 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-6035: Summary: Make simple index parallelism auto inferred Key: HUDI-6035 URL: https://issues.apache.org/jira/browse/HUDI-6035 Project: Apache Hudi Issue Type: Improvement

[jira] [Commented] (HUDI-3945) After the async compaction operation is complete, the task should exit.

2023-04-04 Thread Rahil Chertara (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17708576#comment-17708576 ] Rahil Chertara commented on HUDI-3945: -- Hi, I currently have my doubts on this pr and

[GitHub] [hudi] hudi-bot commented on pull request #7834: [HUDI-5690] Add simpleBucketPartitioner to support using the simple bucket index under bulkinsert

2023-04-04 Thread via GitHub
hudi-bot commented on PR #7834: URL: https://github.com/apache/hudi/pull/7834#issuecomment-1496536930 ## CI report: * b5a005f667dbcbb03c3c36297e6ba9fd4bad5d1c UNKNOWN * ea2b9b0f25b87cbf7212ce87bb9b6e9ba63ddab4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

[GitHub] [hudi] hudi-bot commented on pull request #8378: [HUDI-6031] fix bug: checkpoint lost after changing cow to mor

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8378: URL: https://github.com/apache/hudi/pull/8378#issuecomment-1496507154 ## CI report: * 145deb82cc57354563d0f0fd163ae8bbb958c808 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1612

[GitHub] [hudi] parisni commented on issue #8222: [SUPPORT] Incremental read with MOR does not work as COW

2023-04-04 Thread via GitHub
parisni commented on issue #8222: URL: https://github.com/apache/hudi/issues/8222#issuecomment-1496493568 > The solution would be to perform a base + log merge first (which will consider the precombine fields), then filter for the commit range (increases the cost of the query, but will give

[GitHub] [hudi] berniedurfee-renaissance commented on issue #8372: [SUPPORT] Config conflict with Deltastreamer CustomKeyGenerator - PartitionPath

2023-04-04 Thread via GitHub
berniedurfee-renaissance commented on issue #8372: URL: https://github.com/apache/hudi/issues/8372#issuecomment-1496476454 Oh, good (bad?) timing. I just ran into this today. Any workarounds? -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [hudi] vinothchandar commented on a diff in pull request #8326: [HUDI-6006] Deprecate hoodie.payload.ordering.field

2023-04-04 Thread via GitHub
vinothchandar commented on code in PR #8326: URL: https://github.com/apache/hudi/pull/8326#discussion_r1157662922 ## hudi-common/src/test/java/org/apache/hudi/common/testutils/TestPreCombineUtils.java: ## @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [hudi] hudi-bot commented on pull request #8374: [HUDI-6030] Cleans the ckp meta while the JM restarts

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8374: URL: https://github.com/apache/hudi/pull/8374#issuecomment-1496443062 ## CI report: * 3e9da8cbd363aca09d3be5790ce289ecbc3183e0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1612

[GitHub] [hudi] hudi-bot commented on pull request #8367: [HUDI-6023] HotFix in HoodieDynamicBoundedBloomFilter with refactor arguments in HoodieDynamicBoundedBloomFilter

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8367: URL: https://github.com/apache/hudi/pull/8367#issuecomment-1496368362 ## CI report: * 38951b92ba068d155efc85b1b38ce860bf3551d4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1609

[GitHub] [hudi] hudi-bot commented on pull request #8369: [HUDI-6024] Hotfix in MergeIntoHoodieTableCommand::validate with remove used arguments

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8369: URL: https://github.com/apache/hudi/pull/8369#issuecomment-1496368426 ## CI report: * 544fc9fba0dbf84c03353dcdaf52b7409d31af40 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1609

[GitHub] [hudi] hudi-bot commented on pull request #8326: [HUDI-6006] Deprecate hoodie.payload.ordering.field

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8326: URL: https://github.com/apache/hudi/pull/8326#issuecomment-1496368056 ## CI report: * 4b0c681e00e9ac437a7ff039a0cb827fd5420470 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1609

[GitHub] [hudi] hudi-bot commented on pull request #8326: [HUDI-6006] Deprecate hoodie.payload.ordering.field

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8326: URL: https://github.com/apache/hudi/pull/8326#issuecomment-1496318713 ## CI report: * 4b0c681e00e9ac437a7ff039a0cb827fd5420470 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1609

[GitHub] [hudi] hudi-bot commented on pull request #8326: [HUDI-6006] Deprecate hoodie.payload.ordering.field

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8326: URL: https://github.com/apache/hudi/pull/8326#issuecomment-1496308013 ## CI report: * 4b0c681e00e9ac437a7ff039a0cb827fd5420470 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1609

[GitHub] [hudi] yihua commented on a diff in pull request #8326: [HUDI-6006] Deprecate hoodie.payload.ordering.field

2023-04-04 Thread via GitHub
yihua commented on code in PR #8326: URL: https://github.com/apache/hudi/pull/8326#discussion_r1157494514 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodiePayloadConfig.java: ## @@ -92,7 +95,7 @@ public Builder fromProperties(Properties props) {

[jira] [Assigned] (HUDI-6025) Incremental read with MOR doesn't give correct results

2023-04-04 Thread Lokesh Jain (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain reassigned HUDI-6025: - Assignee: Lokesh Jain > Incremental read with MOR doesn't give correct results >

[GitHub] [hudi] codope commented on pull request #8289: [HUDI-5987] Fix clustering on bootstrap tables non row writer

2023-04-04 Thread via GitHub
codope commented on PR #8289: URL: https://github.com/apache/hudi/pull/8289#issuecomment-1496263554 Closing in favor of #8342 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [hudi] codope closed pull request #8289: [HUDI-5987] Fix clustering on bootstrap tables non row writer

2023-04-04 Thread via GitHub
codope closed pull request #8289: [HUDI-5987] Fix clustering on bootstrap tables non row writer URL: https://github.com/apache/hudi/pull/8289 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[jira] [Assigned] (HUDI-6019) Kafka source support split by count

2023-04-04 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-6019: Assignee: Kong Wei (was: Vinoth Chandar) > Kafka source support split by count > -

[jira] [Assigned] (HUDI-6019) Kafka source support split by count

2023-04-04 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-6019: Assignee: Vinoth Chandar > Kafka source support split by count > --

[GitHub] [hudi] wuwenchi commented on a diff in pull request #7834: [HUDI-5690] Add simpleBucketPartitioner to support using the simple bucket index under bulkinsert

2023-04-04 Thread via GitHub
wuwenchi commented on code in PR #7834: URL: https://github.com/apache/hudi/pull/7834#discussion_r1157483783 ## hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/execution/bulkinsert/TestRDDSimpleBucketPartitioner.java: ## @@ -0,0 +1,115 @@ +/* + * Licensed to the Apac

[GitHub] [hudi] wuwenchi commented on a diff in pull request #7834: [HUDI-5690] Add simpleBucketPartitioner to support using the simple bucket index under bulkinsert

2023-04-04 Thread via GitHub
wuwenchi commented on code in PR #7834: URL: https://github.com/apache/hudi/pull/7834#discussion_r1157483471 ## hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/execution/bulkinsert/TestRDDSimpleBucketPartitioner.java: ## @@ -0,0 +1,115 @@ +/* + * Licensed to the Apac

[GitHub] [hudi] wuwenchi commented on a diff in pull request #7834: [HUDI-5690] Add simpleBucketPartitioner to support using the simple bucket index under bulkinsert

2023-04-04 Thread via GitHub
wuwenchi commented on code in PR #7834: URL: https://github.com/apache/hudi/pull/7834#discussion_r1157483229 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/execution/bulkinsert/RDDSimpleBucketPartitioner.java: ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache S

[GitHub] [hudi] wuwenchi commented on a diff in pull request #7834: [HUDI-5690] Add simpleBucketPartitioner to support using the simple bucket index under bulkinsert

2023-04-04 Thread via GitHub
wuwenchi commented on code in PR #7834: URL: https://github.com/apache/hudi/pull/7834#discussion_r1157482940 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/BucketIndexPartitioner.java: ## @@ -0,0 +1,79 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [hudi] wuwenchi commented on a diff in pull request #7834: [HUDI-5690] Add simpleBucketPartitioner to support using the simple bucket index under bulkinsert

2023-04-04 Thread via GitHub
wuwenchi commented on code in PR #7834: URL: https://github.com/apache/hudi/pull/7834#discussion_r1157482664 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/execution/bulkinsert/BulkInsertInternalPartitionerFactory.java: ## @@ -38,9 +38,12 @@ public static BulkIns

[GitHub] [hudi] wuwenchi commented on pull request #7834: [HUDI-5690] Add simpleBucketPartitioner to support using the simple bucket index under bulkinsert

2023-04-04 Thread via GitHub
wuwenchi commented on PR #7834: URL: https://github.com/apache/hudi/pull/7834#issuecomment-1496245214 > @wuwenchi : Can you look at the PR comments and address them when you get a chance. @bvaradar Sorry for the delay... Modified some, but there seems to be a conflict, I will solve

[GitHub] [hudi] hudi-bot commented on pull request #8344: [HUDI-5968] Fix global index duplicate when update partition

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8344: URL: https://github.com/apache/hudi/pull/8344#issuecomment-1496243696 ## CI report: * f021bc3227eea58d049420227564b9c98589534e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1611

[GitHub] [hudi] hudi-bot commented on pull request #6799: [HUDI-4920] fix PartialUpdatePayload cannot return deleted record in …

2023-04-04 Thread via GitHub
hudi-bot commented on PR #6799: URL: https://github.com/apache/hudi/pull/6799#issuecomment-1496227262 ## CI report: * 031a41f7274ca2c9451fcdfadb88a0f829e23e49 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1602

[GitHub] [hudi] hudi-bot commented on pull request #6799: [HUDI-4920] fix PartialUpdatePayload cannot return deleted record in …

2023-04-04 Thread via GitHub
hudi-bot commented on PR #6799: URL: https://github.com/apache/hudi/pull/6799#issuecomment-1496216107 ## CI report: * 031a41f7274ca2c9451fcdfadb88a0f829e23e49 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1602

[GitHub] [hudi] danny0405 commented on a diff in pull request #8374: [HUDI-6030] Cleans the ckp meta while the JM restarts

2023-04-04 Thread via GitHub
danny0405 commented on code in PR #8374: URL: https://github.com/apache/hudi/pull/8374#discussion_r1157440398 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/meta/CkpMetadata.java: ## @@ -203,6 +203,11 @@ public boolean isAborted(String instant) { ret

[GitHub] [hudi] xccui commented on a diff in pull request #8374: [HUDI-6030] Cleans the ckp meta while the JM restarts

2023-04-04 Thread via GitHub
xccui commented on code in PR #8374: URL: https://github.com/apache/hudi/pull/8374#discussion_r1157363612 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/meta/CkpMetadata.java: ## @@ -203,6 +203,11 @@ public boolean isAborted(String instant) { return

[GitHub] [hudi] xccui commented on issue #8325: [SUPPORT] spark read hudi error: Unable to instantiate HFileBootstrapIndex

2023-04-04 Thread via GitHub
xccui commented on issue #8325: URL: https://github.com/apache/hudi/issues/8325#issuecomment-1496179790 > > I hit the same exception in a Flink writer job. It happened when the job was trying to recover from a failure. > > Hudi version: 0.13.0 Flink version: 1.16.1 > > have you try

[jira] [Commented] (HUDI-6025) Incremental read with MOR doesn't give correct results

2023-04-04 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17708463#comment-17708463 ] Vinoth Chandar commented on HUDI-6025: -- Added an explanation here. [https://github.c

[GitHub] [hudi] vinothchandar commented on issue #8222: [SUPPORT] Incremental read with MOR does not work as COW

2023-04-04 Thread via GitHub
vinothchandar commented on issue #8222: URL: https://github.com/apache/hudi/issues/8222#issuecomment-1496166005 cc @lokeshj1703 who's owning the JIRA -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] vinothchandar commented on issue #8222: [SUPPORT] Incremental read with MOR does not work as COW

2023-04-04 Thread via GitHub
vinothchandar commented on issue #8222: URL: https://github.com/apache/hudi/issues/8222#issuecomment-1496165517 @parisni To clarify the semantics a bit. Incremental query provides all the records that changed between a start and end commit time range. If there are multiple writes (CoW) or m

[GitHub] [hudi] hudi-bot commented on pull request #8303: [HUDI-5998] Speed up reads from bootstrapped tables in spark

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8303: URL: https://github.com/apache/hudi/pull/8303#issuecomment-1496150469 ## CI report: * 78befd976f513cad8901e3c0e5fc97eabef19da2 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1599

[GitHub] [hudi] sydneyhoran commented on issue #8372: [SUPPORT] Config conflict with Deltastreamer CustomKeyGenerator - PartitionPath

2023-04-04 Thread via GitHub
sydneyhoran commented on issue #8372: URL: https://github.com/apache/hudi/issues/8372#issuecomment-1496143784 It seems to be related to splitting the string in [SparkKeyGenUtils.scala#L47](https://github.com/apache/hudi/blob/9288fdc456f9a4215d32908756a4ddaee18abfc4/hudi-client/hudi-spark-cli

[GitHub] [hudi] hudi-bot commented on pull request #8376: [HUDI-6019] support split kafka source by count

2023-04-04 Thread via GitHub
hudi-bot commented on PR #8376: URL: https://github.com/apache/hudi/pull/8376#issuecomment-1496135401 ## CI report: * 8f41636a0533ef541e8da93188a58b9c451fa112 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1611

  1   2   3   >