Re: [PR] [HUDI-7710] Use compaction.requested during conflict resolution [hudi]
hudi-bot commented on PR #11151: URL: https://github.com/apache/hudi/pull/11151#issuecomment-2094649256 ## CI report: * c68b630ed8b878dffc4df1f1074d6fa3987899d9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23663) * 2985ea2ec2f8a0b62086a6ac9933654051a65738 UNKNOWN * bd09a1b36becfbcdc75195427148eb948e384ac5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23664) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7710] Use compaction.requested during conflict resolution [hudi]
hudi-bot commented on PR #11151: URL: https://github.com/apache/hudi/pull/11151#issuecomment-2094647826 ## CI report: * c68b630ed8b878dffc4df1f1074d6fa3987899d9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23663) * 2985ea2ec2f8a0b62086a6ac9933654051a65738 UNKNOWN * bd09a1b36becfbcdc75195427148eb948e384ac5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7710] Use compaction.requested during conflict resolution [hudi]
hudi-bot commented on PR #11151: URL: https://github.com/apache/hudi/pull/11151#issuecomment-2094646391 ## CI report: * c68b630ed8b878dffc4df1f1074d6fa3987899d9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23663) * 2985ea2ec2f8a0b62086a6ac9933654051a65738 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7710] Use compaction.requested during conflict resolution [hudi]
hudi-bot commented on PR #11151: URL: https://github.com/apache/hudi/pull/11151#issuecomment-2094637684 ## CI report: * c68b630ed8b878dffc4df1f1074d6fa3987899d9 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [HUDI-7710] Replace compaction.inflight during conflict resolution [hudi]
linliu-code opened a new pull request, #11151: URL: https://github.com/apache/hudi/pull/11151 ### Change Logs During conflict resolution between an ingestion writer and compaction, if the compaction is in `inflight` state, original logic tries to extract the compaction plan from this inflight file, which is NULL and causes NPE issue. Therefore, we return the `requested` instant. ### Impact Fixed a bug. ### Risk level (write none, low medium or high below) Low. ### Documentation Update _Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none"._ - _The config description must be updated if new configs are added or the default value of the configs are changed_ - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make changes to the website._ ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7701] Metadata table initailization with pending instants [hudi]
hudi-bot commented on PR #11137: URL: https://github.com/apache/hudi/pull/11137#issuecomment-2094573804 ## CI report: * adc1380cb496881fd2f1c8b30aa059759c7c5c9c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23662) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7701] Metadata table initailization with pending instants [hudi]
hudi-bot commented on PR #11137: URL: https://github.com/apache/hudi/pull/11137#issuecomment-2094535016 ## CI report: * a668de4b47df64e2d09b8c1bd0a172271c41a7e3 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23644) * adc1380cb496881fd2f1c8b30aa059759c7c5c9c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23662) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7701] Metadata table initailization with pending instants [hudi]
hudi-bot commented on PR #11137: URL: https://github.com/apache/hudi/pull/11137#issuecomment-2094533717 ## CI report: * a668de4b47df64e2d09b8c1bd0a172271c41a7e3 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23644) * adc1380cb496881fd2f1c8b30aa059759c7c5c9c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Use parent as the glob path when full file path specified [hudi]
the-other-tim-brown commented on code in PR #11150: URL: https://github.com/apache/hudi/pull/11150#discussion_r1590187272 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/clustering/run/strategy/MultipleSparkJobExecutionStrategy.java: ## @@ -457,9 +457,10 @@ private Dataset readRecordsForGroupAsRow(JavaSparkContext jsc, String readPathString = String.join(",", Arrays.stream(paths).map(StoragePath::toString).toArray(String[]::new)); +String globPathString = String.join(",", Arrays.stream(paths).map(StoragePath::getParent).map(StoragePath::toString).distinct().toArray(String[]::new)); params.put("hoodie.datasource.read.paths", readPathString); // Building HoodieFileIndex needs this param to decide query path -params.put("glob.paths", readPathString); +params.put("glob.paths", globPathString); Review Comment: I can't find a test class matching this class name. Is there a clustering test suite I should look in? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Use parent as the glob path when full file path specified [hudi]
danny0405 commented on code in PR #11150: URL: https://github.com/apache/hudi/pull/11150#discussion_r1590186996 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/clustering/run/strategy/MultipleSparkJobExecutionStrategy.java: ## @@ -457,9 +457,10 @@ private Dataset readRecordsForGroupAsRow(JavaSparkContext jsc, String readPathString = String.join(",", Arrays.stream(paths).map(StoragePath::toString).toArray(String[]::new)); +String globPathString = String.join(",", Arrays.stream(paths).map(StoragePath::getParent).map(StoragePath::toString).distinct().toArray(String[]::new)); params.put("hoodie.datasource.read.paths", readPathString); // Building HoodieFileIndex needs this param to decide query path -params.put("glob.paths", readPathString); +params.put("glob.paths", globPathString); Review Comment: do we have any test cases? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7710] Remove compaction.inflight from conflict resolution [hudi]
danny0405 commented on code in PR #11148: URL: https://github.com/apache/hudi/pull/11148#discussion_r1590186936 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/SimpleConcurrentFileWritesConflictResolutionStrategy.java: ## @@ -68,6 +69,7 @@ public Stream getCandidateInstants(HoodieTableMetaClient metaClie .getTimelineOfActions(CollectionUtils.createSet(REPLACE_COMMIT_ACTION, COMPACTION_ACTION)) .findInstantsAfter(currentInstant.getTimestamp()) .filterInflightsAndRequested() +.filter(i -> (!i.getAction().equals(COMPACTION_ACTION)) || i.getState().equals(REQUESTED)) .getInstantsAsStream(); Review Comment: I guess if the compaction does not really execute before, there is no need to resolve the conflicts, because the log files would slice based on their specific completion time. If there is no confclits for the same file group from multiple writers, then we are good. @linliu-code , we can add some test cases to illustrate this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] update fork count [hudi]
hudi-bot commented on PR #11107: URL: https://github.com/apache/hudi/pull/11107#issuecomment-2094397899 ## CI report: * 48122188bc0ee8f85d1d14aee3d5c320f2fb7b29 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23661) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Use parent as the glob path when full file path specified [hudi]
hudi-bot commented on PR #11150: URL: https://github.com/apache/hudi/pull/11150#issuecomment-2094365713 ## CI report: * 353708c54b454bf3749596f74267970f1c332b7b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23660) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] update fork count [hudi]
hudi-bot commented on PR #11107: URL: https://github.com/apache/hudi/pull/11107#issuecomment-2094354296 ## CI report: * 9757330d1ed3ff1afb3bc1b08b0f3ece78917045 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23539) * 48122188bc0ee8f85d1d14aee3d5c320f2fb7b29 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23661) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] update fork count [hudi]
hudi-bot commented on PR #11107: URL: https://github.com/apache/hudi/pull/11107#issuecomment-2094352389 ## CI report: * 9757330d1ed3ff1afb3bc1b08b0f3ece78917045 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23539) * 48122188bc0ee8f85d1d14aee3d5c320f2fb7b29 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7707] Enable bundle validation on Java 8 and 11 [hudi]
hudi-bot commented on PR #11142: URL: https://github.com/apache/hudi/pull/11142#issuecomment-2094350107 ## CI report: * 4d3fc1c3ff0254f545803f93ed361e448245ffaa Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23659) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7710] Remove compaction.inflight from conflict resolution [hudi]
yihua commented on code in PR #11148: URL: https://github.com/apache/hudi/pull/11148#discussion_r1590072865 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/SimpleConcurrentFileWritesConflictResolutionStrategy.java: ## @@ -68,6 +69,7 @@ public Stream getCandidateInstants(HoodieTableMetaClient metaClie .getTimelineOfActions(CollectionUtils.createSet(REPLACE_COMMIT_ACTION, COMPACTION_ACTION)) .findInstantsAfter(currentInstant.getTimestamp()) .filterInflightsAndRequested() +.filter(i -> (!i.getAction().equals(COMPACTION_ACTION)) || i.getState().equals(REQUESTED)) .getInstantsAsStream(); Review Comment: @linliu-code We still need to check the compaction for conflict correct? So instead of filtering out `compaction.inflight`, we should convert `instant.compaction.inflight` to `instant.compaction.request` for checking? Can you write a test case for this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Use parent as the glob path when full file path specified [hudi]
hudi-bot commented on PR #11150: URL: https://github.com/apache/hudi/pull/11150#issuecomment-2094339138 ## CI report: * 11abd3eb1b9418d9013f820e3779f56c50810dfd Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23658) * 353708c54b454bf3749596f74267970f1c332b7b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23660) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Use parent as the glob path when full file path specified [hudi]
hudi-bot commented on PR #11150: URL: https://github.com/apache/hudi/pull/11150#issuecomment-2094337137 ## CI report: * 11abd3eb1b9418d9013f820e3779f56c50810dfd Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23658) * 353708c54b454bf3749596f74267970f1c332b7b UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Use parent as the glob path when full file path specified [hudi]
hudi-bot commented on PR #11150: URL: https://github.com/apache/hudi/pull/11150#issuecomment-2094335218 ## CI report: * 11abd3eb1b9418d9013f820e3779f56c50810dfd Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23658) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7711] Fix MultiTableStreamer can deal with path of properties files [hudi]
hudi-bot commented on PR #11149: URL: https://github.com/apache/hudi/pull/11149#issuecomment-2094335208 ## CI report: * bb03750d5e951785c6205f501d463614cc3315cf Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23657) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7707] Enable bundle validation on Java 8 and 11 [hudi]
hudi-bot commented on PR #11142: URL: https://github.com/apache/hudi/pull/11142#issuecomment-2094335199 ## CI report: * 7bb8334732cdbdd3cba9868ea66c2c0817559981 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23656) * 4d3fc1c3ff0254f545803f93ed361e448245ffaa Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23659) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Use parent as the glob path when full file path specified [hudi]
hudi-bot commented on PR #11150: URL: https://github.com/apache/hudi/pull/11150#issuecomment-2094313377 ## CI report: * 11abd3eb1b9418d9013f820e3779f56c50810dfd Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23658) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7707] Enable bundle validation on Java 8 and 11 [hudi]
hudi-bot commented on PR #11142: URL: https://github.com/apache/hudi/pull/11142#issuecomment-2094313307 ## CI report: * 7bb8334732cdbdd3cba9868ea66c2c0817559981 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23656) * 4d3fc1c3ff0254f545803f93ed361e448245ffaa UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Use parent as the glob path when full file path specified [hudi]
hudi-bot commented on PR #11150: URL: https://github.com/apache/hudi/pull/11150#issuecomment-2094309158 ## CI report: * 11abd3eb1b9418d9013f820e3779f56c50810dfd UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7711] Fix MultiTableStreamer can deal with path of properties files [hudi]
hudi-bot commented on PR #11149: URL: https://github.com/apache/hudi/pull/11149#issuecomment-2094305741 ## CI report: * bb03750d5e951785c6205f501d463614cc3315cf Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23657) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [MINOR] Use parent as the glob path when full file path specified [hudi]
the-other-tim-brown opened a new pull request, #11150: URL: https://github.com/apache/hudi/pull/11150 ### Change Logs - Fix usages of the glob paths to take in partition level paths instead of file level paths in clustering and metadata writing ### Impact - Fixes a bug where we see listing calls per file instead of per partition ### Risk level (write none, low medium or high below) low ### Documentation Update _Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none"._ - _The config description must be updated if new configs are added or the default value of the configs are changed_ - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make changes to the website._ ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7711] Fix MultiTableStreamer can deal with path of properties files [hudi]
hudi-bot commented on PR #11149: URL: https://github.com/apache/hudi/pull/11149#issuecomment-2094288851 ## CI report: * bb03750d5e951785c6205f501d463614cc3315cf UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7711) Fix MultiTableStreamer can deal with path of properties file for each streamer
[ https://issues.apache.org/jira/browse/HUDI-7711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jihwan Lee updated HUDI-7711: - Description: HudiMultiTableStreamer initializes common configs, then deepcopy related fields into each streams. Because _propsFilePath_ on each streamer is not handled, they always retrieve path of test files as default value. Also, if runs MultiTableStreamer with {_}--hoodie-conf{_}, each streamer should be able to have these configs. (such like inheritance) MultiTable configs (kafka-source.properties): {code:java} ... hoodie.streamer.ingestion.tablesToBeIngested=db.tbl1,db.tb2 hoodie.streamer.ingestion.db.tbl1.configFile=hdfs:///tmp/config_1.properties hoodie.streamer.ingestion.db.tbl2.configFile=hdfs:///tmp/config_2.properties ... {code} /tmp/config_1.properties: {code:java} ... hoodie.datasource.write.recordkey.field=id hoodie.streamer.source.kafka.topic=topic1 ... {code} /tmp/config_2.properties: {code:java} ... hoodie.datasource.write.recordkey.field=id hoodie.streamer.source.kafka.topic=topic2 ... {code} error log (workspace is replaced to \{RUNNING_PATH}) : {code:java} 24/05/04 21:41:01 ERROR config.DFSPropertiesConfiguration: Error reading in properties from dfs from file file:{RUNNING_PATH}/src/test/resources/streamer-config/dfs-source.properties 24/05/04 21:41:01 INFO streamer.StreamSync: Shutting down embedded timeline server 24/05/04 21:41:01 ERROR streamer.HoodieMultiTableStreamer: error while running MultiTableDeltaStreamer for table: {TABLE} org.apache.hudi.exception.HoodieIOException: Cannot read properties from dfs from file file:{RUNNING_PATH}/src/test/resources/streamer-config/dfs-source.properties at org.apache.hudi.common.config.DFSPropertiesConfiguration.addPropsFromFile(DFSPropertiesConfiguration.java:168) at org.apache.hudi.common.config.DFSPropertiesConfiguration.(DFSPropertiesConfiguration.java:87) at org.apache.hudi.utilities.UtilHelpers.readConfig(UtilHelpers.java:258) at org.apache.hudi.utilities.streamer.HoodieStreamer$Config.getProps(HoodieStreamer.java:453) at org.apache.hudi.utilities.streamer.StreamSync.getDeducedSchemaProvider(StreamSync.java:714) at org.apache.hudi.utilities.streamer.StreamSync.fetchNextBatchFromSource(StreamSync.java:676) at org.apache.hudi.utilities.streamer.StreamSync.fetchFromSourceAndPrepareRecords(StreamSync.java:568) at org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:540) at org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:444) at org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:874) at org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72) at org.apache.hudi.common.util.Option.ifPresent(Option.java:101) at org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:216) at org.apache.hudi.utilities.streamer.HoodieMultiTableStreamer.sync(HoodieMultiTableStreamer.java:457) at org.apache.hudi.utilities.streamer.HoodieMultiTableStreamer.main(HoodieMultiTableStreamer.java:282) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.FileNotFoundException: File file:{RUNNING_PATH}/src/test/resources/streamer-config/dfs-source.properties does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:146) at org.apache.hadoop.fs.Chec
[PR] [HUDI-7711] Fix MultiTableStreamer can deal with path of properties files [hudi]
hwani3142 opened a new pull request, #11149: URL: https://github.com/apache/hudi/pull/11149 ### Change Logs fix copy logic on MultiTableStreamer ### Impact HoodieMultiTableStreamer ### Risk level (write none, low medium or high below) low ### Documentation Update none ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7711) Fix MultiTableStreamer can deal with path of properties file for each streamer
[ https://issues.apache.org/jira/browse/HUDI-7711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7711: - Labels: pull-request-available (was: ) > Fix MultiTableStreamer can deal with path of properties file for each streamer > -- > > Key: HUDI-7711 > URL: https://issues.apache.org/jira/browse/HUDI-7711 > Project: Apache Hudi > Issue Type: Bug > Components: hudi-utilities > Environment: hudi0.14.1, Spark3.2 >Reporter: Jihwan Lee >Priority: Major > Labels: pull-request-available > > HudiMultiTableStreamer initializes common configs, then deepcopy related > fields into each streams. > Because _propsFilePath_ on each streamer is not handled, they always retrieve > path of test files as default value. > > Also, if runs MultiTableStreamer with {_}--hoodie-conf{_}, each streamer > should be able to have these configs. (such like inheritance) > > MultiTable configs (kafka-source.properties): > > {code:java} > ... > hoodie.streamer.ingestion.tablesToBeIngested=db.tbl1,db.tb2 > hoodie.streamer.ingestion.db.tbl1.configFile=hdfs:///tmp/config_1.properties > hoodie.streamer.ingestion.db.tbl2.configFile=hdfs:///tmp/config_2.properties > ... {code} > > > /tmp/config_1.properties: > > {code:java} > ... > hoodie.datasource.write.recordkey.field=id > hoodie.streamer.source.kafka.topic=topic1 > ... {code} > > > /tmp/config_2.properties: > {code:java} > ... > hoodie.datasource.write.recordkey.field=id > hoodie.streamer.source.kafka.topic=topic2 > ... {code} > > error log (workspace is replaced to \{RUNNING_PATH}) : > > {code:java} > 24/05/04 21:41:01 ERROR config.DFSPropertiesConfiguration: Error reading in > properties from dfs from file > file:{RUNNING_PATH}/src/test/resources/streamer-config/dfs-source.properties > 24/05/04 21:41:01 INFO streamer.StreamSync: Shutting down embedded timeline > server > 24/05/04 21:41:01 ERROR streamer.HoodieMultiTableStreamer: error while > running MultiTableDeltaStreamer for table: review_processed_data > org.apache.hudi.exception.HoodieIOException: Cannot read properties from dfs > from file > file:{RUNNING_PATH}/src/test/resources/streamer-config/dfs-source.properties > at > org.apache.hudi.common.config.DFSPropertiesConfiguration.addPropsFromFile(DFSPropertiesConfiguration.java:168) > at > org.apache.hudi.common.config.DFSPropertiesConfiguration.(DFSPropertiesConfiguration.java:87) > at > org.apache.hudi.utilities.UtilHelpers.readConfig(UtilHelpers.java:258) > at > org.apache.hudi.utilities.streamer.HoodieStreamer$Config.getProps(HoodieStreamer.java:453) > at > org.apache.hudi.utilities.streamer.StreamSync.getDeducedSchemaProvider(StreamSync.java:714) > at > org.apache.hudi.utilities.streamer.StreamSync.fetchNextBatchFromSource(StreamSync.java:676) > at > org.apache.hudi.utilities.streamer.StreamSync.fetchFromSourceAndPrepareRecords(StreamSync.java:568) > at > org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:540) > at > org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:444) > at > org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:874) > at > org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:101) > at > org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:216) > at > org.apache.hudi.utilities.streamer.HoodieMultiTableStreamer.sync(HoodieMultiTableStreamer.java:457) > at > org.apache.hudi.utilities.streamer.HoodieMultiTableStreamer.main(HoodieMultiTableStreamer.java:282) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052) >
[jira] [Created] (HUDI-7711) Fix MultiTableStreamer can deal with path of properties file for each streamer
Jihwan Lee created HUDI-7711: Summary: Fix MultiTableStreamer can deal with path of properties file for each streamer Key: HUDI-7711 URL: https://issues.apache.org/jira/browse/HUDI-7711 Project: Apache Hudi Issue Type: Bug Components: hudi-utilities Environment: hudi0.14.1, Spark3.2 Reporter: Jihwan Lee HudiMultiTableStreamer initializes common configs, then deepcopy related fields into each streams. Because _propsFilePath_ on each streamer is not handled, they always retrieve path of test files as default value. Also, if runs MultiTableStreamer with {_}--hoodie-conf{_}, each streamer should be able to have these configs. (such like inheritance) MultiTable configs (kafka-source.properties): {code:java} ... hoodie.streamer.ingestion.tablesToBeIngested=db.tbl1,db.tb2 hoodie.streamer.ingestion.db.tbl1.configFile=hdfs:///tmp/config_1.properties hoodie.streamer.ingestion.db.tbl2.configFile=hdfs:///tmp/config_2.properties ... {code} /tmp/config_1.properties: {code:java} ... hoodie.datasource.write.recordkey.field=id hoodie.streamer.source.kafka.topic=topic1 ... {code} /tmp/config_2.properties: {code:java} ... hoodie.datasource.write.recordkey.field=id hoodie.streamer.source.kafka.topic=topic2 ... {code} error log (workspace is replaced to \{RUNNING_PATH}) : {code:java} 24/05/04 21:41:01 ERROR config.DFSPropertiesConfiguration: Error reading in properties from dfs from file file:{RUNNING_PATH}/src/test/resources/streamer-config/dfs-source.properties 24/05/04 21:41:01 INFO streamer.StreamSync: Shutting down embedded timeline server 24/05/04 21:41:01 ERROR streamer.HoodieMultiTableStreamer: error while running MultiTableDeltaStreamer for table: review_processed_data org.apache.hudi.exception.HoodieIOException: Cannot read properties from dfs from file file:{RUNNING_PATH}/src/test/resources/streamer-config/dfs-source.properties at org.apache.hudi.common.config.DFSPropertiesConfiguration.addPropsFromFile(DFSPropertiesConfiguration.java:168) at org.apache.hudi.common.config.DFSPropertiesConfiguration.(DFSPropertiesConfiguration.java:87) at org.apache.hudi.utilities.UtilHelpers.readConfig(UtilHelpers.java:258) at org.apache.hudi.utilities.streamer.HoodieStreamer$Config.getProps(HoodieStreamer.java:453) at org.apache.hudi.utilities.streamer.StreamSync.getDeducedSchemaProvider(StreamSync.java:714) at org.apache.hudi.utilities.streamer.StreamSync.fetchNextBatchFromSource(StreamSync.java:676) at org.apache.hudi.utilities.streamer.StreamSync.fetchFromSourceAndPrepareRecords(StreamSync.java:568) at org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:540) at org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:444) at org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:874) at org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72) at org.apache.hudi.common.util.Option.ifPresent(Option.java:101) at org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:216) at org.apache.hudi.utilities.streamer.HoodieMultiTableStreamer.sync(HoodieMultiTableStreamer.java:457) at org.apache.hudi.utilities.streamer.HoodieMultiTableStreamer.main(HoodieMultiTableStreamer.java:282) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.FileNotFoundException: File file:/home1/irteam/user/jihwan/hudi-util/multi_review/src/test/resources/streamer-config/dfs-source.properties does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930) at org.apache.hadoop.fs.RawLocalFi
Re: [PR] [HUDI-7707] Enable bundle validation on Java 8 and 11 [hudi]
hudi-bot commented on PR #11142: URL: https://github.com/apache/hudi/pull/11142#issuecomment-2094105336 ## CI report: * 7bb8334732cdbdd3cba9868ea66c2c0817559981 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23656) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7707] Enable bundle validation on Java 8 and 11 [hudi]
hudi-bot commented on PR #11142: URL: https://github.com/apache/hudi/pull/11142#issuecomment-2094089132 ## CI report: * fd5383cabb77ad3afc075ee1545e65c7e0613855 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23638) * 7bb8334732cdbdd3cba9868ea66c2c0817559981 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23656) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (HUDI-7710) BugFix: Remove compaction.inflight from conflict resolution
[ https://issues.apache.org/jira/browse/HUDI-7710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu closed HUDI-7710. - Resolution: Fixed > BugFix: Remove compaction.inflight from conflict resolution > --- > > Key: HUDI-7710 > URL: https://issues.apache.org/jira/browse/HUDI-7710 > Project: Apache Hudi > Issue Type: Improvement > Components: compaction >Reporter: Lin Liu >Assignee: Lin Liu >Priority: Critical > Labels: pull-request-available > > During conflict resolution, compaction.inflight is found; since they don't > contain any plan information, this could cause NPE error. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [HUDI-7652] Add new `HoodieMergeKey` API to support simple and composite keys [hudi]
danny0405 commented on code in PR #11077: URL: https://github.com/apache/hudi/pull/11077#discussion_r1589929342 ## hudi-common/src/main/java/org/apache/hudi/common/model/HoodieSimpleMergeKey.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.hudi.common.model; + +import java.io.Serializable; +import java.util.Objects; + +/** + * Wraps {@link HoodieKey} and implements the {@link HoodieMergeKey} interface for simple scenarios where the key is a string. + */ +public class HoodieSimpleMergeKey implements HoodieMergeKey { Review Comment: I'm talking about the notion instead of the physical impl. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7652] Add new `HoodieMergeKey` API to support simple and composite keys [hudi]
danny0405 commented on code in PR #11077: URL: https://github.com/apache/hudi/pull/11077#discussion_r1589929165 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java: ## @@ -81,7 +83,7 @@ public class HoodieMergedLogRecordScanner extends AbstractHoodieLogRecordReader // A timer for calculating elapsed time in millis public final HoodieTimer timer = HoodieTimer.create(); // Map of compacted/merged records - private final ExternalSpillableMap records; + private final ExternalSpillableMap records; Review Comment: > we do a separate class hierarchy and not overload HoodieKey Not sure about the specific background here, but make the `ExternalSpillableMap` key as `Serializable` does not overload anything? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7707] Enable bundle validation on Java 8 and 11 [hudi]
hudi-bot commented on PR #11142: URL: https://github.com/apache/hudi/pull/11142#issuecomment-2094076123 ## CI report: * fd5383cabb77ad3afc075ee1545e65c7e0613855 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23638) * 7bb8334732cdbdd3cba9868ea66c2c0817559981 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(hudi) branch master updated (8911aa2d3c7 -> 1c7f8376ade)
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 8911aa2d3c7 [HUDI-7576] Improve efficiency of getRelativePartitionPath, reduce computation of partitionPath in AbstractTableFileSystemView (#11001) add 1c7f8376ade [HUDI-7710] Remove compaction.inflight from conflict resolution (#11148) No new revisions were added by this update. Summary of changes: .../SimpleConcurrentFileWritesConflictResolutionStrategy.java | 2 ++ 1 file changed, 2 insertions(+)
Re: [PR] [HUDI-7710] Remove `compaction.inflight` from conflict resolution [hudi]
danny0405 merged PR #11148: URL: https://github.com/apache/hudi/pull/11148 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7710] Remove `compaction.inflight` from conflict resolution [hudi]
danny0405 commented on code in PR #11148: URL: https://github.com/apache/hudi/pull/11148#discussion_r1589927822 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/SimpleConcurrentFileWritesConflictResolutionStrategy.java: ## @@ -68,6 +69,7 @@ public Stream getCandidateInstants(HoodieTableMetaClient metaClie .getTimelineOfActions(CollectionUtils.createSet(REPLACE_COMMIT_ACTION, COMPACTION_ACTION)) .findInstantsAfter(currentInstant.getTimestamp()) .filterInflightsAndRequested() +.filter(i -> (!i.getAction().equals(COMPACTION_ACTION)) || i.getState().equals(REQUESTED)) .getInstantsAsStream(); Review Comment: Looks reasonable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7710] Remove `compaction.inflight` from conflict resolution [hudi]
hudi-bot commented on PR #11148: URL: https://github.com/apache/hudi/pull/11148#issuecomment-2094072154 ## CI report: * 82ace4ec10ccae4108bed6f67674390f905eee7f Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23654) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(hudi-rs) branch main updated: ci: fix failing check and test case (#10)
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/hudi-rs.git The following commit(s) were added to refs/heads/main by this push: new 82c1dce ci: fix failing check and test case (#10) 82c1dce is described below commit 82c1dce7848b117ec95107e08dfeded6f34e0b37 Author: Shiyan Xu <2701446+xushi...@users.noreply.github.com> AuthorDate: Sat May 4 02:24:54 2024 -0500 ci: fix failing check and test case (#10) fixes #4 --- .licenserc.yaml | 1 + crates/core/src/table/meta_client.rs | 5 +++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/.licenserc.yaml b/.licenserc.yaml index 8fb45ba..2ec3964 100644 --- a/.licenserc.yaml +++ b/.licenserc.yaml @@ -23,6 +23,7 @@ header: paths-ignore: - 'LICENSE' - 'NOTICE' +- '**/fixtures/**' comment: on-failure diff --git a/crates/core/src/table/meta_client.rs b/crates/core/src/table/meta_client.rs index f8c8e41..27f0cf9 100644 --- a/crates/core/src/table/meta_client.rs +++ b/crates/core/src/table/meta_client.rs @@ -120,9 +120,10 @@ fn meta_client_get_partition_paths() { let target_table_path = extract_test_table(fixture_path); let meta_client = MetaClient::new(&target_table_path); let partition_paths = meta_client.get_partition_paths().unwrap(); +let partition_path_set: HashSet<&str> = HashSet::from_iter(partition_paths.iter().map(|p| p.as_str())); assert_eq!( -partition_paths, -vec!["chennai", "sao_paulo", "san_francisco"] +partition_path_set, +HashSet::from_iter(vec!["chennai", "sao_paulo", "san_francisco"]) ) }