[GitHub] [hudi] CaesarWangX commented on issue #6543: [SUPPORT] Unable to load class of UserDefinedMetricsReporter in hudi0.11

2022-08-31 Thread GitBox
CaesarWangX commented on issue #6543: URL: https://github.com/apache/hudi/issues/6543#issuecomment-1233770517 Hi @codope . I am sure that the custom reporter exists in my jar package, because I use the same jar in hudi0.10 and hudi0.9, and it worked -- This is an automated message

[GitHub] [hudi] hudi-bot commented on pull request #6562: [DO_NOT_MERGE] fixing schema for partition path in test data generator

2022-08-31 Thread GitBox
hudi-bot commented on PR #6562: URL: https://github.com/apache/hudi/pull/6562#issuecomment-1233770287 ## CI report: * d5715e1db0ac861c5def0f1632b6cc19fd191617 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6562: [DO_NOT_MERGE] fixing schema for partition path in test data generator

2022-08-31 Thread GitBox
hudi-bot commented on PR #6562: URL: https://github.com/apache/hudi/pull/6562#issuecomment-1233767141 ## CI report: * d5715e1db0ac861c5def0f1632b6cc19fd191617 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6561: [HUDI-4760] Fixing repeated trigger of data file creations w/ clustering

2022-08-31 Thread GitBox
hudi-bot commented on PR #6561: URL: https://github.com/apache/hudi/pull/6561#issuecomment-1233767111 ## CI report: * a6cf2db7c0a449293ac7b50894055b3abb010cde Azure:

[GitHub] [hudi] hudi-bot commented on pull request #4958: [HUDI-3558] Consistent bucket index: bucket resizing (split) & concurrent write during resizing

2022-08-31 Thread GitBox
hudi-bot commented on PR #4958: URL: https://github.com/apache/hudi/pull/4958#issuecomment-1233765813 ## CI report: * 0817fdd44736cb07b39afd3203c14e169bfb2483 Azure:

[GitHub] [hudi] codope commented on issue #6551: [SUPPORT] Hudi only support one table one HoodieJavaWriteClient

2022-08-31 Thread GitBox
codope commented on issue #6551: URL: https://github.com/apache/hudi/issues/6551#issuecomment-1233763768 @xxWSHxx IMO, a write client per table makes sense as tables could have different configs for different purposes. Could add a little bit more detail about your use case, why would you

[GitHub] [hudi] hudi-bot commented on pull request #6561: [HUDI-4760] Fixing repeated trigger of data file creations w/ clustering

2022-08-31 Thread GitBox
hudi-bot commented on PR #6561: URL: https://github.com/apache/hudi/pull/6561#issuecomment-1233763725 ## CI report: * a6cf2db7c0a449293ac7b50894055b3abb010cde UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] codope commented on issue #6543: [SUPPORT] Unable to load class of UserDefinedMetricsReporter in hudi0.11

2022-08-31 Thread GitBox
codope commented on issue #6543: URL: https://github.com/apache/hudi/issues/6543#issuecomment-1233762585 @CaesarWangX Can you confirm that the jar with the custom reporter class is in the classpath. It should have worked. We even have a [test that exercises that code

[GitHub] [hudi] codope commented on issue #6552: [SUPPORT] AWSDmsAvroPayload does not work correctly with any version above 0.10.0

2022-08-31 Thread GitBox
codope commented on issue #6552: URL: https://github.com/apache/hudi/issues/6552#issuecomment-1233753449 Configuration looks fine to me, except you don't really need to set `HoodieWriteConfig.COMBINE_BEFORE_INSERT.key() -> "true"`. Looking at the implementation of `getInsertValue`, an

[jira] [Updated] (HUDI-4731) Shutdown cloud watch reporter on exit

2022-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4731: - Labels: pull-request-available (was: ) > Shutdown cloud watch reporter on exit >

[GitHub] [hudi] xushiyan commented on a diff in pull request #6468: [HUDI-4731] Shutdown CloudWatch reporter when query completes

2022-08-31 Thread GitBox
xushiyan commented on code in PR #6468: URL: https://github.com/apache/hudi/pull/6468#discussion_r960218864 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java: ## @@ -208,6 +209,7 @@ public void sync() throws Exception {

[GitHub] [hudi] nsivabalan opened a new pull request, #6562: [DO_NOT_MERGE] fixing schema for partition path in test data generator

2022-08-31 Thread GitBox
nsivabalan opened a new pull request, #6562: URL: https://github.com/apache/hudi/pull/6562 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any

[GitHub] [hudi] nsivabalan commented on issue #6212: [SUPPORT] Hudi creates duplicate, redundant file during clustering

2022-08-31 Thread GitBox
nsivabalan commented on issue #6212: URL: https://github.com/apache/hudi/issues/6212#issuecomment-1233734308 Here is the fix https://github.com/apache/hudi/pull/6561 Can you verify w/ the patch, you don't see such duplicates. -- This is an automated message from the Apache Git

[GitHub] [hudi] hudi-bot commented on pull request #6510: [HUDI-4724]Add function of skip the _rt suffix for read snapshot

2022-08-31 Thread GitBox
hudi-bot commented on PR #6510: URL: https://github.com/apache/hudi/pull/6510#issuecomment-1233733952 ## CI report: * d0f97dab9dad61e708c7355ad65b82e2e6a59552 Azure:

[jira] [Updated] (HUDI-4760) Clustering results in repeated triggers of clustering execution

2022-08-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4760: - Labels: pull-request-available (was: ) > Clustering results in repeated triggers of clustering

[GitHub] [hudi] nsivabalan opened a new pull request, #6561: [HUDI-4760] Fixing repeated trigger of clustering execution

2022-08-31 Thread GitBox
nsivabalan opened a new pull request, #6561: URL: https://github.com/apache/hudi/pull/6561 ### Change Logs Apparently clustering is being triggered twice since we don't cache the write status and for doing some validation, we do isEmpty on JavaRDD which ended up calling it again.

[jira] [Updated] (HUDI-4760) Clustering results in repeated triggers of clustering execution

2022-08-31 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4760: -- Sprint: 2022/08/22 > Clustering results in repeated triggers of clustering execution >

[jira] [Assigned] (HUDI-4760) Clustering results in repeated triggers of clustering execution

2022-08-31 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-4760: - Assignee: sivabalan narayanan > Clustering results in repeated triggers of

[jira] [Updated] (HUDI-4760) Clustering results in repeated triggers of clustering execution

2022-08-31 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4760: -- Fix Version/s: 0.12.1 > Clustering results in repeated triggers of clustering execution

[jira] [Created] (HUDI-4760) Clustering results in repeated triggers of clustering execution

2022-08-31 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-4760: - Summary: Clustering results in repeated triggers of clustering execution Key: HUDI-4760 URL: https://issues.apache.org/jira/browse/HUDI-4760 Project:

[GitHub] [hudi] nsivabalan commented on issue #6212: [SUPPORT] Hudi creates duplicate, redundant file during clustering

2022-08-31 Thread GitBox
nsivabalan commented on issue #6212: URL: https://github.com/apache/hudi/issues/6212#issuecomment-1233730770 yes, I could able to reproduce :( https://issues.apache.org/jira/browse/HUDI-4760 will put up a fix shortly. -- This is an automated message from the Apache Git

[jira] [Updated] (HUDI-3861) 'path' in CatalogTable#properties failed to be updated when renaming table

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3861: - Status: Patch Available (was: In Progress) > 'path' in CatalogTable#properties failed to be updated when

[jira] [Updated] (HUDI-4731) Shutdown cloud watch reporter on exit

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4731: - Status: Patch Available (was: In Progress) > Shutdown cloud watch reporter on exit >

[jira] [Updated] (HUDI-4731) Shutdown cloud watch reporter on exit

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4731: - Status: In Progress (was: Open) > Shutdown cloud watch reporter on exit >

[jira] [Closed] (HUDI-4742) Fixing AWS Glue partition's location is wrong when updatePartition

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-4742. Resolution: Fixed > Fixing AWS Glue partition's location is wrong when updatePartition >

[hudi] branch master updated: [HUDI-4742] Fix AWS Glue partition's location is wrong when updatePartition (#6545)

2022-08-31 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 5d9db864f1 [HUDI-4742] Fix AWS Glue partition's

[GitHub] [hudi] xushiyan merged pull request #6545: [HUDI-4742] Fix AWS Glue partition's location is wrong when updatePartition

2022-08-31 Thread GitBox
xushiyan merged PR #6545: URL: https://github.com/apache/hudi/pull/6545 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] codope commented on issue #6553: [SUPPORT] upgrade 0.5.3 to 0.12.0 in one go?

2022-08-31 Thread GitBox
codope commented on issue #6553: URL: https://github.com/apache/hudi/issues/6553#issuecomment-1233715675 It should be seamless. Hudi's upgrade handler will execute the upgrade steps automatically upon first commit with the newer version. However, the first commit could take slightly more

[GitHub] [hudi] YuweiXiao commented on issue #5777: [SUPPORT] Hudi table has duplicate data.

2022-08-31 Thread GitBox
YuweiXiao commented on issue #5777: URL: https://github.com/apache/hudi/issues/5777#issuecomment-1233701226 @jjtjiang Hey, could u post the content of `hoodie.property` under `.hoodie` folder? And which version of hudi are you using? Could u try testing it with the latest master

[GitHub] [hudi] xushiyan commented on issue #6281: [SUPPORT] AwsGlueCatalogSyncTool -The number of partition keys do not match the number of partition values

2022-08-31 Thread GitBox
xushiyan commented on issue #6281: URL: https://github.com/apache/hudi/issues/6281#issuecomment-1233691455 @crutis there are some recent fixes wrt glue sync landed in master. if you get a chance, you may quickly try master see if issue resolved. -- This is an automated message from the

[GitHub] [hudi] hudi-bot commented on pull request #4958: [HUDI-3558] Consistent bucket index: bucket resizing (split) & concurrent write during resizing

2022-08-31 Thread GitBox
hudi-bot commented on PR #4958: URL: https://github.com/apache/hudi/pull/4958#issuecomment-1233690676 ## CI report: * 2430be62881a61df2a39ab5ff680603b86e06d54 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #4958: [HUDI-3558] Consistent bucket index: bucket resizing (split) & concurrent write during resizing

2022-08-31 Thread GitBox
hudi-bot commented on PR #4958: URL: https://github.com/apache/hudi/pull/4958#issuecomment-1233687731 ## CI report: * 2430be62881a61df2a39ab5ff680603b86e06d54 Azure:

[GitHub] [hudi] CaesarWangX commented on issue #6543: [SUPPORT] Unable to load class of UserDefinedMetricsReporter in hudi0.11

2022-08-31 Thread GitBox
CaesarWangX commented on issue #6543: URL: https://github.com/apache/hudi/issues/6543#issuecomment-1233684620 Hi @Zouxxyy , Yes, I know the change. I also used CustomizableMetricsReporter -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] XuQianJin-Stars closed issue #6410: [SUPPORT] MergeInto syntax WHEN MATCHED is not optional but must be set.

2022-08-31 Thread GitBox
XuQianJin-Stars closed issue #6410: [SUPPORT] MergeInto syntax WHEN MATCHED is not optional but must be set. URL: https://github.com/apache/hudi/issues/6410 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [hudi] xushiyan commented on issue #6444: [SUPPORT] Timeline Service MarkerDirState thread safe issue

2022-08-31 Thread GitBox
xushiyan commented on issue #6444: URL: https://github.com/apache/hudi/issues/6444#issuecomment-1233676536 > > @novisfff Is this a new marker creation issue at the timeline server after #6383 is landed? > > yes, this may caused by `allMarkers` in `MarkerDirState` just to

[GitHub] [hudi] YuweiXiao commented on a diff in pull request #4958: [HUDI-3558] Consistent bucket index: bucket resizing (split) & concurrent write during resizing

2022-08-31 Thread GitBox
YuweiXiao commented on code in PR #4958: URL: https://github.com/apache/hudi/pull/4958#discussion_r960176114 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/index/ScheduleIndexActionExecutor.java: ## @@ -107,7 +107,7 @@ public Option execute() {

[GitHub] [hudi] nsivabalan commented on issue #6281: [SUPPORT] AwsGlueCatalogSyncTool -The number of partition keys do not match the number of partition values

2022-08-31 Thread GitBox
nsivabalan commented on issue #6281: URL: https://github.com/apache/hudi/issues/6281#issuecomment-1233674813 @xushiyan : can you follow up on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] eric9204 commented on issue #6308: [SUPPORT] Spark multi writer failed ! ! !

2022-08-31 Thread GitBox
eric9204 commented on issue #6308: URL: https://github.com/apache/hudi/issues/6308#issuecomment-1233667672 @nsivabalan thanks! I've retested with `hoodie.datasource.write.streaming.ignore.failed.batch=false` ,the spark micro-batch indeed fail when hudi commit fail. So, should I

[GitHub] [hudi] xushiyan closed issue #6194: [SUPPORT] repair deduplicate unable to find `_hoodie_record_key` in data

2022-08-31 Thread GitBox
xushiyan closed issue #6194: [SUPPORT] repair deduplicate unable to find `_hoodie_record_key` in data URL: https://github.com/apache/hudi/issues/6194 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] QuChunhe opened a new issue, #6560: [SUPPORT]Hudi java client

2022-08-31 Thread GitBox
QuChunhe opened a new issue, #6560: URL: https://github.com/apache/hudi/issues/6560 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at

[GitHub] [hudi] hudi-bot commented on pull request #6510: [HUDI-4724]Add function of skip the _rt suffix for read snapshot

2022-08-31 Thread GitBox
hudi-bot commented on PR #6510: URL: https://github.com/apache/hudi/pull/6510#issuecomment-123365 ## CI report: * 9ec55e1452cfdeb3b3a5d80547e5df5c92e98c4f Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6510: [HUDI-4724]Add function of skip the _rt suffix for read snapshot

2022-08-31 Thread GitBox
hudi-bot commented on PR #6510: URL: https://github.com/apache/hudi/pull/6510#issuecomment-1233650351 ## CI report: * 9ec55e1452cfdeb3b3a5d80547e5df5c92e98c4f Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6550: [HUDI-4690] Cleaning up duplicated classes in Spark 3.3 module

2022-08-31 Thread GitBox
hudi-bot commented on PR #6550: URL: https://github.com/apache/hudi/pull/6550#issuecomment-1233646212 ## CI report: * 7d9335f49bf37df734f2591845514b5b26f1e6bd Azure:

[jira] [Closed] (HUDI-4723) Add document about Hoodie Catalog

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-4723. Resolution: Fixed > Add document about Hoodie Catalog > - > >

[jira] [Closed] (HUDI-4730) Fix batch job cannot clean old commits files

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-4730. Resolution: Fixed > Fix batch job cannot clean old commits files >

[jira] [Updated] (HUDI-4731) Shutdown cloud watch reporter on exit

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4731: - Reviewers: Raymond Xu > Shutdown cloud watch reporter on exit > - > >

[jira] [Updated] (HUDI-4731) Shutdown cloud watch reporter on exit

2022-08-31 Thread Raymond Xu (Jira)

[jira] [Updated] (HUDI-4731) Shutdown cloud watch reporter on exit

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4731: - Sprint: 2022/08/22 > Shutdown cloud watch reporter on exit > - > >

[jira] [Updated] (HUDI-4731) Shutdown cloud watch reporter on exit

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4731: - Fix Version/s: 0.12.1 > Shutdown cloud watch reporter on exit > - > >

[GitHub] [hudi] linfey90 commented on a diff in pull request #6510: [HUDI-4724]Add function of skip the _rt suffix for read snapshot

2022-08-31 Thread GitBox
linfey90 commented on code in PR #6510: URL: https://github.com/apache/hudi/pull/6510#discussion_r960150163 ## hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java: ## @@ -117,7 +118,13 @@ private void initTableNameVars(HiveSyncConfig config) {

[jira] [Updated] (HUDI-4733) Flag emitDelete is inconsistent in HoodieTableSource and MergeOnReadInputFormat

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4733: - Fix Version/s: 0.12.1 > Flag emitDelete is inconsistent in HoodieTableSource and >

[jira] [Assigned] (HUDI-4733) Flag emitDelete is inconsistent in HoodieTableSource and MergeOnReadInputFormat

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-4733: Assignee: Zhaojing Yu > Flag emitDelete is inconsistent in HoodieTableSource and >

[jira] [Assigned] (HUDI-4734) Add table config change validation in deltastreamer

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-4734: Assignee: Vamshi Gudavarthi (was: sivabalan narayanan) > Add table config change validation in

[jira] [Commented] (HUDI-4734) Add table config change validation in deltastreamer

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598693#comment-17598693 ] Raymond Xu commented on HUDI-4734: -- for [~rmahindra] to triage and prioritize > Add table config change

[jira] [Updated] (HUDI-4734) Add table config change validation in deltastreamer

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4734: - Fix Version/s: 0.12.1 > Add table config change validation in deltastreamer >

[jira] [Closed] (HUDI-4737) Fix flaky: TestHoodieSparkMergeOnReadTableRollback.testRollbackWithDeltaAndCompactionCommit

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-4737. Assignee: xi chaomin Resolution: Fixed > Fix flaky: >

[jira] [Updated] (HUDI-4737) Fix flaky: TestHoodieSparkMergeOnReadTableRollback.testRollbackWithDeltaAndCompactionCommit

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4737: - Fix Version/s: 0.12.0 > Fix flaky: >

[jira] [Assigned] (HUDI-4741) Deadlock when restarting failed TM in AbstractStreamWriteFunction

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-4741: Assignee: voon > Deadlock when restarting failed TM in AbstractStreamWriteFunction >

[jira] [Commented] (HUDI-4741) Deadlock when restarting failed TM in AbstractStreamWriteFunction

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598689#comment-17598689 ] Raymond Xu commented on HUDI-4741: -- great. thanks [~voonhous] [~teng_huo] ! > Deadlock when restarting

[jira] [Assigned] (HUDI-4739) Wrong value returned when length equals 1

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-4739: Assignee: wuwenchi > Wrong value returned when length equals 1 >

[jira] [Updated] (HUDI-4739) Wrong value returned when length equals 1

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4739: - Reviewers: Ethan Guo > Wrong value returned when length equals 1 >

[jira] [Updated] (HUDI-4739) Wrong value returned when length equals 1

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4739: - Sprint: 2022/08/22 > Wrong value returned when length equals 1 >

[jira] [Updated] (HUDI-4739) Wrong value returned when length equals 1

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4739: - Story Points: 1 > Wrong value returned when length equals 1 > - >

[jira] [Updated] (HUDI-4739) Wrong value returned when length equals 1

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4739: - Fix Version/s: 0.12.1 > Wrong value returned when length equals 1 >

[jira] [Updated] (HUDI-4739) Wrong value returned when length equals 1

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4739: - Component/s: writer-core > Wrong value returned when length equals 1 >

[jira] [Commented] (HUDI-4741) Deadlock when restarting failed TM in AbstractStreamWriteFunction

2022-08-31 Thread voon (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598688#comment-17598688 ] voon commented on HUDI-4741: [~teng_huo] and I are refactoring the *AbstractStreamWriteFunction* class +

[jira] [Closed] (HUDI-4740) Add metadata fields for hive catalog #createTable

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-4740. Assignee: Danny Chen Resolution: Fixed > Add metadata fields for hive catalog #createTable >

[jira] [Updated] (HUDI-4741) Deadlock when restarting failed TM in AbstractStreamWriteFunction

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4741: - Fix Version/s: 0.12.1 > Deadlock when restarting failed TM in AbstractStreamWriteFunction >

[jira] [Commented] (HUDI-4741) Deadlock when restarting failed TM in AbstractStreamWriteFunction

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598686#comment-17598686 ] Raymond Xu commented on HUDI-4741: -- [~danny0405] [~yuzhaojing] any of you want to take this? > Deadlock

[jira] [Updated] (HUDI-4742) Fixing AWS Glue partition's location is wrong when updatePartition

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4742: - Fix Version/s: 0.12.1 > Fixing AWS Glue partition's location is wrong when updatePartition >

[jira] [Updated] (HUDI-4742) Fixing AWS Glue partition's location is wrong when updatePartition

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4742: - Story Points: 0.5 > Fixing AWS Glue partition's location is wrong when updatePartition >

[jira] [Updated] (HUDI-4742) Fixing AWS Glue partition's location is wrong when updatePartition

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4742: - Sprint: 2022/08/22 > Fixing AWS Glue partition's location is wrong when updatePartition >

[jira] [Updated] (HUDI-4742) Fixing AWS Glue partition's location is wrong when updatePartition

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4742: - Status: In Progress (was: Open) > Fixing AWS Glue partition's location is wrong when updatePartition >

[jira] [Updated] (HUDI-4742) Fixing AWS Glue partition's location is wrong when updatePartition

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4742: - Status: Patch Available (was: In Progress) > Fixing AWS Glue partition's location is wrong when

[jira] [Updated] (HUDI-4742) Fixing AWS Glue partition's location is wrong when updatePartition

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4742: - Reviewers: Raymond Xu > Fixing AWS Glue partition's location is wrong when updatePartition >

[jira] [Closed] (HUDI-4746) Fix flaky : ITTestDataStreamWrite.testWriteMergeOnReadWithCompaction

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-4746. Fix Version/s: 0.12.1 Resolution: Fixed > Fix flaky :

[jira] [Closed] (HUDI-4747) Fix flaky: ITTestHoodieFlinkCompactor.testHoodieFlinkCompactorWithPlanSelectStrategy

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-4747. Fix Version/s: 0.12.1 Resolution: Fixed > Fix flaky: >

[jira] [Updated] (HUDI-4751) Ensure transaction owner instant is set by all callers of txnManager apis

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4751: - Fix Version/s: 0.12.1 > Ensure transaction owner instant is set by all callers of txnManager apis >

[jira] [Updated] (HUDI-4750) Introduce Hybrid Cleaner policy based on both LATEST_COMMITS and LATEST_FILE_VERSIONS

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4750: - Fix Version/s: 0.13.0 > Introduce Hybrid Cleaner policy based on both LATEST_COMMITS and >

[jira] [Updated] (HUDI-4752) Add dedup support for MOR table in cli

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4752: - Labels: new-to-hudi (was: ) > Add dedup support for MOR table in cli >

[jira] [Updated] (HUDI-4752) Add dedup support for MOR table in cli

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4752: - Fix Version/s: 0.13.0 (was: 1.0.0) > Add dedup support for MOR table in cli >

[GitHub] [hudi] nsivabalan commented on issue #6194: [SUPPORT] repair deduplicate unable to find `_hoodie_record_key` in data

2022-08-31 Thread GitBox
nsivabalan commented on issue #6194: URL: https://github.com/apache/hudi/issues/6194#issuecomment-1233614757 my bad. Its https://issues.apache.org/jira/browse/HUDI-4752 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[jira] [Updated] (HUDI-4752) Add dedup support for MOR table in cli

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4752: - Fix Version/s: 1.0.0 > Add dedup support for MOR table in cli > -- >

[jira] [Commented] (HUDI-4753) More accurate evaluation of log record during log writing or compaction

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598678#comment-17598678 ] Raymond Xu commented on HUDI-4753: -- For [~guoyihua] to triage and set the priority > More accurate

[jira] [Updated] (HUDI-4753) More accurate evaluation of log record during log writing or compaction

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4753: - Fix Version/s: 0.13.0 > More accurate evaluation of log record during log writing or compaction >

[jira] [Assigned] (HUDI-4753) More accurate evaluation of log record during log writing or compaction

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-4753: Assignee: Ethan Guo > More accurate evaluation of log record during log writing or compaction >

[jira] [Updated] (HUDI-4753) More accurate evaluation of log record during log writing or compaction

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4753: - Component/s: metadata > More accurate evaluation of log record during log writing or compaction >

[jira] [Updated] (HUDI-4753) More accurate evaluation of log record during log writing or compaction

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4753: - Issue Type: Improvement (was: Bug) > More accurate evaluation of log record during log writing or

[jira] [Closed] (HUDI-2623) Make hudi-bot comment at PR thread bottom

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-2623. > Make hudi-bot comment at PR thread bottom > - > >

[jira] [Updated] (HUDI-4755) INSERT_OVERWRITE(/TABLE) in spark sql should not fail time travel queries for older timestamps

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4755: - Sprint: 2022/09/05 > INSERT_OVERWRITE(/TABLE) in spark sql should not fail time travel queries for >

[jira] [Updated] (HUDI-4755) INSERT_OVERWRITE(/TABLE) in spark sql should not fail time travel queries for older timestamps

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4755: - Priority: Critical (was: Major) > INSERT_OVERWRITE(/TABLE) in spark sql should not fail time travel

[jira] [Assigned] (HUDI-4755) INSERT_OVERWRITE(/TABLE) in spark sql should not fail time travel queries for older timestamps

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-4755: Assignee: XiaoyuGeng > INSERT_OVERWRITE(/TABLE) in spark sql should not fail time travel queries

[jira] [Updated] (HUDI-4755) INSERT_OVERWRITE(/TABLE) in spark sql should not fail time travel queries for older timestamps

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4755: - Fix Version/s: 0.12.1 > INSERT_OVERWRITE(/TABLE) in spark sql should not fail time travel queries for >

[jira] [Updated] (HUDI-4756) Clean up usages of "assume.date.partition" config within hudi

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4756: - Epic Link: HUDI-1239 > Clean up usages of "assume.date.partition" config within hudi >

[jira] [Updated] (HUDI-4756) Clean up usages of "assume.date.partition" config within hudi

2022-08-31 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-4756: - Fix Version/s: 1.0.0 (was: 0.12.1) > Clean up usages of "assume.date.partition"

[GitHub] [hudi] nsivabalan commented on pull request #6559: [DO-NOT-MERGE][WIP] fixing schema for partition path in test data generator

2022-08-31 Thread GitBox
nsivabalan commented on PR #6559: URL: https://github.com/apache/hudi/pull/6559#issuecomment-1233601170 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] hudi-bot commented on pull request #6550: [HUDI-4690] Cleaning up duplicated classes in Spark 3.3 module

2022-08-31 Thread GitBox
hudi-bot commented on PR #6550: URL: https://github.com/apache/hudi/pull/6550#issuecomment-1233564404 ## CI report: * 7d9335f49bf37df734f2591845514b5b26f1e6bd Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6550: [HUDI-4690] Cleaning up duplicated classes in Spark 3.3 module

2022-08-31 Thread GitBox
hudi-bot commented on PR #6550: URL: https://github.com/apache/hudi/pull/6550#issuecomment-1233561446 ## CI report: * 7d9335f49bf37df734f2591845514b5b26f1e6bd UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] nsivabalan commented on issue #6531: [SUPPORT] Insert results different than bulk_insert

2022-08-31 Thread GitBox
nsivabalan commented on issue #6531: URL: https://github.com/apache/hudi/issues/6531#issuecomment-1233550376 dedup w/ insert could happen by chance if the new batch is routed to the same file group due to small file handling. so thats just a side effect of small file handling. --

[GitHub] [hudi] nsivabalan commented on pull request #6559: [DO-NOT-MERGE][WIP] fixing schema for partition path in test data generator

2022-08-31 Thread GitBox
nsivabalan commented on PR #6559: URL: https://github.com/apache/hudi/pull/6559#issuecomment-1233483182 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

  1   2   3   >