[GitHub] [hudi] hudi-bot commented on pull request #6739: [HUDI-4851] Fixing handling of `UTF8String` w/in `InSet` operator

2022-09-21 Thread GitBox
hudi-bot commented on PR #6739: URL: https://github.com/apache/hudi/pull/6739#issuecomment-1254563996 ## CI report: * 6756f0e59418c7de7a7ca0d47a3fd2ff0427f04a Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6739: [HUDI-4851] Fixing handling of `UTF8String` w/in `InSet` operator

2022-09-21 Thread GitBox
hudi-bot commented on PR #6739: URL: https://github.com/apache/hudi/pull/6739#issuecomment-1254560139 ## CI report: * 6756f0e59418c7de7a7ca0d47a3fd2ff0427f04a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6738: [HUDI-4895] Object store based lock provider

2022-09-21 Thread GitBox
hudi-bot commented on PR #6738: URL: https://github.com/apache/hudi/pull/6738#issuecomment-1254556886 ## CI report: * c0c9616166bf46216cdaf9ff8d634770e325e472 Azure:

[GitHub] [hudi] alexeykudinkin opened a new pull request, #6739: [HUDI-4851] Fixing handling of `UTF8String` w/in `InSet` operator

2022-09-21 Thread GitBox
alexeykudinkin opened a new pull request, #6739: URL: https://github.com/apache/hudi/pull/6739 ### Change Logs This is taking up the fix from https://github.com/apache/hudi/pull/6700, and adding the test for it ### Impact **Risk level: None ### Contributor's

[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

2022-09-21 Thread GitBox
hudi-bot commented on PR #6733: URL: https://github.com/apache/hudi/pull/6733#issuecomment-1254523155 ## CI report: * fa31786d3256e2d0a40ae3c1f874d8f32a45ce82 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6734: [HUDI-3478][HUDI-4887] Use Avro as the format of persisted cdc data

2022-09-21 Thread GitBox
hudi-bot commented on PR #6734: URL: https://github.com/apache/hudi/pull/6734#issuecomment-1254520367 ## CI report: * 3d9071b62050a2b72d2522098f2b3263ddf91e40 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6737: [HUDI-4373] Flink Consistent hashing bucket index write path code

2022-09-21 Thread GitBox
hudi-bot commented on PR #6737: URL: https://github.com/apache/hudi/pull/6737#issuecomment-1254516648 ## CI report: * 5e745fc3455ec2ebdf06f1d3068d9c7a112e4987 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6734: [HUDI-3478][HUDI-4887] Use Avro as the format of persisted cdc data

2022-09-21 Thread GitBox
hudi-bot commented on PR #6734: URL: https://github.com/apache/hudi/pull/6734#issuecomment-1254516618 ## CI report: * 3d9071b62050a2b72d2522098f2b3263ddf91e40 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6737: [HUDI-4373] Flink Consistent hashing bucket index write path code

2022-09-21 Thread GitBox
hudi-bot commented on PR #6737: URL: https://github.com/apache/hudi/pull/6737#issuecomment-125451 ## CI report: * 5e745fc3455ec2ebdf06f1d3068d9c7a112e4987 Azure:

[GitHub] [hudi] eshu commented on issue #6283: [SUPPORT] No .marker files

2022-09-21 Thread GitBox
eshu commented on issue #6283: URL: https://github.com/apache/hudi/issues/6283#issuecomment-1254494003 @nsivabalan Workaround is working, but the bug still exists. If workaround is a resolution, then yes, it is resolved. -- This is an automated message from the Apache Git Service. To

[GitHub] [hudi] hudi-bot commented on pull request #6738: [HUDI-4895] Object store based lock provider

2022-09-21 Thread GitBox
hudi-bot commented on PR #6738: URL: https://github.com/apache/hudi/pull/6738#issuecomment-1254476299 ## CI report: * c0c9616166bf46216cdaf9ff8d634770e325e472 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

2022-09-21 Thread GitBox
hudi-bot commented on PR #6733: URL: https://github.com/apache/hudi/pull/6733#issuecomment-1254476263 ## CI report: * c7c9984860b14b40d3f716f1fc1f16dc70f548b4 Azure:

[GitHub] [hudi] IsisPolei commented on issue #6720: [SUPPORT]Caused by: org.apache.hudi.exception.HoodieRemoteException: Connect to 192.168.64.107:34446 [/192.168.64.107] failed: Connection refused (C

2022-09-21 Thread GitBox
IsisPolei commented on issue #6720: URL: https://github.com/apache/hudi/issues/6720#issuecomment-1254475342 The origin problem is offline compaction. The HoodieJavaWriteClient doesn't support compact inline. @Override protected List compact(String compactionInstantTime,

[GitHub] [hudi] hudi-bot commented on pull request #6738: [HUDI-4895] Object store based lock provider

2022-09-21 Thread GitBox
hudi-bot commented on PR #6738: URL: https://github.com/apache/hudi/pull/6738#issuecomment-1254473579 ## CI report: * c0c9616166bf46216cdaf9ff8d634770e325e472 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

2022-09-21 Thread GitBox
hudi-bot commented on PR #6733: URL: https://github.com/apache/hudi/pull/6733#issuecomment-1254473535 ## CI report: * c7c9984860b14b40d3f716f1fc1f16dc70f548b4 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6737: [HUDI-4373] Flink Consistent hashing bucket index write path code

2022-09-21 Thread GitBox
hudi-bot commented on PR #6737: URL: https://github.com/apache/hudi/pull/6737#issuecomment-1254470224 ## CI report: * 5e745fc3455ec2ebdf06f1d3068d9c7a112e4987 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6736: [HUDI-4894] Fix ClassCastException when using fixed type defining dec…

2022-09-21 Thread GitBox
hudi-bot commented on PR #6736: URL: https://github.com/apache/hudi/pull/6736#issuecomment-1254470203 ## CI report: * 255a6aef08b5f9ee25a556baa31d5c329bd8dcfc Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6284: [HUDI-4526] Improve spillableMapBasePath disk directory is full

2022-09-21 Thread GitBox
hudi-bot commented on PR #6284: URL: https://github.com/apache/hudi/pull/6284#issuecomment-1254469770 ## CI report: * 026dbfc7a6d4d7e489e8c8671a84e143bdb01758 UNKNOWN * 4b0a4e72766491e15dbeb8ed904c9aabae32bb89 Azure:

[hudi] branch release-feature-rfc46 updated: [RFC-46][HUDI-4414] Update the RFC-46 doc to fix comments feedback (#6132)

2022-09-21 Thread yuzhaojing
This is an automated email from the ASF dual-hosted git repository. yuzhaojing pushed a commit to branch release-feature-rfc46 in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/release-feature-rfc46 by this push: new 41392e119f

[GitHub] [hudi] yuzhaojing merged pull request #6132: [RFC-46][HUDI-4414] Update the RFC-46 doc to fix comments feedback

2022-09-21 Thread GitBox
yuzhaojing merged PR #6132: URL: https://github.com/apache/hudi/pull/6132 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] yuzhaojing merged pull request #5629: [HUDI-3384][HUDI-3385] Spark specific file reader/writer.

2022-09-21 Thread GitBox
yuzhaojing merged PR #5629: URL: https://github.com/apache/hudi/pull/5629 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] IsisPolei commented on issue #6720: [SUPPORT]Caused by: org.apache.hudi.exception.HoodieRemoteException: Connect to 192.168.64.107:34446 [/192.168.64.107] failed: Connection refused (C

2022-09-21 Thread GitBox
IsisPolei commented on issue #6720: URL: https://github.com/apache/hudi/issues/6720#issuecomment-125445 I think the main reason of this problem is that my app(where SparkRDDWriteClient process hudi data) and the spark cluster which SparkRDDWriteClient connected are deployed in

[jira] [Updated] (HUDI-4895) Object Store based lock provider

2022-09-21 Thread Yuwei Xiao (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuwei Xiao updated HUDI-4895: - Component/s: multi-writer > Object Store based lock provider > > >

[jira] [Updated] (HUDI-4812) Lazy partition listing and file groups fetching in Spark Query

2022-09-21 Thread Yuwei Xiao (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuwei Xiao updated HUDI-4812: - Component/s: spark > Lazy partition listing and file groups fetching in Spark Query >

[jira] [Created] (HUDI-4896) Consistent hashing index resizing for Flink Engine

2022-09-21 Thread Yuwei Xiao (Jira)
Yuwei Xiao created HUDI-4896: Summary: Consistent hashing index resizing for Flink Engine Key: HUDI-4896 URL: https://issues.apache.org/jira/browse/HUDI-4896 Project: Apache Hudi Issue Type:

[jira] [Updated] (HUDI-4895) Object Store based lock provider

2022-09-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4895: - Labels: pull-request-available (was: ) > Object Store based lock provider >

[GitHub] [hudi] YuweiXiao opened a new pull request, #6738: [HUDI-4895] Object store based lock provider

2022-09-21 Thread GitBox
YuweiXiao opened a new pull request, #6738: URL: https://github.com/apache/hudi/pull/6738 ### Change Logs Currently, we have `FileSystemBasedLockProvier`, which relies on the atomic guarantee of the underlying file system. Specifically, only with filesystem's atomic rename & atomic

[jira] [Assigned] (HUDI-4895) Object Store based lock provider

2022-09-21 Thread Yuwei Xiao (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuwei Xiao reassigned HUDI-4895: Assignee: Yuwei Xiao > Object Store based lock provider > > >

[GitHub] [hudi] loukey-lj commented on pull request #6704: [HUDI-4780] improve test setup

2022-09-21 Thread GitBox
loukey-lj commented on PR #6704: URL: https://github.com/apache/hudi/pull/6704#issuecomment-1254445377 > @xushiyan thank u,this pr supplements the [6602](https://github.com/apache/hudi/pull/6602) test case. You can first look at the review record of

[GitHub] [hudi] xicm commented on a diff in pull request #6715: [HUDI-3983] ClassNotFoundException when using hudi-spark-bundle to write table with hbase index

2022-09-21 Thread GitBox
xicm commented on code in PR #6715: URL: https://github.com/apache/hudi/pull/6715#discussion_r977131538 ## hudi-common/src/main/resources/hbase-site.xml: ## @@ -1699,13 +1699,6 @@ possible configurations would overwhelm and obscure the important. Implementation of the

[GitHub] [hudi] hudi-bot commented on pull request #6737: [HUDI-4373] Flink Consistent hashing bucket index write path code

2022-09-21 Thread GitBox
hudi-bot commented on PR #6737: URL: https://github.com/apache/hudi/pull/6737#issuecomment-1254433425 ## CI report: * 5e745fc3455ec2ebdf06f1d3068d9c7a112e4987 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[jira] [Updated] (HUDI-4373) Consistent bucket index write path for Flink engine

2022-09-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4373: - Labels: pull-request-available (was: ) > Consistent bucket index write path for Flink engine >

[GitHub] [hudi] YuweiXiao opened a new pull request, #6737: [HUDI-4373] Flink Consistent hashing bucket index write path code

2022-09-21 Thread GitBox
YuweiXiao opened a new pull request, #6737: URL: https://github.com/apache/hudi/pull/6737 ### Change Logs Implement consistent hashing bucket index for flink. This PR only covers the write core of the index, and the resizing implementation will be in another PR. There are

[GitHub] [hudi] hudi-bot commented on pull request #6736: [HUDI-4894] Fix ClassCastException when using fixed type defining dec…

2022-09-21 Thread GitBox
hudi-bot commented on PR #6736: URL: https://github.com/apache/hudi/pull/6736#issuecomment-1254427155 ## CI report: * 255a6aef08b5f9ee25a556baa31d5c329bd8dcfc Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6736: [HUDI-4894] Fix ClassCastException when using fixed type defining dec…

2022-09-21 Thread GitBox
hudi-bot commented on PR #6736: URL: https://github.com/apache/hudi/pull/6736#issuecomment-1254423537 ## CI report: * 255a6aef08b5f9ee25a556baa31d5c329bd8dcfc UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6735: [HUDI-4892] Fix hudi-spark3-bundle

2022-09-21 Thread GitBox
hudi-bot commented on PR #6735: URL: https://github.com/apache/hudi/pull/6735#issuecomment-1254419744 ## CI report: * 51c0c21c9f5a689943147a1faded74c67fef61a2 Azure:

[GitHub] [hudi] xicm commented on a diff in pull request #6715: [HUDI-3983] ClassNotFoundException when using hudi-spark-bundle to write table with hbase index

2022-09-21 Thread GitBox
xicm commented on code in PR #6715: URL: https://github.com/apache/hudi/pull/6715#discussion_r977117915 ## hudi-common/src/main/resources/hbase-site.xml: ## @@ -1699,13 +1699,6 @@ possible configurations would overwhelm and obscure the important. Implementation of the

[jira] [Created] (HUDI-4895) Object Store based lock provider

2022-09-21 Thread Yuwei Xiao (Jira)
Yuwei Xiao created HUDI-4895: Summary: Object Store based lock provider Key: HUDI-4895 URL: https://issues.apache.org/jira/browse/HUDI-4895 Project: Apache Hudi Issue Type: Improvement

[jira] [Updated] (HUDI-4894) Fix ClassCastException when using fixed type defining decimal column

2022-09-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4894: - Labels: pull-request-available (was: ) > Fix ClassCastException when using fixed type defining

[GitHub] [hudi] wangxianghu opened a new pull request, #6736: [HUDI-4894] Fix ClassCastException when using fixed type defining dec…

2022-09-21 Thread GitBox
wangxianghu opened a new pull request, #6736: URL: https://github.com/apache/hudi/pull/6736 …imal column ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature

[jira] [Updated] (HUDI-4894) Fix ClassCastException when using fixed type defining decimal column

2022-09-21 Thread Xianghu Wang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xianghu Wang updated HUDI-4894: --- Description: schema for decimal column : {code:java} {     "name": "column_name",     "type":

[jira] [Updated] (HUDI-4894) Fix ClassCastException when using fixed type defining decimal column

2022-09-21 Thread Xianghu Wang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xianghu Wang updated HUDI-4894: --- Description: schema for decimal column : {     "name": "column_name",     "type": ["null",{    

[jira] [Updated] (HUDI-4894) Fix ClassCastException when using fixed type defining decimal column

2022-09-21 Thread Xianghu Wang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xianghu Wang updated HUDI-4894: --- Description: schema for decimal column : {     "name": "column_name",     "type": ["null", {        

[jira] [Created] (HUDI-4894) Fix ClassCastException when using fixed type defining decimal column

2022-09-21 Thread Xianghu Wang (Jira)
Xianghu Wang created HUDI-4894: -- Summary: Fix ClassCastException when using fixed type defining decimal column Key: HUDI-4894 URL: https://issues.apache.org/jira/browse/HUDI-4894 Project: Apache Hudi

[GitHub] [hudi] hudi-bot commented on pull request #6284: [HUDI-4526] Improve spillableMapBasePath disk directory is full

2022-09-21 Thread GitBox
hudi-bot commented on PR #6284: URL: https://github.com/apache/hudi/pull/6284#issuecomment-1254378397 ## CI report: * 026dbfc7a6d4d7e489e8c8671a84e143bdb01758 UNKNOWN * 0ea0766862c16ccec08c7c621f98ca8402f772ff Azure:

[GitHub] [hudi] danny0405 commented on a diff in pull request #6697: [HUDI-3478] Implement CDC Write in Spark

2022-09-21 Thread GitBox
danny0405 commented on code in PR #6697: URL: https://github.com/apache/hudi/pull/6697#discussion_r977092086 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCDCLogger.java: ## @@ -0,0 +1,253 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] [hudi] nsivabalan commented on pull request #5341: [HUDI-3919] [UBER] Support out of order rollback blocks in AbstractHoodieLogRecordReader

2022-09-21 Thread GitBox
nsivabalan commented on PR #5341: URL: https://github.com/apache/hudi/pull/5341#issuecomment-1254376340 Closing in favor of https://github.com/apache/hudi/pull/5958 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [hudi] suryaprasanna closed pull request #5341: [HUDI-3919] [UBER] Support out of order rollback blocks in AbstractHoodieLogRecordReader

2022-09-21 Thread GitBox
suryaprasanna closed pull request #5341: [HUDI-3919] [UBER] Support out of order rollback blocks in AbstractHoodieLogRecordReader URL: https://github.com/apache/hudi/pull/5341 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [hudi] hudi-bot commented on pull request #6284: [HUDI-4526] Improve spillableMapBasePath disk directory is full

2022-09-21 Thread GitBox
hudi-bot commented on PR #6284: URL: https://github.com/apache/hudi/pull/6284#issuecomment-1254375965 ## CI report: * 026dbfc7a6d4d7e489e8c8671a84e143bdb01758 UNKNOWN * 0ea0766862c16ccec08c7c621f98ca8402f772ff Azure:

[GitHub] [hudi] hudi-bot commented on pull request #4015: [HUDI-2780] Fix the issue of Mor log skipping complete blocks when reading data

2022-09-21 Thread GitBox
hudi-bot commented on PR #4015: URL: https://github.com/apache/hudi/pull/4015#issuecomment-1254374818 ## CI report: * e1cf530fbae41de33cb9cc76a16a2e6dc5425837 Azure:

[GitHub] [hudi] danny0405 commented on a diff in pull request #6697: [HUDI-3478] Implement CDC Write in Spark

2022-09-21 Thread GitBox
danny0405 commented on code in PR #6697: URL: https://github.com/apache/hudi/pull/6697#discussion_r977089707 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieSortedMergeHandle.java: ## @@ -93,13 +94,18 @@ public void write(GenericRecord oldRecord) {

[jira] [Updated] (HUDI-4893) More than 1 splits are created for a single log file for MOR table

2022-09-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4893: -- Status: In Progress (was: Open) > More than 1 splits are created for a single log file

[jira] [Updated] (HUDI-4884) Fix website docs for default index type in hudi

2022-09-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4884: -- Reviewers: Ethan Guo > Fix website docs for default index type in hudi >

[jira] [Updated] (HUDI-4893) More than 1 splits are created for a single log file for MOR table

2022-09-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4893: -- Story Points: 2 > More than 1 splits are created for a single log file for MOR table >

[jira] [Updated] (HUDI-4848) Fix tooling for deprecated partition

2022-09-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4848: -- Reviewers: Raymond Xu > Fix tooling for deprecated partition >

[jira] [Assigned] (HUDI-4893) More than 1 splits are created for a single log file for MOR table

2022-09-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-4893: - Assignee: sivabalan narayanan > More than 1 splits are created for a single log

[jira] [Updated] (HUDI-4893) More than 1 splits are created for a single log file for MOR table

2022-09-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4893: -- Sprint: 2022/09/19 > More than 1 splits are created for a single log file for MOR table

[jira] [Updated] (HUDI-4893) More than 1 splits are created for a single log file for MOR table

2022-09-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4893: -- Fix Version/s: 0.12.1 > More than 1 splits are created for a single log file for MOR

[jira] [Created] (HUDI-4893) More than 1 splits are created for a single log file for MOR table

2022-09-21 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-4893: - Summary: More than 1 splits are created for a single log file for MOR table Key: HUDI-4893 URL: https://issues.apache.org/jira/browse/HUDI-4893 Project:

[jira] [Updated] (HUDI-4893) More than 1 splits are created for a single log file for MOR table

2022-09-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4893: -- Priority: Blocker (was: Major) > More than 1 splits are created for a single log file

[GitHub] [hudi] nsivabalan commented on pull request #6284: [HUDI-4526] Improve spillableMapBasePath disk directory is full

2022-09-21 Thread GitBox
nsivabalan commented on PR #6284: URL: https://github.com/apache/hudi/pull/6284#issuecomment-1254371638 @xushiyan : can you review this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] danny0405 commented on a diff in pull request #6697: [HUDI-3478] Implement CDC Write in Spark

2022-09-21 Thread GitBox
danny0405 commented on code in PR #6697: URL: https://github.com/apache/hudi/pull/6697#discussion_r977087389 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java: ## @@ -292,6 +315,9 @@ protected void writeInsertRecord(HoodieRecord

[jira] [Updated] (HUDI-4892) Fix hudi-spark3-bundle

2022-09-21 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4892: Sprint: 2022/09/19 > Fix hudi-spark3-bundle > -- > > Key: HUDI-4892 >

[jira] [Updated] (HUDI-4892) Fix hudi-spark3-bundle

2022-09-21 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4892: Status: Patch Available (was: In Progress) > Fix hudi-spark3-bundle > -- > >

[jira] [Updated] (HUDI-4892) Fix hudi-spark3-bundle

2022-09-21 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4892: Status: In Progress (was: Open) > Fix hudi-spark3-bundle > -- > > Key:

[GitHub] [hudi] hudi-bot commented on pull request #6735: [HUDI-4892] Fix hudi-spark3-bundle

2022-09-21 Thread GitBox
hudi-bot commented on PR #6735: URL: https://github.com/apache/hudi/pull/6735#issuecomment-1254343526 ## CI report: * 51c0c21c9f5a689943147a1faded74c67fef61a2 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6498: [HUDI-4878] Fix incremental cleaner use case

2022-09-21 Thread GitBox
hudi-bot commented on PR #6498: URL: https://github.com/apache/hudi/pull/6498#issuecomment-1254343298 ## CI report: * 3c05d0af21cc79358b7c0ffb7aad579da19129db Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6735: [HUDI-4892] Fix hudi-spark3-bundle

2022-09-21 Thread GitBox
hudi-bot commented on PR #6735: URL: https://github.com/apache/hudi/pull/6735#issuecomment-1254341123 ## CI report: * 51c0c21c9f5a689943147a1faded74c67fef61a2 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6498: [HUDI-4878] Fix incremental cleaner use case

2022-09-21 Thread GitBox
hudi-bot commented on PR #6498: URL: https://github.com/apache/hudi/pull/6498#issuecomment-1254340889 ## CI report: * 054e2a560ef080b3591d52f3b2d1cd8b3c2ab0f7 Azure:

[jira] [Updated] (HUDI-4892) Fix hudi-spark3-bundle

2022-09-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4892: - Labels: pull-request-available (was: ) > Fix hudi-spark3-bundle > -- > >

[GitHub] [hudi] yihua opened a new pull request, #6735: [HUDI-4892] Fix hudi-spark3-bundle

2022-09-21 Thread GitBox
yihua opened a new pull request, #6735: URL: https://github.com/apache/hudi/pull/6735 ### Change Logs This PR fixes the hudi-spark3-bundle. Before this PR, reading a Hudi table with Spark datasource in Spark 3.3 shell with hudi-spark3-bundle throws the following exception. Some

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6734: [HUDI-3478][HUDI-4887] Use Avro as the format of persisted cdc data

2022-09-21 Thread GitBox
alexeykudinkin commented on code in PR #6734: URL: https://github.com/apache/hudi/pull/6734#discussion_r977061910 ## hudi-common/src/main/java/org/apache/hudi/avro/AvroSchemaUtils.java: ## @@ -109,6 +109,11 @@ public static Schema createNullableSchema(Schema.Type avroType) {

[jira] [Created] (HUDI-4892) Fix hudi-spark3-bundle

2022-09-21 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-4892: --- Summary: Fix hudi-spark3-bundle Key: HUDI-4892 URL: https://issues.apache.org/jira/browse/HUDI-4892 Project: Apache Hudi Issue Type: Bug Reporter: Ethan

[GitHub] [hudi] nsivabalan commented on pull request #6498: [HUDI-4878] Fix incremental cleaner use case

2022-09-21 Thread GitBox
nsivabalan commented on PR #6498: URL: https://github.com/apache/hudi/pull/6498#issuecomment-1254323578 @codope: Can you review this patch. I have overhauled the initial fix put up. But could result in good perf improv for cleaning. I am yet to write tests. but do take a look at my logic

[GitHub] [hudi] CTTY commented on a diff in pull request #5113: [HUDI-3625] [RFC-60] Optimized storage layout for Cloud Object Stores

2022-09-21 Thread GitBox
CTTY commented on code in PR #5113: URL: https://github.com/apache/hudi/pull/5113#discussion_r977051856 ## rfc/rfc-56/rfc-56.md: ## @@ -0,0 +1,226 @@ + + +# RFC-56: Federated Storage Layer + +## Proposers +- @umehrot2 + +## Approvers +- @vinoth +- @shivnarayan + +## Status +

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5629: [HUDI-3384][HUDI-3385] Spark specific file reader/writer.

2022-09-21 Thread GitBox
alexeykudinkin commented on code in PR #5629: URL: https://github.com/apache/hudi/pull/5629#discussion_r977042462 ## hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecord.java: ## @@ -291,59 +284,51 @@ public void checkState() { } } -

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5629: [HUDI-3384][HUDI-3385] Spark specific file reader/writer.

2022-09-21 Thread GitBox
alexeykudinkin commented on code in PR #5629: URL: https://github.com/apache/hudi/pull/5629#discussion_r977041457 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/hudi/SparkStructTypeSerializer.scala: ## @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5629: [HUDI-3384][HUDI-3385] Spark specific file reader/writer.

2022-09-21 Thread GitBox
alexeykudinkin commented on code in PR #5629: URL: https://github.com/apache/hudi/pull/5629#discussion_r977040996 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/hudi/SparkStructTypeSerializer.scala: ## @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] yihua commented on issue #6640: [SUPPORT] HUDI partition table duplicate data cow hudi 0.10.0 flink 1.13.1

2022-09-21 Thread GitBox
yihua commented on issue #6640: URL: https://github.com/apache/hudi/issues/6640#issuecomment-1254303659 @yuzhaojing @danny0405 Could any one of you chime in here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] yihua commented on issue #6644: Hudi Multi Writer DynamoDBBasedLocking issue

2022-09-21 Thread GitBox
yihua commented on issue #6644: URL: https://github.com/apache/hudi/issues/6644#issuecomment-1254302236 @koochiswathiTR Thanks for raising this! The config naming of `partition_key` is confusing to new comers. Here's what you need to do: (1) As @xushiyan already mentioned, you don't

[GitHub] [hudi] hudi-bot commented on pull request #4015: [HUDI-2780] Fix the issue of Mor log skipping complete blocks when reading data

2022-09-21 Thread GitBox
hudi-bot commented on PR #4015: URL: https://github.com/apache/hudi/pull/4015#issuecomment-1254300684 ## CI report: * 375927ade5b4b327e44ebc227fb57e64de524fcc Azure:

[GitHub] [hudi] hudi-bot commented on pull request #4015: [HUDI-2780] Fix the issue of Mor log skipping complete blocks when reading data

2022-09-21 Thread GitBox
hudi-bot commented on PR #4015: URL: https://github.com/apache/hudi/pull/4015#issuecomment-1254296165 ## CI report: * 375927ade5b4b327e44ebc227fb57e64de524fcc Azure:

[GitHub] [hudi] bhasudha commented on a diff in pull request #6638: [DOCS] Add tags to blog pages

2022-09-21 Thread GitBox
bhasudha commented on code in PR #6638: URL: https://github.com/apache/hudi/pull/6638#discussion_r977030742 ## README.md: ## @@ -156,6 +156,44 @@ Example: When you change any file in `versioned_docs/version-0.7.0/`, it will on ## Configs Configs can be automatically updated

[hudi] branch asf-site updated: [DOCS] Add tags to blog pages (#6638)

2022-09-21 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 12ebe2bdef [DOCS] Add tags to blog pages

[GitHub] [hudi] nsivabalan commented on a diff in pull request #6638: [DOCS] Add tags to blog pages

2022-09-21 Thread GitBox
nsivabalan commented on code in PR #6638: URL: https://github.com/apache/hudi/pull/6638#discussion_r976869703 ## README.md: ## @@ -156,6 +156,44 @@ Example: When you change any file in `versioned_docs/version-0.7.0/`, it will on ## Configs Configs can be automatically

[GitHub] [hudi] nsivabalan merged pull request #6638: [DOCS] Add tags to blog pages

2022-09-21 Thread GitBox
nsivabalan merged PR #6638: URL: https://github.com/apache/hudi/pull/6638 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] yihua commented on issue #6398: [SUPPORT] Metadata table thows hbase exceptions

2022-09-21 Thread GitBox
yihua commented on issue #6398: URL: https://github.com/apache/hudi/issues/6398#issuecomment-1254287346 > @yihua yes this parameter is placed in separate hbase-site.xml which is used by spark. Thanks for the confirmation! I'll also list this as a workaround in our FAQ. -- This is

[GitHub] [hudi] nsivabalan commented on pull request #4015: [HUDI-2780] Fix the issue of Mor log skipping complete blocks when reading data

2022-09-21 Thread GitBox
nsivabalan commented on PR #4015: URL: https://github.com/apache/hudi/pull/4015#issuecomment-1254285358 have pushed out a commit by myself to address feedback. yet to see if we can cover the fix w/ a test. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [hudi] yihua closed issue #6658: [SUPPORT] undrop table

2022-09-21 Thread GitBox
yihua closed issue #6658: [SUPPORT] undrop table URL: https://github.com/apache/hudi/issues/6658 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] yihua commented on issue #6658: [SUPPORT] undrop table

2022-09-21 Thread GitBox
yihua commented on issue #6658: URL: https://github.com/apache/hudi/issues/6658#issuecomment-1254283326 @melin Thank you for raising this feature request! I created a Jira ticket to track the work and let's follow up there: HUDI-4891. Closing this support ticket. -- This is an

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6516: [HUDI-4729] Fix fq can not be queried in pending compaction when query ro table with spark

2022-09-21 Thread GitBox
alexeykudinkin commented on code in PR #6516: URL: https://github.com/apache/hudi/pull/6516#discussion_r977025908 ## hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java: ## @@ -665,13 +671,21 @@ public final Stream

[jira] [Updated] (HUDI-4891) Support UNDROP TABLE in Spark SQL

2022-09-21 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4891: Description: Specifies the identifier for the table to restore. If the identifier contains spaces or

[jira] [Created] (HUDI-4891) Support UNDROP TABLE in Spark SQL

2022-09-21 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-4891: --- Summary: Support UNDROP TABLE in Spark SQL Key: HUDI-4891 URL: https://issues.apache.org/jira/browse/HUDI-4891 Project: Apache Hudi Issue Type: Improvement

[jira] [Updated] (HUDI-4891) Support UNDROP TABLE in Spark SQL

2022-09-21 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4891: Fix Version/s: 1.0.0 > Support UNDROP TABLE in Spark SQL > - > >

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6046: [HUDI-4363] Support Clustering row writer to improve performance

2022-09-21 Thread GitBox
alexeykudinkin commented on code in PR #6046: URL: https://github.com/apache/hudi/pull/6046#discussion_r977019700 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/clustering/run/strategy/MultipleSparkJobExecutionStrategy.java: ## @@ -275,6 +345,66 @@

[GitHub] [hudi] alexeykudinkin commented on pull request #6046: [HUDI-4363] Support Clustering row writer to improve performance

2022-09-21 Thread GitBox
alexeykudinkin commented on PR #6046: URL: https://github.com/apache/hudi/pull/6046#issuecomment-1254275167 @boneanxs thank you very much for iterating on this one! Truly monumental effort! -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [hudi] alexeykudinkin commented on pull request #6046: [HUDI-4363] Support Clustering row writer to improve performance

2022-09-21 Thread GitBox
alexeykudinkin commented on PR #6046: URL: https://github.com/apache/hudi/pull/6046#issuecomment-1254275527 Did you try to re-run your benchmark after the changes we've made? If so, can you please paste the results in here -- This is an automated message from the Apache Git Service. To

[GitHub] [hudi] yihua commented on issue #6686: Apache Hudi Consistency issues with glue and marketplace connector

2022-09-21 Thread GitBox
yihua commented on issue #6686: URL: https://github.com/apache/hudi/issues/6686#issuecomment-1254272868 @asankadarshana007 The consistency check, when enabled, happens when removing invalid data files: (1) check that all paths to delete exist, (2) delete them, (3) wait for all paths to

[jira] [Commented] (HUDI-3796) Implement layout to filter out uncommitted log files without reading the log blocks

2022-09-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607985#comment-17607985 ] sivabalan narayanan commented on HUDI-3796: --- changing the name of the log file is a pretty big

[jira] [Updated] (HUDI-3796) Implement layout to filter out uncommitted log files without reading the log blocks

2022-09-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3796: -- Sprint: (was: 2022/09/19) > Implement layout to filter out uncommitted log files

  1   2   3   >