[GitHub] [hudi] danny0405 commented on a diff in pull request #7255: [HUDI-5250] use the estimate record size when estimation threshold is l…

2022-11-20 Thread GitBox
danny0405 commented on code in PR #7255: URL: https://github.com/apache/hudi/pull/7255#discussion_r1027671915 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -372,7 +372,7 @@ protected static long

[GitHub] [hudi] danny0405 commented on a diff in pull request #7226: [HUDI-5018] Make user-provided copyOnWriteRecordSizeEstimate first precedence

2022-11-20 Thread GitBox
danny0405 commented on code in PR #7226: URL: https://github.com/apache/hudi/pull/7226#discussion_r102767 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -367,9 +368,19 @@ public int getPartition(Object key) {

[GitHub] [hudi] bhasudha commented on pull request #7258: [DOCS] Add more blogs and talks

2022-11-20 Thread GitBox
bhasudha commented on PR #7258: URL: https://github.com/apache/hudi/pull/7258#issuecomment-1321569907 tested in localhost. Screenshot below - https://user-images.githubusercontent.com/2179254/202990302-e36de464-1db9-4dc3-b3bf-326a814e4cb3.png;> -- This is an automated

[GitHub] [hudi] bhasudha opened a new pull request, #7258: [DOCS] Add more blogs and talks

2022-11-20 Thread GitBox
bhasudha opened a new pull request, #7258: URL: https://github.com/apache/hudi/pull/7258 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any

[GitHub] [hudi] hudi-bot commented on pull request #7220: [HUDI-5230] Lazy init secondaryView in PriorityBasedFileSystemView

2022-11-20 Thread GitBox
hudi-bot commented on PR #7220: URL: https://github.com/apache/hudi/pull/7220#issuecomment-1321560447 ## CI report: * f99f32c3326e63682509109271777b8272fcb803 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7220: [HUDI-5230] Lazy init secondaryView in PriorityBasedFileSystemView

2022-11-20 Thread GitBox
hudi-bot commented on PR #7220: URL: https://github.com/apache/hudi/pull/7220#issuecomment-1321555298 ## CI report: * c867e57d17bb9c2109865b4a9c84af11b377355c Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7255: [HUDI-5250] use the estimate record size when estimation threshold is l…

2022-11-20 Thread GitBox
hudi-bot commented on PR #7255: URL: https://github.com/apache/hudi/pull/7255#issuecomment-1321550376 ## CI report: * 98941b2fe0f47e0d3d0846cc7869c29eed4d090d Azure:

[hudi] branch asf-site updated: [DOCS] Change upcoming Community sync image (#7257)

2022-11-20 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new b79f3c0a7b [DOCS] Change upcoming Community

[GitHub] [hudi] xushiyan merged pull request #7257: [DOCS] Change upcoming Community sync image

2022-11-20 Thread GitBox
xushiyan merged PR #7257: URL: https://github.com/apache/hudi/pull/7257 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[hudi] branch master updated: [MINOR] Fix `TestSchemaEvolutionClient` compilation (#7256)

2022-11-20 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new e90ab52a3d [MINOR] Fix

[GitHub] [hudi] xushiyan merged pull request #7256: [MINOR] Fix `TestSchemaEvolutionClient` compilation

2022-11-20 Thread GitBox
xushiyan merged PR #7256: URL: https://github.com/apache/hudi/pull/7256 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] xushiyan commented on pull request #7256: [MINOR] Fix `TestSchemaEvolutionClient` compilation

2022-11-20 Thread GitBox
xushiyan commented on PR #7256: URL: https://github.com/apache/hudi/pull/7256#issuecomment-1321535869 locally verified the UT passes. no need to wait for Azure CI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] honeyaya commented on a diff in pull request #7255: [HUDI-5250] use the estimate record size when estimation threshold is l…

2022-11-20 Thread GitBox
honeyaya commented on code in PR #7255: URL: https://github.com/apache/hudi/pull/7255#discussion_r1027630040 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -372,7 +372,7 @@ protected static long

[GitHub] [hudi] xushiyan closed issue #7204: failed to snapshot query in hive when query a empty partition

2022-11-20 Thread GitBox
xushiyan closed issue #7204: failed to snapshot query in hive when query a empty partition URL: https://github.com/apache/hudi/issues/7204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] xushiyan commented on issue #7204: failed to snapshot query in hive when query a empty partition

2022-11-20 Thread GitBox
xushiyan commented on issue #7204: URL: https://github.com/apache/hudi/issues/7204#issuecomment-1321531785 tracked in https://issues.apache.org/jira/browse/HUDI-5220 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] bhasudha opened a new pull request, #7257: [DOCS] Change upcoming Community sync image

2022-11-20 Thread GitBox
bhasudha opened a new pull request, #7257: URL: https://github.com/apache/hudi/pull/7257 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any

[GitHub] [hudi] trushev commented on pull request #7256: [MINOR] Fix `TestSchemaEvolutionClient` compilation

2022-11-20 Thread GitBox
trushev commented on PR #7256: URL: https://github.com/apache/hudi/pull/7256#issuecomment-1321527609 +1 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] hudi-bot commented on pull request #7256: [MINOR] Fix `TestSchemaEvolutionClient` compilation

2022-11-20 Thread GitBox
hudi-bot commented on PR #7256: URL: https://github.com/apache/hudi/pull/7256#issuecomment-1321500865 ## CI report: * 6a04cdd638fc4b809974e84fa1c5a10891b7bbf0 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7256: [MINOR] Fix `TestSchemaEvolutionClient` compilation

2022-11-20 Thread GitBox
hudi-bot commented on PR #7256: URL: https://github.com/apache/hudi/pull/7256#issuecomment-1321497811 ## CI report: * 6a04cdd638fc4b809974e84fa1c5a10891b7bbf0 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #7238: [HUDI-3963] Cleaning up `QueueBasedExecutor` impls

2022-11-20 Thread GitBox
hudi-bot commented on PR #7238: URL: https://github.com/apache/hudi/pull/7238#issuecomment-1321495225 ## CI report: * eb19e944d8f426f28c4f7266c9ca7a60ffb68fad Azure:

[GitHub] [hudi] zhangyue19921010 commented on pull request #7238: [HUDI-3963] Cleaning up `QueueBasedExecutor` impls

2022-11-20 Thread GitBox
zhangyue19921010 commented on PR #7238: URL: https://github.com/apache/hudi/pull/7238#issuecomment-1321493055 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] trushev commented on a diff in pull request #7256: [MINOR] Fix `TestSchemaEvolutionClient` compilation

2022-11-20 Thread GitBox
trushev commented on code in PR #7256: URL: https://github.com/apache/hudi/pull/7256#discussion_r1027586875 ## hudi-client/hudi-java-client/src/test/java/org/apache/hudi/table/action/commit/TestSchemaEvolutionClient.java: ## @@ -43,7 +42,7 @@ /** * Tests for schema evolution

[GitHub] [hudi] xushiyan opened a new pull request, #7256: [MINOR] Fix `TestSchemaEvolutionClient` compilation

2022-11-20 Thread GitBox
xushiyan opened a new pull request, #7256: URL: https://github.com/apache/hudi/pull/7256 ### Change Logs Fix `TestSchemaEvolutionClient` compilation ### Impact NA ### Risk level (write none, low medium or high below) none ### Documentation Update

[GitHub] [hudi] xicm commented on pull request #7226: [HUDI-5018] Make user-provided copyOnWriteRecordSizeEstimate first precedence

2022-11-20 Thread GitBox
xicm commented on PR #7226: URL: https://github.com/apache/hudi/pull/7226#issuecomment-1321470483 > Considering this scenario, I will set `COPY_ON_WRITE_RECORD_SIZE_ESTIMATE` is more smaller than the original to prevent generate a large number of small files when I first load data. But I

[GitHub] [hudi] trushev commented on pull request #7248: [HUDI-5244] Fix bugs in schema evolution client with lost operation field and not found schema

2022-11-20 Thread GitBox
trushev commented on PR #7248: URL: https://github.com/apache/hudi/pull/7248#issuecomment-1321460457 @xiarixiaoyao This PR have been merged right after https://github.com/apache/hudi/pull/7250 which consists conflicted code removing `HoodieJavaClientTestBase` Need to revert and

[GitHub] [hudi] hudi-bot commented on pull request #7220: [HUDI-5230] Lazy init secondaryView in PriorityBasedFileSystemView

2022-11-20 Thread GitBox
hudi-bot commented on PR #7220: URL: https://github.com/apache/hudi/pull/7220#issuecomment-1321458283 ## CI report: * c867e57d17bb9c2109865b4a9c84af11b377355c Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7220: [HUDI-5230] Lazy init secondaryView in PriorityBasedFileSystemView

2022-11-20 Thread GitBox
hudi-bot commented on PR #7220: URL: https://github.com/apache/hudi/pull/7220#issuecomment-1321454721 ## CI report: * c867e57d17bb9c2109865b4a9c84af11b377355c Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7238: [HUDI-3963] Cleaning up `QueueBasedExecutor` impls

2022-11-20 Thread GitBox
hudi-bot commented on PR #7238: URL: https://github.com/apache/hudi/pull/7238#issuecomment-1321450971 ## CI report: * eb19e944d8f426f28c4f7266c9ca7a60ffb68fad Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7226: [HUDI-5018] Make user-provided copyOnWriteRecordSizeEstimate first precedence

2022-11-20 Thread GitBox
hudi-bot commented on PR #7226: URL: https://github.com/apache/hudi/pull/7226#issuecomment-1321450889 ## CI report: * 89e408f8aa68139a5f87230f1d3f4f3a5fb838d8 Azure:

[GitHub] [hudi] xicm commented on a diff in pull request #7226: [HUDI-5018] Make user-provided copyOnWriteRecordSizeEstimate first precedence

2022-11-20 Thread GitBox
xicm commented on code in PR #7226: URL: https://github.com/apache/hudi/pull/7226#discussion_r1027564734 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -367,9 +368,19 @@ public int getPartition(Object key) {

[GitHub] [hudi] TJX2014 commented on a diff in pull request #7220: [HUDI-5230] Lazy init secondaryView in PriorityBasedFileSystemView

2022-11-20 Thread GitBox
TJX2014 commented on code in PR #7220: URL: https://github.com/apache/hudi/pull/7220#discussion_r1027560453 ## hudi-common/src/main/java/org/apache/hudi/common/table/view/PriorityBasedFileSystemView.java: ## @@ -131,149 +133,159 @@ private void

[GitHub] [hudi] TJX2014 commented on a diff in pull request #7220: [HUDI-5230] Lazy init secondaryView in PriorityBasedFileSystemView

2022-11-20 Thread GitBox
TJX2014 commented on code in PR #7220: URL: https://github.com/apache/hudi/pull/7220#discussion_r1027560371 ## hudi-common/src/main/java/org/apache/hudi/common/table/view/FileSystemViewManager.java: ## @@ -26,6 +26,7 @@ import

[GitHub] [hudi] TJX2014 commented on a diff in pull request #7220: [HUDI-5230] Lazy init secondaryView in PriorityBasedFileSystemView

2022-11-20 Thread GitBox
TJX2014 commented on code in PR #7220: URL: https://github.com/apache/hudi/pull/7220#discussion_r1027560298 ## hudi-common/src/main/java/org/apache/hudi/common/table/view/PriorityBasedFileSystemView.java: ## @@ -25,6 +25,7 @@ import

[hudi] branch master updated: [HUDI-5244] Fix bugs in schema evolution client with lost operation field and not found schema (#7248)

2022-11-20 Thread mengtao
This is an automated email from the ASF dual-hosted git repository. mengtao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 4c218231bd [HUDI-5244] Fix bugs in schema

[GitHub] [hudi] xiarixiaoyao merged pull request #7248: [HUDI-5244] Fix bugs in schema evolution client with lost operation field and not found schema

2022-11-20 Thread GitBox
xiarixiaoyao merged PR #7248: URL: https://github.com/apache/hudi/pull/7248 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] trushev commented on pull request #7248: [HUDI-5244] Fix bugs in schema evolution client with lost operation field and not found schema

2022-11-20 Thread GitBox
trushev commented on PR #7248: URL: https://github.com/apache/hudi/pull/7248#issuecomment-1321426497 > LGTM @trushev Does Spark have this problem? "Not found schema for table" -- no it is just client api bug "Lost operation field in avro schema" -- I believe yes because

[GitHub] [hudi] TJX2014 commented on a diff in pull request #7220: [HUDI-5230] Lazy init secondaryView in PriorityBasedFileSystemView

2022-11-20 Thread GitBox
TJX2014 commented on code in PR #7220: URL: https://github.com/apache/hudi/pull/7220#discussion_r1027556918 ## hudi-common/src/test/java/org/apache/hudi/common/table/view/TestPriorityBasedFileSystemView.java: ## @@ -75,7 +75,8 @@ public class TestPriorityBasedFileSystemView {

[GitHub] [hudi] xushiyan commented on a diff in pull request #7220: [HUDI-5230] Lazy init secondaryView in PriorityBasedFileSystemView

2022-11-20 Thread GitBox
xushiyan commented on code in PR #7220: URL: https://github.com/apache/hudi/pull/7220#discussion_r1027513868 ## hudi-common/src/main/java/org/apache/hudi/common/table/view/FileSystemViewManager.java: ## @@ -26,6 +26,7 @@ import

[GitHub] [hudi] danny0405 commented on a diff in pull request #7255: [HUDI-5250] use the estimate record size when estimation threshold is l…

2022-11-20 Thread GitBox
danny0405 commented on code in PR #7255: URL: https://github.com/apache/hudi/pull/7255#discussion_r1027545946 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -372,7 +372,7 @@ protected static long

[GitHub] [hudi] danny0405 commented on a diff in pull request #7255: [HUDI-5250] use the estimate record size when estimation threshold is l…

2022-11-20 Thread GitBox
danny0405 commented on code in PR #7255: URL: https://github.com/apache/hudi/pull/7255#discussion_r1027525224 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -372,7 +372,7 @@ protected static long

[GitHub] [hudi] danny0405 commented on a diff in pull request #7255: [HUDI-5250] use the estimate record size when estimation threshold is l…

2022-11-20 Thread GitBox
danny0405 commented on code in PR #7255: URL: https://github.com/apache/hudi/pull/7255#discussion_r1027525224 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -372,7 +372,7 @@ protected static long

[GitHub] [hudi] xiarixiaoyao commented on pull request #7248: [HUDI-5244] Fix bugs in schema evolution client with lost operation field and not found schema

2022-11-20 Thread GitBox
xiarixiaoyao commented on PR #7248: URL: https://github.com/apache/hudi/pull/7248#issuecomment-1321409547 LGTM @trushev Does Spark have this problem? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[hudi] branch master updated: [HUDI-5237] Support for HoodieUnMergedLogRecordScanner with InternalSchema (#7237)

2022-11-20 Thread mengtao
This is an automated email from the ASF dual-hosted git repository. mengtao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 21bcbfd792 [HUDI-5237] Support for

[GitHub] [hudi] xiarixiaoyao merged pull request #7237: [HUDI-5237] Support for HoodieUnMergedLogRecordScanner with InternalSchema

2022-11-20 Thread GitBox
xiarixiaoyao merged PR #7237: URL: https://github.com/apache/hudi/pull/7237 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] xiarixiaoyao commented on pull request #7237: [HUDI-5237] Support for HoodieUnMergedLogRecordScanner with InternalSchema

2022-11-20 Thread GitBox
xiarixiaoyao commented on PR #7237: URL: https://github.com/apache/hudi/pull/7237#issuecomment-1321405491 @trushev thanks for your work -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] danny0405 commented on a diff in pull request #7226: [HUDI-5018] Make user-provided copyOnWriteRecordSizeEstimate first precedence

2022-11-20 Thread GitBox
danny0405 commented on code in PR #7226: URL: https://github.com/apache/hudi/pull/7226#discussion_r1027510067 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -367,9 +368,19 @@ public int getPartition(Object key) {

[GitHub] [hudi] hudi-bot commented on pull request #7255: [HUDI-5250] use the estimate record size when estimation threshold is l…

2022-11-20 Thread GitBox
hudi-bot commented on PR #7255: URL: https://github.com/apache/hudi/pull/7255#issuecomment-1321401362 ## CI report: * 98941b2fe0f47e0d3d0846cc7869c29eed4d090d Azure:

[GitHub] [hudi] danny0405 commented on a diff in pull request #7226: [HUDI-5018] Make user-provided copyOnWriteRecordSizeEstimate first precedence

2022-11-20 Thread GitBox
danny0405 commented on code in PR #7226: URL: https://github.com/apache/hudi/pull/7226#discussion_r1027509172 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -367,9 +368,19 @@ public int getPartition(Object key) {

[GitHub] [hudi] hudi-bot commented on pull request #7255: [HUDI-5250] use the estimate record size when estimation threshold is l…

2022-11-20 Thread GitBox
hudi-bot commented on PR #7255: URL: https://github.com/apache/hudi/pull/7255#issuecomment-1321397575 ## CI report: * 98941b2fe0f47e0d3d0846cc7869c29eed4d090d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[jira] [Updated] (HUDI-5250) Using the default value of estimate record size at the averageBytesPerRecord() when estimation threshold is less than 0

2022-11-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5250: - Labels: pull-request-available (was: ) > Using the default value of estimate record size at the

[GitHub] [hudi] honeyaya opened a new pull request, #7255: [HUDI-5250] use the estimate record size when estimation threshold is l…

2022-11-20 Thread GitBox
honeyaya opened a new pull request, #7255: URL: https://github.com/apache/hudi/pull/7255 Using the default value of estimate record size at the averageBytesPerRecord() when estimation threshold is less than 0 ### Change Logs Currently, hudi obtains the average record size

[jira] [Updated] (HUDI-5250) Using the default value of estimate record size at the averageBytesPerRecord() when estimation threshold is less than 0

2022-11-20 Thread XixiHua (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] XixiHua updated HUDI-5250: -- Description: Currently, hudi obtains the average record size based on records written during previous commits.

[jira] [Updated] (HUDI-5250) Using the default value of estimate record size at the averageBytesPerRecord() when estimation threshold is less than 0

2022-11-20 Thread XixiHua (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] XixiHua updated HUDI-5250: -- Description: Currently, hudi obtains the average record size based on records written during previous commits.

[jira] [Updated] (HUDI-5250) Using the default value of estimate record size at the averageBytesPerRecord() when estimation threshold is less than 0

2022-11-20 Thread XixiHua (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] XixiHua updated HUDI-5250: -- Description: Currently, hudi obtains the average record size based on records written during previous commits.

[jira] [Updated] (HUDI-5250) Using the default value of estimate record size at the averageBytesPerRecord() when estimation threshold is less than 0

2022-11-20 Thread XixiHua (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] XixiHua updated HUDI-5250: -- Description: Currently, hudi obtains the average record size based on records written during previous commits.

[GitHub] [hudi] danny0405 commented on pull request #7159: [HUDI-5173]Skip if there is only one file in clusteringGroup

2022-11-20 Thread GitBox
danny0405 commented on PR #7159: URL: https://github.com/apache/hudi/pull/7159#issuecomment-1321385967 Did you apply the patch yet, i didn't see it. And there are test failures. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[hudi] branch master updated: [HUDI-5247] Clean up java client tests (#7250)

2022-11-20 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 926794aa74 [HUDI-5247] Clean up java client

[GitHub] [hudi] xushiyan merged pull request #7250: [HUDI-5247] Clean up java client tests

2022-11-20 Thread GitBox
xushiyan merged PR #7250: URL: https://github.com/apache/hudi/pull/7250 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[hudi] branch master updated: [HUDI-5070] Move flaky cleaner tests to separate class (#7251)

2022-11-20 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 0e1f9653c0 [HUDI-5070] Move flaky cleaner tests

[GitHub] [hudi] xushiyan merged pull request #7251: [HUDI-5070] Move flaky cleaner tests to separate class

2022-11-20 Thread GitBox
xushiyan merged PR #7251: URL: https://github.com/apache/hudi/pull/7251 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Created] (HUDI-5250) Using the default value of estimate record size at the averageBytesPerRecord() when estimation threshold is less than 0

2022-11-20 Thread XixiHua (Jira)
XixiHua created HUDI-5250: - Summary: Using the default value of estimate record size at the averageBytesPerRecord() when estimation threshold is less than 0 Key: HUDI-5250 URL:

[GitHub] [hudi] hudi-bot commented on pull request #7226: [HUDI-5018] Make user-provided copyOnWriteRecordSizeEstimate first precedence

2022-11-20 Thread GitBox
hudi-bot commented on PR #7226: URL: https://github.com/apache/hudi/pull/7226#issuecomment-1321342297 ## CI report: * 89e408f8aa68139a5f87230f1d3f4f3a5fb838d8 Azure:

[GitHub] [hudi] xicm commented on pull request #7226: [HUDI-5018] Make user-provided copyOnWriteRecordSizeEstimate first precedence

2022-11-20 Thread GitBox
xicm commented on PR #7226: URL: https://github.com/apache/hudi/pull/7226#issuecomment-1321341828 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] hudi-bot commented on pull request #7238: [HUDI-3963] Cleaning up `QueueBasedExecutor` impls

2022-11-20 Thread GitBox
hudi-bot commented on PR #7238: URL: https://github.com/apache/hudi/pull/7238#issuecomment-1321330392 ## CI report: * eb19e944d8f426f28c4f7266c9ca7a60ffb68fad Azure:

[GitHub] [hudi] zhangyue19921010 commented on pull request #7238: [HUDI-3963] Cleaning up `QueueBasedExecutor` impls

2022-11-20 Thread GitBox
zhangyue19921010 commented on PR #7238: URL: https://github.com/apache/hudi/pull/7238#issuecomment-1321322013 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] jklim96 opened a new issue, #7254: [SUPPORT] Incremental query performance

2022-11-20 Thread GitBox
jklim96 opened a new issue, #7254: URL: https://github.com/apache/hudi/issues/7254 **Describe the problem you faced** Hi all, I have a question about the performance of incremental queries. I'm comparing the performance between running incremental queries and simply doing a

[GitHub] [hudi] hudi-bot commented on pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
hudi-bot commented on PR #7003: URL: https://github.com/apache/hudi/pull/7003#issuecomment-1321232534 ## CI report: * 1c1d6e24197b60f243657a892b0591be2256538f UNKNOWN * 790c32c99000390fdba07f5428f4a5d4148a6a3c Azure:

[GitHub] [hudi] kazdy commented on pull request #7253: [Docs] Add Spark Structured Streaming docs to Spark quick start guide

2022-11-20 Thread GitBox
kazdy commented on PR #7253: URL: https://github.com/apache/hudi/pull/7253#issuecomment-1321204667 @bhasudha could you take a look and merge? Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] kazdy opened a new pull request, #7253: [Docs] Add Spark Structured Streaming docs to Spark quick start guide

2022-11-20 Thread GitBox
kazdy opened a new pull request, #7253: URL: https://github.com/apache/hudi/pull/7253 ### Change Logs Add Spark Structured Streaming to quick start guide. Hudi supports Strucrured Streaming reads starting from Hudi 0.8 and writes with async table services starting from Hudi 0.9.

[GitHub] [hudi] hudi-bot commented on pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
hudi-bot commented on PR #7003: URL: https://github.com/apache/hudi/pull/7003#issuecomment-1321198546 ## CI report: * 1c1d6e24197b60f243657a892b0591be2256538f UNKNOWN * 3be6021cc3b2a644a1735ba30b396801e3a59dc4 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
hudi-bot commented on PR #7003: URL: https://github.com/apache/hudi/pull/7003#issuecomment-1321197164 ## CI report: * 1c1d6e24197b60f243657a892b0591be2256538f UNKNOWN * 3be6021cc3b2a644a1735ba30b396801e3a59dc4 Azure:

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027318013 ## hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieAvroParquetReader.java: ## @@ -105,8 +105,11 @@ public long getTotalRecords() { } private

[GitHub] [hudi] hudi-bot commented on pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
hudi-bot commented on PR #7003: URL: https://github.com/apache/hudi/pull/7003#issuecomment-1321176511 ## CI report: * 1c1d6e24197b60f243657a892b0591be2256538f UNKNOWN * 77451457295f5322edb7ba0a3f4ff29d26ff80de Azure:

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027317029 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/HoodieCatalystExpressionUtils.scala: ## @@ -251,8 +252,16 @@ object HoodieCatalystExpressionUtils {

[GitHub] [hudi] hudi-bot commented on pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
hudi-bot commented on PR #7003: URL: https://github.com/apache/hudi/pull/7003#issuecomment-1321173434 ## CI report: * 1c1d6e24197b60f243657a892b0591be2256538f UNKNOWN * 77451457295f5322edb7ba0a3f4ff29d26ff80de Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
hudi-bot commented on PR #7003: URL: https://github.com/apache/hudi/pull/7003#issuecomment-1321172075 ## CI report: * 1c1d6e24197b60f243657a892b0591be2256538f UNKNOWN * 77451457295f5322edb7ba0a3f4ff29d26ff80de Azure:

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027312754 ## hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieAvroParquetReader.java: ## @@ -105,8 +105,11 @@ public long getTotalRecords() { } private

[GitHub] [hudi] hudi-bot commented on pull request #7251: [HUDI-5070] Move flaky cleaner tests to separate class

2022-11-20 Thread GitBox
hudi-bot commented on PR #7251: URL: https://github.com/apache/hudi/pull/7251#issuecomment-1321170718 ## CI report: * a0a8d599fd8ad82cb40b2da03251afda5fda2737 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7021: [Minor] fix multi deser avro payload

2022-11-20 Thread GitBox
hudi-bot commented on PR #7021: URL: https://github.com/apache/hudi/pull/7021#issuecomment-1321170527 ## CI report: * a35770322a450468bc5ada74900b554b430c0b22 Azure:

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027312754 ## hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieAvroParquetReader.java: ## @@ -105,8 +105,11 @@ public long getTotalRecords() { } private

[GitHub] [hudi] xushiyan commented on pull request #7240: [HUDI-5239] support HoodieJavaWriteClient compact

2022-11-20 Thread GitBox
xushiyan commented on PR #7240: URL: https://github.com/apache/hudi/pull/7240#issuecomment-1321166053 @ymZhao1001 can you add unit tests pls? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027302760 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java: ## @@ -492,15 +519,40 @@ private Pair>> fetchFromSourc boolean shouldCombine =

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027302467 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java: ## @@ -420,6 +438,15 @@ public Pair>> readFromSource( } private Pair>>

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027301233 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/HoodieSparkParquetReader.java: ## @@ -56,6 +63,8 @@ public class HoodieSparkParquetReader

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027301007 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieAvroDataBlock.java: ## @@ -194,6 +185,13 @@ private RecordIterator(Schema readerSchema, Schema

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027299190 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/HoodieInternalRowUtils.scala: ## @@ -172,24 +214,22 @@ object HoodieInternalRowUtils { } } - /**

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027296254 ## hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/HoodieCatalystExpressionUtils.scala: ## @@ -251,8 +252,16 @@ object HoodieCatalystExpressionUtils {

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027295363 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/HoodieSparkParquetReader.java: ## @@ -108,7 +117,12 @@ private ClosableIterator

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027295172 ## hudi-common/src/main/java/org/apache/hudi/common/util/ConfigUtils.java: ## @@ -58,10 +56,8 @@ public static String getPayloadClass(Properties properties) { return

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027295124 ## hudi-common/src/main/java/org/apache/hudi/common/util/IdentityIterator.java: ## @@ -0,0 +1,44 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027294778 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala: ## @@ -461,9 +460,12 @@ abstract class HoodieBaseRelation(val

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027294717 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/HoodieSparkValidateDuplicateKeyRecordMerger.scala: ## @@ -0,0 +1,46 @@ +/* + *

[GitHub] [hudi] wzx140 commented on a diff in pull request #7003: [minor] add more test for rfc46

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7003: URL: https://github.com/apache/hudi/pull/7003#discussion_r1027294579 ## hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestSpark3DDL.scala: ## @@ -531,6 +533,18 @@ class TestSpark3DDL extends

[GitHub] [hudi] hudi-bot commented on pull request #7251: [HUDI-5070] Move flaky cleaner tests to separate class

2022-11-20 Thread GitBox
hudi-bot commented on PR #7251: URL: https://github.com/apache/hudi/pull/7251#issuecomment-1321135968 ## CI report: * a0a8d599fd8ad82cb40b2da03251afda5fda2737 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7021: [Minor] fix multi deser avro payload

2022-11-20 Thread GitBox
hudi-bot commented on PR #7021: URL: https://github.com/apache/hudi/pull/7021#issuecomment-1321135812 ## CI report: * a63d51584048892b705e43ecc0cd705da66ea8be Azure:

[GitHub] [hudi] hudi-bot commented on pull request #7251: [HUDI-5070] Move flaky cleaner tests to separate class

2022-11-20 Thread GitBox
hudi-bot commented on PR #7251: URL: https://github.com/apache/hudi/pull/7251#issuecomment-1321134586 ## CI report: * a0a8d599fd8ad82cb40b2da03251afda5fda2737 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #7021: [Minor] fix multi deser avro payload

2022-11-20 Thread GitBox
hudi-bot commented on PR #7021: URL: https://github.com/apache/hudi/pull/7021#issuecomment-1321134478 ## CI report: * a63d51584048892b705e43ecc0cd705da66ea8be Azure:

[GitHub] [hudi] wzx140 commented on a diff in pull request #7021: [Minor] fix multi deser avro payload

2022-11-20 Thread GitBox
wzx140 commented on code in PR #7021: URL: https://github.com/apache/hudi/pull/7021#discussion_r1027290671 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java: ## @@ -364,7 +364,8 @@ private void processAppendResult(AppendResult result,

[jira] [Created] (HUDI-5249) Support MetadataColumnStatsIndex for Spark record

2022-11-20 Thread Frank Wong (Jira)
Frank Wong created HUDI-5249: Summary: Support MetadataColumnStatsIndex for Spark record Key: HUDI-5249 URL: https://issues.apache.org/jira/browse/HUDI-5249 Project: Apache Hudi Issue Type: New

[jira] [Closed] (HUDI-5248) Support MetadataColumnStatsIndex for Spark record

2022-11-20 Thread Frank Wong (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Wong closed HUDI-5248. Resolution: Not A Problem > Support MetadataColumnStatsIndex for Spark record >

  1   2   >