Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-27 Thread via GitHub
yihua commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2024077244 > > We have #7146 which also attempted to solve the same problem. Should we close #7146 and prefer this one? > > That does not solve the problem as the sorting (of the input batch) is

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-25 Thread via GitHub
bhat-vinay commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2019436395 > We have #7146 which also attempted to solve the same problem. Should we close #7146 and prefer this one? That does not solve the problem as the sorting (of the input batch) is

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-25 Thread via GitHub
yihua commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2019412865 We have #7146 which also attempted to solve the same problem. Should we close #7146 and prefer this one? -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-24 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2016810146 ## CI report: * 2c83cfaf2bdaef6b5075989992aeeff8052461ed UNKNOWN * 9329d8d43e9274478e64a0d40cbe7a5a0362ec90 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-24 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2016795397 ## CI report: * 2c83cfaf2bdaef6b5075989992aeeff8052461ed UNKNOWN * e2296a2de6391dee42a83d390410eb71f193d55c Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-24 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2016793663 ## CI report: * 2c83cfaf2bdaef6b5075989992aeeff8052461ed UNKNOWN * e2296a2de6391dee42a83d390410eb71f193d55c Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-23 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2016391528 ## CI report: * 2c83cfaf2bdaef6b5075989992aeeff8052461ed UNKNOWN * e2296a2de6391dee42a83d390410eb71f193d55c Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2016360644 ## CI report: * 2c83cfaf2bdaef6b5075989992aeeff8052461ed UNKNOWN * a84507191a942c5d8c98610958ca48f47188bc48 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2016357819 ## CI report: * 2c83cfaf2bdaef6b5075989992aeeff8052461ed UNKNOWN * a84507191a942c5d8c98610958ca48f47188bc48 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
bhat-vinay commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1536561023 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java: ## @@ -480,6 +480,20 @@ public class HoodieWriteConfig extends

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
bhat-vinay commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1536560898 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java: ## @@ -411,4 +427,90 @@ public Partitioner

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
vinothchandar commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1536245569 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java: ## @@ -411,4 +427,90 @@ public Partitioner

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
vinothchandar commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1536245244 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java: ## @@ -230,6 +236,10 @@ protected

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
vinothchandar commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1536245082 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java: ## @@ -480,6 +480,20 @@ public class HoodieWriteConfig extends

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2015437643 ## CI report: * 2c83cfaf2bdaef6b5075989992aeeff8052461ed UNKNOWN * a84507191a942c5d8c98610958ca48f47188bc48 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2015350367 ## CI report: * b802619f011c1d9ef5b334ecf67ab7df74964e08 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2015337228 ## CI report: * b802619f011c1d9ef5b334ecf67ab7df74964e08 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
bhat-vinay commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2015327684 > IIUC this adds additional shuffle and a new job? I'd like to understand how we think this impacts the current insert DAG. Yet to review the new partitioner, will do once I hear back

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
bhat-vinay commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1535756962 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java: ## @@ -411,4 +427,90 @@ public Partitioner

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
bhat-vinay commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1535754607 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java: ## @@ -411,4 +427,90 @@ public Partitioner

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
bhat-vinay commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1535751978 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java: ## @@ -411,4 +427,90 @@ public Partitioner

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
bhat-vinay commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1535749601 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java: ## @@ -411,4 +427,90 @@ public Partitioner

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
bhat-vinay commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1535706564 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java: ## @@ -394,6 +404,12 @@ public Partitioner

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
bhat-vinay commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1535704794 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/BaseSparkCommitActionExecutor.java: ## @@ -230,6 +236,10 @@ protected Partitioner

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
bhat-vinay commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1535694790 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java: ## @@ -480,6 +480,20 @@ public class HoodieWriteConfig extends

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
bhat-vinay commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1535695254 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java: ## @@ -480,6 +480,20 @@ public class HoodieWriteConfig extends

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-22 Thread via GitHub
bhat-vinay commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1535473655 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java: ## @@ -480,6 +480,20 @@ public class HoodieWriteConfig extends

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-21 Thread via GitHub
vinothchandar commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1534279513 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java: ## @@ -480,6 +480,20 @@ public class HoodieWriteConfig extends

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2007159791 ## CI report: * b802619f011c1d9ef5b334ecf67ab7df74964e08 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2007022918 ## CI report: * bd71699ccef3e28be182c2cd5f8093b0cb507694 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2007009003 ## CI report: * bd71699ccef3e28be182c2cd5f8093b0cb507694 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2006227969 ## CI report: * bd71699ccef3e28be182c2cd5f8093b0cb507694 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-19 Thread via GitHub
bhat-vinay commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1529876646 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -90,8 +94,11 @@ public class UpsertPartitioner

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2006041219 ## CI report: * 5016a9c8d9daeea9f6f28f63cc090514482571a4 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-19 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2006019061 ## CI report: * 5016a9c8d9daeea9f6f28f63cc090514482571a4 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-18 Thread via GitHub
rmahindra123 commented on code in PR #10876: URL: https://github.com/apache/hudi/pull/10876#discussion_r1529460065 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java: ## @@ -90,8 +94,11 @@ public class UpsertPartitioner

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-18 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2003610845 ## CI report: * 5016a9c8d9daeea9f6f28f63cc090514482571a4 Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-18 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2003452778 ## CI report: * f3c15a77a88d778d532dcc3fbed186441b3fa04c Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-18 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2003422824 ## CI report: * f3c15a77a88d778d532dcc3fbed186441b3fa04c Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-18 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2003139899 ## CI report: * f3c15a77a88d778d532dcc3fbed186441b3fa04c Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-18 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2003066457 ## CI report: * f3c15a77a88d778d532dcc3fbed186441b3fa04c Azure:

Re: [PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-18 Thread via GitHub
hudi-bot commented on PR #10876: URL: https://github.com/apache/hudi/pull/10876#issuecomment-2003058078 ## CI report: * f3c15a77a88d778d532dcc3fbed186441b3fa04c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[PR] [HUDI-7512] sort input records for insert operation [hudi]

2024-03-18 Thread via GitHub
bhat-vinay opened a new pull request, #10876: URL: https://github.com/apache/hudi/pull/10876 ### Change Logs Allows for sorting input records in insert operation. This is still a in-progress PR - uploading to get some test signals. Pending: Custom sort columns, more unit tests