[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-05-11 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1191850116 ## hudi-utilities/src/test/java/org/apache/hudi/utilities/offlinejob/TestOfflineHoodieCompactor.java: ## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-05-11 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1191848563 ## hudi-utilities/src/test/java/org/apache/hudi/utilities/offlinejob/TestOfflineHoodieCompactor.java: ## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-05-06 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1186656966 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java: ## @@ -295,4 +301,11 @@ private String getSchemaFromLatestInstant() throws

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-05-06 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1186656966 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java: ## @@ -295,4 +301,11 @@ private String getSchemaFromLatestInstant() throws

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-05-05 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1186146172 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java: ## @@ -269,13 +272,14 @@ private int doCompact(JavaSparkContext jsc) throws

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-05-05 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1185742751 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java: ## @@ -256,7 +279,16 @@ private int doScheduleAndCluster(JavaSparkContext jsc)

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-05-04 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1185642315 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java: ## @@ -256,7 +279,16 @@ private int doScheduleAndCluster(JavaSparkContext jsc)

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-05-04 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1185642208 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java: ## @@ -256,7 +279,16 @@ private int doScheduleAndCluster(JavaSparkContext jsc)

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-05-04 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1185641155 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java: ## @@ -256,7 +279,16 @@ private int doScheduleAndCluster(JavaSparkContext jsc)

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-26 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1177963180 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java: ## @@ -269,6 +269,7 @@ private int doCompact(JavaSparkContext jsc) throws Exception

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-26 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1177963180 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java: ## @@ -269,6 +269,7 @@ private int doCompact(JavaSparkContext jsc) throws Exception

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-26 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1177963180 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java: ## @@ -269,6 +269,7 @@ private int doCompact(JavaSparkContext jsc) throws Exception

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-25 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1176051746 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java: ## @@ -245,6 +246,7 @@ private void

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-25 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1176051746 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java: ## @@ -245,6 +246,7 @@ private void

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-25 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1176051746 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java: ## @@ -245,6 +246,7 @@ private void

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-25 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1176051746 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java: ## @@ -245,6 +246,7 @@ private void

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-25 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1176051746 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java: ## @@ -245,6 +246,7 @@ private void

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-25 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1176051746 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java: ## @@ -245,6 +246,7 @@ private void

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-24 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1175193258 ## hudi-utilities/src/test/java/org/apache/hudi/utilities/TestHoodieCompactor.java: ## @@ -0,0 +1,177 @@ +package org.apache.hudi.utilities; + +import

[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

2023-04-23 Thread via GitHub
zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1174758600 ## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java: ## @@ -292,7 +292,9 @@ protected void