Re: [PR] [HUDI-7471] Increase the number of Spark executors in tests [hudi]
hudi-bot commented on PR #10802: URL: https://github.com/apache/hudi/pull/10802#issuecomment-1975070520 ## CI report: * 4c37feb88ed56cbc6cb81aedcde0eba21996b84f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22743) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7471] Increase the number of Spark executors in tests [hudi]
hudi-bot commented on PR #10802: URL: https://github.com/apache/hudi/pull/10802#issuecomment-1975068983 ## CI report: * 4c37feb88ed56cbc6cb81aedcde0eba21996b84f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7471] Increase the number of Spark executors in tests [hudi]
yihua commented on PR #10802: URL: https://github.com/apache/hudi/pull/10802#issuecomment-1975068962 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7470] Compaction completed not need write to mdt if mdt is disable [hudi]
hudi-bot commented on PR #10801: URL: https://github.com/apache/hudi/pull/10801#issuecomment-1975067481 ## CI report: * 6524c27e11d40ab23b6248d82a6115a79da6cf49 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22741) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7471) Increase the number of Spark executors in tests
[ https://issues.apache.org/jira/browse/HUDI-7471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7471: - Labels: pull-request-available (was: ) > Increase the number of Spark executors in tests > --- > > Key: HUDI-7471 > URL: https://issues.apache.org/jira/browse/HUDI-7471 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Major > Labels: pull-request-available > Fix For: 0.15.0, 1.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [HUDI-7471] Increase the number of Spark executors in tests [hudi]
yihua opened a new pull request, #10802: URL: https://github.com/apache/hudi/pull/10802 ### Change Logs This PR makes two minor changes: - Increases the number of executors in Spark session in tests. - Uses the existing util method to get Spark conf for a few tests. ### Impact Reduces test time ### Risk level none ### Documentation Update N/A ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (HUDI-7471) Increase the number of Spark executors in tests
[ https://issues.apache.org/jira/browse/HUDI-7471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo reassigned HUDI-7471: --- Assignee: Ethan Guo > Increase the number of Spark executors in tests > --- > > Key: HUDI-7471 > URL: https://issues.apache.org/jira/browse/HUDI-7471 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7471) Increase the number of Spark executors in tests
[ https://issues.apache.org/jira/browse/HUDI-7471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7471: Fix Version/s: 0.15.0 1.0.0 > Increase the number of Spark executors in tests > --- > > Key: HUDI-7471 > URL: https://issues.apache.org/jira/browse/HUDI-7471 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Major > Fix For: 0.15.0, 1.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-7471) Increase the number of Spark executors in tests
Ethan Guo created HUDI-7471: --- Summary: Increase the number of Spark executors in tests Key: HUDI-7471 URL: https://issues.apache.org/jira/browse/HUDI-7471 Project: Apache Hudi Issue Type: Improvement Reporter: Ethan Guo -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [I] [SUPPORT] Cleaner fails with com.esotericsoftware.kryo.KryoException: java.util.ConcurrentModificationException [hudi]
cbomgit commented on issue #10785: URL: https://github.com/apache/hudi/issues/10785#issuecomment-1975053685 > I believe this PR #10065 should fix the problem Thanks. Is there a particular condition that triggers this? Also, is the patch backported to older versions? I saw the 0.14.1 label, but unsure if it means i can apply it to my version (0.11). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(hudi) branch master updated: [HUDI-7469] Reduce redundant tests with Hudi record types (#10800)
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 82f221c7d05 [HUDI-7469] Reduce redundant tests with Hudi record types (#10800) 82f221c7d05 is described below commit 82f221c7d05436bb1eac4f09e1e675d1c91a7cf1 Author: Y Ethan Guo AuthorDate: Sat Mar 2 21:56:54 2024 -0800 [HUDI-7469] Reduce redundant tests with Hudi record types (#10800) --- .../apache/hudi/functional/TestCOWDataSource.scala | 72 +++--- .../apache/hudi/functional/TestMORDataSource.scala | 20 +- .../sql/hudi/TestAlterTableDropPartition.scala | 4 +- .../spark/sql/hudi/TestCompactionTable.scala | 4 +- .../apache/spark/sql/hudi/TestInsertTable.scala| 265 ++--- .../apache/spark/sql/hudi/TestMergeIntoTable.scala | 24 +- .../spark/sql/hudi/TestMergeIntoTable2.scala | 20 +- .../TestMergeIntoTableWithNonRecordKeyField.scala | 8 +- .../org/apache/spark/sql/hudi/TestSpark3DDL.scala | 16 +- .../spark/sql/hudi/TestTimeTravelTable.scala | 12 +- .../apache/spark/sql/hudi/TestUpdateTable.scala| 6 +- .../deltastreamer/TestHoodieDeltaStreamer.java | 63 +++-- 12 files changed, 246 insertions(+), 268 deletions(-) diff --git a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala index a28a228fd46..5614b414927 100644 --- a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala +++ b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala @@ -23,8 +23,9 @@ import org.apache.hudi.DataSourceWriteOptions.{INLINE_CLUSTERING_ENABLE, KEYGENE import org.apache.hudi.HoodieConversionUtils.toJavaOption import org.apache.hudi.QuickstartUtils.{convertToStringList, getQuickstartWriteConfigs} import org.apache.hudi.client.common.HoodieSparkEngineContext -import org.apache.hudi.common.config.TimestampKeyGeneratorConfig.{TIMESTAMP_INPUT_DATE_FORMAT, TIMESTAMP_OUTPUT_DATE_FORMAT, TIMESTAMP_TIMEZONE_FORMAT, TIMESTAMP_TYPE_FIELD} import org.apache.hudi.common.config.HoodieMetadataConfig +import org.apache.hudi.common.config.TimestampKeyGeneratorConfig.{TIMESTAMP_INPUT_DATE_FORMAT, TIMESTAMP_OUTPUT_DATE_FORMAT, TIMESTAMP_TIMEZONE_FORMAT, TIMESTAMP_TYPE_FIELD} +import org.apache.hudi.common.fs.FSUtils import org.apache.hudi.common.model.HoodieRecord.HoodieRecordType import org.apache.hudi.common.model.{HoodieRecord, WriteOperationType} import org.apache.hudi.common.table.timeline.{HoodieInstant, HoodieTimeline, TimelineUtils} @@ -44,7 +45,6 @@ import org.apache.hudi.metrics.{Metrics, MetricsReporterType} import org.apache.hudi.testutils.HoodieSparkClientTestBase import org.apache.hudi.util.JFunction import org.apache.hudi.{AvroConversionUtils, DataSourceReadOptions, DataSourceWriteOptions, HoodieDataSourceHelpers, QuickstartUtils, ScalaAssertionSupport} -import org.apache.hudi.common.fs.FSUtils import org.apache.spark.sql._ import org.apache.spark.sql.functions._ import org.apache.spark.sql.hudi.HoodieSparkSessionExtension @@ -96,10 +96,9 @@ class TestCOWDataSource extends HoodieSparkClientTestBase with ScalaAssertionSup System.gc() } - @ParameterizedTest - @EnumSource(value = classOf[HoodieRecordType], names = Array("AVRO", "SPARK")) - def testShortNameStorage(recordType: HoodieRecordType) { -val (writeOpts, readOpts) = getWriterReaderOpts(recordType) + @Test + def testShortNameStorage(): Unit = { +val (writeOpts, readOpts) = getWriterReaderOpts() // Insert Operation val records = recordsToStrings(dataGen.generateInserts("000", 100)).toList @@ -564,10 +563,9 @@ class TestCOWDataSource extends HoodieSparkClientTestBase with ScalaAssertionSup * archival should kick in and 2 commits should be archived. If schema is valid, no exception will be thrown. If not, * NPE will be thrown. */ - @ParameterizedTest - @EnumSource(value = classOf[HoodieRecordType], names = Array("AVRO", "SPARK")) - def testArchivalWithBulkInsert(recordType: HoodieRecordType): Unit = { -val (writeOpts, readOpts) = getWriterReaderOpts(recordType) + @Test + def testArchivalWithBulkInsert(): Unit = { +val (writeOpts, readOpts) = getWriterReaderOpts() var structType: StructType = null for (i <- 1 to 7) { @@ -696,10 +694,9 @@ class TestCOWDataSource extends HoodieSparkClientTestBase with ScalaAssertionSup } } - @ParameterizedTest - @EnumSource(value = classOf[HoodieRecordType], names = Array("AVRO", "SPARK")) - def testOverWriteModeUseReplaceAction(recordType: HoodieRecordType): Unit = { -val (writeOpts, readOpts) = getWriterReaderOpts(recordType) + @Test + def testOverWriteModeUseReplaceAc
Re: [PR] [HUDI-6953] Adding test for composite keys with bulk insert row writer [hudi]
yihua merged PR #10214: URL: https://github.com/apache/hudi/pull/10214 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(hudi) branch master updated: [HUDI-6953] Adding test for composite keys with bulk insert row writer (#10214)
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 59f1c66848c [HUDI-6953] Adding test for composite keys with bulk insert row writer (#10214) 59f1c66848c is described below commit 59f1c66848c3ddbfff1ea5fe3eacd39f1adf9a3a Author: Sivabalan Narayanan AuthorDate: Sat Mar 2 21:57:23 2024 -0800 [HUDI-6953] Adding test for composite keys with bulk insert row writer (#10214) --- .../apache/hudi/functional/TestCOWDataSource.scala | 21 + 1 file changed, 21 insertions(+) diff --git a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala index 5614b414927..ff87a90cef8 100644 --- a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala +++ b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala @@ -487,6 +487,27 @@ class TestCOWDataSource extends HoodieSparkClientTestBase with ScalaAssertionSup assertEquals(snapshotDF2.count(), (validRecordsFromBatch1 + validRecordsFromBatch2)) } + @Test + def bulkInsertCompositeKeys(): Unit = { +val (writeOpts, readOpts) = getWriterReaderOpts(HoodieRecordType.AVRO) + +// Insert Operation +val records = recordsToStrings(dataGen.generateInserts("000", 100)).toList +val inputDF = spark.read.json(spark.sparkContext.parallelize(records, 2)) + +val inputDf1 = inputDF.withColumn("new_col",lit("value1")) +val inputDf2 = inputDF.withColumn("new_col", lit(null).cast("String") ) + +inputDf1.union(inputDf2).write.format("hudi") +.options(writeOpts) +.option(DataSourceWriteOptions.RECORDKEY_FIELD.key, "_row_key,new_col") +.option(DataSourceWriteOptions.OPERATION.key(),"bulk_insert") +.mode(SaveMode.Overwrite) +.save(basePath) + +assertEquals(200, spark.read.format("org.apache.hudi").options(readOpts).load(basePath).count()) + } + /** * This tests the case that query by with a specified partition condition on hudi table which is * different between the value of the partition field and the actual partition path,
Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]
yihua merged PR #10800: URL: https://github.com/apache/hudi/pull/10800 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]
hudi-bot commented on PR #10800: URL: https://github.com/apache/hudi/pull/10800#issuecomment-1975052350 ## CI report: * 4f3d78c3f0404bf8e0fb3f2aa8907f8718414d31 UNKNOWN * 170654037ccf164486aee674542f7b68e7e2714d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22740) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7469) Reduce redundant tests with Hudi record types
[ https://issues.apache.org/jira/browse/HUDI-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7469: Fix Version/s: 0.15.0 1.0.0 > Reduce redundant tests with Hudi record types > - > > Key: HUDI-7469 > URL: https://issues.apache.org/jira/browse/HUDI-7469 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Major > Labels: pull-request-available > Fix For: 0.15.0, 1.0.0 > > > There are lots of tests running with the permutations of both Hudi record > types, e.g., AVRO and SPARK, which are not necessary. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7469) Reduce redundant tests with Hudi record types
[ https://issues.apache.org/jira/browse/HUDI-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7469: Priority: Critical (was: Major) > Reduce redundant tests with Hudi record types > - > > Key: HUDI-7469 > URL: https://issues.apache.org/jira/browse/HUDI-7469 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Critical > Labels: pull-request-available > Fix For: 0.15.0, 1.0.0 > > > There are lots of tests running with the permutations of both Hudi record > types, e.g., AVRO and SPARK, which are not necessary. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-7469) Reduce redundant tests with Hudi record types
[ https://issues.apache.org/jira/browse/HUDI-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo reassigned HUDI-7469: --- Assignee: Ethan Guo > Reduce redundant tests with Hudi record types > - > > Key: HUDI-7469 > URL: https://issues.apache.org/jira/browse/HUDI-7469 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Major > Labels: pull-request-available > > There are lots of tests running with the permutations of both Hudi record > types, e.g., AVRO and SPARK, which are not necessary. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (HUDI-7469) Reduce redundant tests with Hudi record types
[ https://issues.apache.org/jira/browse/HUDI-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo closed HUDI-7469. --- Resolution: Fixed > Reduce redundant tests with Hudi record types > - > > Key: HUDI-7469 > URL: https://issues.apache.org/jira/browse/HUDI-7469 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Major > Labels: pull-request-available > > There are lots of tests running with the permutations of both Hudi record > types, e.g., AVRO and SPARK, which are not necessary. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [HUDI-7470] Compaction completed not need write to mdt if mdt is disable [hudi]
hudi-bot commented on PR #10801: URL: https://github.com/apache/hudi/pull/10801#issuecomment-1975024791 ## CI report: * 6524c27e11d40ab23b6248d82a6115a79da6cf49 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22741) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7470] Compaction completed not need write to mdt if mdt is disable [hudi]
hudi-bot commented on PR #10801: URL: https://github.com/apache/hudi/pull/10801#issuecomment-1975023561 ## CI report: * 6524c27e11d40ab23b6248d82a6115a79da6cf49 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]
hudi-bot commented on PR #10800: URL: https://github.com/apache/hudi/pull/10800#issuecomment-1975022383 ## CI report: * 4f3d78c3f0404bf8e0fb3f2aa8907f8718414d31 UNKNOWN * 066f13566c862bb22b9ba5945768a431bf7fdf0c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22739) * 170654037ccf164486aee674542f7b68e7e2714d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22740) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7470) Compaction completed not need write to mdt if it is disable
[ https://issues.apache.org/jira/browse/HUDI-7470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xy updated HUDI-7470: - Attachment: CAPTURE_2024-03-03_124553.jpg > Compaction completed not need write to mdt if it is disable > --- > > Key: HUDI-7470 > URL: https://issues.apache.org/jira/browse/HUDI-7470 > Project: Apache Hudi > Issue Type: Improvement > Components: spark-sql >Reporter: xy >Assignee: xy >Priority: Major > Labels: pull-request-available > Attachments: CAPTURE_2024-03-03_123512.jpg, > CAPTURE_2024-03-03_124229.jpg, CAPTURE_2024-03-03_124553.jpg, mdt.jpg > > > Compaction completed not need write to mdt if it is disable. > when sparksql is set hoodie.metadata.enable=false and execute compaction,it > would also execute metadata update. It is not fitable if need disable mdt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7470) Compaction completed not need write to mdt if it is disable
[ https://issues.apache.org/jira/browse/HUDI-7470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7470: - Labels: pull-request-available (was: ) > Compaction completed not need write to mdt if it is disable > --- > > Key: HUDI-7470 > URL: https://issues.apache.org/jira/browse/HUDI-7470 > Project: Apache Hudi > Issue Type: Improvement > Components: spark-sql >Reporter: xy >Assignee: xy >Priority: Major > Labels: pull-request-available > Attachments: CAPTURE_2024-03-03_123512.jpg, > CAPTURE_2024-03-03_124229.jpg, mdt.jpg > > > Compaction completed not need write to mdt if it is disable. > when sparksql is set hoodie.metadata.enable=false and execute compaction,it > would also execute metadata update. It is not fitable if need disable mdt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [HUDI-7470] Compaction completed not need write to mdt if it is disable [hudi]
xuzifu666 opened a new pull request, #10801: URL: https://github.com/apache/hudi/pull/10801 ### Change Logs Compaction completed not need write to mdt if it is disable. when sparksql is set hoodie.metadata.enable=false and execute compaction,it would also execute metadata update. It is not fitable if need disable mdt. ### Impact low ### Risk level (write none, low medium or high below) low ### Documentation Update _Describe any necessary documentation update if there is any new feature, config, or user-facing change_ - _The config description must be updated if new configs are added or the default value of the configs are changed_ - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make changes to the website._ ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7470) Compaction completed not need write to mdt if it is disable
[ https://issues.apache.org/jira/browse/HUDI-7470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xy updated HUDI-7470: - Attachment: CAPTURE_2024-03-03_124229.jpg > Compaction completed not need write to mdt if it is disable > --- > > Key: HUDI-7470 > URL: https://issues.apache.org/jira/browse/HUDI-7470 > Project: Apache Hudi > Issue Type: Improvement > Components: spark-sql >Reporter: xy >Assignee: xy >Priority: Major > Attachments: CAPTURE_2024-03-03_123512.jpg, > CAPTURE_2024-03-03_124229.jpg, mdt.jpg > > > Compaction completed not need write to mdt if it is disable. > when sparksql is set hoodie.metadata.enable=false and execute compaction,it > would also execute metadata update. It is not fitable if need disable mdt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7470) Compaction completed not need write to mdt if it is disable
[ https://issues.apache.org/jira/browse/HUDI-7470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xy updated HUDI-7470: - Summary: Compaction completed not need write to mdt if it is disable (was: Compaction completed need write to mdt if it is enable) > Compaction completed not need write to mdt if it is disable > --- > > Key: HUDI-7470 > URL: https://issues.apache.org/jira/browse/HUDI-7470 > Project: Apache Hudi > Issue Type: Improvement > Components: spark-sql >Reporter: xy >Assignee: xy >Priority: Major > Attachments: CAPTURE_2024-03-03_123512.jpg, mdt.jpg > > > Compaction completed need write to mdt if it is enable. > when sparksql is set hoodie.metadata.enable=false and execute compaction,it > would also execute metadata update. It is not fitable if need disable mdt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7470) Compaction completed not need write to mdt if it is disable
[ https://issues.apache.org/jira/browse/HUDI-7470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xy updated HUDI-7470: - Description: Compaction completed not need write to mdt if it is disable. when sparksql is set hoodie.metadata.enable=false and execute compaction,it would also execute metadata update. It is not fitable if need disable mdt. was: Compaction completed need write to mdt if it is enable. when sparksql is set hoodie.metadata.enable=false and execute compaction,it would also execute metadata update. It is not fitable if need disable mdt. > Compaction completed not need write to mdt if it is disable > --- > > Key: HUDI-7470 > URL: https://issues.apache.org/jira/browse/HUDI-7470 > Project: Apache Hudi > Issue Type: Improvement > Components: spark-sql >Reporter: xy >Assignee: xy >Priority: Major > Attachments: CAPTURE_2024-03-03_123512.jpg, mdt.jpg > > > Compaction completed not need write to mdt if it is disable. > when sparksql is set hoodie.metadata.enable=false and execute compaction,it > would also execute metadata update. It is not fitable if need disable mdt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-7470) Compaction completed need write to mdt if it is enable
xy created HUDI-7470: Summary: Compaction completed need write to mdt if it is enable Key: HUDI-7470 URL: https://issues.apache.org/jira/browse/HUDI-7470 Project: Apache Hudi Issue Type: Improvement Components: spark-sql Reporter: xy Assignee: xy Attachments: CAPTURE_2024-03-03_123512.jpg, mdt.jpg Compaction completed need write to mdt if it is enable. when sparksql is set hoodie.metadata.enable=false and execute compaction,it would also execute metadata update. It is not fitable if need disable mdt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]
hudi-bot commented on PR #10800: URL: https://github.com/apache/hudi/pull/10800#issuecomment-1975014734 ## CI report: * 4f3d78c3f0404bf8e0fb3f2aa8907f8718414d31 UNKNOWN * 066f13566c862bb22b9ba5945768a431bf7fdf0c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22739) * 170654037ccf164486aee674542f7b68e7e2714d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6089] Handle default insert behaviour to ingest duplicates [hudi]
wombatu-kun commented on PR #10728: URL: https://github.com/apache/hudi/pull/10728#issuecomment-1975012557 update in documentation is already made: https://github.com/apache/hudi/pull/10739 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]
hudi-bot commented on PR #10800: URL: https://github.com/apache/hudi/pull/10800#issuecomment-1975011931 ## CI report: * 4f3d78c3f0404bf8e0fb3f2aa8907f8718414d31 UNKNOWN * 066f13566c862bb22b9ba5945768a431bf7fdf0c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22739) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] Cleaner fails with com.esotericsoftware.kryo.KryoException: java.util.ConcurrentModificationException [hudi]
CTTY commented on issue #10785: URL: https://github.com/apache/hudi/issues/10785#issuecomment-1975008692 I believe this PR #10065 should fix the problem -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]
hudi-bot commented on PR #10800: URL: https://github.com/apache/hudi/pull/10800#issuecomment-1975003929 ## CI report: * a9e83f5d727a100b39f8da8fde8eda78a9101de8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22738) * 4f3d78c3f0404bf8e0fb3f2aa8907f8718414d31 UNKNOWN * 066f13566c862bb22b9ba5945768a431bf7fdf0c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]
hudi-bot commented on PR #10800: URL: https://github.com/apache/hudi/pull/10800#issuecomment-1975002647 ## CI report: * a9e83f5d727a100b39f8da8fde8eda78a9101de8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22738) * 4f3d78c3f0404bf8e0fb3f2aa8907f8718414d31 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]
hudi-bot commented on PR #10799: URL: https://github.com/apache/hudi/pull/10799#issuecomment-1975002634 ## CI report: * 1982318df811e9dbbb0458b2219d251ceeae683a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22737) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]
hudi-bot commented on PR #10799: URL: https://github.com/apache/hudi/pull/10799#issuecomment-1975001765 ## CI report: * 25fe17b146b5a519faf87b398aeac917ecdbbad0 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22735) * 1982318df811e9dbbb0458b2219d251ceeae683a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22737) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]
hudi-bot commented on PR #10800: URL: https://github.com/apache/hudi/pull/10800#issuecomment-1974980800 ## CI report: * a9e83f5d727a100b39f8da8fde8eda78a9101de8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22738) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]
hudi-bot commented on PR #10800: URL: https://github.com/apache/hudi/pull/10800#issuecomment-1974979326 ## CI report: * a9e83f5d727a100b39f8da8fde8eda78a9101de8 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6953] Adding test for composite keys with bulk insert row writer [hudi]
hudi-bot commented on PR #10214: URL: https://github.com/apache/hudi/pull/10214#issuecomment-1974977841 ## CI report: * 039b6d5e9aef7b31e8e44aeb367e5352d66bbe9c UNKNOWN * 020b0107da7fb19f738d4cb639eedada78299729 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22733) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7150] ExternalSpillableMap support values method [hudi]
hudi-bot commented on PR #10194: URL: https://github.com/apache/hudi/pull/10194#issuecomment-1974977818 ## CI report: * b0608830895508b879e36d8099fdebbb605a4aec Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22734) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]
yihua commented on PR #10799: URL: https://github.com/apache/hudi/pull/10799#issuecomment-1974976881 > Hmm, most of the PRs does not need update for doc, is it reasonable to by default do all the validations? And the `[MINOR] xxx` style title seems been fixed to pass the validation right? User needs to just add `N/A` to the "Documentation Update" section in the PR description. This is a reminder for author to check if there is any documentation update needed, not necessary to do update if the code changes are not user-facing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7469) Reduce redundant tests with Hudi record types
[ https://issues.apache.org/jira/browse/HUDI-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7469: - Labels: pull-request-available (was: ) > Reduce redundant tests with Hudi record types > - > > Key: HUDI-7469 > URL: https://issues.apache.org/jira/browse/HUDI-7469 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Ethan Guo >Priority: Major > Labels: pull-request-available > > There are lots of tests running with the permutations of both Hudi record > types, e.g., AVRO and SPARK, which are not necessary. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]
yihua opened a new pull request, #10800: URL: https://github.com/apache/hudi/pull/10800 ### Change Logs There are lots of functional tests running with the permutations of both Hudi record types, e.g., AVRO and SPARK, which are not necessary, e.g., not directly related to testing the record type. This PR removes them to save time in CI. ### Impact Reduces time in CI to avoid running unnecessary tests. ### Risk level none ### Documentation Update N/A ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7469) Reduce redundant tests with Hudi record types
[ https://issues.apache.org/jira/browse/HUDI-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-7469: Description: There are lots of tests running with the permutations of both Hudi record types, e.g., AVRO and SPARK, which are not necessary. > Reduce redundant tests with Hudi record types > - > > Key: HUDI-7469 > URL: https://issues.apache.org/jira/browse/HUDI-7469 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Ethan Guo >Priority: Major > > There are lots of tests running with the permutations of both Hudi record > types, e.g., AVRO and SPARK, which are not necessary. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-7469) Reduce redundant tests with Hudi record types
Ethan Guo created HUDI-7469: --- Summary: Reduce redundant tests with Hudi record types Key: HUDI-7469 URL: https://issues.apache.org/jira/browse/HUDI-7469 Project: Apache Hudi Issue Type: Improvement Reporter: Ethan Guo -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]
danny0405 commented on PR #10799: URL: https://github.com/apache/hudi/pull/10799#issuecomment-1974966068 Hmm, most of the PRs does not need update for doc, is it reasonable to by default do all the validations? And the `[MINOR] xxx` style title seems been fixed to pass the validation right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]
hudi-bot commented on PR #10799: URL: https://github.com/apache/hudi/pull/10799#issuecomment-1974965902 ## CI report: * 25fe17b146b5a519faf87b398aeac917ecdbbad0 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22735) * 1982318df811e9dbbb0458b2219d251ceeae683a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22737) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]
hudi-bot commented on PR #10799: URL: https://github.com/apache/hudi/pull/10799#issuecomment-1974964732 ## CI report: * 25fe17b146b5a519faf87b398aeac917ecdbbad0 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22735) * 1982318df811e9dbbb0458b2219d251ceeae683a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-6089) Handle default insert behaviour to ingest duplicates
[ https://issues.apache.org/jira/browse/HUDI-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-6089: - Status: Open (was: In Progress) > Handle default insert behaviour to ingest duplicates > > > Key: HUDI-6089 > URL: https://issues.apache.org/jira/browse/HUDI-6089 > Project: Apache Hudi > Issue Type: Improvement > Components: configs, writer-core >Reporter: Aditya Goenka >Assignee: Vova Kolmakov >Priority: Major > Labels: insert, pull-request-available > Fix For: 1.1.0 > > > Related to - [https://github.com/apache/hudi/issues/8451] > > Make default value of "hoodie.merge.allow.duplicate.on.inserts" as True to > avoid the merge stage for operation type insert and combine before insert is > false. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (HUDI-6089) Handle default insert behaviour to ingest duplicates
[ https://issues.apache.org/jira/browse/HUDI-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-6089. Resolution: Fixed Fixed via master branch: 3a864ec63598c2919c06ed03422cf54416b31b43 > Handle default insert behaviour to ingest duplicates > > > Key: HUDI-6089 > URL: https://issues.apache.org/jira/browse/HUDI-6089 > Project: Apache Hudi > Issue Type: Improvement > Components: configs, writer-core >Reporter: Aditya Goenka >Assignee: Vova Kolmakov >Priority: Major > Labels: insert, pull-request-available > Fix For: 1.1.0 > > > Related to - [https://github.com/apache/hudi/issues/8451] > > Make default value of "hoodie.merge.allow.duplicate.on.inserts" as True to > avoid the merge stage for operation type insert and combine before insert is > false. -- This message was sent by Atlassian Jira (v8.20.10#820010)
(hudi) branch master updated: [HUDI-6089] Handle default insert behaviour to ingest duplicates (#10728)
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 3a864ec6359 [HUDI-6089] Handle default insert behaviour to ingest duplicates (#10728) 3a864ec6359 is described below commit 3a864ec63598c2919c06ed03422cf54416b31b43 Author: wombatu-kun AuthorDate: Sun Mar 3 07:44:26 2024 +0700 [HUDI-6089] Handle default insert behaviour to ingest duplicates (#10728) Co-authored-by: Vova Kolmakov --- .../src/main/java/org/apache/hudi/config/HoodieWriteConfig.java | 2 +- .../main/java/org/apache/hudi/metadata/HoodieMetadataWriteUtils.java | 1 + .../src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java | 1 + .../org/apache/spark/sql/hudi/TestHoodieTableValuedFunction.scala | 1 + .../src/test/scala/org/apache/spark/sql/hudi/TestInsertTable.scala| 4 +++- .../test/scala/org/apache/spark/sql/hudi/TestMergeIntoTable2.scala| 2 ++ .../apache/hudi/utilities/deltastreamer/TestHoodieDeltaStreamer.java | 2 ++ 7 files changed, 11 insertions(+), 2 deletions(-) diff --git a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java index f4cb386d271..9447069a995 100644 --- a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java +++ b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java @@ -562,7 +562,7 @@ public class HoodieWriteConfig extends HoodieConfig { public static final ConfigProperty MERGE_ALLOW_DUPLICATE_ON_INSERTS_ENABLE = ConfigProperty .key("hoodie.merge.allow.duplicate.on.inserts") - .defaultValue("false") + .defaultValue("true") .markAdvanced() .withDocumentation("When enabled, we allow duplicate keys even if inserts are routed to merge with an existing file (for ensuring file sizing)." + " This is only relevant for insert operation, since upsert, delete operations will ensure unique key constraints are maintained."); diff --git a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataWriteUtils.java b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataWriteUtils.java index 7c42ccf5016..243b74b9199 100644 --- a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataWriteUtils.java +++ b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataWriteUtils.java @@ -86,6 +86,7 @@ public class HoodieMetadataWriteUtils { HoodieWriteConfig.Builder builder = HoodieWriteConfig.newBuilder() .withEngineType(writeConfig.getEngineType()) .withTimelineLayoutVersion(TimelineLayoutVersion.CURR_VERSION) +.withMergeAllowDuplicateOnInserts(false) .withConsistencyGuardConfig(ConsistencyGuardConfig.newBuilder() .withConsistencyCheckEnabled(writeConfig.getConsistencyGuardConfig().isConsistencyCheckEnabled()) .withInitialConsistencyCheckIntervalMs(writeConfig.getConsistencyGuardConfig().getInitialConsistencyCheckIntervalMs()) diff --git a/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java b/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java index 5c93f924ece..90fcfd4fd7a 100644 --- a/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java +++ b/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java @@ -89,6 +89,7 @@ public class TestHoodieWriteConfig { assertEquals(5, config.getMaxCommitsToKeep()); assertEquals(2, config.getMinCommitsToKeep()); assertTrue(config.shouldUseExternalSchemaTransformation()); +assertTrue(config.allowDuplicateInserts()); } @Test diff --git a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestHoodieTableValuedFunction.scala b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestHoodieTableValuedFunction.scala index bdf512d3451..aa6ff39431f 100644 --- a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestHoodieTableValuedFunction.scala +++ b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestHoodieTableValuedFunction.scala @@ -450,6 +450,7 @@ class TestHoodieTableValuedFunction extends HoodieSparkSqlTestBase { |""".stripMargin ) + spark.sql("set hoodie.merge.allow.duplicate.on.inserts = false") spark.sql( s""" | insert into $tableName diff --git a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestInsertTable.scala b/hud
Re: [PR] [HUDI-6089] Handle default insert behaviour to ingest duplicates [hudi]
danny0405 merged PR #10728: URL: https://github.com/apache/hudi/pull/10728 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] Cleaner fails with com.esotericsoftware.kryo.KryoException: java.util.ConcurrentModificationException [hudi]
danny0405 commented on issue #10785: URL: https://github.com/apache/hudi/issues/10785#issuecomment-1974960705 cc @umehrot2 for visibility. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Increase spark num executors in tests [hudi]
yihua closed pull request #10798: [MINOR] Increase spark num executors in tests URL: https://github.com/apache/hudi/pull/10798 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]
hudi-bot commented on PR #10799: URL: https://github.com/apache/hudi/pull/10799#issuecomment-1974950031 ## CI report: * 25fe17b146b5a519faf87b398aeac917ecdbbad0 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22735) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6953] Adding test for composite keys with bulk insert row writer [hudi]
hudi-bot commented on PR #10214: URL: https://github.com/apache/hudi/pull/10214#issuecomment-1974949802 ## CI report: * 0ee77f22a2f213a1c581e443a52eb6965832abc4 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21233) * 039b6d5e9aef7b31e8e44aeb367e5352d66bbe9c UNKNOWN * 020b0107da7fb19f738d4cb639eedada78299729 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22733) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7150] ExternalSpillableMap support values method [hudi]
hudi-bot commented on PR #10194: URL: https://github.com/apache/hudi/pull/10194#issuecomment-1974949774 ## CI report: * c6bf629aadfc3b60d94e74ef69c85356248fdffb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21180) * b0608830895508b879e36d8099fdebbb605a4aec Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22734) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]
hudi-bot commented on PR #10799: URL: https://github.com/apache/hudi/pull/10799#issuecomment-1974947706 ## CI report: * 25fe17b146b5a519faf87b398aeac917ecdbbad0 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6953] Adding test for composite keys with bulk insert row writer [hudi]
hudi-bot commented on PR #10214: URL: https://github.com/apache/hudi/pull/10214#issuecomment-1974947482 ## CI report: * 0ee77f22a2f213a1c581e443a52eb6965832abc4 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21233) * 039b6d5e9aef7b31e8e44aeb367e5352d66bbe9c UNKNOWN * 020b0107da7fb19f738d4cb639eedada78299729 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7150] ExternalSpillableMap support values method [hudi]
hudi-bot commented on PR #10194: URL: https://github.com/apache/hudi/pull/10194#issuecomment-1974947465 ## CI report: * c6bf629aadfc3b60d94e74ef69c85356248fdffb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21180) * b0608830895508b879e36d8099fdebbb605a4aec UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Increase spark num executors in tests [hudi]
hudi-bot commented on PR #10798: URL: https://github.com/apache/hudi/pull/10798#issuecomment-1974946004 ## CI report: * f17d97c01fc22a1aee1012a1fd494e88a242f57f UNKNOWN * 615d596b29492b6a9b65c8114c2137eb4b84eb70 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22732) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6953] Adding test for composite keys with bulk insert row writer [hudi]
hudi-bot commented on PR #10214: URL: https://github.com/apache/hudi/pull/10214#issuecomment-1974945872 ## CI report: * 0ee77f22a2f213a1c581e443a52eb6965832abc4 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21233) * 039b6d5e9aef7b31e8e44aeb367e5352d66bbe9c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [MINOR] Add PR description validation on documentation updates [hudi]
yihua opened a new pull request, #10799: URL: https://github.com/apache/hudi/pull/10799 ### Change Logs As above, to make PR description validation strict. ### Impact As above. ### Risk level none ### Documentation Update _Describe any necessary documentation update if there is any new feature, config, or user-facing change_ - _The config description must be updated if new configs are added or the default value of the configs are changed_ - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make changes to the website._ ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Increase spark num executors in tests [hudi]
hudi-bot commented on PR #10798: URL: https://github.com/apache/hudi/pull/10798#issuecomment-1974936494 ## CI report: * f17d97c01fc22a1aee1012a1fd494e88a242f57f UNKNOWN * 615d596b29492b6a9b65c8114c2137eb4b84eb70 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6089] Handle default insert behaviour to ingest duplicates [hudi]
bvaradar commented on PR #10728: URL: https://github.com/apache/hudi/pull/10728#issuecomment-1974935143 cc @nsivabalan : this needs update in documentation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Increase spark num executors in tests [hudi]
hudi-bot commented on PR #10798: URL: https://github.com/apache/hudi/pull/10798#issuecomment-1974934148 ## CI report: * f17d97c01fc22a1aee1012a1fd494e88a242f57f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [MINOR] Increase spark num executors in tests [hudi]
yihua opened a new pull request, #10798: URL: https://github.com/apache/hudi/pull/10798 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance impact._ ### Risk level (write none, low medium or high below) _If medium or high, explain what verification was done to mitigate the risks._ ### Documentation Update _Describe any necessary documentation update if there is any new feature, config, or user-facing change_ - _The config description must be updated if new configs are added or the default value of the configs are changed_ - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make changes to the website._ ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(hudi) branch master updated: [HUDI-7465] Split tests in CI further to reduce total CI elapsed time (#10795)
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new eeccdf9bb0f [HUDI-7465] Split tests in CI further to reduce total CI elapsed time (#10795) eeccdf9bb0f is described below commit eeccdf9bb0f2885c37e0b480c330400fd2f80a1b Author: Y Ethan Guo AuthorDate: Sat Mar 2 13:59:58 2024 -0800 [HUDI-7465] Split tests in CI further to reduce total CI elapsed time (#10795) --- .github/workflows/bot.yml| 139 +++ azure-pipelines-20230430.yml | 58 ++ 2 files changed, 176 insertions(+), 21 deletions(-) diff --git a/.github/workflows/bot.yml b/.github/workflows/bot.yml index 0bfd9541bcc..3007c752534 100644 --- a/.github/workflows/bot.yml +++ b/.github/workflows/bot.yml @@ -53,7 +53,7 @@ jobs: - name: RAT check run: ./scripts/release/validate_source_rat.sh - test-spark: + test-spark-java-tests: runs-on: ubuntu-latest strategy: matrix: @@ -107,22 +107,87 @@ jobs: SPARK_PROFILE: ${{ matrix.sparkProfile }} run: mvn test -Punit-tests -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" -pl hudi-examples/hudi-examples-spark $MVN_ARGS - - name: UT - Common & Spark + - name: Java UT - Common & Spark env: SCALA_PROFILE: ${{ matrix.scalaProfile }} SPARK_PROFILE: ${{ matrix.sparkProfile }} SPARK_MODULES: ${{ matrix.sparkModules }} if: ${{ !endsWith(env.SPARK_PROFILE, '3.2') }} # skip test spark 3.2 as it's covered by Azure CI run: - mvn test -Punit-tests -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" -pl "$SPARK_COMMON_MODULES,$SPARK_MODULES" $MVN_ARGS - - name: FT - Spark + mvn test -Punit-tests -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" -DwildcardSuites=skipScalaTests -DfailIfNoTests=false -pl "$SPARK_COMMON_MODULES,$SPARK_MODULES" $MVN_ARGS + - name: Java FT - Spark env: SCALA_PROFILE: ${{ matrix.scalaProfile }} SPARK_PROFILE: ${{ matrix.sparkProfile }} SPARK_MODULES: ${{ matrix.sparkModules }} if: ${{ !endsWith(env.SPARK_PROFILE, '3.2') }} # skip test spark 3.2 as it's covered by Azure CI run: - mvn test -Pfunctional-tests -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" -pl "$SPARK_COMMON_MODULES,$SPARK_MODULES" $MVN_ARGS + mvn test -Pfunctional-tests -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" -DwildcardSuites=skipScalaTests -DfailIfNoTests=false -pl "$SPARK_COMMON_MODULES,$SPARK_MODULES" $MVN_ARGS + + test-spark-scala-tests: +runs-on: ubuntu-latest +strategy: + matrix: +include: + - scalaProfile: "scala-2.11" +sparkProfile: "spark2.4" +sparkModules: "hudi-spark-datasource/hudi-spark2" + + - scalaProfile: "scala-2.12" +sparkProfile: "spark3.0" +sparkModules: "hudi-spark-datasource/hudi-spark3.0.x" + + - scalaProfile: "scala-2.12" +sparkProfile: "spark3.1" +sparkModules: "hudi-spark-datasource/hudi-spark3.1.x" + + - scalaProfile: "scala-2.12" +sparkProfile: "spark3.2" +sparkModules: "hudi-spark-datasource/hudi-spark3.2.x" + + - scalaProfile: "scala-2.12" +sparkProfile: "spark3.3" +sparkModules: "hudi-spark-datasource/hudi-spark3.3.x" + + - scalaProfile: "scala-2.12" +sparkProfile: "spark3.4" +sparkModules: "hudi-spark-datasource/hudi-spark3.4.x" + + - scalaProfile: "scala-2.12" +sparkProfile: "spark3.5" +sparkModules: "hudi-spark-datasource/hudi-spark3.5.x" + +steps: + - uses: actions/checkout@v3 + - name: Set up JDK 8 +uses: actions/setup-java@v3 +with: + java-version: '8' + distribution: 'adopt' + architecture: x64 + cache: maven + - name: Build Project +env: + SCALA_PROFILE: ${{ matrix.scalaProfile }} + SPARK_PROFILE: ${{ matrix.sparkProfile }} +run: + mvn clean install -T 2 -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" -DskipTests=true $MVN_ARGS -am -pl "hudi-examples/hudi-examples-spark,$SPARK_COMMON_MODULES,$SPARK_MODULES" + - name: Scala UT - Common & Spark +env: + SCALA_PROFILE: ${{ matrix.scalaProfile }} + SPARK_PROFILE: ${{ matrix.sparkProfile }} + SPARK_MODULES: ${{ matrix.sparkModules }} +if: ${{ !endsWith(env.SPARK_PROFILE, '3.2') }} # skip test spark 3.2 as it's covered by Azure CI +run: + mvn test -Punit-tests -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" -Dtest=skipJavaTests -DfailIfNoTests=false -pl "$SPARK_COMMON_MODULES,$SPARK_MODULES" $MVN_ARGS + - name: Scala FT - Spark +env: + SCALA_PROFILE: ${{ matrix.sca
Re: [PR] [HUDI-7465] Split tests in CI further to reduce total CI elapsed time [hudi]
yihua merged PR #10795: URL: https://github.com/apache/hudi/pull/10795 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7465] Split tests in CI further to reduce total CI elapsed time [hudi]
hudi-bot commented on PR #10795: URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974919265 ## CI report: * fe31edd435cf0c990fa3174db7d43a9412ad012c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22729) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7465] Split tests in CI further to reduce total CI elapsed time [hudi]
hudi-bot commented on PR #10795: URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974909152 ## CI report: * fe31edd435cf0c990fa3174db7d43a9412ad012c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22729) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(hudi) branch master updated: [HUDI-7341] Support unmerged record read (#10632)
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new f792643657e [HUDI-7341] Support unmerged record read (#10632) f792643657e is described below commit f792643657ebba69edd2b2aeeb4a37d15c39beba Author: Lin Liu <141371752+linliu-c...@users.noreply.github.com> AuthorDate: Sat Mar 2 12:58:15 2024 -0800 [HUDI-7341] Support unmerged record read (#10632) --- .../hudi/common/engine/HoodieReaderContext.java| 7 + .../table/read/HoodieFileGroupRecordBuffer.java| 8 +- .../read/HoodieKeyBasedFileGroupRecordBuffer.java | 2 +- .../HoodiePositionBasedFileGroupRecordBuffer.java | 2 +- .../read/HoodieUnmergedFileGroupRecordBuffer.java | 146 + .../testutils/reader/HoodieTestReaderContext.java | 9 ++ 6 files changed, 170 insertions(+), 4 deletions(-) diff --git a/hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java b/hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java index 1d81007c375..86a875c9df3 100644 --- a/hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java +++ b/hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java @@ -219,4 +219,11 @@ public abstract class HoodieReaderContext { public long extractRecordPosition(T record, Schema schema, String fieldName, long providedPositionIfNeeded) { return providedPositionIfNeeded; } + + /** + * Constructs engine specific delete record. + */ + public T constructRawDeleteRecord(Map metadata) { +return null; + } } diff --git a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupRecordBuffer.java b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupRecordBuffer.java index ccc001e79c9..d9ba8bcd90e 100644 --- a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupRecordBuffer.java +++ b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupRecordBuffer.java @@ -34,8 +34,12 @@ import java.util.Map; public interface HoodieFileGroupRecordBuffer { enum BufferType { -KEY_BASED, -POSITION_BASED +// Merging based on record key. +KEY_BASED_MERGE, +// Merging based on record position. +POSITION_BASED_MERGE, +// No Merging at all. +UNMERGED } /** diff --git a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieKeyBasedFileGroupRecordBuffer.java b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieKeyBasedFileGroupRecordBuffer.java index b4e32be8c65..0430a42e863 100644 --- a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieKeyBasedFileGroupRecordBuffer.java +++ b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieKeyBasedFileGroupRecordBuffer.java @@ -65,7 +65,7 @@ public class HoodieKeyBasedFileGroupRecordBuffer extends HoodieBaseFileGroupR @Override public BufferType getBufferType() { -return BufferType.KEY_BASED; +return BufferType.KEY_BASED_MERGE; } @Override diff --git a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodiePositionBasedFileGroupRecordBuffer.java b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodiePositionBasedFileGroupRecordBuffer.java index 4412713928f..50e969343e1 100644 --- a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodiePositionBasedFileGroupRecordBuffer.java +++ b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodiePositionBasedFileGroupRecordBuffer.java @@ -72,7 +72,7 @@ public class HoodiePositionBasedFileGroupRecordBuffer extends HoodieBaseFileG @Override public BufferType getBufferType() { -return BufferType.POSITION_BASED; +return BufferType.POSITION_BASED_MERGE; } @Override diff --git a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieUnmergedFileGroupRecordBuffer.java b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieUnmergedFileGroupRecordBuffer.java new file mode 100644 index 000..76aa28308c4 --- /dev/null +++ b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieUnmergedFileGroupRecordBuffer.java @@ -0,0 +1,146 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the Licens
Re: [PR] [HUDI-7341] Support unmerged record read [hudi]
yihua merged PR #10632: URL: https://github.com/apache/hudi/pull/10632 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7465] Split tests in CI further to reduce total CI elapsed time [hudi]
hudi-bot commented on PR #10795: URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974907215 ## CI report: * fe31edd435cf0c990fa3174db7d43a9412ad012c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7465] Split tests in CI further to reduce total CI elapsed time [hudi]
yihua commented on PR #10795: URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974906299 Looks like flakiness in Azure CI. Will retry. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7465] Split tests in CI further to reduce total CI elapsed time [hudi]
hudi-bot commented on PR #10795: URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974893487 ## CI report: * fe31edd435cf0c990fa3174db7d43a9412ad012c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22729) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (HUDI-60) [UMBRELLA] Support Apache Beam for incremental tailing
[ https://issues.apache.org/jira/browse/HUDI-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-60: -- Assignee: xy > [UMBRELLA] Support Apache Beam for incremental tailing > -- > > Key: HUDI-60 > URL: https://issues.apache.org/jira/browse/HUDI-60 > Project: Apache Hudi > Issue Type: Epic > Components: spark, Utilities >Reporter: Vinoth Chandar >Assignee: xy >Priority: Major > Labels: gsoc, gsoc2021, hudi-umbrellas, mentor > > (More details to be added) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-60) [UMBRELLA] Support Apache Beam for incremental tailing
[ https://issues.apache.org/jira/browse/HUDI-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-60: --- Fix Version/s: 0.15.0 > [UMBRELLA] Support Apache Beam for incremental tailing > -- > > Key: HUDI-60 > URL: https://issues.apache.org/jira/browse/HUDI-60 > Project: Apache Hudi > Issue Type: Epic > Components: spark, Utilities >Reporter: Vinoth Chandar >Assignee: xy >Priority: Major > Labels: gsoc, gsoc2021, hudi-umbrellas, mentor > Fix For: 0.15.0 > > > (More details to be added) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (HUDI-7398) clarify clustering strategy for java client
[ https://issues.apache.org/jira/browse/HUDI-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu closed HUDI-7398. Assignee: Raymond Xu Resolution: Fixed > clarify clustering strategy for java client > --- > > Key: HUDI-7398 > URL: https://issues.apache.org/jira/browse/HUDI-7398 > Project: Apache Hudi > Issue Type: Improvement > Components: docs >Reporter: Raymond Xu >Assignee: Raymond Xu >Priority: Minor > Labels: pull-request-available > Fix For: 0.14.2 > > > java client only does linear sort > org.apache.hudi.client.clustering.run.strategy.JavaExecutionStrategy#getPartitioner > org.apache.hudi.client.clustering.run.strategy.JavaExecutionStrategy#getPartitioner > but in fact it can be extended to perform space-filling curve sorting. guess > it’s just not implemented yet. if you’re interested, feel free to attempt it > with a pr -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [HUDI-7465][DNM] Split tests in CI further to reduce total CI elapsed time [hudi]
hudi-bot commented on PR #10795: URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974882533 ## CI report: * d3d782594eea1b77c47e37862b0673c7a1768710 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22727) * fe31edd435cf0c990fa3174db7d43a9412ad012c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22729) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7465][DNM] Split tests in CI further to reduce total CI elapsed time [hudi]
hudi-bot commented on PR #10795: URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974880555 ## CI report: * d3d782594eea1b77c47e37862b0673c7a1768710 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22727) * fe31edd435cf0c990fa3174db7d43a9412ad012c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]
hudi-bot commented on PR #10797: URL: https://github.com/apache/hudi/pull/10797#issuecomment-1974865229 ## CI report: * 05774d87786b8f5101b6953ea769831244544c44 UNKNOWN * 435db97306f340b9ab078f154394da691e0354e1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22728) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]
hudi-bot commented on PR #10797: URL: https://github.com/apache/hudi/pull/10797#issuecomment-1974855084 ## CI report: * 05774d87786b8f5101b6953ea769831244544c44 UNKNOWN * 435db97306f340b9ab078f154394da691e0354e1 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22728) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]
hudi-bot commented on PR #10797: URL: https://github.com/apache/hudi/pull/10797#issuecomment-1974852961 ## CI report: * 05774d87786b8f5101b6953ea769831244544c44 UNKNOWN * 435db97306f340b9ab078f154394da691e0354e1 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]
hudi-bot commented on PR #10797: URL: https://github.com/apache/hudi/pull/10797#issuecomment-1974850925 ## CI report: * 05774d87786b8f5101b6953ea769831244544c44 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] Cleaner fails with com.esotericsoftware.kryo.KryoException: java.util.ConcurrentModificationException [hudi]
cbomgit commented on issue #10785: URL: https://github.com/apache/hudi/issues/10785#issuecomment-1974850671 Running the cleaner as a separate job fails as well. Using the following args: ``` spark-submit --master yarn --deploy-mode cluster --class org.apache.hudi.utilities.HoodieCleaner /usr/lib/hudi/hudi-utilities-bundle.jar --target-base-path s3://table-path --props s3://table-path/.hoodie/hoodie.properties --hoodie-conf hoodie.cleaner.policy=KEEP_LATEST_FILE_VERSIONS --hoodie-conf hoodie.cleaner.fileversions.retained=2 --hoodie-conf hoodie.cleaner.policy.failed.writes=LAZY --hoodie-conf hoodie.write.lock.provider=org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider --hoodie-conf hoodie.write.lock.dynamodb.table=HoodieLockTable --hoodie-conf hoodie.metadata.enable=false --hoodie-conf hoodie.write.lock.dynamodb.region=us-east-1 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]
stayrascal commented on code in PR #10797: URL: https://github.com/apache/hudi/pull/10797#discussion_r150904 ## hudi-common/src/main/java/org/apache/hudi/common/table/view/FileSystemViewManager.java: ## @@ -130,69 +131,69 @@ public void close() { /** * Create RocksDB based file System view for a table. * - * @param viewConf View Storage Configuration + * @param viewConf View Storage Configuration * @param metaClient HoodieTableMetaClient - * @return + * @return {@link RocksDbBasedFileSystemView} */ private static RocksDbBasedFileSystemView createRocksDBBasedFileSystemView(FileSystemViewStorageConfig viewConf, - HoodieTableMetaClient metaClient) { + HoodieTableMetaClient metaClient) { HoodieTimeline timeline = metaClient.getActiveTimeline().filterCompletedAndCompactionInstants(); return new RocksDbBasedFileSystemView(metaClient, timeline, viewConf); } /** * Create a spillable Map based file System view for a table. * - * @param viewConf View Storage Configuration + * @param viewConf View Storage Configuration * @param metaClient HoodieTableMetaClient - * @return + * @return {@link SpillableMapBasedFileSystemView} */ - private static SpillableMapBasedFileSystemView createSpillableMapBasedFileSystemView(FileSystemViewStorageConfig viewConf, Review Comment: `FileSystemViewStorageConfig viewConf` is never used. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]
stayrascal commented on code in PR #10797: URL: https://github.com/apache/hudi/pull/10797#discussion_r150420 ## hudi-common/src/main/java/org/apache/hudi/common/table/view/FileSystemViewManager.java: ## @@ -66,17 +65,19 @@ public class FileSystemViewManager { private final SerializableConfiguration conf; // The View Storage config used to store file-system views private final FileSystemViewStorageConfig viewStorageConfig; - // Map from Base-Path to View - private final ConcurrentHashMap globalViewMap; // Factory Map to create file-system views private final Function2 viewCreator; + // Map from Base-Path to View + private final ConcurrentHashMap globalViewMap; Review Comment: it's easy to compare & read if keep the sequence with constructor. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [MINOR] Clean code of FileSystemViewManager [hudi]
stayrascal opened a new pull request, #10797: URL: https://github.com/apache/hudi/pull/10797 ### Change Logs Clean unused methods and parameters of `FileSystemViewManager` ### Impact No ### Risk level (write none, low medium or high below) Low ### Documentation Update No ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]
stayrascal closed pull request #10796: [MINOR] Clean code of FileSystemViewManager URL: https://github.com/apache/hudi/pull/10796 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [MINOR] Clean code of FileSystemViewManager [hudi]
stayrascal opened a new pull request, #10796: URL: https://github.com/apache/hudi/pull/10796 ### Change Logs Clean unused methods and parameters of `FileSystemViewManager` ### Impact No ### Risk level (write none, low medium or high below) Low ### Documentation Update No ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (HUDI-7467) TestHoodieDeltaStreamer. testAutoGenerateRecordKeys
[ https://issues.apache.org/jira/browse/HUDI-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu reassigned HUDI-7467: - Assignee: (was: Lin Liu) > TestHoodieDeltaStreamer. testAutoGenerateRecordKeys > --- > > Key: HUDI-7467 > URL: https://issues.apache.org/jira/browse/HUDI-7467 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lin Liu >Priority: Major > > https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e > {code:java} > [ERROR] Tests run: 131, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: > 2,459.289 s <<< FAILURE! - in > org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer > [ERROR] testAutoGenerateRecordKeys Time elapsed: 14.248 s <<< FAILURE! > org.opentest4j.AssertionFailedError: expected: <300> but was: <500> > at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) > at > org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62) > at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166) > at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161) > at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:611) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.assertRecordCount(HoodieDeltaStreamerTestBase.java:486) > at > org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testAutoGenerateRecordKeys(TestHoodieDeltaStreamer.java:2823) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7467) TestHoodieDeltaStreamer. testAutoGenerateRecordKeys
[ https://issues.apache.org/jira/browse/HUDI-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7467: -- Summary: TestHoodieDeltaStreamer. testAutoGenerateRecordKeys (was: TestHoodieDeltaStreamer) > TestHoodieDeltaStreamer. testAutoGenerateRecordKeys > --- > > Key: HUDI-7467 > URL: https://issues.apache.org/jira/browse/HUDI-7467 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lin Liu >Assignee: Lin Liu >Priority: Major > > https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e > {code:java} > [ERROR] Tests run: 131, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: > 2,459.289 s <<< FAILURE! - in > org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer > [ERROR] testAutoGenerateRecordKeys Time elapsed: 14.248 s <<< FAILURE! > org.opentest4j.AssertionFailedError: expected: <300> but was: <500> > at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) > at > org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62) > at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166) > at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161) > at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:611) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.assertRecordCount(HoodieDeltaStreamerTestBase.java:486) > at > org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testAutoGenerateRecordKeys(TestHoodieDeltaStreamer.java:2823) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7467) TestHoodieDeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7467: -- Description: https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e {code:java} [ERROR] Tests run: 131, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 2,459.289 s <<< FAILURE! - in org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer [ERROR] testAutoGenerateRecordKeys Time elapsed: 14.248 s <<< FAILURE! org.opentest4j.AssertionFailedError: expected: <300> but was: <500> at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62) at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166) at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161) at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:611) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.assertRecordCount(HoodieDeltaStreamerTestBase.java:486) at org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testAutoGenerateRecordKeys(TestHoodieDeltaStreamer.java:2823) {code} was: {code:java} [ERROR] Tests run: 131, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 2,459.289 s <<< FAILURE! - in org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer [ERROR] testAutoGenerateRecordKeys Time elapsed: 14.248 s <<< FAILURE! org.opentest4j.AssertionFailedError: expected: <300> but was: <500> at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62) at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166) at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161) at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:611) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.assertRecordCount(HoodieDeltaStreamerTestBase.java:486) at org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testAutoGenerateRecordKeys(TestHoodieDeltaStreamer.java:2823) {code} > TestHoodieDeltaStreamer > --- > > Key: HUDI-7467 > URL: https://issues.apache.org/jira/browse/HUDI-7467 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lin Liu >Assignee: Lin Liu >Priority: Major > > https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e > {code:java} > [ERROR] Tests run: 131, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: > 2,459.289 s <<< FAILURE! - in > org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer > [ERROR] testAutoGenerateRecordKeys Time elapsed: 14.248 s <<< FAILURE! > org.opentest4j.AssertionFailedError: expected: <300> but was: <500> > at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) > at > org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62) > at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166) > at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161) > at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:611) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.assertRecordCount(HoodieDeltaStreamerTestBase.java:486) > at > org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testAutoGenerateRecordKeys(TestHoodieDeltaStreamer.java:2823) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-7468) TestHoodieDeltaStreamerSchemaEvolutionQuick
[ https://issues.apache.org/jira/browse/HUDI-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu reassigned HUDI-7468: - Assignee: Jonathan Vexler (was: Lin Liu) > TestHoodieDeltaStreamerSchemaEvolutionQuick > --- > > Key: HUDI-7468 > URL: https://issues.apache.org/jira/browse/HUDI-7468 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lin Liu >Assignee: Jonathan Vexler >Priority: Major > > https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e > {code:java} > [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: > 514.307 s <<< FAILURE! - in > org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick > [ERROR] testReorderingColumn{String, Boolean, Boolean, Boolean}[1] Time > elapsed: 13.21 s <<< ERROR! > org.apache.hudi.exception.SchemaCompatibilityException: Incoming batch schema > is not compatible with the table's one > at > org.apache.hudi.HoodieSchemaUtils$.deduceWriterSchema(HoodieSchemaUtils.scala:179) > at > org.apache.hudi.HoodieSparkSqlWriter$.deduceWriterSchema(HoodieSparkSqlWriter.scala:147) > at > org.apache.hudi.HoodieSparkSqlWriter.deduceWriterSchema(HoodieSparkSqlWriter.scala) > at > org.apache.hudi.utilities.streamer.StreamSync.getDeducedSchemaProvider(StreamSync.java:671) > at > org.apache.hudi.utilities.streamer.StreamSync.lambda$fetchNextBatchFromSource$5(StreamSync.java:612) > at org.apache.hudi.common.util.Option.map(Option.java:112) > at > org.apache.hudi.utilities.streamer.StreamSync.fetchNextBatchFromSource(StreamSync.java:612) > at > org.apache.hudi.utilities.streamer.StreamSync.fetchFromSourceAndPrepareRecords(StreamSync.java:524) > at > org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:497) > at > org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:400) > at > org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:855) > at > org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:101) > at > org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:211) > at > org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick.testReorderingColumn(TestHoodieDeltaStreamerSchemaEvolutionQuick.java:327) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7468) TestHoodieDeltaStreamerSchemaEvolutionQuick
[ https://issues.apache.org/jira/browse/HUDI-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7468: -- Description: https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e {code:java} [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 514.307 s <<< FAILURE! - in org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick [ERROR] testReorderingColumn{String, Boolean, Boolean, Boolean}[1] Time elapsed: 13.21 s <<< ERROR! org.apache.hudi.exception.SchemaCompatibilityException: Incoming batch schema is not compatible with the table's one at org.apache.hudi.HoodieSchemaUtils$.deduceWriterSchema(HoodieSchemaUtils.scala:179) at org.apache.hudi.HoodieSparkSqlWriter$.deduceWriterSchema(HoodieSparkSqlWriter.scala:147) at org.apache.hudi.HoodieSparkSqlWriter.deduceWriterSchema(HoodieSparkSqlWriter.scala) at org.apache.hudi.utilities.streamer.StreamSync.getDeducedSchemaProvider(StreamSync.java:671) at org.apache.hudi.utilities.streamer.StreamSync.lambda$fetchNextBatchFromSource$5(StreamSync.java:612) at org.apache.hudi.common.util.Option.map(Option.java:112) at org.apache.hudi.utilities.streamer.StreamSync.fetchNextBatchFromSource(StreamSync.java:612) at org.apache.hudi.utilities.streamer.StreamSync.fetchFromSourceAndPrepareRecords(StreamSync.java:524) at org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:497) at org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:400) at org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:855) at org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72) at org.apache.hudi.common.util.Option.ifPresent(Option.java:101) at org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:211) at org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick.testReorderingColumn(TestHoodieDeltaStreamerSchemaEvolutionQuick.java:327) {code} was: {code:java} [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 514.307 s <<< FAILURE! - in org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick [ERROR] testReorderingColumn{String, Boolean, Boolean, Boolean}[1] Time elapsed: 13.21 s <<< ERROR! org.apache.hudi.exception.SchemaCompatibilityException: Incoming batch schema is not compatible with the table's one at org.apache.hudi.HoodieSchemaUtils$.deduceWriterSchema(HoodieSchemaUtils.scala:179) at org.apache.hudi.HoodieSparkSqlWriter$.deduceWriterSchema(HoodieSparkSqlWriter.scala:147) at org.apache.hudi.HoodieSparkSqlWriter.deduceWriterSchema(HoodieSparkSqlWriter.scala) at org.apache.hudi.utilities.streamer.StreamSync.getDeducedSchemaProvider(StreamSync.java:671) at org.apache.hudi.utilities.streamer.StreamSync.lambda$fetchNextBatchFromSource$5(StreamSync.java:612) at org.apache.hudi.common.util.Option.map(Option.java:112) at org.apache.hudi.utilities.streamer.StreamSync.fetchNextBatchFromSource(StreamSync.java:612) at org.apache.hudi.utilities.streamer.StreamSync.fetchFromSourceAndPrepareRecords(StreamSync.java:524) at org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:497) at org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:400) at org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:855) at org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72) at org.apache.hudi.common.util.Option.ifPresent(Option.java:101) at org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:211) at org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick.testReorderingColumn(TestHoodieDeltaStreamerSchemaEvolutionQuick.java:327) {code} > TestHoodieDeltaStreamerSchemaEvolutionQuick > --- > > Key: HUDI-7468 > URL: https://issues.apache.org/jira/browse/HUDI-7468 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lin Liu >Assignee: Lin Liu >Priority: Major > > https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e > {code:java} > [ERROR] Tests run: 29, Failures: 0, Errors: 3, Sk
[jira] [Created] (HUDI-7468) TestHoodieDeltaStreamerSchemaEvolutionQuick
Lin Liu created HUDI-7468: - Summary: TestHoodieDeltaStreamerSchemaEvolutionQuick Key: HUDI-7468 URL: https://issues.apache.org/jira/browse/HUDI-7468 Project: Apache Hudi Issue Type: Bug Reporter: Lin Liu {code:java} [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 514.307 s <<< FAILURE! - in org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick [ERROR] testReorderingColumn{String, Boolean, Boolean, Boolean}[1] Time elapsed: 13.21 s <<< ERROR! org.apache.hudi.exception.SchemaCompatibilityException: Incoming batch schema is not compatible with the table's one at org.apache.hudi.HoodieSchemaUtils$.deduceWriterSchema(HoodieSchemaUtils.scala:179) at org.apache.hudi.HoodieSparkSqlWriter$.deduceWriterSchema(HoodieSparkSqlWriter.scala:147) at org.apache.hudi.HoodieSparkSqlWriter.deduceWriterSchema(HoodieSparkSqlWriter.scala) at org.apache.hudi.utilities.streamer.StreamSync.getDeducedSchemaProvider(StreamSync.java:671) at org.apache.hudi.utilities.streamer.StreamSync.lambda$fetchNextBatchFromSource$5(StreamSync.java:612) at org.apache.hudi.common.util.Option.map(Option.java:112) at org.apache.hudi.utilities.streamer.StreamSync.fetchNextBatchFromSource(StreamSync.java:612) at org.apache.hudi.utilities.streamer.StreamSync.fetchFromSourceAndPrepareRecords(StreamSync.java:524) at org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:497) at org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:400) at org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:855) at org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72) at org.apache.hudi.common.util.Option.ifPresent(Option.java:101) at org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:211) at org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick.testReorderingColumn(TestHoodieDeltaStreamerSchemaEvolutionQuick.java:327) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-7468) TestHoodieDeltaStreamerSchemaEvolutionQuick
[ https://issues.apache.org/jira/browse/HUDI-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu reassigned HUDI-7468: - Assignee: Lin Liu > TestHoodieDeltaStreamerSchemaEvolutionQuick > --- > > Key: HUDI-7468 > URL: https://issues.apache.org/jira/browse/HUDI-7468 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lin Liu >Assignee: Lin Liu >Priority: Major > > {code:java} > [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: > 514.307 s <<< FAILURE! - in > org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick > [ERROR] testReorderingColumn{String, Boolean, Boolean, Boolean}[1] Time > elapsed: 13.21 s <<< ERROR! > org.apache.hudi.exception.SchemaCompatibilityException: Incoming batch schema > is not compatible with the table's one > at > org.apache.hudi.HoodieSchemaUtils$.deduceWriterSchema(HoodieSchemaUtils.scala:179) > at > org.apache.hudi.HoodieSparkSqlWriter$.deduceWriterSchema(HoodieSparkSqlWriter.scala:147) > at > org.apache.hudi.HoodieSparkSqlWriter.deduceWriterSchema(HoodieSparkSqlWriter.scala) > at > org.apache.hudi.utilities.streamer.StreamSync.getDeducedSchemaProvider(StreamSync.java:671) > at > org.apache.hudi.utilities.streamer.StreamSync.lambda$fetchNextBatchFromSource$5(StreamSync.java:612) > at org.apache.hudi.common.util.Option.map(Option.java:112) > at > org.apache.hudi.utilities.streamer.StreamSync.fetchNextBatchFromSource(StreamSync.java:612) > at > org.apache.hudi.utilities.streamer.StreamSync.fetchFromSourceAndPrepareRecords(StreamSync.java:524) > at > org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:497) > at > org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:400) > at > org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:855) > at > org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:101) > at > org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:211) > at > org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick.testReorderingColumn(TestHoodieDeltaStreamerSchemaEvolutionQuick.java:327) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-7467) TestHoodieDeltaStreamer
Lin Liu created HUDI-7467: - Summary: TestHoodieDeltaStreamer Key: HUDI-7467 URL: https://issues.apache.org/jira/browse/HUDI-7467 Project: Apache Hudi Issue Type: Bug Reporter: Lin Liu Assignee: Lin Liu {code:java} [ERROR] Tests run: 131, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 2,459.289 s <<< FAILURE! - in org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer [ERROR] testAutoGenerateRecordKeys Time elapsed: 14.248 s <<< FAILURE! org.opentest4j.AssertionFailedError: expected: <300> but was: <500> at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62) at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166) at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161) at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:611) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.assertRecordCount(HoodieDeltaStreamerTestBase.java:486) at org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testAutoGenerateRecordKeys(TestHoodieDeltaStreamer.java:2823) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)