Re: [PR] [HUDI-7471] Increase the number of Spark executors in tests [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10802:
URL: https://github.com/apache/hudi/pull/10802#issuecomment-1975070520

   
   ## CI report:
   
   * 4c37feb88ed56cbc6cb81aedcde0eba21996b84f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22743)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7471] Increase the number of Spark executors in tests [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10802:
URL: https://github.com/apache/hudi/pull/10802#issuecomment-1975068983

   
   ## CI report:
   
   * 4c37feb88ed56cbc6cb81aedcde0eba21996b84f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7471] Increase the number of Spark executors in tests [hudi]

2024-03-02 Thread via GitHub


yihua commented on PR #10802:
URL: https://github.com/apache/hudi/pull/10802#issuecomment-1975068962

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7470] Compaction completed not need write to mdt if mdt is disable [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10801:
URL: https://github.com/apache/hudi/pull/10801#issuecomment-1975067481

   
   ## CI report:
   
   * 6524c27e11d40ab23b6248d82a6115a79da6cf49 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22741)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7471) Increase the number of Spark executors in tests

2024-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-7471:
-
Labels: pull-request-available  (was: )

> Increase the number of Spark executors in tests
> ---
>
> Key: HUDI-7471
> URL: https://issues.apache.org/jira/browse/HUDI-7471
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [HUDI-7471] Increase the number of Spark executors in tests [hudi]

2024-03-02 Thread via GitHub


yihua opened a new pull request, #10802:
URL: https://github.com/apache/hudi/pull/10802

   ### Change Logs
   
   This PR makes two minor changes:
   - Increases the number of executors in Spark session in tests.
   - Uses the existing util method to get Spark conf for a few tests.
   
   ### Impact
   
   Reduces test time
   
   ### Risk level
   
   none
   
   ### Documentation Update
   
   N/A
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (HUDI-7471) Increase the number of Spark executors in tests

2024-03-02 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo reassigned HUDI-7471:
---

Assignee: Ethan Guo

> Increase the number of Spark executors in tests
> ---
>
> Key: HUDI-7471
> URL: https://issues.apache.org/jira/browse/HUDI-7471
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7471) Increase the number of Spark executors in tests

2024-03-02 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7471:

Fix Version/s: 0.15.0
   1.0.0

> Increase the number of Spark executors in tests
> ---
>
> Key: HUDI-7471
> URL: https://issues.apache.org/jira/browse/HUDI-7471
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
> Fix For: 0.15.0, 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-7471) Increase the number of Spark executors in tests

2024-03-02 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-7471:
---

 Summary: Increase the number of Spark executors in tests
 Key: HUDI-7471
 URL: https://issues.apache.org/jira/browse/HUDI-7471
 Project: Apache Hudi
  Issue Type: Improvement
Reporter: Ethan Guo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [I] [SUPPORT] Cleaner fails with com.esotericsoftware.kryo.KryoException: java.util.ConcurrentModificationException [hudi]

2024-03-02 Thread via GitHub


cbomgit commented on issue #10785:
URL: https://github.com/apache/hudi/issues/10785#issuecomment-1975053685

   > I believe this PR #10065 should fix the problem
   
   Thanks. Is there a particular condition that triggers this? Also, is the 
patch backported to older versions? I saw the 0.14.1 label, but unsure if it 
means i can apply it to my version (0.11).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(hudi) branch master updated: [HUDI-7469] Reduce redundant tests with Hudi record types (#10800)

2024-03-02 Thread yihua
This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 82f221c7d05 [HUDI-7469] Reduce redundant tests with Hudi record types 
(#10800)
82f221c7d05 is described below

commit 82f221c7d05436bb1eac4f09e1e675d1c91a7cf1
Author: Y Ethan Guo 
AuthorDate: Sat Mar 2 21:56:54 2024 -0800

[HUDI-7469] Reduce redundant tests with Hudi record types (#10800)
---
 .../apache/hudi/functional/TestCOWDataSource.scala |  72 +++---
 .../apache/hudi/functional/TestMORDataSource.scala |  20 +-
 .../sql/hudi/TestAlterTableDropPartition.scala |   4 +-
 .../spark/sql/hudi/TestCompactionTable.scala   |   4 +-
 .../apache/spark/sql/hudi/TestInsertTable.scala| 265 ++---
 .../apache/spark/sql/hudi/TestMergeIntoTable.scala |  24 +-
 .../spark/sql/hudi/TestMergeIntoTable2.scala   |  20 +-
 .../TestMergeIntoTableWithNonRecordKeyField.scala  |   8 +-
 .../org/apache/spark/sql/hudi/TestSpark3DDL.scala  |  16 +-
 .../spark/sql/hudi/TestTimeTravelTable.scala   |  12 +-
 .../apache/spark/sql/hudi/TestUpdateTable.scala|   6 +-
 .../deltastreamer/TestHoodieDeltaStreamer.java |  63 +++--
 12 files changed, 246 insertions(+), 268 deletions(-)

diff --git 
a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala
 
b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala
index a28a228fd46..5614b414927 100644
--- 
a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala
+++ 
b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala
@@ -23,8 +23,9 @@ import 
org.apache.hudi.DataSourceWriteOptions.{INLINE_CLUSTERING_ENABLE, KEYGENE
 import org.apache.hudi.HoodieConversionUtils.toJavaOption
 import org.apache.hudi.QuickstartUtils.{convertToStringList, 
getQuickstartWriteConfigs}
 import org.apache.hudi.client.common.HoodieSparkEngineContext
-import 
org.apache.hudi.common.config.TimestampKeyGeneratorConfig.{TIMESTAMP_INPUT_DATE_FORMAT,
 TIMESTAMP_OUTPUT_DATE_FORMAT, TIMESTAMP_TIMEZONE_FORMAT, TIMESTAMP_TYPE_FIELD}
 import org.apache.hudi.common.config.HoodieMetadataConfig
+import 
org.apache.hudi.common.config.TimestampKeyGeneratorConfig.{TIMESTAMP_INPUT_DATE_FORMAT,
 TIMESTAMP_OUTPUT_DATE_FORMAT, TIMESTAMP_TIMEZONE_FORMAT, TIMESTAMP_TYPE_FIELD}
+import org.apache.hudi.common.fs.FSUtils
 import org.apache.hudi.common.model.HoodieRecord.HoodieRecordType
 import org.apache.hudi.common.model.{HoodieRecord, WriteOperationType}
 import org.apache.hudi.common.table.timeline.{HoodieInstant, HoodieTimeline, 
TimelineUtils}
@@ -44,7 +45,6 @@ import org.apache.hudi.metrics.{Metrics, MetricsReporterType}
 import org.apache.hudi.testutils.HoodieSparkClientTestBase
 import org.apache.hudi.util.JFunction
 import org.apache.hudi.{AvroConversionUtils, DataSourceReadOptions, 
DataSourceWriteOptions, HoodieDataSourceHelpers, QuickstartUtils, 
ScalaAssertionSupport}
-import org.apache.hudi.common.fs.FSUtils
 import org.apache.spark.sql._
 import org.apache.spark.sql.functions._
 import org.apache.spark.sql.hudi.HoodieSparkSessionExtension
@@ -96,10 +96,9 @@ class TestCOWDataSource extends HoodieSparkClientTestBase 
with ScalaAssertionSup
 System.gc()
   }
 
-  @ParameterizedTest
-  @EnumSource(value = classOf[HoodieRecordType], names = Array("AVRO", 
"SPARK"))
-  def testShortNameStorage(recordType: HoodieRecordType) {
-val (writeOpts, readOpts) = getWriterReaderOpts(recordType)
+  @Test
+  def testShortNameStorage(): Unit = {
+val (writeOpts, readOpts) = getWriterReaderOpts()
 
 // Insert Operation
 val records = recordsToStrings(dataGen.generateInserts("000", 100)).toList
@@ -564,10 +563,9 @@ class TestCOWDataSource extends HoodieSparkClientTestBase 
with ScalaAssertionSup
* archival should kick in and 2 commits should be archived. If schema is 
valid, no exception will be thrown. If not,
* NPE will be thrown.
*/
-  @ParameterizedTest
-  @EnumSource(value = classOf[HoodieRecordType], names = Array("AVRO", 
"SPARK"))
-  def testArchivalWithBulkInsert(recordType: HoodieRecordType): Unit = {
-val (writeOpts, readOpts) = getWriterReaderOpts(recordType)
+  @Test
+  def testArchivalWithBulkInsert(): Unit = {
+val (writeOpts, readOpts) = getWriterReaderOpts()
 
 var structType: StructType = null
 for (i <- 1 to 7) {
@@ -696,10 +694,9 @@ class TestCOWDataSource extends HoodieSparkClientTestBase 
with ScalaAssertionSup
 }
   }
 
-  @ParameterizedTest
-  @EnumSource(value = classOf[HoodieRecordType], names = Array("AVRO", 
"SPARK"))
-  def testOverWriteModeUseReplaceAction(recordType: HoodieRecordType): Unit = {
-val (writeOpts, readOpts) = getWriterReaderOpts(recordType)
+  @Test
+  def testOverWriteModeUseReplaceAc

Re: [PR] [HUDI-6953] Adding test for composite keys with bulk insert row writer [hudi]

2024-03-02 Thread via GitHub


yihua merged PR #10214:
URL: https://github.com/apache/hudi/pull/10214


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(hudi) branch master updated: [HUDI-6953] Adding test for composite keys with bulk insert row writer (#10214)

2024-03-02 Thread yihua
This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 59f1c66848c [HUDI-6953] Adding test for composite keys with bulk 
insert row writer (#10214)
59f1c66848c is described below

commit 59f1c66848c3ddbfff1ea5fe3eacd39f1adf9a3a
Author: Sivabalan Narayanan 
AuthorDate: Sat Mar 2 21:57:23 2024 -0800

[HUDI-6953] Adding test for composite keys with bulk insert row writer 
(#10214)
---
 .../apache/hudi/functional/TestCOWDataSource.scala  | 21 +
 1 file changed, 21 insertions(+)

diff --git 
a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala
 
b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala
index 5614b414927..ff87a90cef8 100644
--- 
a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala
+++ 
b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala
@@ -487,6 +487,27 @@ class TestCOWDataSource extends HoodieSparkClientTestBase 
with ScalaAssertionSup
 assertEquals(snapshotDF2.count(), (validRecordsFromBatch1 + 
validRecordsFromBatch2))
   }
 
+  @Test
+  def bulkInsertCompositeKeys(): Unit = {
+val (writeOpts, readOpts) = getWriterReaderOpts(HoodieRecordType.AVRO)
+
+// Insert Operation
+val records = recordsToStrings(dataGen.generateInserts("000", 100)).toList
+val inputDF = spark.read.json(spark.sparkContext.parallelize(records, 2))
+
+val inputDf1 = inputDF.withColumn("new_col",lit("value1"))
+val inputDf2 = inputDF.withColumn("new_col", lit(null).cast("String") )
+
+inputDf1.union(inputDf2).write.format("hudi")
+.options(writeOpts)
+.option(DataSourceWriteOptions.RECORDKEY_FIELD.key, "_row_key,new_col")
+.option(DataSourceWriteOptions.OPERATION.key(),"bulk_insert")
+.mode(SaveMode.Overwrite)
+.save(basePath)
+
+assertEquals(200, 
spark.read.format("org.apache.hudi").options(readOpts).load(basePath).count())
+  }
+
   /**
* This tests the case that query by with a specified partition condition on 
hudi table which is
* different between the value of the partition field and the actual 
partition path,



Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]

2024-03-02 Thread via GitHub


yihua merged PR #10800:
URL: https://github.com/apache/hudi/pull/10800


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10800:
URL: https://github.com/apache/hudi/pull/10800#issuecomment-1975052350

   
   ## CI report:
   
   * 4f3d78c3f0404bf8e0fb3f2aa8907f8718414d31 UNKNOWN
   * 170654037ccf164486aee674542f7b68e7e2714d Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22740)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7469) Reduce redundant tests with Hudi record types

2024-03-02 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7469:

Fix Version/s: 0.15.0
   1.0.0

> Reduce redundant tests with Hudi record types
> -
>
> Key: HUDI-7469
> URL: https://issues.apache.org/jira/browse/HUDI-7469
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> There are lots of tests running with the permutations of both Hudi record 
> types, e.g., AVRO and SPARK, which are not necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7469) Reduce redundant tests with Hudi record types

2024-03-02 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7469:

Priority: Critical  (was: Major)

> Reduce redundant tests with Hudi record types
> -
>
> Key: HUDI-7469
> URL: https://issues.apache.org/jira/browse/HUDI-7469
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> There are lots of tests running with the permutations of both Hudi record 
> types, e.g., AVRO and SPARK, which are not necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-7469) Reduce redundant tests with Hudi record types

2024-03-02 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo reassigned HUDI-7469:
---

Assignee: Ethan Guo

> Reduce redundant tests with Hudi record types
> -
>
> Key: HUDI-7469
> URL: https://issues.apache.org/jira/browse/HUDI-7469
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
>  Labels: pull-request-available
>
> There are lots of tests running with the permutations of both Hudi record 
> types, e.g., AVRO and SPARK, which are not necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-7469) Reduce redundant tests with Hudi record types

2024-03-02 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo closed HUDI-7469.
---
Resolution: Fixed

> Reduce redundant tests with Hudi record types
> -
>
> Key: HUDI-7469
> URL: https://issues.apache.org/jira/browse/HUDI-7469
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
>  Labels: pull-request-available
>
> There are lots of tests running with the permutations of both Hudi record 
> types, e.g., AVRO and SPARK, which are not necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [HUDI-7470] Compaction completed not need write to mdt if mdt is disable [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10801:
URL: https://github.com/apache/hudi/pull/10801#issuecomment-1975024791

   
   ## CI report:
   
   * 6524c27e11d40ab23b6248d82a6115a79da6cf49 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22741)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7470] Compaction completed not need write to mdt if mdt is disable [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10801:
URL: https://github.com/apache/hudi/pull/10801#issuecomment-1975023561

   
   ## CI report:
   
   * 6524c27e11d40ab23b6248d82a6115a79da6cf49 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10800:
URL: https://github.com/apache/hudi/pull/10800#issuecomment-1975022383

   
   ## CI report:
   
   * 4f3d78c3f0404bf8e0fb3f2aa8907f8718414d31 UNKNOWN
   * 066f13566c862bb22b9ba5945768a431bf7fdf0c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22739)
 
   * 170654037ccf164486aee674542f7b68e7e2714d Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22740)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7470) Compaction completed not need write to mdt if it is disable

2024-03-02 Thread xy (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xy updated HUDI-7470:
-
Attachment: CAPTURE_2024-03-03_124553.jpg

> Compaction completed not need write to mdt if it is disable
> ---
>
> Key: HUDI-7470
> URL: https://issues.apache.org/jira/browse/HUDI-7470
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: spark-sql
>Reporter: xy
>Assignee: xy
>Priority: Major
>  Labels: pull-request-available
> Attachments: CAPTURE_2024-03-03_123512.jpg, 
> CAPTURE_2024-03-03_124229.jpg, CAPTURE_2024-03-03_124553.jpg, mdt.jpg
>
>
> Compaction completed not need write to mdt if it is disable.
> when sparksql is set hoodie.metadata.enable=false and execute compaction,it 
> would also execute metadata update. It is not fitable if need disable mdt.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7470) Compaction completed not need write to mdt if it is disable

2024-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-7470:
-
Labels: pull-request-available  (was: )

> Compaction completed not need write to mdt if it is disable
> ---
>
> Key: HUDI-7470
> URL: https://issues.apache.org/jira/browse/HUDI-7470
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: spark-sql
>Reporter: xy
>Assignee: xy
>Priority: Major
>  Labels: pull-request-available
> Attachments: CAPTURE_2024-03-03_123512.jpg, 
> CAPTURE_2024-03-03_124229.jpg, mdt.jpg
>
>
> Compaction completed not need write to mdt if it is disable.
> when sparksql is set hoodie.metadata.enable=false and execute compaction,it 
> would also execute metadata update. It is not fitable if need disable mdt.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [HUDI-7470] Compaction completed not need write to mdt if it is disable [hudi]

2024-03-02 Thread via GitHub


xuzifu666 opened a new pull request, #10801:
URL: https://github.com/apache/hudi/pull/10801

   ### Change Logs
   
   Compaction completed not need write to mdt if it is disable.
   
   when sparksql is set hoodie.metadata.enable=false and execute compaction,it 
would also execute metadata update. It is not fitable if need disable mdt.
   
   ### Impact
   
   low
   
   ### Risk level (write none, low medium or high below)
   
   low
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7470) Compaction completed not need write to mdt if it is disable

2024-03-02 Thread xy (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xy updated HUDI-7470:
-
Attachment: CAPTURE_2024-03-03_124229.jpg

> Compaction completed not need write to mdt if it is disable
> ---
>
> Key: HUDI-7470
> URL: https://issues.apache.org/jira/browse/HUDI-7470
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: spark-sql
>Reporter: xy
>Assignee: xy
>Priority: Major
> Attachments: CAPTURE_2024-03-03_123512.jpg, 
> CAPTURE_2024-03-03_124229.jpg, mdt.jpg
>
>
> Compaction completed not need write to mdt if it is disable.
> when sparksql is set hoodie.metadata.enable=false and execute compaction,it 
> would also execute metadata update. It is not fitable if need disable mdt.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7470) Compaction completed not need write to mdt if it is disable

2024-03-02 Thread xy (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xy updated HUDI-7470:
-
Summary: Compaction completed not need write to mdt if it is disable  (was: 
Compaction completed need write to mdt if it is enable)

> Compaction completed not need write to mdt if it is disable
> ---
>
> Key: HUDI-7470
> URL: https://issues.apache.org/jira/browse/HUDI-7470
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: spark-sql
>Reporter: xy
>Assignee: xy
>Priority: Major
> Attachments: CAPTURE_2024-03-03_123512.jpg, mdt.jpg
>
>
> Compaction completed need write to mdt if it is enable.
> when sparksql is set hoodie.metadata.enable=false and execute compaction,it 
> would also execute metadata update. It is not fitable if need disable mdt.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7470) Compaction completed not need write to mdt if it is disable

2024-03-02 Thread xy (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xy updated HUDI-7470:
-
Description: 
Compaction completed not need write to mdt if it is disable.

when sparksql is set hoodie.metadata.enable=false and execute compaction,it 
would also execute metadata update. It is not fitable if need disable mdt.

  was:
Compaction completed need write to mdt if it is enable.

when sparksql is set hoodie.metadata.enable=false and execute compaction,it 
would also execute metadata update. It is not fitable if need disable mdt.


> Compaction completed not need write to mdt if it is disable
> ---
>
> Key: HUDI-7470
> URL: https://issues.apache.org/jira/browse/HUDI-7470
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: spark-sql
>Reporter: xy
>Assignee: xy
>Priority: Major
> Attachments: CAPTURE_2024-03-03_123512.jpg, mdt.jpg
>
>
> Compaction completed not need write to mdt if it is disable.
> when sparksql is set hoodie.metadata.enable=false and execute compaction,it 
> would also execute metadata update. It is not fitable if need disable mdt.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-7470) Compaction completed need write to mdt if it is enable

2024-03-02 Thread xy (Jira)
xy created HUDI-7470:


 Summary: Compaction completed need write to mdt if it is enable
 Key: HUDI-7470
 URL: https://issues.apache.org/jira/browse/HUDI-7470
 Project: Apache Hudi
  Issue Type: Improvement
  Components: spark-sql
Reporter: xy
Assignee: xy
 Attachments: CAPTURE_2024-03-03_123512.jpg, mdt.jpg

Compaction completed need write to mdt if it is enable.

when sparksql is set hoodie.metadata.enable=false and execute compaction,it 
would also execute metadata update. It is not fitable if need disable mdt.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10800:
URL: https://github.com/apache/hudi/pull/10800#issuecomment-1975014734

   
   ## CI report:
   
   * 4f3d78c3f0404bf8e0fb3f2aa8907f8718414d31 UNKNOWN
   * 066f13566c862bb22b9ba5945768a431bf7fdf0c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22739)
 
   * 170654037ccf164486aee674542f7b68e7e2714d UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6089] Handle default insert behaviour to ingest duplicates [hudi]

2024-03-02 Thread via GitHub


wombatu-kun commented on PR #10728:
URL: https://github.com/apache/hudi/pull/10728#issuecomment-1975012557

   update in documentation is already made: 
https://github.com/apache/hudi/pull/10739


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10800:
URL: https://github.com/apache/hudi/pull/10800#issuecomment-1975011931

   
   ## CI report:
   
   * 4f3d78c3f0404bf8e0fb3f2aa8907f8718414d31 UNKNOWN
   * 066f13566c862bb22b9ba5945768a431bf7fdf0c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22739)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Cleaner fails with com.esotericsoftware.kryo.KryoException: java.util.ConcurrentModificationException [hudi]

2024-03-02 Thread via GitHub


CTTY commented on issue #10785:
URL: https://github.com/apache/hudi/issues/10785#issuecomment-1975008692

   I believe this PR #10065 should fix the problem


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10800:
URL: https://github.com/apache/hudi/pull/10800#issuecomment-1975003929

   
   ## CI report:
   
   * a9e83f5d727a100b39f8da8fde8eda78a9101de8 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22738)
 
   * 4f3d78c3f0404bf8e0fb3f2aa8907f8718414d31 UNKNOWN
   * 066f13566c862bb22b9ba5945768a431bf7fdf0c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10800:
URL: https://github.com/apache/hudi/pull/10800#issuecomment-1975002647

   
   ## CI report:
   
   * a9e83f5d727a100b39f8da8fde8eda78a9101de8 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22738)
 
   * 4f3d78c3f0404bf8e0fb3f2aa8907f8718414d31 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10799:
URL: https://github.com/apache/hudi/pull/10799#issuecomment-1975002634

   
   ## CI report:
   
   * 1982318df811e9dbbb0458b2219d251ceeae683a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22737)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10799:
URL: https://github.com/apache/hudi/pull/10799#issuecomment-1975001765

   
   ## CI report:
   
   * 25fe17b146b5a519faf87b398aeac917ecdbbad0 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22735)
 
   * 1982318df811e9dbbb0458b2219d251ceeae683a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22737)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10800:
URL: https://github.com/apache/hudi/pull/10800#issuecomment-1974980800

   
   ## CI report:
   
   * a9e83f5d727a100b39f8da8fde8eda78a9101de8 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22738)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10800:
URL: https://github.com/apache/hudi/pull/10800#issuecomment-1974979326

   
   ## CI report:
   
   * a9e83f5d727a100b39f8da8fde8eda78a9101de8 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6953] Adding test for composite keys with bulk insert row writer [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10214:
URL: https://github.com/apache/hudi/pull/10214#issuecomment-1974977841

   
   ## CI report:
   
   * 039b6d5e9aef7b31e8e44aeb367e5352d66bbe9c UNKNOWN
   * 020b0107da7fb19f738d4cb639eedada78299729 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22733)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7150] ExternalSpillableMap support values method [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10194:
URL: https://github.com/apache/hudi/pull/10194#issuecomment-1974977818

   
   ## CI report:
   
   * b0608830895508b879e36d8099fdebbb605a4aec Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22734)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]

2024-03-02 Thread via GitHub


yihua commented on PR #10799:
URL: https://github.com/apache/hudi/pull/10799#issuecomment-1974976881

   > Hmm, most of the PRs does not need update for doc, is it reasonable to by 
default do all the validations? And the `[MINOR] xxx` style title seems been 
fixed to pass the validation right?
   
   User needs to just add `N/A` to the "Documentation Update" section in the PR 
description.  This is a reminder for author to check if there is any 
documentation update needed, not necessary to do update if the code changes are 
not user-facing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7469) Reduce redundant tests with Hudi record types

2024-03-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-7469:
-
Labels: pull-request-available  (was: )

> Reduce redundant tests with Hudi record types
> -
>
> Key: HUDI-7469
> URL: https://issues.apache.org/jira/browse/HUDI-7469
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Priority: Major
>  Labels: pull-request-available
>
> There are lots of tests running with the permutations of both Hudi record 
> types, e.g., AVRO and SPARK, which are not necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [HUDI-7469] Reduce redundant tests with Hudi record types [hudi]

2024-03-02 Thread via GitHub


yihua opened a new pull request, #10800:
URL: https://github.com/apache/hudi/pull/10800

   ### Change Logs
   
   There are lots of functional tests running with the permutations of both 
Hudi record types, e.g., AVRO and SPARK, which are not necessary, e.g., not 
directly related to testing the record type.  This PR removes them to save time 
in CI.
   
   ### Impact
   
   Reduces time in CI to avoid running unnecessary tests.
   
   ### Risk level
   
   none
   
   ### Documentation Update
   
   N/A
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7469) Reduce redundant tests with Hudi record types

2024-03-02 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7469:

Description: There are lots of tests running with the permutations of both 
Hudi record types, e.g., AVRO and SPARK, which are not necessary.

> Reduce redundant tests with Hudi record types
> -
>
> Key: HUDI-7469
> URL: https://issues.apache.org/jira/browse/HUDI-7469
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Priority: Major
>
> There are lots of tests running with the permutations of both Hudi record 
> types, e.g., AVRO and SPARK, which are not necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-7469) Reduce redundant tests with Hudi record types

2024-03-02 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-7469:
---

 Summary: Reduce redundant tests with Hudi record types
 Key: HUDI-7469
 URL: https://issues.apache.org/jira/browse/HUDI-7469
 Project: Apache Hudi
  Issue Type: Improvement
Reporter: Ethan Guo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]

2024-03-02 Thread via GitHub


danny0405 commented on PR #10799:
URL: https://github.com/apache/hudi/pull/10799#issuecomment-1974966068

   Hmm, most of the PRs does not need update for doc, is it reasonable to by 
default do all the validations? And the `[MINOR] xxx` style title seems been 
fixed to pass the validation right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10799:
URL: https://github.com/apache/hudi/pull/10799#issuecomment-1974965902

   
   ## CI report:
   
   * 25fe17b146b5a519faf87b398aeac917ecdbbad0 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22735)
 
   * 1982318df811e9dbbb0458b2219d251ceeae683a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22737)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10799:
URL: https://github.com/apache/hudi/pull/10799#issuecomment-1974964732

   
   ## CI report:
   
   * 25fe17b146b5a519faf87b398aeac917ecdbbad0 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22735)
 
   * 1982318df811e9dbbb0458b2219d251ceeae683a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-6089) Handle default insert behaviour to ingest duplicates

2024-03-02 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated HUDI-6089:
-
Status: Open  (was: In Progress)

> Handle default insert behaviour to ingest duplicates
> 
>
> Key: HUDI-6089
> URL: https://issues.apache.org/jira/browse/HUDI-6089
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: configs, writer-core
>Reporter: Aditya Goenka
>Assignee: Vova Kolmakov
>Priority: Major
>  Labels: insert, pull-request-available
> Fix For: 1.1.0
>
>
> Related to - [https://github.com/apache/hudi/issues/8451]
>  
> Make default value of  "hoodie.merge.allow.duplicate.on.inserts" as True to 
> avoid the merge stage for operation type insert and combine before insert is 
> false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-6089) Handle default insert behaviour to ingest duplicates

2024-03-02 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen closed HUDI-6089.

Resolution: Fixed

Fixed via master branch: 3a864ec63598c2919c06ed03422cf54416b31b43

> Handle default insert behaviour to ingest duplicates
> 
>
> Key: HUDI-6089
> URL: https://issues.apache.org/jira/browse/HUDI-6089
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: configs, writer-core
>Reporter: Aditya Goenka
>Assignee: Vova Kolmakov
>Priority: Major
>  Labels: insert, pull-request-available
> Fix For: 1.1.0
>
>
> Related to - [https://github.com/apache/hudi/issues/8451]
>  
> Make default value of  "hoodie.merge.allow.duplicate.on.inserts" as True to 
> avoid the merge stage for operation type insert and combine before insert is 
> false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


(hudi) branch master updated: [HUDI-6089] Handle default insert behaviour to ingest duplicates (#10728)

2024-03-02 Thread danny0405
This is an automated email from the ASF dual-hosted git repository.

danny0405 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 3a864ec6359 [HUDI-6089] Handle default insert behaviour to ingest 
duplicates (#10728)
3a864ec6359 is described below

commit 3a864ec63598c2919c06ed03422cf54416b31b43
Author: wombatu-kun 
AuthorDate: Sun Mar 3 07:44:26 2024 +0700

[HUDI-6089] Handle default insert behaviour to ingest duplicates (#10728)

Co-authored-by: Vova Kolmakov 
---
 .../src/main/java/org/apache/hudi/config/HoodieWriteConfig.java   | 2 +-
 .../main/java/org/apache/hudi/metadata/HoodieMetadataWriteUtils.java  | 1 +
 .../src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java   | 1 +
 .../org/apache/spark/sql/hudi/TestHoodieTableValuedFunction.scala | 1 +
 .../src/test/scala/org/apache/spark/sql/hudi/TestInsertTable.scala| 4 +++-
 .../test/scala/org/apache/spark/sql/hudi/TestMergeIntoTable2.scala| 2 ++
 .../apache/hudi/utilities/deltastreamer/TestHoodieDeltaStreamer.java  | 2 ++
 7 files changed, 11 insertions(+), 2 deletions(-)

diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
index f4cb386d271..9447069a995 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
@@ -562,7 +562,7 @@ public class HoodieWriteConfig extends HoodieConfig {
 
   public static final ConfigProperty 
MERGE_ALLOW_DUPLICATE_ON_INSERTS_ENABLE = ConfigProperty
   .key("hoodie.merge.allow.duplicate.on.inserts")
-  .defaultValue("false")
+  .defaultValue("true")
   .markAdvanced()
   .withDocumentation("When enabled, we allow duplicate keys even if 
inserts are routed to merge with an existing file (for ensuring file sizing)."
   + " This is only relevant for insert operation, since upsert, delete 
operations will ensure unique key constraints are maintained.");
diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataWriteUtils.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataWriteUtils.java
index 7c42ccf5016..243b74b9199 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataWriteUtils.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataWriteUtils.java
@@ -86,6 +86,7 @@ public class HoodieMetadataWriteUtils {
 HoodieWriteConfig.Builder builder = HoodieWriteConfig.newBuilder()
 .withEngineType(writeConfig.getEngineType())
 .withTimelineLayoutVersion(TimelineLayoutVersion.CURR_VERSION)
+.withMergeAllowDuplicateOnInserts(false)
 .withConsistencyGuardConfig(ConsistencyGuardConfig.newBuilder()
 
.withConsistencyCheckEnabled(writeConfig.getConsistencyGuardConfig().isConsistencyCheckEnabled())
 
.withInitialConsistencyCheckIntervalMs(writeConfig.getConsistencyGuardConfig().getInitialConsistencyCheckIntervalMs())
diff --git 
a/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java
 
b/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java
index 5c93f924ece..90fcfd4fd7a 100644
--- 
a/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java
+++ 
b/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java
@@ -89,6 +89,7 @@ public class TestHoodieWriteConfig {
 assertEquals(5, config.getMaxCommitsToKeep());
 assertEquals(2, config.getMinCommitsToKeep());
 assertTrue(config.shouldUseExternalSchemaTransformation());
+assertTrue(config.allowDuplicateInserts());
   }
 
   @Test
diff --git 
a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestHoodieTableValuedFunction.scala
 
b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestHoodieTableValuedFunction.scala
index bdf512d3451..aa6ff39431f 100644
--- 
a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestHoodieTableValuedFunction.scala
+++ 
b/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestHoodieTableValuedFunction.scala
@@ -450,6 +450,7 @@ class TestHoodieTableValuedFunction extends 
HoodieSparkSqlTestBase {
|""".stripMargin
   )
 
+  spark.sql("set hoodie.merge.allow.duplicate.on.inserts = false")
   spark.sql(
 s"""
| insert into $tableName
diff --git 
a/hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestInsertTable.scala
 
b/hud

Re: [PR] [HUDI-6089] Handle default insert behaviour to ingest duplicates [hudi]

2024-03-02 Thread via GitHub


danny0405 merged PR #10728:
URL: https://github.com/apache/hudi/pull/10728


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Cleaner fails with com.esotericsoftware.kryo.KryoException: java.util.ConcurrentModificationException [hudi]

2024-03-02 Thread via GitHub


danny0405 commented on issue #10785:
URL: https://github.com/apache/hudi/issues/10785#issuecomment-1974960705

   cc @umehrot2 for visibility.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Increase spark num executors in tests [hudi]

2024-03-02 Thread via GitHub


yihua closed pull request #10798: [MINOR] Increase spark num executors in tests
URL: https://github.com/apache/hudi/pull/10798


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10799:
URL: https://github.com/apache/hudi/pull/10799#issuecomment-1974950031

   
   ## CI report:
   
   * 25fe17b146b5a519faf87b398aeac917ecdbbad0 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22735)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6953] Adding test for composite keys with bulk insert row writer [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10214:
URL: https://github.com/apache/hudi/pull/10214#issuecomment-1974949802

   
   ## CI report:
   
   * 0ee77f22a2f213a1c581e443a52eb6965832abc4 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21233)
 
   * 039b6d5e9aef7b31e8e44aeb367e5352d66bbe9c UNKNOWN
   * 020b0107da7fb19f738d4cb639eedada78299729 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22733)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7150] ExternalSpillableMap support values method [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10194:
URL: https://github.com/apache/hudi/pull/10194#issuecomment-1974949774

   
   ## CI report:
   
   * c6bf629aadfc3b60d94e74ef69c85356248fdffb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21180)
 
   * b0608830895508b879e36d8099fdebbb605a4aec Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22734)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add PR description validation on documentation updates [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10799:
URL: https://github.com/apache/hudi/pull/10799#issuecomment-1974947706

   
   ## CI report:
   
   * 25fe17b146b5a519faf87b398aeac917ecdbbad0 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6953] Adding test for composite keys with bulk insert row writer [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10214:
URL: https://github.com/apache/hudi/pull/10214#issuecomment-1974947482

   
   ## CI report:
   
   * 0ee77f22a2f213a1c581e443a52eb6965832abc4 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21233)
 
   * 039b6d5e9aef7b31e8e44aeb367e5352d66bbe9c UNKNOWN
   * 020b0107da7fb19f738d4cb639eedada78299729 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7150] ExternalSpillableMap support values method [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10194:
URL: https://github.com/apache/hudi/pull/10194#issuecomment-1974947465

   
   ## CI report:
   
   * c6bf629aadfc3b60d94e74ef69c85356248fdffb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21180)
 
   * b0608830895508b879e36d8099fdebbb605a4aec UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Increase spark num executors in tests [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10798:
URL: https://github.com/apache/hudi/pull/10798#issuecomment-1974946004

   
   ## CI report:
   
   * f17d97c01fc22a1aee1012a1fd494e88a242f57f UNKNOWN
   * 615d596b29492b6a9b65c8114c2137eb4b84eb70 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22732)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6953] Adding test for composite keys with bulk insert row writer [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10214:
URL: https://github.com/apache/hudi/pull/10214#issuecomment-1974945872

   
   ## CI report:
   
   * 0ee77f22a2f213a1c581e443a52eb6965832abc4 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21233)
 
   * 039b6d5e9aef7b31e8e44aeb367e5352d66bbe9c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [MINOR] Add PR description validation on documentation updates [hudi]

2024-03-02 Thread via GitHub


yihua opened a new pull request, #10799:
URL: https://github.com/apache/hudi/pull/10799

   ### Change Logs
   
   As above, to make PR description validation strict.
   
   ### Impact
   
   As above.
   
   ### Risk level
   
   none
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Increase spark num executors in tests [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10798:
URL: https://github.com/apache/hudi/pull/10798#issuecomment-1974936494

   
   ## CI report:
   
   * f17d97c01fc22a1aee1012a1fd494e88a242f57f UNKNOWN
   * 615d596b29492b6a9b65c8114c2137eb4b84eb70 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6089] Handle default insert behaviour to ingest duplicates [hudi]

2024-03-02 Thread via GitHub


bvaradar commented on PR #10728:
URL: https://github.com/apache/hudi/pull/10728#issuecomment-1974935143

   cc @nsivabalan : this needs update in documentation.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Increase spark num executors in tests [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10798:
URL: https://github.com/apache/hudi/pull/10798#issuecomment-1974934148

   
   ## CI report:
   
   * f17d97c01fc22a1aee1012a1fd494e88a242f57f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [MINOR] Increase spark num executors in tests [hudi]

2024-03-02 Thread via GitHub


yihua opened a new pull request, #10798:
URL: https://github.com/apache/hudi/pull/10798

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was 
copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance 
impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(hudi) branch master updated: [HUDI-7465] Split tests in CI further to reduce total CI elapsed time (#10795)

2024-03-02 Thread yihua
This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new eeccdf9bb0f [HUDI-7465] Split tests in CI further to reduce total CI 
elapsed time (#10795)
eeccdf9bb0f is described below

commit eeccdf9bb0f2885c37e0b480c330400fd2f80a1b
Author: Y Ethan Guo 
AuthorDate: Sat Mar 2 13:59:58 2024 -0800

[HUDI-7465] Split tests in CI further to reduce total CI elapsed time 
(#10795)
---
 .github/workflows/bot.yml| 139 +++
 azure-pipelines-20230430.yml |  58 ++
 2 files changed, 176 insertions(+), 21 deletions(-)

diff --git a/.github/workflows/bot.yml b/.github/workflows/bot.yml
index 0bfd9541bcc..3007c752534 100644
--- a/.github/workflows/bot.yml
+++ b/.github/workflows/bot.yml
@@ -53,7 +53,7 @@ jobs:
   - name: RAT check
 run: ./scripts/release/validate_source_rat.sh
 
-  test-spark:
+  test-spark-java-tests:
 runs-on: ubuntu-latest
 strategy:
   matrix:
@@ -107,22 +107,87 @@ jobs:
   SPARK_PROFILE: ${{ matrix.sparkProfile }}
 run:
   mvn test -Punit-tests -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" -pl 
hudi-examples/hudi-examples-spark $MVN_ARGS
-  - name: UT - Common & Spark
+  - name: Java UT - Common & Spark
 env:
   SCALA_PROFILE: ${{ matrix.scalaProfile }}
   SPARK_PROFILE: ${{ matrix.sparkProfile }}
   SPARK_MODULES: ${{ matrix.sparkModules }}
 if: ${{ !endsWith(env.SPARK_PROFILE, '3.2') }} # skip test spark 3.2 
as it's covered by Azure CI
 run:
-  mvn test -Punit-tests -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" -pl 
"$SPARK_COMMON_MODULES,$SPARK_MODULES" $MVN_ARGS
-  - name: FT - Spark
+  mvn test -Punit-tests -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" 
-DwildcardSuites=skipScalaTests -DfailIfNoTests=false -pl 
"$SPARK_COMMON_MODULES,$SPARK_MODULES" $MVN_ARGS
+  - name: Java FT - Spark
 env:
   SCALA_PROFILE: ${{ matrix.scalaProfile }}
   SPARK_PROFILE: ${{ matrix.sparkProfile }}
   SPARK_MODULES: ${{ matrix.sparkModules }}
 if: ${{ !endsWith(env.SPARK_PROFILE, '3.2') }} # skip test spark 3.2 
as it's covered by Azure CI
 run:
-  mvn test -Pfunctional-tests -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" 
-pl "$SPARK_COMMON_MODULES,$SPARK_MODULES" $MVN_ARGS
+  mvn test -Pfunctional-tests -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" 
-DwildcardSuites=skipScalaTests -DfailIfNoTests=false -pl 
"$SPARK_COMMON_MODULES,$SPARK_MODULES" $MVN_ARGS
+
+  test-spark-scala-tests:
+runs-on: ubuntu-latest
+strategy:
+  matrix:
+include:
+  - scalaProfile: "scala-2.11"
+sparkProfile: "spark2.4"
+sparkModules: "hudi-spark-datasource/hudi-spark2"
+
+  - scalaProfile: "scala-2.12"
+sparkProfile: "spark3.0"
+sparkModules: "hudi-spark-datasource/hudi-spark3.0.x"
+
+  - scalaProfile: "scala-2.12"
+sparkProfile: "spark3.1"
+sparkModules: "hudi-spark-datasource/hudi-spark3.1.x"
+
+  - scalaProfile: "scala-2.12"
+sparkProfile: "spark3.2"
+sparkModules: "hudi-spark-datasource/hudi-spark3.2.x"
+
+  - scalaProfile: "scala-2.12"
+sparkProfile: "spark3.3"
+sparkModules: "hudi-spark-datasource/hudi-spark3.3.x"
+
+  - scalaProfile: "scala-2.12"
+sparkProfile: "spark3.4"
+sparkModules: "hudi-spark-datasource/hudi-spark3.4.x"
+
+  - scalaProfile: "scala-2.12"
+sparkProfile: "spark3.5"
+sparkModules: "hudi-spark-datasource/hudi-spark3.5.x"
+
+steps:
+  - uses: actions/checkout@v3
+  - name: Set up JDK 8
+uses: actions/setup-java@v3
+with:
+  java-version: '8'
+  distribution: 'adopt'
+  architecture: x64
+  cache: maven
+  - name: Build Project
+env:
+  SCALA_PROFILE: ${{ matrix.scalaProfile }}
+  SPARK_PROFILE: ${{ matrix.sparkProfile }}
+run:
+  mvn clean install -T 2 -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" 
-DskipTests=true $MVN_ARGS -am -pl 
"hudi-examples/hudi-examples-spark,$SPARK_COMMON_MODULES,$SPARK_MODULES"
+  - name: Scala UT - Common & Spark
+env:
+  SCALA_PROFILE: ${{ matrix.scalaProfile }}
+  SPARK_PROFILE: ${{ matrix.sparkProfile }}
+  SPARK_MODULES: ${{ matrix.sparkModules }}
+if: ${{ !endsWith(env.SPARK_PROFILE, '3.2') }} # skip test spark 3.2 
as it's covered by Azure CI
+run:
+  mvn test -Punit-tests -D"$SCALA_PROFILE" -D"$SPARK_PROFILE" 
-Dtest=skipJavaTests -DfailIfNoTests=false -pl 
"$SPARK_COMMON_MODULES,$SPARK_MODULES" $MVN_ARGS
+  - name: Scala FT - Spark
+env:
+  SCALA_PROFILE: ${{ matrix.sca

Re: [PR] [HUDI-7465] Split tests in CI further to reduce total CI elapsed time [hudi]

2024-03-02 Thread via GitHub


yihua merged PR #10795:
URL: https://github.com/apache/hudi/pull/10795


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7465] Split tests in CI further to reduce total CI elapsed time [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10795:
URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974919265

   
   ## CI report:
   
   * fe31edd435cf0c990fa3174db7d43a9412ad012c Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22729)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7465] Split tests in CI further to reduce total CI elapsed time [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10795:
URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974909152

   
   ## CI report:
   
   * fe31edd435cf0c990fa3174db7d43a9412ad012c Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22729)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(hudi) branch master updated: [HUDI-7341] Support unmerged record read (#10632)

2024-03-02 Thread yihua
This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new f792643657e [HUDI-7341] Support unmerged record read (#10632)
f792643657e is described below

commit f792643657ebba69edd2b2aeeb4a37d15c39beba
Author: Lin Liu <141371752+linliu-c...@users.noreply.github.com>
AuthorDate: Sat Mar 2 12:58:15 2024 -0800

[HUDI-7341] Support unmerged record read (#10632)
---
 .../hudi/common/engine/HoodieReaderContext.java|   7 +
 .../table/read/HoodieFileGroupRecordBuffer.java|   8 +-
 .../read/HoodieKeyBasedFileGroupRecordBuffer.java  |   2 +-
 .../HoodiePositionBasedFileGroupRecordBuffer.java  |   2 +-
 .../read/HoodieUnmergedFileGroupRecordBuffer.java  | 146 +
 .../testutils/reader/HoodieTestReaderContext.java  |   9 ++
 6 files changed, 170 insertions(+), 4 deletions(-)

diff --git 
a/hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java
 
b/hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java
index 1d81007c375..86a875c9df3 100644
--- 
a/hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java
+++ 
b/hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java
@@ -219,4 +219,11 @@ public abstract class HoodieReaderContext {
   public long extractRecordPosition(T record, Schema schema, String fieldName, 
long providedPositionIfNeeded) {
 return providedPositionIfNeeded;
   }
+
+  /**
+   * Constructs engine specific delete record.
+   */
+  public T constructRawDeleteRecord(Map metadata) {
+return null;
+  }
 }
diff --git 
a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupRecordBuffer.java
 
b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupRecordBuffer.java
index ccc001e79c9..d9ba8bcd90e 100644
--- 
a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupRecordBuffer.java
+++ 
b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupRecordBuffer.java
@@ -34,8 +34,12 @@ import java.util.Map;
 
 public interface HoodieFileGroupRecordBuffer {
   enum BufferType {
-KEY_BASED,
-POSITION_BASED
+// Merging based on record key.
+KEY_BASED_MERGE,
+// Merging based on record position.
+POSITION_BASED_MERGE,
+// No Merging at all.
+UNMERGED
   }
 
   /**
diff --git 
a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieKeyBasedFileGroupRecordBuffer.java
 
b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieKeyBasedFileGroupRecordBuffer.java
index b4e32be8c65..0430a42e863 100644
--- 
a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieKeyBasedFileGroupRecordBuffer.java
+++ 
b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieKeyBasedFileGroupRecordBuffer.java
@@ -65,7 +65,7 @@ public class HoodieKeyBasedFileGroupRecordBuffer extends 
HoodieBaseFileGroupR
 
   @Override
   public BufferType getBufferType() {
-return BufferType.KEY_BASED;
+return BufferType.KEY_BASED_MERGE;
   }
 
   @Override
diff --git 
a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodiePositionBasedFileGroupRecordBuffer.java
 
b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodiePositionBasedFileGroupRecordBuffer.java
index 4412713928f..50e969343e1 100644
--- 
a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodiePositionBasedFileGroupRecordBuffer.java
+++ 
b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodiePositionBasedFileGroupRecordBuffer.java
@@ -72,7 +72,7 @@ public class HoodiePositionBasedFileGroupRecordBuffer 
extends HoodieBaseFileG
 
   @Override
   public BufferType getBufferType() {
-return BufferType.POSITION_BASED;
+return BufferType.POSITION_BASED_MERGE;
   }
 
   @Override
diff --git 
a/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieUnmergedFileGroupRecordBuffer.java
 
b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieUnmergedFileGroupRecordBuffer.java
new file mode 100644
index 000..76aa28308c4
--- /dev/null
+++ 
b/hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieUnmergedFileGroupRecordBuffer.java
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the Licens

Re: [PR] [HUDI-7341] Support unmerged record read [hudi]

2024-03-02 Thread via GitHub


yihua merged PR #10632:
URL: https://github.com/apache/hudi/pull/10632


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7465] Split tests in CI further to reduce total CI elapsed time [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10795:
URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974907215

   
   ## CI report:
   
   * fe31edd435cf0c990fa3174db7d43a9412ad012c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7465] Split tests in CI further to reduce total CI elapsed time [hudi]

2024-03-02 Thread via GitHub


yihua commented on PR #10795:
URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974906299

   Looks like flakiness in Azure CI.  Will retry.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7465] Split tests in CI further to reduce total CI elapsed time [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10795:
URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974893487

   
   ## CI report:
   
   * fe31edd435cf0c990fa3174db7d43a9412ad012c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22729)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (HUDI-60) [UMBRELLA] Support Apache Beam for incremental tailing

2024-03-02 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-60:
--

Assignee: xy

> [UMBRELLA] Support Apache Beam for incremental tailing
> --
>
> Key: HUDI-60
> URL: https://issues.apache.org/jira/browse/HUDI-60
> Project: Apache Hudi
>  Issue Type: Epic
>  Components: spark, Utilities
>Reporter: Vinoth Chandar
>Assignee: xy
>Priority: Major
>  Labels: gsoc, gsoc2021, hudi-umbrellas, mentor
>
> (More details to be added)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-60) [UMBRELLA] Support Apache Beam for incremental tailing

2024-03-02 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-60:
---
Fix Version/s: 0.15.0

> [UMBRELLA] Support Apache Beam for incremental tailing
> --
>
> Key: HUDI-60
> URL: https://issues.apache.org/jira/browse/HUDI-60
> Project: Apache Hudi
>  Issue Type: Epic
>  Components: spark, Utilities
>Reporter: Vinoth Chandar
>Assignee: xy
>Priority: Major
>  Labels: gsoc, gsoc2021, hudi-umbrellas, mentor
> Fix For: 0.15.0
>
>
> (More details to be added)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-7398) clarify clustering strategy for java client

2024-03-02 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-7398.

  Assignee: Raymond Xu
Resolution: Fixed

> clarify clustering strategy for java client
> ---
>
> Key: HUDI-7398
> URL: https://issues.apache.org/jira/browse/HUDI-7398
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: docs
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.14.2
>
>
> java client only does linear sort
> org.apache.hudi.client.clustering.run.strategy.JavaExecutionStrategy#getPartitioner
> org.apache.hudi.client.clustering.run.strategy.JavaExecutionStrategy#getPartitioner
> but in fact it can be extended to perform space-filling curve sorting. guess 
> it’s just not implemented yet. if you’re interested, feel free to attempt it 
> with a pr



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [HUDI-7465][DNM] Split tests in CI further to reduce total CI elapsed time [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10795:
URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974882533

   
   ## CI report:
   
   * d3d782594eea1b77c47e37862b0673c7a1768710 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22727)
 
   * fe31edd435cf0c990fa3174db7d43a9412ad012c Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22729)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7465][DNM] Split tests in CI further to reduce total CI elapsed time [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10795:
URL: https://github.com/apache/hudi/pull/10795#issuecomment-1974880555

   
   ## CI report:
   
   * d3d782594eea1b77c47e37862b0673c7a1768710 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22727)
 
   * fe31edd435cf0c990fa3174db7d43a9412ad012c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10797:
URL: https://github.com/apache/hudi/pull/10797#issuecomment-1974865229

   
   ## CI report:
   
   * 05774d87786b8f5101b6953ea769831244544c44 UNKNOWN
   * 435db97306f340b9ab078f154394da691e0354e1 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22728)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10797:
URL: https://github.com/apache/hudi/pull/10797#issuecomment-1974855084

   
   ## CI report:
   
   * 05774d87786b8f5101b6953ea769831244544c44 UNKNOWN
   * 435db97306f340b9ab078f154394da691e0354e1 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22728)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10797:
URL: https://github.com/apache/hudi/pull/10797#issuecomment-1974852961

   
   ## CI report:
   
   * 05774d87786b8f5101b6953ea769831244544c44 UNKNOWN
   * 435db97306f340b9ab078f154394da691e0354e1 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]

2024-03-02 Thread via GitHub


hudi-bot commented on PR #10797:
URL: https://github.com/apache/hudi/pull/10797#issuecomment-1974850925

   
   ## CI report:
   
   * 05774d87786b8f5101b6953ea769831244544c44 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Cleaner fails with com.esotericsoftware.kryo.KryoException: java.util.ConcurrentModificationException [hudi]

2024-03-02 Thread via GitHub


cbomgit commented on issue #10785:
URL: https://github.com/apache/hudi/issues/10785#issuecomment-1974850671

   Running the cleaner as a separate job fails as well. Using the following 
args:
   
   ```
   spark-submit --master yarn --deploy-mode cluster --class 
org.apache.hudi.utilities.HoodieCleaner /usr/lib/hudi/hudi-utilities-bundle.jar 
--target-base-path s3://table-path --props 
s3://table-path/.hoodie/hoodie.properties --hoodie-conf 
hoodie.cleaner.policy=KEEP_LATEST_FILE_VERSIONS --hoodie-conf 
hoodie.cleaner.fileversions.retained=2 --hoodie-conf 
hoodie.cleaner.policy.failed.writes=LAZY --hoodie-conf 
hoodie.write.lock.provider=org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider
 --hoodie-conf hoodie.write.lock.dynamodb.table=HoodieLockTable --hoodie-conf 
hoodie.metadata.enable=false --hoodie-conf 
hoodie.write.lock.dynamodb.region=us-east-1
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]

2024-03-02 Thread via GitHub


stayrascal commented on code in PR #10797:
URL: https://github.com/apache/hudi/pull/10797#discussion_r150904


##
hudi-common/src/main/java/org/apache/hudi/common/table/view/FileSystemViewManager.java:
##
@@ -130,69 +131,69 @@ public void close() {
   /**
* Create RocksDB based file System view for a table.
*
-   * @param viewConf View Storage Configuration
+   * @param viewConf   View Storage Configuration
* @param metaClient HoodieTableMetaClient
-   * @return
+   * @return {@link RocksDbBasedFileSystemView}
*/
   private static RocksDbBasedFileSystemView 
createRocksDBBasedFileSystemView(FileSystemViewStorageConfig viewConf,
-  HoodieTableMetaClient metaClient) {
+ 
HoodieTableMetaClient metaClient) {
 HoodieTimeline timeline = 
metaClient.getActiveTimeline().filterCompletedAndCompactionInstants();
 return new RocksDbBasedFileSystemView(metaClient, timeline, viewConf);
   }
 
   /**
* Create a spillable Map based file System view for a table.
*
-   * @param viewConf View Storage Configuration
+   * @param viewConf   View Storage Configuration
* @param metaClient HoodieTableMetaClient
-   * @return
+   * @return {@link SpillableMapBasedFileSystemView}
*/
-  private static SpillableMapBasedFileSystemView 
createSpillableMapBasedFileSystemView(FileSystemViewStorageConfig viewConf,

Review Comment:
   `FileSystemViewStorageConfig viewConf` is never used.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]

2024-03-02 Thread via GitHub


stayrascal commented on code in PR #10797:
URL: https://github.com/apache/hudi/pull/10797#discussion_r150420


##
hudi-common/src/main/java/org/apache/hudi/common/table/view/FileSystemViewManager.java:
##
@@ -66,17 +65,19 @@ public class FileSystemViewManager {
   private final SerializableConfiguration conf;
   // The View Storage config used to store file-system views
   private final FileSystemViewStorageConfig viewStorageConfig;
-  // Map from Base-Path to View
-  private final ConcurrentHashMap 
globalViewMap;
   // Factory Map to create file-system views
   private final Function2 viewCreator;
+  // Map from Base-Path to View
+  private final ConcurrentHashMap 
globalViewMap;

Review Comment:
   it's easy to compare & read if keep the sequence with constructor.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [MINOR] Clean code of FileSystemViewManager [hudi]

2024-03-02 Thread via GitHub


stayrascal opened a new pull request, #10797:
URL: https://github.com/apache/hudi/pull/10797

   ### Change Logs
   
   Clean unused methods and parameters of `FileSystemViewManager`
   
   ### Impact
   
   No
   
   ### Risk level (write none, low medium or high below)
   
   Low
   
   ### Documentation Update
   
   No
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Clean code of FileSystemViewManager [hudi]

2024-03-02 Thread via GitHub


stayrascal closed pull request #10796: [MINOR] Clean code of 
FileSystemViewManager
URL: https://github.com/apache/hudi/pull/10796


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [MINOR] Clean code of FileSystemViewManager [hudi]

2024-03-02 Thread via GitHub


stayrascal opened a new pull request, #10796:
URL: https://github.com/apache/hudi/pull/10796

   ### Change Logs
   
   Clean unused methods and parameters of `FileSystemViewManager`
   
   ### Impact
   
   No
   
   ### Risk level (write none, low medium or high below)
   
   Low
   
   ### Documentation Update
   
   No
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (HUDI-7467) TestHoodieDeltaStreamer. testAutoGenerateRecordKeys

2024-03-02 Thread Lin Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Liu reassigned HUDI-7467:
-

Assignee: (was: Lin Liu)

> TestHoodieDeltaStreamer. testAutoGenerateRecordKeys
> ---
>
> Key: HUDI-7467
> URL: https://issues.apache.org/jira/browse/HUDI-7467
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Lin Liu
>Priority: Major
>
> https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e
> {code:java}
> [ERROR] Tests run: 131, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 
> 2,459.289 s <<< FAILURE! - in 
> org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer
> [ERROR] testAutoGenerateRecordKeys  Time elapsed: 14.248 s  <<< FAILURE!
> org.opentest4j.AssertionFailedError: expected: <300> but was: <500>
>   at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
>   at 
> org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62)
>   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166)
>   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161)
>   at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:611)
>   at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.assertRecordCount(HoodieDeltaStreamerTestBase.java:486)
>   at 
> org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testAutoGenerateRecordKeys(TestHoodieDeltaStreamer.java:2823)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7467) TestHoodieDeltaStreamer. testAutoGenerateRecordKeys

2024-03-02 Thread Lin Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Liu updated HUDI-7467:
--
Summary: TestHoodieDeltaStreamer. testAutoGenerateRecordKeys  (was: 
TestHoodieDeltaStreamer)

> TestHoodieDeltaStreamer. testAutoGenerateRecordKeys
> ---
>
> Key: HUDI-7467
> URL: https://issues.apache.org/jira/browse/HUDI-7467
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Lin Liu
>Assignee: Lin Liu
>Priority: Major
>
> https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e
> {code:java}
> [ERROR] Tests run: 131, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 
> 2,459.289 s <<< FAILURE! - in 
> org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer
> [ERROR] testAutoGenerateRecordKeys  Time elapsed: 14.248 s  <<< FAILURE!
> org.opentest4j.AssertionFailedError: expected: <300> but was: <500>
>   at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
>   at 
> org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62)
>   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166)
>   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161)
>   at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:611)
>   at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.assertRecordCount(HoodieDeltaStreamerTestBase.java:486)
>   at 
> org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testAutoGenerateRecordKeys(TestHoodieDeltaStreamer.java:2823)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7467) TestHoodieDeltaStreamer

2024-03-02 Thread Lin Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Liu updated HUDI-7467:
--
Description: 
https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e
{code:java}
[ERROR] Tests run: 131, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 
2,459.289 s <<< FAILURE! - in 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer
[ERROR] testAutoGenerateRecordKeys  Time elapsed: 14.248 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: expected: <300> but was: <500>
at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
at 
org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62)
at 
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166)
at 
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161)
at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:611)
at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.assertRecordCount(HoodieDeltaStreamerTestBase.java:486)
at 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testAutoGenerateRecordKeys(TestHoodieDeltaStreamer.java:2823)
 {code}

  was:
{code:java}
[ERROR] Tests run: 131, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 
2,459.289 s <<< FAILURE! - in 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer
[ERROR] testAutoGenerateRecordKeys  Time elapsed: 14.248 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: expected: <300> but was: <500>
at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
at 
org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62)
at 
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166)
at 
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161)
at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:611)
at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.assertRecordCount(HoodieDeltaStreamerTestBase.java:486)
at 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testAutoGenerateRecordKeys(TestHoodieDeltaStreamer.java:2823)
 {code}


> TestHoodieDeltaStreamer
> ---
>
> Key: HUDI-7467
> URL: https://issues.apache.org/jira/browse/HUDI-7467
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Lin Liu
>Assignee: Lin Liu
>Priority: Major
>
> https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e
> {code:java}
> [ERROR] Tests run: 131, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 
> 2,459.289 s <<< FAILURE! - in 
> org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer
> [ERROR] testAutoGenerateRecordKeys  Time elapsed: 14.248 s  <<< FAILURE!
> org.opentest4j.AssertionFailedError: expected: <300> but was: <500>
>   at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
>   at 
> org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62)
>   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166)
>   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161)
>   at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:611)
>   at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.assertRecordCount(HoodieDeltaStreamerTestBase.java:486)
>   at 
> org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testAutoGenerateRecordKeys(TestHoodieDeltaStreamer.java:2823)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-7468) TestHoodieDeltaStreamerSchemaEvolutionQuick

2024-03-02 Thread Lin Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Liu reassigned HUDI-7468:
-

Assignee: Jonathan Vexler  (was: Lin Liu)

> TestHoodieDeltaStreamerSchemaEvolutionQuick
> ---
>
> Key: HUDI-7468
> URL: https://issues.apache.org/jira/browse/HUDI-7468
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Lin Liu
>Assignee: Jonathan Vexler
>Priority: Major
>
> https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e
> {code:java}
> [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 
> 514.307 s <<< FAILURE! - in 
> org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick
> [ERROR] testReorderingColumn{String, Boolean, Boolean, Boolean}[1]  Time 
> elapsed: 13.21 s  <<< ERROR!
> org.apache.hudi.exception.SchemaCompatibilityException: Incoming batch schema 
> is not compatible with the table's one
>   at 
> org.apache.hudi.HoodieSchemaUtils$.deduceWriterSchema(HoodieSchemaUtils.scala:179)
>   at 
> org.apache.hudi.HoodieSparkSqlWriter$.deduceWriterSchema(HoodieSparkSqlWriter.scala:147)
>   at 
> org.apache.hudi.HoodieSparkSqlWriter.deduceWriterSchema(HoodieSparkSqlWriter.scala)
>   at 
> org.apache.hudi.utilities.streamer.StreamSync.getDeducedSchemaProvider(StreamSync.java:671)
>   at 
> org.apache.hudi.utilities.streamer.StreamSync.lambda$fetchNextBatchFromSource$5(StreamSync.java:612)
>   at org.apache.hudi.common.util.Option.map(Option.java:112)
>   at 
> org.apache.hudi.utilities.streamer.StreamSync.fetchNextBatchFromSource(StreamSync.java:612)
>   at 
> org.apache.hudi.utilities.streamer.StreamSync.fetchFromSourceAndPrepareRecords(StreamSync.java:524)
>   at 
> org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:497)
>   at 
> org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:400)
>   at 
> org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:855)
>   at 
> org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72)
>   at org.apache.hudi.common.util.Option.ifPresent(Option.java:101)
>   at 
> org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:211)
>   at 
> org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick.testReorderingColumn(TestHoodieDeltaStreamerSchemaEvolutionQuick.java:327)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7468) TestHoodieDeltaStreamerSchemaEvolutionQuick

2024-03-02 Thread Lin Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Liu updated HUDI-7468:
--
Description: 
https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e
{code:java}

[ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 
514.307 s <<< FAILURE! - in 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick
[ERROR] testReorderingColumn{String, Boolean, Boolean, Boolean}[1]  Time 
elapsed: 13.21 s  <<< ERROR!
org.apache.hudi.exception.SchemaCompatibilityException: Incoming batch schema 
is not compatible with the table's one
at 
org.apache.hudi.HoodieSchemaUtils$.deduceWriterSchema(HoodieSchemaUtils.scala:179)
at 
org.apache.hudi.HoodieSparkSqlWriter$.deduceWriterSchema(HoodieSparkSqlWriter.scala:147)
at 
org.apache.hudi.HoodieSparkSqlWriter.deduceWriterSchema(HoodieSparkSqlWriter.scala)
at 
org.apache.hudi.utilities.streamer.StreamSync.getDeducedSchemaProvider(StreamSync.java:671)
at 
org.apache.hudi.utilities.streamer.StreamSync.lambda$fetchNextBatchFromSource$5(StreamSync.java:612)
at org.apache.hudi.common.util.Option.map(Option.java:112)
at 
org.apache.hudi.utilities.streamer.StreamSync.fetchNextBatchFromSource(StreamSync.java:612)
at 
org.apache.hudi.utilities.streamer.StreamSync.fetchFromSourceAndPrepareRecords(StreamSync.java:524)
at 
org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:497)
at 
org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:400)
at 
org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:855)
at 
org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72)
at org.apache.hudi.common.util.Option.ifPresent(Option.java:101)
at 
org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:211)
at 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick.testReorderingColumn(TestHoodieDeltaStreamerSchemaEvolutionQuick.java:327)
 {code}

  was:
{code:java}
[ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 
514.307 s <<< FAILURE! - in 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick
[ERROR] testReorderingColumn{String, Boolean, Boolean, Boolean}[1]  Time 
elapsed: 13.21 s  <<< ERROR!
org.apache.hudi.exception.SchemaCompatibilityException: Incoming batch schema 
is not compatible with the table's one
at 
org.apache.hudi.HoodieSchemaUtils$.deduceWriterSchema(HoodieSchemaUtils.scala:179)
at 
org.apache.hudi.HoodieSparkSqlWriter$.deduceWriterSchema(HoodieSparkSqlWriter.scala:147)
at 
org.apache.hudi.HoodieSparkSqlWriter.deduceWriterSchema(HoodieSparkSqlWriter.scala)
at 
org.apache.hudi.utilities.streamer.StreamSync.getDeducedSchemaProvider(StreamSync.java:671)
at 
org.apache.hudi.utilities.streamer.StreamSync.lambda$fetchNextBatchFromSource$5(StreamSync.java:612)
at org.apache.hudi.common.util.Option.map(Option.java:112)
at 
org.apache.hudi.utilities.streamer.StreamSync.fetchNextBatchFromSource(StreamSync.java:612)
at 
org.apache.hudi.utilities.streamer.StreamSync.fetchFromSourceAndPrepareRecords(StreamSync.java:524)
at 
org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:497)
at 
org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:400)
at 
org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:855)
at 
org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72)
at org.apache.hudi.common.util.Option.ifPresent(Option.java:101)
at 
org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:211)
at 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick.testReorderingColumn(TestHoodieDeltaStreamerSchemaEvolutionQuick.java:327)
 {code}


> TestHoodieDeltaStreamerSchemaEvolutionQuick
> ---
>
> Key: HUDI-7468
> URL: https://issues.apache.org/jira/browse/HUDI-7468
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Lin Liu
>Assignee: Lin Liu
>Priority: Major
>
> https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=22725&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=9df7def4-004b-5fb7-f042-da5d723783ad&s=859b8d9a-8fd6-5a5c-6f5e-f84f1990894e
> {code:java}
> [ERROR] Tests run: 29, Failures: 0, Errors: 3, Sk

[jira] [Created] (HUDI-7468) TestHoodieDeltaStreamerSchemaEvolutionQuick

2024-03-02 Thread Lin Liu (Jira)
Lin Liu created HUDI-7468:
-

 Summary: TestHoodieDeltaStreamerSchemaEvolutionQuick
 Key: HUDI-7468
 URL: https://issues.apache.org/jira/browse/HUDI-7468
 Project: Apache Hudi
  Issue Type: Bug
Reporter: Lin Liu


{code:java}
[ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 
514.307 s <<< FAILURE! - in 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick
[ERROR] testReorderingColumn{String, Boolean, Boolean, Boolean}[1]  Time 
elapsed: 13.21 s  <<< ERROR!
org.apache.hudi.exception.SchemaCompatibilityException: Incoming batch schema 
is not compatible with the table's one
at 
org.apache.hudi.HoodieSchemaUtils$.deduceWriterSchema(HoodieSchemaUtils.scala:179)
at 
org.apache.hudi.HoodieSparkSqlWriter$.deduceWriterSchema(HoodieSparkSqlWriter.scala:147)
at 
org.apache.hudi.HoodieSparkSqlWriter.deduceWriterSchema(HoodieSparkSqlWriter.scala)
at 
org.apache.hudi.utilities.streamer.StreamSync.getDeducedSchemaProvider(StreamSync.java:671)
at 
org.apache.hudi.utilities.streamer.StreamSync.lambda$fetchNextBatchFromSource$5(StreamSync.java:612)
at org.apache.hudi.common.util.Option.map(Option.java:112)
at 
org.apache.hudi.utilities.streamer.StreamSync.fetchNextBatchFromSource(StreamSync.java:612)
at 
org.apache.hudi.utilities.streamer.StreamSync.fetchFromSourceAndPrepareRecords(StreamSync.java:524)
at 
org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:497)
at 
org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:400)
at 
org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:855)
at 
org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72)
at org.apache.hudi.common.util.Option.ifPresent(Option.java:101)
at 
org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:211)
at 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick.testReorderingColumn(TestHoodieDeltaStreamerSchemaEvolutionQuick.java:327)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-7468) TestHoodieDeltaStreamerSchemaEvolutionQuick

2024-03-02 Thread Lin Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Liu reassigned HUDI-7468:
-

Assignee: Lin Liu

> TestHoodieDeltaStreamerSchemaEvolutionQuick
> ---
>
> Key: HUDI-7468
> URL: https://issues.apache.org/jira/browse/HUDI-7468
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Lin Liu
>Assignee: Lin Liu
>Priority: Major
>
> {code:java}
> [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 
> 514.307 s <<< FAILURE! - in 
> org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick
> [ERROR] testReorderingColumn{String, Boolean, Boolean, Boolean}[1]  Time 
> elapsed: 13.21 s  <<< ERROR!
> org.apache.hudi.exception.SchemaCompatibilityException: Incoming batch schema 
> is not compatible with the table's one
>   at 
> org.apache.hudi.HoodieSchemaUtils$.deduceWriterSchema(HoodieSchemaUtils.scala:179)
>   at 
> org.apache.hudi.HoodieSparkSqlWriter$.deduceWriterSchema(HoodieSparkSqlWriter.scala:147)
>   at 
> org.apache.hudi.HoodieSparkSqlWriter.deduceWriterSchema(HoodieSparkSqlWriter.scala)
>   at 
> org.apache.hudi.utilities.streamer.StreamSync.getDeducedSchemaProvider(StreamSync.java:671)
>   at 
> org.apache.hudi.utilities.streamer.StreamSync.lambda$fetchNextBatchFromSource$5(StreamSync.java:612)
>   at org.apache.hudi.common.util.Option.map(Option.java:112)
>   at 
> org.apache.hudi.utilities.streamer.StreamSync.fetchNextBatchFromSource(StreamSync.java:612)
>   at 
> org.apache.hudi.utilities.streamer.StreamSync.fetchFromSourceAndPrepareRecords(StreamSync.java:524)
>   at 
> org.apache.hudi.utilities.streamer.StreamSync.readFromSource(StreamSync.java:497)
>   at 
> org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:400)
>   at 
> org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:855)
>   at 
> org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72)
>   at org.apache.hudi.common.util.Option.ifPresent(Option.java:101)
>   at 
> org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:211)
>   at 
> org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamerSchemaEvolutionQuick.testReorderingColumn(TestHoodieDeltaStreamerSchemaEvolutionQuick.java:327)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-7467) TestHoodieDeltaStreamer

2024-03-02 Thread Lin Liu (Jira)
Lin Liu created HUDI-7467:
-

 Summary: TestHoodieDeltaStreamer
 Key: HUDI-7467
 URL: https://issues.apache.org/jira/browse/HUDI-7467
 Project: Apache Hudi
  Issue Type: Bug
Reporter: Lin Liu
Assignee: Lin Liu


{code:java}
[ERROR] Tests run: 131, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 
2,459.289 s <<< FAILURE! - in 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer
[ERROR] testAutoGenerateRecordKeys  Time elapsed: 14.248 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: expected: <300> but was: <500>
at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
at 
org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62)
at 
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166)
at 
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161)
at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:611)
at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerTestBase.assertRecordCount(HoodieDeltaStreamerTestBase.java:486)
at 
org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer.testAutoGenerateRecordKeys(TestHoodieDeltaStreamer.java:2823)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   >