the-other-tim-brown commented on code in PR #13419:
URL: https://github.com/apache/hudi/pull/13419#discussion_r2140561135
##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestRecordLevelIndex.scala:
##########
@@ -229,19 +229,14 @@ class TestRecordLevelIndex extends
RecordLevelIndexTestBase {
}
@ParameterizedTest
- @CsvSource(value = Array(
- "COPY_ON_WRITE,6", "COPY_ON_WRITE,8", "MERGE_ON_READ,6", "MERGE_ON_READ,8"
- ))
- def testRLIWithDelete(tableType: String, tableVersion: Int): Unit = {
- val hudiOpts = commonOpts ++ Map(
- DataSourceWriteOptions.TABLE_TYPE.key -> tableType,
- HoodieTableConfig.VERSION.key() -> tableVersion.toString,
- HoodieWriteConfig.WRITE_TABLE_VERSION.key() -> tableVersion.toString)
+ @EnumSource(classOf[HoodieTableType])
Review Comment:
The full suite runs with v6 with `TestRecordLevelIndexTableVersionSix` so we
don't need to override at this level as well. This lead to 4 repeated test
cases.
##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestPartitionStatsPruning.scala:
##########
@@ -32,9 +32,11 @@ import org.apache.hudi.storage.StoragePath
import org.apache.spark.sql.SaveMode
import org.junit.jupiter.api.Assertions.assertTrue
+import org.junit.jupiter.api.Tag
import org.junit.jupiter.params.ParameterizedTest
import org.junit.jupiter.params.provider.MethodSource
+@Tag("functional-b")
Review Comment:
Not required since this extends a tagged class but this makes it more
explicit for what is meant to be in each tagged set of tests.
##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestRecordLevelIndexTableVersionSix.scala:
##########
@@ -21,6 +21,9 @@ package org.apache.hudi.functional
import org.apache.hudi.common.table.HoodieTableConfig
import org.apache.hudi.config.HoodieWriteConfig
+import org.junit.jupiter.api.Tag
+
+@Tag("functional")
Review Comment:
Not required since this extends a tagged class but this makes it more
explicit for what is meant to be in each tagged set of tests.
##########
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/utils/HoodieWriterClientTestHarness.java:
##########
@@ -255,6 +255,7 @@ public HoodieWriteConfig.Builder getConfigBuilder(String
schemaStr, HoodieIndex.
.withStorageConfig(HoodieStorageConfig.newBuilder().hfileMaxFileSize(1024 *
1024).parquetMaxFileSize(1024 * 1024).orcMaxFileSize(1024 * 1024).build())
.forTable(RAW_TRIPS_TEST_NAME)
.withIndexConfig(HoodieIndexConfig.newBuilder().withIndexType(indexType).build())
+ .withMarkersTimelineServerBasedBatchIntervalMs(10)
Review Comment:
In many places in the PR, the batch interval is decreased from the default
of 50ms to 10ms. This shaves the overhead of each write helping trim the
runtime of our tests that execute a series of writes.
##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestColumnStatsIndexWithSQL.scala:
##########
@@ -41,15 +41,16 @@ import org.apache.spark.sql.catalyst.expressions.{And,
AttributeReference, Bitwi
import org.apache.spark.sql.catalyst.expressions.Literal.TrueLiteral
import org.apache.spark.sql.hudi.DataSkippingUtils
import org.apache.spark.sql.types.StringType
+import org.junit.jupiter.api.{Tag, Test}
import org.junit.jupiter.api.Assertions.{assertEquals, assertFalse, assertTrue}
-import org.junit.jupiter.api.Test
import org.junit.jupiter.params.ParameterizedTest
import org.junit.jupiter.params.provider.MethodSource
import java.io.File
import scala.collection.JavaConverters._
+@Tag("functional-b")
Review Comment:
Not required since this extends a tagged class but this makes it more
explicit for what is meant to be in each tagged set of tests.
##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/RecordLevelIndexTestBase.scala:
##########
@@ -35,7 +35,8 @@ import scala.util.Using
class RecordLevelIndexTestBase extends HoodieStatsIndexTestBase {
val metadataOpts: Map[String, String] = Map(
HoodieMetadataConfig.ENABLE.key -> "true",
- HoodieMetadataConfig.RECORD_INDEX_ENABLE_PROP.key -> "true"
+ HoodieMetadataConfig.RECORD_INDEX_ENABLE_PROP.key -> "true",
+ HoodieMetadataConfig.ENABLE_METADATA_INDEX_COLUMN_STATS.key -> "false"
Review Comment:
Col stats adds overhead to our tests so disabling for the common case. We
have some cases that explicitly test the combination of RLI and other indexes
like: `testRLIWithOtherMetadataPartitions`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]