This is an automated email from the ASF dual-hosted git repository. xiangfu0 pushed a commit to branch codex/shared-rich-integration-suite in repository https://gitbox.apache.org/repos/asf/pinot.git
commit 0d59212405da5ace39a17bf91c9ebbdd9c5d9c1d Author: Xiang Fu <[email protected]> AuthorDate: Fri Apr 24 12:24:32 2026 -0700 Add shared rich integration test suite --- .../INTEGRATION_TEST_SETUP_GROUPS.md | 459 +++++++++++++++++++++ pinot-integration-tests/pom.xml | 23 ++ .../IngestionConfigHybridIntegrationTest.java | 2 +- .../tests/SegmentUploadIntegrationTest.java | 2 +- .../tests/SharedRichClusterIntegrationTest.java | 386 +++++++++++++++++ .../tests/tpch/TPCHQueryIntegrationTest.java | 4 +- .../shared-rich-cluster-integration-test-suite.xml | 30 ++ 7 files changed, 902 insertions(+), 4 deletions(-) diff --git a/pinot-integration-tests/INTEGRATION_TEST_SETUP_GROUPS.md b/pinot-integration-tests/INTEGRATION_TEST_SETUP_GROUPS.md new file mode 100644 index 00000000000..db709d6c7bb --- /dev/null +++ b/pinot-integration-tests/INTEGRATION_TEST_SETUP_GROUPS.md @@ -0,0 +1,459 @@ +<!-- + + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +--> +# Pinot Integration Test Setup Groups + +This inventory groups the `pinot-integration-tests` TestNG tests by the infrastructure +they start today. The goal is to make it clear which classes can be moved behind a +single suite-level infrastructure setup and which classes need a dedicated setup +because they override process configuration, start alternate components, use Docker, +or intentionally restart services. + +Current CI wiring is alphabetical for most tests (`integration-tests-set-1` and +`integration-tests-set-2`) plus one shared TestNG suite for +`org.apache.pinot.integration.tests.custom`. Alphabetical execution does not align +with infrastructure compatibility, so most classes still start and tear down their +own clusters. + +## Already Suite-Shared + +`CustomDataQueryClusterIntegrationTest` is the existing model for one infrastructure +setup per suite: + +- `@BeforeSuite`: starts ZK, Kafka, controller, broker, server, and minion once. +- `@BeforeClass`: creates the class-specific table/data. +- `@AfterClass`: drops the class-specific table/data. +- `@AfterSuite`: tears down the shared infrastructure once. + +Classes currently covered by `custom-cluster-integration-test-suite.xml`: + +- `AggregateMetricsTest` +- `ArithmeticFunctionsIntegrationTest` +- `ArrayTest` +- `BitwiseFunctionsIntegrationTest` +- `BytesTypeTest` +- `CLPEncodingRealtimeTest` +- `CpcSketchTest` +- `DistinctQueriesTest` +- `FloatingPointDataTypeTest` +- `FunnelCountTest` +- `GeoSpatialTest` +- `GroupByOptionsTest` +- `GroupByTrimmingTest` +- `IvfFlatVectorTest` +- `IvfPqVectorRealtimeTest` +- `IvfPqVectorTest` +- `JsonPathTest` +- `MapFieldTypeMixedValueIngestingIntegrationTest` +- `MapFieldTypeRealtimeTest` +- `MapFieldTypeTest` +- `MapTypeTest` +- `MultiColumnRealtimeColMajorTextIndicesTest` +- `MultiColumnRealtimeRowMajorTextIndicesTest` +- `MultiColumnTextIndicesTest` +- `MultiTopicRealtimeClusterIntegrationTest` +- `OfflineUpsertTableTest` +- `ProtoBufCodeGenMessageDecoderTest` +- `RefreshSegmentMinionTest` +- `RowExpressionTest` +- `SSBQueryTest` +- `StarTreeTest` +- `SumPrecisionTest` +- `TableSamplerIntegrationTest` +- `TextIndicesRealtimeTest` +- `TextIndicesTest` +- `ThetaSketchTest` +- `TimestampTest` +- `TupleSketchTest` +- `ULLTest` +- `UnnestIntegrationTest` +- `VectorTest` +- `WindowFunnelTest` + +`BigNumberOfSegmentsTest` is in the same package but disabled. + +## Setup Signature Matrix + +Legend: + +- `C/B/S/M` means Pinot controllers, Pinot brokers, Pinot servers, and minions + started as the baseline test setup. Most rows also start one ZK; multi-cluster + rows start one ZK per cluster. +- `Kafka` means an embedded Kafka cluster is part of setup. Exact Kafka broker + count comes from `getNumKafkaBrokers()` and can still matter for a final suite. +- `Overrides` lists process-level setup differences: `override*Conf()`, custom + `create*Starter()`, Swagger, fake servers, schema registry, Docker/Kinesis, or + tests that add/restart participants. +- Some tests also mutate Helix cluster/table config after the cluster is running. + Those are not counted as process config overrides, but shared suites still need + to either make those mutations part of suite setup or reset them per class. +- Rows with the same `C/B/S/M`, same external infra, and no overrides are the + highest-confidence candidates for one shared suite run. +- Rows with overrides are still useful buckets, but they should share only when + the actual override values are intentionally compatible. + +## Component Config Override Summary + +This table groups runnable integration test classes only by inherited component +config override methods: + +- `overrideControllerConf()` +- `overrideBrokerConf()` +- `overrideServerConf()` +- `overrideMinionConf()` + +Subclasses inherit the group of their base class. For example, subclasses of +`BaseRealtimeClusterIntegrationTest` count as `server` because that base overrides +server config. + +| Component config override group | Runnable test classes | +| --- | ---: | +| none | 88 | +| broker | 11 | +| server | 8 | +| controller | 3 | +| broker + server | 12 | +| controller + broker | 1 | +| controller + server | 14 | +| controller + broker + server | 9 | +| controller + broker + server + minion | 3 | +| total | 149 | + +## Refactor Plan For No-Override Tests + +The 88 tests without inherited component config overrides are the best place to +start. A single rich shared environment should cover most of them: + +- 1 ZK +- 1 controller +- 1 broker +- 2 servers +- 1 minion +- embedded Kafka started once, preferably without creating a default topic + +This should be treated as a superset environment, not as proof that every test can +immediately run unchanged. Extra servers/minions are usually harmless at the +process level, but they can change routing, assignment, rebalance summaries, +metrics, and `numServersQueried` assertions. + +### Target Coverage + +Likely target for the first shared-rich-cluster suite: + +- 42 `custom/*` tests already use the same broad topology and are suite-shared. +- About 32-33 additional no-override tests should be reasonable first migration + targets after table/topic/tenant cleanup. +- That puts the practical first target around 74-75 of the 88 no-override tests. + +Keep these no-override tests out of the first shared-rich-cluster pass: + +- `ControllerLeaderLocatorIntegrationTest`, `ServerStarterIntegrationTest`: controller-only tests with method-local + component starts. +- `CancelQueryIntegrationTests`: requires 4 servers. +- `PartialUpsertTableRebalanceIntegrationTest`, `KafkaPartitionSubsetChaosIntegrationTest`, + `UpsertTableSegmentUploadIntegrationTest`: add, remove, or restart servers during methods. +- `SegmentCompletionIntegrationTest`: uses a fake Helix server participant. +- `KinesisShardChangeTest`, `RealtimeKinesisIntegrationTest`: Docker LocalStack/Kinesis. +- `MultiClusterIntegrationTest`, `SameTableNameMultiClusterIntegrationTest`: two isolated clusters plus extra broker. +- `UdfTest`: manual UDF cluster with known non-daemon thread caveat. +- `AdminConsoleIntegrationTest`: can join only if the shared controller enables Swagger for the whole suite. + +### First-Wave Timing + +The first draft suite moves three low-risk no-override classes behind the shared +rich cluster: + +- `SegmentUploadIntegrationTest` +- `IngestionConfigHybridIntegrationTest` +- `TPCHQueryIntegrationTest` + +On this workstation, the same 25 TestNG tests passed in both modes: + +| Mode | Command | Wall time | +| --- | --- | ---: | +| Per-class lifecycle | `./mvnw -pl pinot-integration-tests -Dtest=SegmentUploadIntegrationTest,IngestionConfigHybridIntegrationTest,TPCHQueryIntegrationTest -Dsurefire.failIfNoSpecifiedTests=false test` | 57.46s | +| Shared rich suite | `./mvnw -pl pinot-integration-tests -Pshared-rich-cluster-integration-test-suite test` | 43.63s | + +That is a 13.83s wall-clock reduction, about 24% for this small first wave. + +### Suite Infrastructure + +Create a shared base similar to `CustomDataQueryClusterIntegrationTest`, but make +it reusable by the non-custom no-override tests: + +1. Add a shared suite holder that starts the rich environment in `@BeforeSuite`. +2. Start Kafka with no default topic; classes create only the topics they need. +3. Start 2 Pinot servers and 1 minion even for tests that do not need minion. +4. Expose shared controller, broker, server, minion, Kafka, Helix, and admin-client state through delegation methods. +5. Tear everything down once in `@AfterSuite`. + +### Tenant Isolation + +Use tenants to make a 2-server physical cluster behave like either a 1-server or +2-server logical test cluster: + +1. Tag server 0 with a one-server tenant, e.g. `SharedOneServerTenant`. +2. Tag both servers with a two-server tenant, e.g. `SharedTwoServerTenant`. +3. Default migrated tests to the one-server tenant unless they currently require 2 servers. +4. Map current 2-server tests to the two-server tenant. +5. Keep the broker tenant shared; one broker is enough for these candidates. + +This avoids many failures where a formerly 1-server test observes 2 servers in +routing or assignment. + +### Per-Class Isolation + +Keep data setup class-scoped even though infrastructure is suite-scoped: + +1. `@BeforeClass`: create schema, table config, Kafka topic, segments, H2 data, and query generator. +2. `@AfterClass`: drop offline/realtime/logical tables, delete schemas, clear task metadata where needed, and delete or + uniquify Kafka topics. +3. Wait for ExternalView, IdealState, routing, and table-data-manager cleanup before the next class starts. +4. Run these shared-suite tests sequentially; do not enable TestNG class parallelism. + +Reuse `mytable` only if teardown fully waits for cleanup. Otherwise add a table +name indirection layer and update hard-coded query text/query-file handling. + +### Migration Order + +1. **Extract shared infrastructure** from `CustomDataQueryClusterIntegrationTest` + into a reusable helper/base. +2. **Move the existing custom suite** onto that helper without changing behavior. +3. **Add non-custom 2-server Kafka tests**: `BaseDedupIntegrationTest`, + `CommitTimeCompactionIntegrationTest`, logical-table tests, and + `PauselessRealtimeIngestionWithDedupIntegrationTest`. +4. **Add minion Kafka tests** that do not mutate process config: + merge/rollup, purge, realtime-to-offline, segment-generation realtime, stale + segment check, upsert compact merge, and upsert table. +5. **Add one-server offline/realtime tests** using the one-server tenant mapping. +6. **Evaluate Swagger once** and either include `AdminConsoleIntegrationTest` by + enabling Swagger suite-wide or keep it dedicated. +7. **Handle mutable-participant tests last** with explicit baseline restore + checks after each class. + +### Acceptance Criteria + +Before moving a test into the shared-rich-cluster suite, verify: + +- It has no inherited component config override. +- It does not require a fake server, LocalStack/Kinesis, schema registry, UDF cluster, or multi-cluster broker. +- It does not depend on exact physical server count unless tenant isolation preserves that expectation. +- It cleans up all tables, schemas, logical tables, Kafka topics, and task metadata it creates. +- Running it before and after another migrated class gives the same result. + +### Subagent Migration Assessment + +Read-only subagents inspected the no-component-config-override tests against the +shared-rich-cluster target. They did not edit files or run the full shared suite. + +#### Try First + +These are the lowest-risk classes to place behind suite-aware lifecycle first: + +- `SegmentUploadIntegrationTest` +- `TPCHQueryIntegrationTest` +- `IngestionConfigHybridIntegrationTest` +- `LogicalTableWithOneOfflineTableIntegrationTest` +- `LogicalTableWithTwoOfflineTablesIntegrationTest` +- `LogicalTableWithTwelveOfflineTablesIntegrationTest` +- `LogicalTableWithOneRealtimeTableIntegrationTest` +- `LogicalTableWithOneOfflineOneRealtimeTableIntegrationTest` +- `LogicalTableWithTwelveOfflineOneRealtimeTableIntegrationTest` + +The custom query tests are already suite-shared and match the target topology. +They should stay as the control group while the non-custom suite is introduced. + +#### Patch Cleanup Or Isolation First + +These are plausible shared-rich-cluster candidates, but need table/topic/task +cleanup, unique names, sequential execution, or reset hooks before moving: + +- `BaseDedupIntegrationTest`: also drop `DedupTableWithReplicas`. +- `PauselessRealtimeIngestionWithDedupIntegrationTest`: same cleanup as dedup. +- `CommitTimeCompactionIntegrationTest`: exact segment/count assertions and temporary cluster config mutation. +- `SegmentPartitionLLCRealtimeClusterIntegrationTest`: hard-coded `mytable` queries and exact segment counts. +- `DimensionTableIntegrationTest`: fixed `mytable`; deletes table during a test method. +- `MultiStageEngineIntegrationTest`: multiple fixed table names and cluster-config toggles. +- `QueryQuotaClusterIntegrationTest`: quota mutations need strict reset and isolation. +- `SegmentWriterUploaderIntegrationTest`: fixed `mytable`; exact segment-count assertions. +- `SparkSegmentMetadataPushIntegrationTest`: fixed `_testTable = "mytable"`. +- `StarTreeFunctionParametersIntegrationTest`: fixed `mytable`, table reloads, cluster config mutation. +- `LogicalTableWithTwoOfflineOneRealtimeTableIntegrationTest`: stateful logical-table time-boundary mutation. +- `LogicalTableWithTwoRealtimeTableIntegrationTest`: fixed Kafka topic and per-instance counters. +- `SegmentGenerationMinionClusterIntegrationTest`: drop all tables it creates. +- `SimpleMinionClusterIntegrationTest`: global minion task state and cluster task config. +- `MergeRollupMinionClusterIntegrationTest`: global task queues and generic table/topic names. +- `PurgeMetadataPushMinionClusterIntegrationTest`: inherited setup creates extra tables/task state. +- `PurgeMinionClusterIntegrationTest`: global `MinionContext` purger and generic table names. +- `RealtimeToOfflineSegmentsMinionClusterIntegrationTest`: currently leaves metadata table cleanup work. +- `SegmentGenerationMinionRealtimeIngestionTest`: `@BeforeTest/@AfterTest`, fixed realtime table. +- `StaleSegmentCheckIntegrationTest`: simple candidate, but currently does not drop table/schema. +- `UpsertCompactMergeTaskIntegrationTest`: fixed table names and Kafka topic isolation. +- `MultiTopicRealtimeClusterIntegrationTest`: fixed topics and table; no topic deletion. +- `MultiColumnTextIndicesTest`, `MultiColumnRealtimeRowMajorTextIndicesTest`, + `MultiColumnRealtimeColMajorTextIndicesTest`, `TextIndicesRealtimeTest`: table config mutations and reloads; + keep sequential. +- `SSBQueryTest`: generic table names `customer`, `dates`, `lineorder`, `part`, `supplier`. +- `TimestampTest`: changes JVM default timezone; keep sequential. + +#### Keep Dedicated Initially + +These should not be moved into the first shared-rich-cluster suite: + +- `ControllerLeaderLocatorIntegrationTest`: controller-only flow, starts a second controller inside the method. +- `ServerStarterIntegrationTest`: controller-only flow, starts/stops short-lived servers inside methods. +- `AdminConsoleIntegrationTest`: requires Swagger controller unless Swagger is enabled suite-wide. +- `HelixZNodeSizeLimitTest`: changes ZK/client buffer assumptions and writes very large IdealState data. +- `OfflineClusterIntegrationTest`: destructive instance decommission and exact private-cluster assertions. +- `CancelQueryIntegrationTests`: requires 4 servers. +- `PartialUpsertTableRebalanceIntegrationTest`: starts from 1 server and adds/stops servers with exact assertions. +- `KafkaPartitionSubsetChaosIntegrationTest`: restarts shared servers and has fixed topic/table state. +- `SegmentCompletionIntegrationTest`: fake Helix server participant, no real Pinot server. +- `KinesisShardChangeTest`, `RealtimeKinesisIntegrationTest`: Docker LocalStack/Kinesis. +- `MultiClusterIntegrationTest`, `SameTableNameMultiClusterIntegrationTest`: two isolated clusters plus extra broker. +- `UdfTest`: manual UDF cluster with non-daemon thread caveat. +- `RefreshSegmentMinionTest`: global `RefreshSegmentTask` queues/state. +- `UpsertTableIntegrationTest`: custom extra server/metadata-manager test and exact routing counts. + +### No Process Config Overrides + +These are the best first candidates for shared suite-level infrastructure. The +main cleanup work is table/schema/topic isolation plus resetting any Helix config +the class changes at runtime. + +| Setup | External infra | Classes | +| --- | --- | --- | +| `C=1 B=0 S=0 M=0` | none | `ControllerLeaderLocatorIntegrationTest` *(starts a second controller inside the method)*, `ServerStarterIntegrationTest` *(starts/stops short-lived servers inside methods)* | +| `C=1 B=1 S=1 M=0` | none | `DimensionTableIntegrationTest`, `HelixZNodeSizeLimitTest`, `MultiStageEngineIntegrationTest`, `OfflineClusterIntegrationTest`, `QueryQuotaClusterIntegrationTest`, `SegmentUploadIntegrationTest`, `SegmentWriterUploaderIntegrationTest`, `SparkSegmentMetadataPushIntegrationTest`, `StarTreeFunctionParametersIntegrationTest`, `TPCHQueryIntegrationTest` | +| `C=1 B=1 S=4 M=0` | none | `CancelQueryIntegrationTests` | +| `C=1 B=1 S=1 M=0` | Kafka | `IngestionConfigHybridIntegrationTest`, `PartialUpsertTableRebalanceIntegrationTest` *(adds temporary servers during methods)*, `SegmentPartitionLLCRealtimeClusterIntegrationTest` | +| `C=1 B=1 S=2 M=0` | Kafka | `BaseDedupIntegrationTest`, `CommitTimeCompactionIntegrationTest`, `LogicalTableWithOneOfflineOneRealtimeTableIntegrationTest`, `LogicalTableWithOneOfflineTableIntegrationTest`, `LogicalTableWithOneRealtimeTableIntegrationTest`, `LogicalTableWithTwelveOfflineOneRealtimeTableIntegrationTest`, `LogicalTableWithTwelveOfflineTablesIntegrationTest`, `LogicalTableWithTwoOfflineOneRealtimeTableIntegrationTest`, `LogicalTableWithTwoOfflineTablesIntegrationTest`, `Lo [...] +| `C=1 B=1 S=1 M=1` | none | `SegmentGenerationMinionClusterIntegrationTest`, `SimpleMinionClusterIntegrationTest` | +| `C=1 B=1 S=1 M=1` | Kafka | `MergeRollupMinionClusterIntegrationTest`, `PurgeMetadataPushMinionClusterIntegrationTest`, `PurgeMinionClusterIntegrationTest`, `RealtimeToOfflineSegmentsMinionClusterIntegrationTest`, `SegmentGenerationMinionRealtimeIngestionTest`, `StaleSegmentCheckIntegrationTest`, `UpsertCompactMergeTaskIntegrationTest` | +| `C=1 B=1 S=2 M=1` | Kafka | `UpsertTableIntegrationTest` | +| `C=1 B=1 S=2 M=1` | Kafka | `custom/*` tests listed above; already suite-shared | + +### Broker Config Overrides + +| Setup | External infra | Classes | +| --- | --- | --- | +| `C=1 B=1 S=0 M=0` | none | `BrokerServiceDiscoveryIntegrationTest` | +| `C=1 B=1 S=1 M=0` | none | `CursorCronCleanupIntegrationTest`, `CursorFsIntegrationTest`, `CursorIntegrationTest`, `EmptyResponseIntegrationTest` | +| `C=1 B=1 S=2 M=0` | none | `MultiStageEngineExplainIntegrationTest`, `OfflineTimestampIndexIntegrationTest`, `QueryThreadContextIntegrationTest`, `SpoolIntegrationTest` | +| `C=1 B=1 S=1 M=0` | Kafka | `BrokerQueryLimitTest`, `NullHandlingIntegrationTest` | + +### Server Config Overrides + +| Setup | External infra | Classes | +| --- | --- | --- | +| `C=1 B=1 S=1 M=0` | none | `MultiStageWithoutStatsIntegrationTest` | +| `C=1 B=1 S=1 M=0` | Kafka | `ExactlyOnceKafkaRealtimeClusterIntegrationTest` *(transactional Kafka)*, `KafkaConsumingSegmentToBeMovedSummaryIntegrationTest` *(adds a server during the method)*, `KafkaIncreaseDecreasePartitionsIntegrationTest`, `RealtimeConsumptionRateLimiterClusterIntegrationTest` | +| `C=1 B=1 S=1 M=0` | Kafka + schema registry | `KafkaConfluentSchemaRegistryAvroMessageDecoderRealtimeClusterIntegrationTest` | + +### Broker And Server Config Overrides + +| Setup | External infra | Classes | +| --- | --- | --- | +| `C=1 B=1 S=1 M=0` | none | `CpuBasedBrokerQueryKillingIntegrationTest`, `CpuBasedServerQueryKillingIntegrationTest`, `JmxMetricsIntegrationTest`, `MemoryBasedServerQueryKillingIntegrationTest`, `OfflineGRPCServerIntegrationTest`, `OfflineGRPCServerMultiStageIntegrationTest`, `OfflineSecureGRPCServerIntegrationTest`, `WindowResourceAccountingTest` | +| `C=1 B=1 S=2 M=0` | none | `GroupByEnableTrimOptionIntegrationTest` | +| `C=1 B=1 S=4 M=0` | none | `MultiStageEngineSmallBufferTest` | +| `C=1 B=1 S=1 M=0` | Kafka | `QueryWorkloadIntegrationTest` | +| `C=1 B=2 S=3 M=0` | none | `MultiNodesOfflineClusterIntegrationTest` *(also adds/stops a broker and restarts a server inside methods)* | + +### Controller Config Overrides + +| Setup | External infra | Classes | +| --- | --- | --- | +| `C=1 B=1 S=1 M=0` | none | `MultiStageEngineCustomTenantIntegrationTest` | +| `C=1 B=1 S=1 M=0` | Kafka | `PinotLLCRealtimeSegmentManagerIntegrationTest` | +| `C=1 B=1 S=4 M=0` | Kafka | `ControllerPeriodicTasksIntegrationTest` | + +### Controller And Server Config Overrides + +| Setup | External infra | Classes | +| --- | --- | --- | +| `C=1 B=1 S=1 M=0` | Kafka | `LLCRealtimeClusterIntegrationTest`, `LLCRealtimeKafka3ClusterIntegrationTest`, `LLCRealtimeKafka4ClusterIntegrationTest`, `RetentionManagerIntegrationTest` | +| `C=1 B=1 S=2 M=0` | Kafka | `PeerDownloadLLCRealtimeClusterIntegrationTest` | + +### Controller, Broker, And Server Config Overrides + +| Setup | External infra | Classes | +| --- | --- | --- | +| `C=1 B=1 S=1 M=0` | none | `ControllerServiceDiscoveryIntegrationTest`, `CursorWithAuthIntegrationTest`, `TimeSeriesAuthIntegrationTest`, `TimeSeriesIntegrationTest` | +| `C=1 B=1 S=1 M=0` | Kafka | `RowLevelSecurityIntegrationTest` | +| `C=1 B=1 S=2 M=0` | Kafka | `DateTimeFieldSpecHybridClusterIntegrationTest`, `GrpcBrokerClusterIntegrationTest`, `HybridClusterIntegrationTest`, `TableRebalanceIntegrationTest` *(adds/stops servers during methods)*, `TenantRebalanceIntegrationTest` | + +### Controller Starter And Failure Injection Overrides + +| Setup | External infra | Classes | +| --- | --- | --- | +| `C=1 B=1 S=1 M=0` | Kafka | `PauselessRealtimeIngestionCommitEndMetadataFailureTest`, `PauselessRealtimeIngestionConsumingTransitionFailureTest`, `PauselessRealtimeIngestionIdealStateUpdateFailureTest`, `PauselessRealtimeIngestionIntegrationTest`, `PauselessRealtimeIngestionNewSegmentMetadataCreationFailureTest`, `PauselessRealtimeIngestionSegmentCommitFailureTest`, `TableRebalancePauselessIntegrationTest` | +| `C=1 B=1 S=2 M=0` | Kafka | `PauselessDedupRealtimeIngestionConsumingTransitionFailureTest`, `PauselessDedupRealtimeIngestionSegmentCommitFailureTest` | + +### Minion Config Or Auth/TLS Overrides + +| Setup | External infra | Classes | +| --- | --- | --- | +| `C=1 B=1 S=1 M=1` | none | `AdminConsoleIntegrationTest` *(Swagger controller)*, `BasicAuthBatchIntegrationTest` *(controller/broker/server/minion auth overrides)* | +| `C=1 B=1 S=1 M=1` | Kafka | `TlsIntegrationTest`, `UrlAuthRealtimeIntegrationTest` | + +### Restart Or Mutable-Participant Suites + +These can still be grouped by baseline setup, but only if each class restores the +baseline before the next class runs. + +| Baseline setup | External infra | Classes | +| --- | --- | --- | +| `C=1 B=1 S=1 M=0` | Kafka | `KafkaPartitionSubsetChaosIntegrationTest` | +| `C=1 B=1 S=1 M=0` | Kafka | `DedupPreloadIntegrationTest`, `UpsertTableSegmentPreloadIntegrationTest` *(server config override plus restart)* | +| `C=1 B=1 S=2 M=0` | Kafka | `UpsertTableSegmentUploadIntegrationTest` | + +### Special Infrastructure + +| Setup | External infra | Classes | +| --- | --- | --- | +| `C=1 B=1 S=fake M=0` | Kafka | `SegmentCompletionIntegrationTest` | +| `C=1 B=1 S=1 M=0` | Docker LocalStack/Kinesis | `KinesisShardChangeTest`, `RealtimeKinesisIntegrationTest` | +| `C=1 B=1 S=1 M=1` | manual UDF cluster | `UdfTest` | +| `C=2 B=3 S=2 M=0` | two ZK-backed clusters | `MultiClusterIntegrationTest`, `SameTableNameMultiClusterIntegrationTest` | + +## Must Stay Dedicated Initially + +These tests have setup behavior that is too different to share with the standard +single-cluster suites without a deeper refactor: + +- `KinesisShardChangeTest`: starts Docker-backed LocalStack/Kinesis. +- `RealtimeKinesisIntegrationTest`: starts Docker-backed LocalStack/Kinesis. +- `MultiClusterIntegrationTest`: starts two isolated Pinot clusters plus an extra broker. +- `SameTableNameMultiClusterIntegrationTest`: extends the multi-cluster setup. +- `UdfTest`: starts `IntegrationUdfTestCluster` manually and notes leaked non-daemon threads. +- `ChaosMonkeyIntegrationTest`: disabled test methods and external process management. +- `TPCHGeneratedQueryIntegrationTest`: generated-query test method is disabled. + +## Implementation Notes + +To get one setup/teardown per compatible group: + +1. Introduce suite-level base classes per topology, using the custom test suite pattern. +2. Move infrastructure startup from `@BeforeClass` to `@BeforeSuite` for each group. +3. Keep schema/table/topic/segment setup in `@BeforeClass` and drop table-specific state in `@AfterClass`. +4. Make class-specific table names unique where a group would otherwise reuse `mytable`. +5. Split TestNG XML by these groups instead of the current alphabetical Maven profiles. +6. Keep tests with config overrides, custom starters, restarts, TLS/auth, Kinesis, UDF, and multi-cluster in dedicated suites until each has an explicit shared-infra contract. diff --git a/pinot-integration-tests/pom.xml b/pinot-integration-tests/pom.xml index 761513d83ee..5b284a23b23 100644 --- a/pinot-integration-tests/pom.xml +++ b/pinot-integration-tests/pom.xml @@ -161,6 +161,29 @@ </plugins> </build> </profile> + <profile> + <id>shared-rich-cluster-integration-test-suite</id> + <activation> + <activeByDefault>false</activeByDefault> + </activation> + <build> + <plugins> + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-surefire-plugin</artifactId> + <configuration> + <skipTests>false</skipTests> + <suiteXmlFiles> + <suiteXmlFile>src/test/resources/shared-rich-cluster-integration-test-suite.xml</suiteXmlFile> + </suiteXmlFiles> + <systemPropertyVariables> + <pinot.integration.sharedRichCluster.enabled>true</pinot.integration.sharedRichCluster.enabled> + </systemPropertyVariables> + </configuration> + </plugin> + </plugins> + </build> + </profile> </profiles> <dependencies> diff --git a/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/IngestionConfigHybridIntegrationTest.java b/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/IngestionConfigHybridIntegrationTest.java index 7495be1e0f7..d068098f28d 100644 --- a/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/IngestionConfigHybridIntegrationTest.java +++ b/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/IngestionConfigHybridIntegrationTest.java @@ -42,7 +42,7 @@ import static org.testng.Assert.assertEquals; /** * Tests ingestion configs on a hybrid table */ -public class IngestionConfigHybridIntegrationTest extends BaseClusterIntegrationTest { +public class IngestionConfigHybridIntegrationTest extends SharedRichClusterIntegrationTest { private static final int NUM_OFFLINE_SEGMENTS = 8; private static final int NUM_REALTIME_SEGMENTS = 6; private static final String TIME_COLUMN_NAME = "millisSinceEpoch"; diff --git a/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/SegmentUploadIntegrationTest.java b/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/SegmentUploadIntegrationTest.java index a3ed151ad95..ba1c2584f60 100644 --- a/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/SegmentUploadIntegrationTest.java +++ b/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/SegmentUploadIntegrationTest.java @@ -55,7 +55,7 @@ import org.testng.annotations.Test; * Currently only tests METADATA push type. * todo: add test for URI push */ -public class SegmentUploadIntegrationTest extends BaseClusterIntegrationTest { +public class SegmentUploadIntegrationTest extends SharedRichClusterIntegrationTest { private static String _tableNameSuffix; @Override diff --git a/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/SharedRichClusterIntegrationTest.java b/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/SharedRichClusterIntegrationTest.java new file mode 100644 index 00000000000..50fd61f8376 --- /dev/null +++ b/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/SharedRichClusterIntegrationTest.java @@ -0,0 +1,386 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.pinot.integration.tests; + +import java.io.File; +import java.io.IOException; +import java.util.List; +import java.util.Map; +import org.apache.commons.io.FileUtils; +import org.apache.pinot.client.admin.PinotAdminClient; +import org.apache.pinot.util.TestUtils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import org.testng.annotations.AfterSuite; +import org.testng.annotations.BeforeSuite; + + +/** + * Base class for integration tests that can share the same rich Pinot cluster within a TestNG suite. + * + * <p>The sharing behavior is disabled by default so the existing per-class lifecycle is preserved for direct + * {@code -Dtest=...} runs. Suite XMLs can opt in by setting {@value #SHARED_RICH_CLUSTER_ENABLED_PROPERTY} to + * {@code true}. + */ +public abstract class SharedRichClusterIntegrationTest extends BaseClusterIntegrationTest { + public static final String SHARED_RICH_CLUSTER_ENABLED_PROPERTY = "pinot.integration.sharedRichCluster.enabled"; + + private static final Logger LOGGER = LoggerFactory.getLogger(SharedRichClusterIntegrationTest.class); + private static final String SHARED_CLUSTER_NAME = "SharedRichClusterIntegrationTestSuite"; + + protected static SharedRichClusterIntegrationTest _sharedRichClusterTestSuite; + + @BeforeSuite(alwaysRun = true) + public void setUpSharedRichClusterSuite() + throws Exception { + if (!isSharedRichClusterEnabled()) { + return; + } + synchronized (SharedRichClusterIntegrationTest.class) { + if (_sharedRichClusterTestSuite != null) { + return; + } + _sharedRichClusterTestSuite = this; + } + + LOGGER.warn("Setting up shared rich integration test suite"); + TestUtils.ensureDirectoriesExistAndEmpty(_tempDir, _segmentDir, _tarDir); + super.startZk(); + super.startKafkaWithoutTopic(); + super.startController(super.getDefaultControllerConfiguration()); + super.startBrokers(1); + super.startServers(2); + super.startMinion(); + attachSharedRichCluster(); + LOGGER.warn("Finished setting up shared rich integration test suite"); + } + + @AfterSuite(alwaysRun = true) + public void tearDownSharedRichClusterSuite() + throws Exception { + if (!isSharedRichClusterOwner()) { + return; + } + + LOGGER.warn("Tearing down shared rich integration test suite"); + try { + if (_minionStarter != null) { + super.stopMinion(); + } + if (!_serverStarters.isEmpty()) { + super.stopServer(); + } + if (!_brokerStarters.isEmpty()) { + super.stopBroker(); + } + if (_controllerStarter != null) { + super.stopController(); + } + super.stopKafka(); + super.stopZk(); + FileUtils.deleteDirectory(_tempDir); + LOGGER.warn("Finished tearing down shared rich integration test suite"); + } finally { + synchronized (SharedRichClusterIntegrationTest.class) { + _sharedRichClusterTestSuite = null; + } + } + } + + protected boolean isSharedRichClusterEnabled() { + return Boolean.getBoolean(SHARED_RICH_CLUSTER_ENABLED_PROPERTY); + } + + protected boolean isSharedRichClusterOwner() { + return isSharedRichClusterEnabled() && _sharedRichClusterTestSuite == this; + } + + protected void attachSharedRichCluster() { + if (!isSharedRichClusterEnabled()) { + return; + } + SharedRichClusterIntegrationTest sharedSuite = _sharedRichClusterTestSuite; + if (sharedSuite == null) { + throw new IllegalStateException("Shared rich cluster has not been initialized"); + } + if (sharedSuite == this) { + return; + } + + _controllerStarter = sharedSuite._controllerStarter; + _controllerPort = sharedSuite._controllerPort; + _controllerConfig = sharedSuite._controllerConfig; + _controllerBaseApiUrl = sharedSuite._controllerBaseApiUrl; + _controllerRequestURLBuilder = sharedSuite._controllerRequestURLBuilder; + _controllerDataDir = sharedSuite._controllerDataDir; + _helixResourceManager = sharedSuite._helixResourceManager; + _helixManager = sharedSuite._helixManager; + _helixDataAccessor = sharedSuite._helixDataAccessor; + _helixAdmin = sharedSuite._helixAdmin; + _propertyStore = sharedSuite._propertyStore; + _tableRebalanceManager = sharedSuite._tableRebalanceManager; + _tableSizeReader = sharedSuite._tableSizeReader; + + _brokerStarters.clear(); + _brokerStarters.addAll(sharedSuite._brokerStarters); + _brokerPorts.clear(); + _brokerPorts.addAll(sharedSuite._brokerPorts); + _brokerBaseApiUrl = sharedSuite._brokerBaseApiUrl; + _brokerGrpcEndpoint = sharedSuite._brokerGrpcEndpoint; + + _serverStarters.clear(); + _serverStarters.addAll(sharedSuite._serverStarters); + _serverGrpcPort = sharedSuite._serverGrpcPort; + _serverAdminApiPort = sharedSuite._serverAdminApiPort; + _serverNettyPort = sharedSuite._serverNettyPort; + + _minionStarter = sharedSuite._minionStarter; + _minionBaseApiUrl = sharedSuite._minionBaseApiUrl; + _kafkaStarters = sharedSuite._kafkaStarters; + } + + @Override + public String getHelixClusterName() { + if (isSharedRichClusterEnabled()) { + return SHARED_CLUSTER_NAME; + } + return super.getHelixClusterName(); + } + + @Override + public String getZkUrl() { + if (isSharedRichClusterEnabled()) { + return _sharedRichClusterTestSuite == this ? super.getZkUrl() : _sharedRichClusterTestSuite.getZkUrl(); + } + return super.getZkUrl(); + } + + @Override + protected String getBrokerBaseApiUrl() { + if (isSharedRichClusterEnabled()) { + return _sharedRichClusterTestSuite == this + ? super.getBrokerBaseApiUrl() + : _sharedRichClusterTestSuite.getBrokerBaseApiUrl(); + } + return super.getBrokerBaseApiUrl(); + } + + @Override + protected String getBrokerGrpcEndpoint() { + if (isSharedRichClusterEnabled()) { + return _sharedRichClusterTestSuite == this + ? super.getBrokerGrpcEndpoint() + : _sharedRichClusterTestSuite.getBrokerGrpcEndpoint(); + } + return super.getBrokerGrpcEndpoint(); + } + + @Override + public int getControllerPort() { + if (isSharedRichClusterEnabled()) { + return _sharedRichClusterTestSuite == this + ? super.getControllerPort() + : _sharedRichClusterTestSuite.getControllerPort(); + } + return super.getControllerPort(); + } + + @Override + protected int getRandomBrokerPort() { + if (isSharedRichClusterEnabled()) { + return _sharedRichClusterTestSuite == this + ? super.getRandomBrokerPort() + : _sharedRichClusterTestSuite.getRandomBrokerPort(); + } + return super.getRandomBrokerPort(); + } + + @Override + public String getMinionBaseApiUrl() { + if (isSharedRichClusterEnabled()) { + return _sharedRichClusterTestSuite == this + ? super.getMinionBaseApiUrl() + : _sharedRichClusterTestSuite.getMinionBaseApiUrl(); + } + return super.getMinionBaseApiUrl(); + } + + @Override + public void startZk() { + if (isSharedRichClusterEnabled()) { + attachSharedRichCluster(); + return; + } + super.startZk(); + } + + @Override + public void startZk(int port) { + if (isSharedRichClusterEnabled()) { + attachSharedRichCluster(); + return; + } + super.startZk(port); + } + + @Override + public void startController() + throws Exception { + if (isSharedRichClusterEnabled()) { + attachSharedRichCluster(); + return; + } + super.startController(); + } + + @Override + public void startController(Map<String, Object> properties) + throws Exception { + if (isSharedRichClusterEnabled()) { + attachSharedRichCluster(); + return; + } + super.startController(properties); + } + + @Override + protected void startBroker() + throws Exception { + if (isSharedRichClusterEnabled()) { + attachSharedRichCluster(); + return; + } + super.startBroker(); + } + + @Override + protected void startBrokers(int numBrokers) + throws Exception { + if (isSharedRichClusterEnabled()) { + attachSharedRichCluster(); + return; + } + super.startBrokers(numBrokers); + } + + @Override + protected void startServer() + throws Exception { + if (isSharedRichClusterEnabled()) { + attachSharedRichCluster(); + return; + } + super.startServer(); + } + + @Override + protected void startServers(int numServers) + throws Exception { + if (isSharedRichClusterEnabled()) { + attachSharedRichCluster(); + return; + } + super.startServers(numServers); + } + + @Override + protected void startMinion() + throws Exception { + if (isSharedRichClusterEnabled()) { + attachSharedRichCluster(); + return; + } + super.startMinion(); + } + + @Override + protected void startKafka() { + if (isSharedRichClusterEnabled()) { + attachSharedRichCluster(); + createKafkaTopic(getKafkaTopic()); + return; + } + super.startKafka(); + } + + @Override + protected void startKafkaWithoutTopic() { + if (isSharedRichClusterEnabled()) { + attachSharedRichCluster(); + return; + } + super.startKafkaWithoutTopic(); + } + + @Override + public void stopZk() { + if (!isSharedRichClusterEnabled()) { + super.stopZk(); + } + } + + @Override + public void stopController() { + if (!isSharedRichClusterEnabled()) { + super.stopController(); + } + } + + @Override + protected void stopBroker() { + if (!isSharedRichClusterEnabled()) { + super.stopBroker(); + } + } + + @Override + protected void stopServer() { + if (!isSharedRichClusterEnabled()) { + super.stopServer(); + } + } + + @Override + protected void stopMinion() { + if (!isSharedRichClusterEnabled()) { + super.stopMinion(); + } + } + + @Override + protected void stopKafka() { + if (!isSharedRichClusterEnabled()) { + super.stopKafka(); + } + } + + @Override + protected void pushAvroIntoKafka(List<File> avroFiles) + throws Exception { + attachSharedRichCluster(); + super.pushAvroIntoKafka(avroFiles); + } + + @Override + public PinotAdminClient getOrCreateAdminClient() + throws IOException { + attachSharedRichCluster(); + return super.getOrCreateAdminClient(); + } +} diff --git a/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/tpch/TPCHQueryIntegrationTest.java b/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/tpch/TPCHQueryIntegrationTest.java index 0288e616933..16304de409d 100644 --- a/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/tpch/TPCHQueryIntegrationTest.java +++ b/pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/tpch/TPCHQueryIntegrationTest.java @@ -35,8 +35,8 @@ import org.apache.commons.collections4.CollectionUtils; import org.apache.commons.io.FileUtils; import org.apache.commons.io.IOUtils; import org.apache.pinot.client.ResultSetGroup; -import org.apache.pinot.integration.tests.BaseClusterIntegrationTest; import org.apache.pinot.integration.tests.ClusterIntegrationTestUtils; +import org.apache.pinot.integration.tests.SharedRichClusterIntegrationTest; import org.apache.pinot.spi.config.table.TableConfig; import org.apache.pinot.spi.data.Schema; import org.apache.pinot.tools.utils.JarUtils; @@ -54,7 +54,7 @@ import org.testng.annotations.Test; * REAME.md to generate a larger dataset for better testing. * Queries are executed against Pinot and H2, and the results are compared. */ -public class TPCHQueryIntegrationTest extends BaseClusterIntegrationTest { +public class TPCHQueryIntegrationTest extends SharedRichClusterIntegrationTest { private static final int NUM_TPCH_QUERIES = 24; // Pinot queries 15, 16, 17 fail due to lack of support for views. diff --git a/pinot-integration-tests/src/test/resources/shared-rich-cluster-integration-test-suite.xml b/pinot-integration-tests/src/test/resources/shared-rich-cluster-integration-test-suite.xml new file mode 100644 index 00000000000..164bf662267 --- /dev/null +++ b/pinot-integration-tests/src/test/resources/shared-rich-cluster-integration-test-suite.xml @@ -0,0 +1,30 @@ +<!-- + + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +--> +<!DOCTYPE suite SYSTEM "https://testng.org/testng-1.0.dtd" > +<suite name="SharedRichClusterIntegrationTestSuite"> + <test name="SharedRichClusterIntegrationTests"> + <classes> + <class name="org.apache.pinot.integration.tests.SegmentUploadIntegrationTest"/> + <class name="org.apache.pinot.integration.tests.IngestionConfigHybridIntegrationTest"/> + <class name="org.apache.pinot.integration.tests.tpch.TPCHQueryIntegrationTest"/> + </classes> + </test> +</suite> --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
