xiangfu0 opened a new pull request, #17789: URL: https://github.com/apache/pinot/pull/17789
## Summary - Extends Apache Pinot's upsert (primary-key deduplication) to OFFLINE tables, enabling batch-ingested data to leverage primary-key-based deduplication - Implements a three-level comparison column fallback: configured comparison columns → time column → segment creation time - Adds full upsert lifecycle to `OfflineTableDataManager` (init, addSegment, replaceSegment, getSegmentContexts, shutdown) - Updates query executor, server admin APIs (`TablesResource`, `PrimaryKeyCount`), and validation to support offline upsert tables ## Files Changed | File | Change | |------|--------| | `TableConfigUtils.java` | Relax REALTIME-only validation for upsert/dedup | | `BaseTableUpsertMetadataManager.java` | Three-level comparison column fallback | | `UpsertContext.java` | Allow empty comparison columns | | `UpsertUtils.java` | Add `ConstantComparisonColumnReader` for segment-creation-time comparison | | `BasePartitionUpsertMetadataManager.java` | `createRecordInfoReader()` helper with segment creation time fallback | | `OfflineTableDataManager.java` | Full upsert support for offline tables | | `SingleTableExecutionInfo.java` | Support `OfflineTableDataManager` in query path | | `TablesResource.java` | Add offline upsert primary key count in table metadata API | | `PrimaryKeyCount.java` | Add offline upsert primary key count computation | | `TableConfigUtilsTest.java` | Update tests for relaxed validation | | `OfflineUpsertTableIntegrationTest.java` | New integration test for offline upsert | ## Test plan - [x] `TableConfigUtilsTest` — 45 tests pass (updated expected error messages for OFFLINE tables) - [x] `BasePartitionUpsertMetadataManagerTest` + `ConcurrentMapPartitionUpsertMetadataManager*` — 50 tests pass - [x] `TablesResourceTest` — 16 tests pass - [x] `PrimaryKeyCountTest` — 7 tests pass - [x] `OfflineUpsertTableIntegrationTest` — new integration test validates end-to-end offline upsert (dedup query results, segment replacement, skipUpsert option) 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
