xiangfu0 opened a new pull request, #17789:
URL: https://github.com/apache/pinot/pull/17789

   ## Summary
   - Extends Apache Pinot's upsert (primary-key deduplication) to OFFLINE 
tables, enabling batch-ingested data to leverage primary-key-based deduplication
   - Implements a three-level comparison column fallback: configured comparison 
columns → time column → segment creation time
   - Adds full upsert lifecycle to `OfflineTableDataManager` (init, addSegment, 
replaceSegment, getSegmentContexts, shutdown)
   - Updates query executor, server admin APIs (`TablesResource`, 
`PrimaryKeyCount`), and validation to support offline upsert tables
   
   ## Files Changed
   | File | Change |
   |------|--------|
   | `TableConfigUtils.java` | Relax REALTIME-only validation for upsert/dedup |
   | `BaseTableUpsertMetadataManager.java` | Three-level comparison column 
fallback |
   | `UpsertContext.java` | Allow empty comparison columns |
   | `UpsertUtils.java` | Add `ConstantComparisonColumnReader` for 
segment-creation-time comparison |
   | `BasePartitionUpsertMetadataManager.java` | `createRecordInfoReader()` 
helper with segment creation time fallback |
   | `OfflineTableDataManager.java` | Full upsert support for offline tables |
   | `SingleTableExecutionInfo.java` | Support `OfflineTableDataManager` in 
query path |
   | `TablesResource.java` | Add offline upsert primary key count in table 
metadata API |
   | `PrimaryKeyCount.java` | Add offline upsert primary key count computation |
   | `TableConfigUtilsTest.java` | Update tests for relaxed validation |
   | `OfflineUpsertTableIntegrationTest.java` | New integration test for 
offline upsert |
   
   ## Test plan
   - [x] `TableConfigUtilsTest` — 45 tests pass (updated expected error 
messages for OFFLINE tables)
   - [x] `BasePartitionUpsertMetadataManagerTest` + 
`ConcurrentMapPartitionUpsertMetadataManager*` — 50 tests pass
   - [x] `TablesResourceTest` — 16 tests pass
   - [x] `PrimaryKeyCountTest` — 7 tests pass
   - [x] `OfflineUpsertTableIntegrationTest` — new integration test validates 
end-to-end offline upsert (dedup query results, segment replacement, skipUpsert 
option)
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to