yihua opened a new pull request, #18896:
URL: https://github.com/apache/hudi/pull/18896
### Describe the issue this Pull Request addresses
For Hudi table version 6, several streamer sources hardcoded `new
StreamerCheckpointV2(...)` and emitted v2 checkpoint keys in commit metadata
regardless of the configured write table version (`KafkaSource` was the
original repro).
### Summary and Changelog
This PR routes every affected source (`KafkaSource`, `KinesisSource`,
`JdbcSource`, `SqlFileBasedSource`, `PulsarSource`, `DebeziumSource`,
`GcsEventsSource`, `HiveIncrPullSource`) and selector helper
(`DFSPathSelector`, `DatePartitionPathSelector`, `S3EventsMetaSelector`)
through `CheckpointUtils.createCheckpoint(writeTableVersion, key)` so the
emitted checkpoint class is determined solely by `WRITE_TABLE_VERSION` (v6 →
v1, v8 → v2). Removed now-dead helpers (`Source#assertCheckpointVersion`,
`InputBatch(Option, String, ...)` ctors).
Incremental sources (`HoodieIncrSource`, `S3EventsHoodieIncrSource`,
`GcsEventsHoodieIncrSource`) are not affected — they have their own checkpoint
construction logic and already route through
`CheckpointUtils.buildCheckpointFromGeneralSource` / the
`DATASOURCES_NOT_SUPPORTED_WITH_CKPT_V2` allowlist.
### Impact
Streamer commit metadata for table version 6 now correctly emits v1
checkpoint keys (`deltastreamer.checkpoint.key`) instead of v2 keys
(`streamer.checkpoint.key.v2`). No public API change.
### Risk Level
medium: touches the checkpoint write path in every streamer source. Added
`TestStreamerSourceCheckpointVersion`, which exercises each source class across
{v6, v8} write table versions and {V1, V2} input checkpoints and asserts the
returned checkpoint class is determined solely by the write table version.
Added a `testCreateCheckpoint` parameterized test for
`CheckpointUtils.createCheckpoint` in `TestCheckpointUtils`.
### Documentation Update
none
### Contributor's checklist
- [ ] Read through [contributor's
guide](https://hudi.apache.org/contribute/how-to-contribute)
- [ ] Enough context is provided in the sections above
- [ ] Adequate tests were added if applicable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]