featzhang commented on PR #28083:
URL: https://github.com/apache/flink/pull/28083#issuecomment-4468971018
## Checkpoint compatibility
State format changed completely: `SequenceGenerator`'s per-generator
`ListState<Long>` → `NumberSequenceSource`'s internal state. Existing
checkpoints with sequence fields can't restore.
If DataGen is production-used, need migration path. If test-only (likely),
document it:
```java
/**
* DataGen is for testing/dev only. State compatibility not guaranteed.
* Do not use in production with checkpoint recovery.
*/
```
Add to release notes as breaking change.
## Missing tests
### Checkpoint restore
```java
@Test
void testCheckpointRestore() {
// Run sequence [0,100] to index 50, checkpoint, restore, verify
continues from 51
}
```
### Edge cases
```java
@Test
void testNumberOfRowsZero() {
// 'number-of-rows' = '0', expect immediate finish
}
@Test
void testNumberOfRowsExceedsLongMax() {
// 'number-of-rows' = '9223372036854775807', verify no overflow
}
@Test
void testSequenceStartEqualsEnd() {
// start = end = 42, expect single row with value 42
}
@Test
void testMultipleSequenceFieldsDifferentRanges() {
// field1: [0, 1000000], field2: [0, 10]
// number-of-rows unset → expect stops at 11 (min wins)
}
@Test
void testSequenceTinyintOverflow() {
// TINYINT sequence [0, 10], number-of-rows = 300
// expect stops at 11, NOT wraps to -128
}
@Test
void testParallelismGreaterThanOne() {
// parallelism = 4, number-of-rows = 100
// verify each subtask gets ~25, total = 100, no duplicates
}
@Test
void testRateLimiterAccuracy() {
// rows-per-second = 100, run for 1s, expect 95-105 rows (5% tolerance)
}
```
### Nullability
```java
@Test
void testNullableFieldsWithRandomGenerator() {
// Random generator on nullable column, verify nulls generated
}
```
Clarify:
1. Is DataGen ever used in production?
2. Does FLIP-27 have auto state migration?
3. Should old state detection fail fast with clear error?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]