github-actions[bot] commented on code in PR #61389:
URL: https://github.com/apache/doris/pull/61389#discussion_r2944101960
##########
fe/fe-core/src/main/java/org/apache/doris/job/offset/jdbc/JdbcSourceOffsetProvider.java:
##########
@@ -116,7 +116,11 @@ public Offset getNextOffset(StreamingJobProperties
jobProps, Map<String, String>
nextOffset.setSplits(snapshotSplits);
return nextOffset;
} else if (currentOffset != null && currentOffset.snapshotSplit()) {
- // snapshot to binlog
+ if (isSnapshotOnlyMode()) {
+ // snapshot-only mode: all splits done, signal job to stop
+ return null;
Review Comment:
**[Medium — Fragile Contract]** `getNextOffset()` returning `null` is a new
behavior not documented in the `SourceOffsetProvider` interface contract. Both
callers (`StreamingInsertTask.before()` line 92 and
`StreamingMultiTblTask.before()` line 113) immediately call `.toString()` on
the result, which would NPE if reached.
In practice, this is currently safe because `onStreamTaskSuccess()` checks
`hasReachedEnd()` before creating a new task (and thus before `before()` is
called). However, this safety depends on callers always checking
`hasReachedEnd()` first — a fragile implicit contract.
Suggestion: Either (a) throw an explicit `IllegalStateException("No more
offsets in snapshot-only mode")` instead of returning null, so any future
caller that forgets the `hasReachedEnd()` check gets a clear error instead of
an NPE, or (b) add a `@Nullable` annotation and document the contract in the
interface.
##########
fs_brokers/cdc_client/src/main/java/org/apache/doris/cdcclient/source/reader/mysql/MySqlSourceReader.java:
##########
@@ -789,8 +789,9 @@ private MySqlSourceConfig generateMySqlConfig(Map<String,
String> cdcConfig, Str
// setting startMode
String startupMode = cdcConfig.get(DataSourceConfigKeys.OFFSET);
if (DataSourceConfigKeys.OFFSET_INITIAL.equalsIgnoreCase(startupMode))
{
Review Comment:
**[Info — Behavioral Change on Existing Mode]** Uncommenting
`configFactory.startupOptions(StartupOptions.initial())` for `OFFSET_INITIAL`
changes the behavior of the existing `initial` mode. Previously the factory
default startup option was used (which is likely `INITIAL` anyway in Flink
CDC). Now it's set explicitly.
This is needed for correctness — `getSourceSplits()` at line 163 reads
`sourceConfig.getStartupOptions().startupMode` to decide between snapshot
splits and binlog split, and with the new `SNAPSHOT` mode branch, `INITIAL`
needs to be explicit too. Just confirming this is intentional and has been
tested with the existing `initial` mode regression tests.
##########
fe/fe-core/src/main/java/org/apache/doris/job/extensions/insert/streaming/DataSourceConfigValidator.java:
##########
@@ -82,8 +82,9 @@ private static boolean isValidValue(String key, String value)
{
}
if (key.equals(DataSourceConfigKeys.OFFSET)
- && !(value.equals(DataSourceConfigKeys.OFFSET_INITIAL)
- || value.equals(DataSourceConfigKeys.OFFSET_LATEST))) {
+ &&
!(value.equalsIgnoreCase(DataSourceConfigKeys.OFFSET_INITIAL)
Review Comment:
**[Low — Behavioral Side-Effect]** Changing from `equals()` to
`equalsIgnoreCase()` for `OFFSET_INITIAL` and `OFFSET_LATEST` is a behavioral
change for existing offset values. Previously `"INITIAL"`, `"Initial"`,
`"LATEST"` etc. would fail validation; now they pass. This is probably fine
(and arguably a UX improvement), but it's worth noting as a scope expansion
beyond the snapshot feature.
Also note: `OFFSET_EARLIEST` is still missing from this validator. This
appears to be a pre-existing omission (the old code also excluded it), but
since this PR touches this exact block, it would be a good time to add it if
`earliest` is a valid option for the multi-table streaming job path.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]