laughingman7743 commented on PR #156: URL: https://github.com/apache/flink-connector-jdbc/pull/156#issuecomment-4775237639
### CI failure is a pre-existing flaky test on `main`, not caused by this PR The red job (https://github.com/apache/flink-connector-jdbc/actions/runs/27995114250/job/82864660173) fails in `DerbyDynamicTableSourceITCase.testLimit`: ``` Caused by: java.sql.SQLNonTransientConnectionException: No current connection. (ERROR 08003) at ...EmbedPreparedStatement.executeQuery(...) ``` **This is not introduced by this PR:** - This branch is rebased on `af994651`. CI for that exact commit on `main` (https://github.com/apache/flink-connector-jdbc/actions/runs/27738188350) fails in the **same `DerbyDynamicTableSourceITCase` class with the same `ERROR 08003: No current connection`**, only a different method (`testProject`). - The latest `main` (https://github.com/apache/flink-connector-jdbc/actions/runs/27966883251) also fails CI on another flaky core ITCase (`JdbcSourceStreamRelatedITCase.testAtLeastOnceWithFailure`, 30s `TimeoutException`). - None of the failing test files are touched by this PR; changes are isolated to the new `flink-connector-jdbc-spanner` module plus catalog/dialect/lineage in core. ### Root cause of the flaky `testLimit` `testLimit` runs `SELECT * FROM t LIMIT 1` over a source split into **2 partitions** (`scan.partition.num=2`). With `LIMIT 1` the job completes after the first row and cancels the source while the split-fetcher thread is still opening the **second** split. Cancellation tears the JDBC connection down, so the in-flight `JdbcSourceSplitReader.openResultSetForSplitWhenAtLeastOnce()` -> `statement.executeQuery()` runs against an already-closed connection -> `08003: No current connection`. Two gaps in `JdbcSourceSplitReader` make this fatal and racy: - `wakeUp()` is a no-op (https://github.com/apache/flink-connector-jdbc/blob/main/flink-connector-jdbc-core/src/main/java/org/apache/flink/connector/jdbc/core/datastream/source/reader/JdbcSourceSplitReader.java), so there is no cooperative cancellation — the reader keeps opening the next split during shutdown. - `fetch()` rethrows any `SQLException` as a fatal `RuntimeException`, so a benign shutdown-time "connection closed" error fails the whole job. ### How the fix differs from the closed PR https://github.com/apache/flink-connector-jdbc/pull/191 That PR guarded only `resultSet.isClosed()` before `extract()` / `resultSet.next()` inside the record loop. But per the stack trace the failure is at `executeQuery()` when **opening the next split**, which that PR does not touch — so it would not reliably fix this. It was closed without merge. The fix I'd propose is different and targets the actual cause — cooperative cancellation: 1. `wakeUp()` / `close()` set a `volatile boolean wakeup` flag. 2. `checkSplitOrStartNext()` checks the flag before opening a new split (the `executeQuery()` path) and returns "no more records" instead of starting a query during shutdown. 3. In `fetch()`, if a `SQLException` occurs while the reader is shutting down (flag set / thread interrupted), treat it as a graceful end-of-split instead of rethrowing — this closes the residual race window where cancellation lands during the Derby call. This covers the `executeQuery()` failure point that #191 missed, and removes the "shutdown error -> job failure" behavior. I'm happy to open a separate, focused PR + JIRA for this so both this PR and `main` go green. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
