DanielCarter-stack commented on PR #10372:
URL: https://github.com/apache/seatunnel/pull/10372#issuecomment-3773116547

   <!-- code-pr-reviewer -->
   <!-- cpr:pr_reply_v2_parts {"group": "apache/seatunnel#10372", "part": 1, 
"total": 1} -->
   ### Issue 1: Inconsistent URL option references in JDBC Connector (BLOCKER)
   
   **Location**: 
   - 
`seatunnel-connectors-v2/connector-cdc/connector-cdc-mysql/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/mysql/source/MySqlIncrementalSource.java:88`
   - 
`seatunnel-connectors-v2/connector-cdc/connector-cdc-postgres/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/postgres/source/PostgresIncrementalSource.java:85`
   - 
`seatunnel-connectors-v2/connector-cdc/connector-cdc-sqlserver/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/sqlserver/source/SqlServerIncrementalSource.java:76`
   
   **Context Code**:
   ```java
   // MySqlIncrementalSource.java line 88
   JdbcUrlUtil.UrlInfo urlInfo = 
JdbcUrlUtil.getUrlInfo(config.get(JdbcCommonOptions.URL));
   ```
   
   **But defined in Factory**:
   ```java
   // MySqlIncrementalSourceFactory.java
   .required(
       MySqlIncrementalSourceOptions.USERNAME,
       MySqlIncrementalSourceOptions.PASSWORD,
       MySqlIncrementalSourceOptions.URL)  // 使用新的URL
   ```
   
   **Issue Description**:
   The PR adds a new `URL` option (withFallbackKeys "base-url") in 
`SourceOptions`, and references `XxxIncrementalSourceOptions.URL` (inherited 
from `SourceOptions.URL`) in the `optionRule()` of each Connector's Factory. 
However, in the actual `IncrementalSource` implementation, it still uses 
`JdbcCommonOptions.URL` to read the configuration.
   
   Although `JdbcCommonOptions.URL` and `SourceOptions.URL` have the same 
definition (both use key "url" withFallbackKeys "base-url"), this inconsistency 
will lead to:
   1. Code confusion - developers are unclear which URL option should be used
   2. If `JdbcCommonOptions.URL` and `SourceOptions.URL` definitions become out 
of sync in the future, configuration read errors will occur
   3. Violates the "Single Source of Truth" principle
   
   **Potential Risks**:
   - If one URL definition is modified in the future without synchronizing the 
other, it will result in configuration validation passing but runtime reading 
failing
   - Difficult code maintenance, confusing for newcomers
   
   **Impact Scope**:
   - Direct impact: MySQL CDC, PostgreSQL CDC, SQL Server CDC
   - Indirect impact: All users relying on these Connectors
   - Affected area: Multiple Connectors
   
   **Severity**: BLOCKER
   
   **Improvement Suggestions**:
   Option A: Uniformly use `SourceOptions.URL`
   ```java
   // MySqlIncrementalSource.java line 88
   JdbcUrlUtil.UrlInfo urlInfo = 
JdbcUrlUtil.getUrlInfo(config.get(SourceOptions.URL));
   ```
   
   Option B: Re-export URL in each `XxxIncrementalSourceOptions`
   ```java
   // 在MySqlIncrementalSourceOptions中添加
   public static final Option<String> URL = SourceOptions.URL;
   
   // 然后使用
   JdbcUrlUtil.UrlInfo urlInfo = 
JdbcUrlUtil.getUrlInfo(config.get(MySqlIncrementalSourceOptions.URL));
   ```
   
   **Rationale**: Ensure consistency between Option definition and usage, avoid 
future maintenance issues.
   
   ---
   
   ### Issue 2: Oracle IncrementalSource uses inconsistent URL reference (MAJOR)
   
   **Location**:
   - 
`seatunnel-connectors-v2/connector-cdc/connector-cdc-oracle/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/oracle/source/OracleIncrementalSource.java:92`
   
   **Context Code**:
   ```java
   // OracleIncrementalSource.java line 92
   configFactory.originUrl(config.get(JdbcCommonOptions.URL));
   ```
   
   **But defined in Factory**:
   ```java
   // OracleIncrementalSourceFactory.java
   .optional(OracleIncrementalSourceOptions.URL)  // 使用继承的URL
   ```
   
   **Issue Description**:
   Similar to other JDBC Connectors, Oracle references 
`OracleIncrementalSourceOptions.URL` in Factory but uses 
`JdbcCommonOptions.URL` in Source implementation. However, Oracle's special 
characteristic is that unlike other Connectors, it doesn't need to parse the 
URL, but directly passes originUrl.
   
   **Potential Risks**:
   - Same risks as Issue 1
   - Oracle's URL handling approach is inconsistent with other Connectors, 
increasing maintenance complexity
   
   **Impact Scope**:
   - Direct impact: Oracle CDC Connector
   - Indirect impact: Oracle CDC users
   - Affected area: Single Connector
   
   **Severity**: MAJOR
   
   **Improvement Suggestions**:
   ```java
   // OracleIncrementalSource.java line 92
   configFactory.originUrl(config.get(OracleIncrementalSourceOptions.URL));
   ```
   
   **Rationale**: Same as Issue 1, consistency needs to be maintained.
   
   ---
   
   ### Issue 3: Incomplete MongoDB constant migration (MINOR)
   
   **Location**:
   - 
`seatunnel-connectors-v2/connector-cdc/connector-cdc-mongodb/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/mongodb/config/MongodbSourceOptions.java`
 (deleted)
   
   **Issue Description**:
   The PR creates `MongodbSourceConstants` class and moves constants into it, 
but needs confirmation:
   1. Were all constants migrated?
   2. Is there other code directly referencing the old 
`MongodbSourceOptions.常量`?
   
   From the diff, references in the utils class have been updated, but need to 
confirm nothing is missing.
   
   **Potential Risks**:
   - If there are missing references, compilation errors will occur
   - Constant visibility may change
   
   **Impact Scope**:
   - Direct impact: MongoDB CDC Connector
   - Affected area: Single Connector
   
   **Severity**: MINOR
   
   **Improvement Suggestions**:
   Need to add checks in CI to ensure no missing references.
   
   ---
   
   ### Issue 4: Postgres/Opengauss Factory using FQCN indicates architectural 
issues (MINOR)
   
   **Location**:
   - 
`seatunnel-connectors-v2/connector-cdc/connector-cdc-postgres/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/postgres/source/PostgresIncrementalSourceFactory.java:50,84,109`
   - 
`seatunnel-connectors-v2/connector-cdc/connector-cdc-opengauss/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/opengauss/OpengaussIncrementalSourceFactory.java:86,111`
   
   **Context Code**:
   ```java
   // PostgresIncrementalSourceFactory.java line 50
   return org.apache.seatunnel.connectors.seatunnel.cdc.postgres.source
           .PostgresIncrementalSource.IDENTIFIER;
   
   // line 84
   return org.apache.seatunnel.connectors.seatunnel.cdc.postgres.source
           .PostgresIncrementalSource.class;
   
   // line 109
   new org.apache.seatunnel.connectors.seatunnel.cdc.postgres.source
           .PostgresIncrementalSource<>(context.getOptions(), catalogTables);
   ```
   
   **Issue Description**:
   Using fully qualified class names (FQCN) instead of simple imports usually 
indicates:
   1. Circular dependencies exist
   2. Package structure is unreasonable
   3. Class name conflicts
   
   This reduces code readability.
   
   **Potential Risks**:
   - Decreased code readability
   - May exist deeper design issues
   
   **Impact Scope**:
   - Direct impact: PostgreSQL CDC, OpenGauss CDC
   - Affected area: Multiple Connectors
   
   **Severity**: MINOR
   
   **Improvement Suggestions**:
   Refactor package structure to eliminate circular dependencies, allow using 
normal imports.
   
   ---
   
   ### Issue 5: Missing test validation (MAJOR)
   
   **Location**: Entire PR
   
   **Issue Description**:
   The PR performs large-scale refactoring, but lacks tests:
   1. No validation that the new Options class inheritance relationship is 
correct
   2. No validation that URL configuration actually reads normally
   3. No validation of configuration backward compatibility
   4. Only modified `ConnectorOptionCheckTest` to remove whitelist, didn't add 
new validation
   
   **Potential Risks**:
   - Configuration read failures may occur at runtime
   - May break existing user configurations
   - Difficult to discover regression issues
   
   **Impact Scope**:
   - Direct impact: All CDC Connectors
   - Indirect impact: All CDC users
   - Affected area: Multiple Connectors
   
   **Severity**: MAJOR
   
   **Improvement Suggestions**:
   Add unit tests:
   ```java
   @Test
   public void testMySqlIncrementalSourceOptionsInheritance() {
       // 验证继承关系
       
assertTrue(JdbcSourceOptions.class.isAssignableFrom(MySqlIncrementalSourceOptions.class));
       
assertTrue(SourceOptions.class.isAssignableFrom(MySqlIncrementalSourceOptions.class));
       
       // 验证URL选项可用
       ReadonlyConfig config = ReadonlyConfig.fromMap(Map.of("url", 
"jdbc:mysql://localhost:3306/test"));
       assertEquals("jdbc:mysql://localhost:3306/test", 
config.get(MySqlIncrementalSourceOptions.URL));
   }
   
   @Test
   public void testMongodbIncrementalSourceOptionsInheritance() {
       // 验证继承关系
       
assertTrue(SourceOptions.class.isAssignableFrom(MongodbIncrementalSourceOptions.class));
       
assertFalse(JdbcSourceOptions.class.isAssignableFrom(MongodbIncrementalSourceOptions.class));
   }
   ```
   
   **Rationale**: Ensure correctness and stability of refactoring.
   
   ---
   
   ### Issue 6: Inaccurate PR description (MINOR)
   
   **Location**: PR description
   
   **Issue Description**:
   The PR description states "Does this PR introduce any user-facing change? 
no", but actually:
   1. Multiple public classes were deleted (`MySqlSourceOptions`, etc.)
   2. MongoDB constants were moved to a new class
   3. These are API changes and will affect external code
   
   **Potential Risks**:
   - Users may encounter compilation errors after upgrade
   - Inaccurate documentation
   
   **Impact Scope**:
   - Impact: All code directly using CDC API
   - Affected area: API users
   
   **Severity**: MINOR
   
   **Improvement Suggestions**:
   Update PR description to:
   ```
   Does this PR introduce any user-facing change?
   Yes. This is a breaking change for code that directly references:
   - MySqlSourceOptions (renamed to MySqlIncrementalSourceOptions)
   - OracleSourceOptions (renamed to OracleIncrementalSourceOptions)
   - PostgresOptions (renamed to PostgresIncrementalSourceOptions)
   - SqlServerSourceOptions (renamed to SqlServerIncrementalSourceOptions)
   - MongodbSourceOptions (renamed to MongodbIncrementalSourceOptions)
   - MongodbSourceOptions constants (moved to MongodbSourceConstants)
   
   User configurations are not affected.
   ```
   
   And document this change in `incompatible-changes.md`.
   
   ---
   
   ### Issue 7: MongoDB splitMetaGroupSize configuration option deleted (MAJOR)
   
   **Location**:
   - 
`seatunnel-connectors-v2/connector-cdc/connector-cdc-mongodb/src/main/java/org/apache/seatunnel/connectors/seatunnel/cdc/mongodb/MongodbIncrementalSource.java:89-91`
   
   **Code Before Change**:
   ```java
   
Optional.ofNullable(config.get(MongodbSourceOptions.HEARTBEAT_INTERVAL_MILLIS))
           .ifPresent(builder::heartbeatIntervalMillis);
   
Optional.ofNullable(config.get(MongodbSourceOptions.HEARTBEAT_INTERVAL_MILLIS))
           .ifPresent(builder::splitMetaGroupSize);
   ```
   
   **Code After Change**:
   ```java
   
Optional.ofNullable(config.get(MongodbIncrementalSourceOptions.HEARTBEAT_INTERVAL_MILLIS))
           .ifPresent(builder::heartbeatIntervalMillis);
   // splitMetaGroupSize配置行被删除
   ```
   
   **Issue Description**:
   The `splitMetaGroupSize` configuration option was removed, but from the diff:
   1. `MongodbSourceConfigProvider.Builder` still has `splitMetaGroupSize` 
method
   2. Deleting this line means users can no longer set this parameter through 
configuration
   3. Need to confirm if this change was an intentional feature removal
   
   **Potential Risks**:
   - If `splitMetaGroupSize` has a default value and users could override it 
before, they cannot override it now
   - May affect performance tuning
   
   **Impact Scope**:
   - Direct impact: MongoDB CDC Connector
   - Indirect impact: Users using this parameter for performance tuning
   - Affected area: Single Connector
   
   **Severity**: MAJOR
   
   **Improvement Suggestions**:
   1. If intentionally deleted, need to explain in PR description and changelog
   2. If accidentally deleted, need to restore:
   ```java
   
Optional.ofNullable(config.get(MongodbIncrementalSourceOptions.SPLIT_META_GROUP_SIZE))
           .ifPresent(builder::splitMetaGroupSize);
   ```
   
   And define this Option in `MongodbIncrementalSourceOptions`.
   
   **Rationale**: Need to clarify if this is a breaking change or a bug.
   
   ---
   
   ## Part 4: Overall Assessment
   
   ### Can this be merged
   
   **Conclusion**: ❌ **Not recommended to merge in current state**
   
   **Rationale**:
   1. **BLOCKER issues**: Issue 1 (URL option inconsistency) must be fixed 
before merging, otherwise will leave serious technical debt
   2. **MAJOR issues**: Issues 2, 5, 7 need to be resolved or explicitly 
explained before merging
   3. **Insufficient testing**: Lacks validation tests for the new architecture
   
   ### Improvement Priority
   
   **P0 (Must Fix)**:
   - Issue 1: Unify URL option references
   - Issue 7: Confirm or restore splitMetaGroupSize configuration
   
   **P1 (Strongly Recommended to Fix)**:
   - Issue 2: Oracle's URL reference consistency
   - Issue 5: Add test validation
   - Issue 6: Update PR description and documentation
   
   **P2 (Recommended to Fix)**:
   - Issue 3: Verify MongoDB constant migration completeness
   - Issue 4: Refactor to eliminate FQCN usage
   
   ### Architecture Recommendations
   
   In the long term, recommend:
   1. Unify URL handling approach for all JDBC CDC Connectors
   2. Consider migrating `JdbcCommonOptions.URL` to `JdbcSourceOptions.URL` and 
deprecating the former
   3. Re-examine PostgreSQL/Opengauss package structure to eliminate circular 
dependencies
   
   ### Summary
   
   This is a PR with **correct direction but incomplete execution**. The idea 
of establishing a CDC Options inheritance system is good, but there are 
critical inconsistency issues during implementation, especially the URL option 
dual reference problem. Recommend the author fix the above issues before 
merging.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to