tchivs opened a new pull request, #4084: URL: https://github.com/apache/flink-cdc/pull/4084
## ๐ Summary This PR fixes a critical runtime issue where PostgreSQL tables with `NUMERIC` fields (without explicit precision) containing NULL values cause `IndexOutOfBoundsException` in Flink CDC jobs. **JIRA**: [FLINK-38196](https://issues.apache.org/jira/browse/FLINK-38196) ## ๐ Problem Description ### Issue - PostgreSQL `NUMERIC` fields without explicit precision default to precision=0, scale=0 - When such fields contain NULL values, binary decimal processing fails with `IndexOutOfBoundsException` - This affects both source and pipeline connectors across all `decimal.handling.mode` configurations ### Impact - **High severity**: Causes runtime job failures - **Common scenario**: Many PostgreSQL schemas use `NUMERIC` without precision - **Cross-mode impact**: Affects string, double, and precise decimal handling modes ## ๐ง Solution ### Core Changes 1. **Enhanced PostgresTypeUtils** (`PostgresTypeUtils.java`) - Added proper handling for numeric(0) precision edge cases - Improved NULL value validation and processing - Enhanced compatibility with all decimal.handling.mode options 2. **Improved Debezium Configuration** (`DebeziumConfigUtils.java`) - Added utility methods for decimal handling mode management - Centralized decimal configuration logic - Better error handling for edge cases 3. **Binary Segment Utilities** (`BinarySegmentUtilsTest.java`) - Fixed zero-precision decimal field processing - Added boundary condition handling - Improved memory safety for edge cases ### Key Implementation Details - **Null-safe processing**: Added comprehensive NULL checks before decimal operations - **Precision validation**: Proper handling when precision=0 or invalid values - **Mode compatibility**: Ensures consistent behavior across string/double/precise modes - **Backward compatibility**: Existing functionality remains unchanged ## ๐งช Testing ### New Test Coverage 1. **PostgresNumericZeroSourceITCase** - Source connector integration tests - Parameterized tests for all decimal.handling.mode values - NULL value scenarios and edge cases - Comprehensive validation of numeric(0) field processing 2. **PostgresNumericZeroITCase** - Pipeline connector integration tests - Cross-mode compatibility validation - Bigint vs numeric field behavior verification - IndexOutOfBoundsException fix verification 3. **PostgresTypeUtilsTest** - Unit tests - Type mapping validation for edge cases - Precision boundary testing - Decimal handling mode behavior verification ### Test Data - **SQL Schema**: `numeric_zero_precision_test.sql` - **Test scenarios**: NULL values, zero values, mixed positive/negative, boundary conditions - **Coverage**: All decimal.handling.mode configurations ## ๐ Quality Assurance ### Pre-submission Checklist - โ All existing tests pass - โ New tests added for edge cases - โ Code style follows project conventions - โ No performance regression - โ Backward compatibility maintained - โ Cross-platform testing completed ### Verification Results ```bash # Source connector tests mvn test -Dtest=PostgresNumericZeroSourceITCase # Result: โ All tests passed # Pipeline connector tests mvn test -Dtest=PostgresNumericZeroITCase # Result: โ All tests passed # Type utilities tests mvn test -Dtest=PostgresTypeUtilsTest # Result: โ All tests passed ``` ## ๐ Files Changed ### Core Implementation - `flink-cdc-connect/flink-cdc-source-connectors/flink-connector-postgres-cdc/src/main/java/org/apache/flink/cdc/connectors/postgres/source/utils/PostgresTypeUtils.java` - `flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-postgres/src/main/java/org/apache/flink/cdc/connectors/postgres/utils/PostgresTypeUtils.java` ### Configuration & Utilities - `flink-cdc-connect/flink-cdc-source-connectors/flink-connector-postgres-cdc/src/main/java/org/apache/flink/cdc/connectors/postgres/source/utils/DebeziumConfigUtils.java` - `flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-postgres/src/main/java/org/apache/flink/cdc/connectors/postgres/utils/DebeziumConfigUtils.java` ### Test Files - `PostgresNumericZeroSourceITCase.java` - Source connector integration tests - `PostgresNumericZeroITCase.java` - Pipeline connector integration tests - `PostgresTypeUtilsTest.java` - Unit tests (both connectors) - `numeric_zero_precision_test.sql` - Test data schema ## ๐ Impact Assessment ### Benefits - **Eliminates runtime failures** for PostgreSQL numeric(0) fields - **Improves reliability** for common PostgreSQL schema patterns - **Zero performance impact** - optimized for production use - **Enhanced test coverage** prevents future regressions ### Risk Assessment - **Low risk**: Changes are focused and well-tested - **Backward compatible**: No breaking changes to existing APIs - **Isolated scope**: Fixes specific edge case without affecting core logic ## ๐ Related Work - **Debezium compatibility**: Aligns with upstream Debezium decimal handling - **Flink CDC standards**: Follows established patterns for type utilities - **PostgreSQL best practices**: Handles standard PostgreSQL numeric types properly --- **Reviewers**: Please pay special attention to: - NULL value handling logic in `PostgresTypeUtils.java` - Test coverage for all decimal.handling.mode configurations - Integration test stability and cleanup **Testing Recommendation**: Run the full test suite with focus on PostgreSQL connector tests to ensure no regressions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
