yangxiao created FLINK-35701: -------------------------------- Summary: SqlServer the primary key type is uniqueidentifier, the scan.incremental.snapshot.chunk.size parameter does not take effect during split chunk Key: FLINK-35701 URL: https://issues.apache.org/jira/browse/FLINK-35701 Project: Flink Issue Type: Bug Components: Flink CDC Reporter: yangxiao
1. The source table in the SQL Server database contains 1000000 inventory data records. The default value of scan.incremental.snapshot.chunk.size is 8096. 2. Only one chunk is split, which should be 124 chunks. Problem reproduction: 1. Create a test table in the SQL Server and import data. BEGIN TRANSACTION USE [testdb]; DROP TABLE [dbo].[testtable]; CREATE TABLE [dbo].[testtable] ( [TestId] varchar(64), [CustomerId] varchar(64), [Id] uniqueidentifier NOT NULL, PRIMARY KEY CLUSTERED ([Id]) ); ALTER TABLE [dbo].[testtable] SET (LOCK_ESCALATION = TABLE); COMMIT declare @Id int; set @Id=1; while @Id<=1000000 begin insert into testtable values(NEWID(), NEWID(), NEWID()); set @Id=@Id+1; end; 2. Use flinkcdc sqlserver connector to collect data. CREATE TABLE testtable ( TestId STRING, CustomerId STRING, Id STRING, PRIMARY KEY (Id) NOT ENFORCED ) WITH ( 'connector' = 'sqlserver-cdc', 'hostname' = '', 'port' = '1433', 'username' = '', 'password' = '', 'database-name' = 'testdb', 'table-name' = 'dbo.testtable' ); 3、LOG 2024-06-26 10:04:43,377 | INFO | [SourceCoordinator-Source: testtable[1]] | Use unevenly-sized chunks for table cdm.dbo.CustomerVehicle, the chunk size is 8096 | com.ververica.cdc.connectors.sqlserver.source.dialect.SqlServerChunkSplitter.splitUnevenlySizedChunks(SqlServerChunkSplitter.java:268) 2024-06-26 10:04:43,385 | INFO | [SourceCoordinator-Source: testtable[1]] | Split table cdm.dbo.CustomerVehicle into 1 chunks, time cost: 144ms. | com.ververica.cdc.connectors.sqlserver.source.dialect.SqlServerChunkSplitter.generateSplits(SqlServerChunkSplitter.java:117) -- This message was sent by Atlassian Jira (v8.20.10#820010)