Hello! Maybe you could help with debezium/vervica connectors in optimizing memory consumption for large changes in a single transaction using vervica flink-oracle-cdc-connector.

Environment:
flink - 1.15.1
oracle-cdc - 2.3
oracle - 19.3


When updating large number of rows in a single transaction I get the exception:
"ERROR io.debezium.connector.oracle.logminer.LogMinerHelper         [] - Mining session stopped due to the java.lang.OutOfMemoryError: Java heap space".

For the table with 20 fields of types (int, float, timestamp, date, string)*4 I could get the results listed in table:

For TaskManager heap space=540Mi, I could get 250k rows through, but failed on 300k.
For TaskManager heap space=960Mi, I could get 350k rows through, but failed on 400k.

Using SourceFunction created from OracleSource.builder(), startup-mode set to latest-offset,
and self-defined deserializer, where I added log.info to see if the problem starts after debezium/cdc-connector did it work, but Java Heap Space error occurs before the flow gets to the deserialization.

I've tried to provide the next debezium properties, but got no luck:

dbzProps.setProperty("log.mining.batch.size.max", "10000");
dbzProps.setProperty("log.mining.batch.size.default", "2000");
dbzProps.setProperty("log.mining.batch.size.min", "100");
dbzProps.setProperty("log.mining.view.fetch.size", "1000");
dbzProps.setProperty("max.batch.size", "64");
dbzProps.setProperty("max.queue.size", "256");


After some digging, I could find places in code where debezium consume changes from oracle:
Loop to fetch records:
https://github.com/debezium/debezium/blob/1.6/debezium-connector-oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerStreamingChangeEventSource.java#L154

Inserts/Updates/Deletes records are registered in transaction buffer:
https://github.com/debezium/debezium/blob/1.6/debezium-connector-oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerQueryResultProcessor.java#L267

Commit records cause all events in transaction buffer to be commited - sent forward to dispatcher, ending in batch handler:
https://github.com/debezium/debezium/blob/1.6/debezium-connector-oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerQueryResultProcessor.java#L144

As far as I see, the cause of the problem is that debezium stores all changes locally before the commit of the transaction occurs.

Is there a way to make the connector split a large amount of changes processing for a single transaction? Or maybe any other way to get rid of the large memory consumption problem?

 

Reply via email to