hwasyui commented on issue #15046:
URL: https://github.com/apache/iceberg/issues/15046#issuecomment-4153773241
hello, i am facing the same issues. im using debezium connector for source.
everything works perfectly but still duplicate even with upsert-mode and table
iceberg write upsert enabled. below is my setup, i hope it can be solved:
{
"name": "PI_TEST6",
"config": {
"connector.class": "org.apache.iceberg.connect.IcebergSinkConnector",
"tasks.max": "1",
"topics": "-",
"iceberg.catalog.type": "hadoop",
"iceberg.catalog.warehouse": "/data/iceberg/warehouse",
"iceberg.tables": "db_testing.testing2",
"iceberg.tables.upsert-mode-enabled": "true",
"iceberg.tables.default-id-columns": "nomor",
"iceberg.tables.cdc-field": "op",
"iceberg.tables.cdc.ops.insert": "c",
"iceberg.tables.cdc.ops.update": "u",
"iceberg.tables.cdc.ops.delete": "d",
"iceberg.tables.commit.interval-ms": "5000",
"transforms": "unwrap",
"transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",
"transforms.unwrap.add.fields": "op",
"transforms.unwrap.drop.tombstones": "true",
"transforms.unwrap.delete.handling.mode": "rewrite",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "true",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable": "true",
"iceberg.control.topic": "iceberg-control-testing",
"iceberg.control.group.id": "iceberg-control-group-testing"
}
}
usually a debezium connector message in a topic - looks like this:
{
"payload": {
"before": {...},
"after": {...},
"op": "c/u/d",
"ts_ms": 123456
}
}
i refer to several website and documentation but cannot find a solutions. i
usually use spark cdc for writing to iceberg but currently exploring this
approach alternative. please let me know if there is a solution
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]