[ 
https://issues.apache.org/jira/browse/KAFKA-5983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793062#comment-16793062
 ] 

Peter Davis commented on KAFKA-5983:
------------------------------------

[~cricket007] said:
{quote}the only reason I see have two registries would if you want topics of 
the same name in two clusters with different schemas.
{quote}
The issue is that if we want to mirror data from a single topic from one 
cluster to another, then we have to mirror and slave the _entire_ Schema 
Registry (__schemas topic if we're talking about Confluent's implementation).  
And then unrelated producers on the target cluster are broken.

Part of the issue is that Confluent's Schema Registry's sequentially generated 
schema IDs are not portable between clusters – there's no way to mirror/slave 
only certain schemas, because the schema ID numbers will collide.

In summary, there are two use cases
 * Mirroring an entire cluster including the __schemas topic (e.g., Disaster 
Recovery) – works alright (provided the DR Schema Registry is a slave – after a 
disaster, have to make it active and reverse the mirror).
 * Mirroring only some topics or where a slave Schema Registry doesn't make 
sense (example: production to a staging environment, where there are active 
producers in staging with their own schemas) is very problematic.

> Cannot mirror Avro-encoded data using the Apache Kafka MirrorMaker
> ------------------------------------------------------------------
>
>                 Key: KAFKA-5983
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5983
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.11.0.0
>         Environment: OS: Linux CentOS 7 and Windows 10
>            Reporter: Giulio Vito de Musso
>            Priority: Major
>              Labels: windows
>
> I'm installing an Apache Kafka MirrorMaker instance to replicate one cluster 
> data to one another cluster. Both on the source and on the target clusters 
> I'm using the Confluent Avro schema registry and the data is binarized with 
> Avro.
> I'm using the latest released version of Confluent 3.3.0 (kafka 0.11). 
> Moreover, the source broker is on a Windows machine while the target broker 
> is on a Linux machine.
> The two Kafka clusters are independent, thus they have different schema 
> registries.
> This are my configuration files for the MirrroMaker
> {code:title=consumer.properties|borderStyle=solid}
> group.id=test-mirrormaker-group
> bootstrap.servers=host01:9092
> exclude.internal.topics=true
> client.id=mirror_maker_consumer0
> auto.commit.enabled=false
> # Avro schema registry properties
> key.converter=io.confluent.connect.avro.AvroConverter
> key.converter.schema.registry.url=http://host01:8081
> value.converter=io.confluent.connect.avro.AvroConverter
> value.converter.schema.registry.url=http://host01:8081
> internal.key.converter=org.apache.kafka.connect.json.JsonConverter
> internal.value.converter=org.apache.kafka.connect.json.JsonConverter
> internal.key.converter.schemas.enable=false
> internal.value.converter.schemas.enable=false
> {code}
> {code:title=producer.properties|borderStyle=solid}
> bootstrap.servers=host02:9093
> compression.type=none
> acks=1
> client.id=mirror_maker_producer0
> # Avro schema registry properties
> key.converter=io.confluent.connect.avro.AvroConverter
> key.converter.schema.registry.url=http://host02:8081
> value.converter=io.confluent.connect.avro.AvroConverter
> value.converter.schema.registry.url=http://host02:8081
> internal.key.converter=org.apache.kafka.connect.json.JsonConverter
> internal.value.converter=org.apache.kafka.connect.json.JsonConverter
> internal.key.converter.schemas.enable=false
> internal.value.converter.schemas.enable=false
> {code}
> I run the MirrorMaker on the host01 Windows machine with this command
> {code}
> C:\kafka>.\bin\windows\kafka-mirror-maker.bat --consumer.config 
> .\etc\kafka\consumer.properties --producer.config 
> .\etc\kafka\producer.properties --whitelist=MY_TOPIC
> [2017-09-26 10:09:58,555] WARN The configuration 
> 'internal.key.converter.schemas.enable' was supplied but isn't a known 
> config. (org.apache.kafka.clients.producer.ProducerConfig)
> [2017-09-26 10:09:58,555] WARN The configuration 
> 'value.converter.schema.registry.url' was supplied but isn't a known config. 
> (org.apache.kafka.clients.producer.ProducerConfig)
> [2017-09-26 10:09:58,571] WARN The configuration 'internal.key.converter' was 
> supplied but isn't a known config. 
> (org.apache.kafka.clients.producer.ProducerConfig)
> [2017-09-26 10:09:58,586] WARN The configuration 
> 'internal.value.converter.schemas.enable' was supplied but isn't a known 
> config. (org.apache.kafka.clients.producer.ProducerConfig)
> [2017-09-26 10:09:58,602] WARN The configuration 'internal.value.converter' 
> was supplied but isn't a known config. 
> (org.apache.kafka.clients.producer.ProducerConfig)
> [2017-09-26 10:09:58,633] WARN The configuration 'value.converter' was 
> supplied but isn't a known config. 
> (org.apache.kafka.clients.producer.ProducerConfig)
> [2017-09-26 10:09:58,649] WARN The configuration 'key.converter' was supplied 
> but isn't a known config. (org.apache.kafka.clients.producer.ProducerConfig)
> [2017-09-26 10:09:58,649] WARN The configuration 
> 'key.converter.schema.registry.url' was supplied but isn't a known config. 
> (org.apache.kafka.clients.producer.ProducerConfig)
> [2017-09-26 10:09:58,727] WARN The configuration 
> 'internal.key.converter.schemas.enable' was supplied but isn't a known 
> config. (org.apache.kafka.clients.consumer.ConsumerConfig)
> [2017-09-26 10:09:58,727] WARN The configuration 
> 'value.converter.schema.registry.url' was supplied but isn't a known config. 
> (org.apache.kafka.clients.consumer.ConsumerConfig)
> [2017-09-26 10:09:58,727] WARN The configuration 'internal.key.converter' was 
> supplied but isn't a known config. 
> (org.apache.kafka.clients.consumer.ConsumerConfig)
> [2017-09-26 10:09:58,742] WARN The configuration 'auto.commit.enabled' was 
> supplied but isn't a known config. 
> (org.apache.kafka.clients.consumer.ConsumerConfig)
> [2017-09-26 10:09:58,774] WARN The configuration 
> 'internal.value.converter.schemas.enable' was supplied but isn't a known 
> config. (org.apache.kafka.clients.consumer.ConsumerConfig)
> [2017-09-26 10:09:58,789] WARN The configuration 'internal.value.converter' 
> was supplied but isn't a known config. 
> (org.apache.kafka.clients.consumer.ConsumerConfig)
> [2017-09-26 10:09:58,805] WARN The configuration 'value.converter' was 
> supplied but isn't a known config. 
> (org.apache.kafka.clients.consumer.ConsumerConfig)
> [2017-09-26 10:09:58,805] WARN The configuration 'key.converter' was supplied 
> but isn't a known config. (org.apache.kafka.clients.consumer.ConsumerConfig)
> [2017-09-26 10:09:58,821] WARN The configuration 
> 'key.converter.schema.registry.url' was supplied but isn't a known config. 
> (org.apache.kafka.clients.consumer.ConsumerConfig)
> {code}
> Using the topic UI utility (https://github.com/Landoop/kafka-topics-ui) I can 
> see that on the target broker the data is sent, but it is shown binarized and 
> I think this is caused by the misconfiguration of the schema registry.
> It seems that the MirrorMaker serializes both key and value data with the 
> _ByteArraySerializer_, so it ignores the Avro Schema registry case
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/MirrorMaker.scala#L237
> It would be very useful if the Kafka MirrorMaker would read the key/value 
> serialization class parameters for producer and consumer, allowing to 
> configue the Avro schema serde.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to