Hi Pulsar developers,

  I'm writing to bring to your attention a breaking change I've encountered
in Pulsar 4.1.x related to Protobuf schema handling. It seems that the
upgrade to a newer version of Avro in Pulsar has introduced stricter
namespace validation, which causes issues with schemas generated by older
tools.


*The Issue*
  When a Pulsar client that uses a Protobuf schema generated by an older
version of avro-protobuf tries to connect to a Pulsar 4.1.x broker, the
producer creation fails. The broker throws an InvalidSchemaDataException with
a message like:

org.apache.pulsar.broker.service.schema.exceptions.InvalidSchemaDataException:
Invalid schema definition data for PROTOBUF schema


This issue does not occur with Pulsar 4.0.x brokers.


*Root Cause*
  The root cause appears to be a change in Apache Avro, specifically the
introduction of stricter namespace validation in this PR (
https://github.com/apache/avro/pull/2513/files#diff-fee0506b175d3befabdbba7a07dfeb7755c923bed05b876d35d6598dbd7f277f
).

  Some older versions of `avro-protobuf` generate schemas with a $
character in the namespace, which is now considered invalid.


*How to Reproduce*
  I've created a small project that demonstrates the issue:
https://github.com/mattison/pulsar-avro-schema-breaking (
https://github.com/mattisonchao/pulsar-avro-schema-breaking)

   1. Start a Pulsar 4.0.8 broker and run the client application.
Everything works as expected.
   2. Stop the 4.0.8 broker and start a 4.1.2 broker.
   3. Run the client application again. The producer will fail to be
created.

*Discussion*

  This change, while seemingly originating in Avro, creates a significant
breaking change for Pulsar users who may be relying on schemas generated by
older tools.

  I'd like to start a discussion on how we can address this. Some
possibilities could be:

   * Is it possible to rollback the version of Avro?
   * Could Pulsar provide a workaround to allow for these older schemas to
be used?
   * Should this be documented as a known breaking change with a clear
migration path for affected users?
   * Is there a way to relax the validation in Pulsar while still
benefiting from the newer Avro version?

  I look forward to hearing your thoughts on this.

  Thanks

Reply via email to