Hi Pulsar developers, I'm writing to bring to your attention a breaking change I've encountered in Pulsar 4.1.x related to Protobuf schema handling. It seems that the upgrade to a newer version of Avro in Pulsar has introduced stricter namespace validation, which causes issues with schemas generated by older tools.
*The Issue* When a Pulsar client that uses a Protobuf schema generated by an older version of avro-protobuf tries to connect to a Pulsar 4.1.x broker, the producer creation fails. The broker throws an InvalidSchemaDataException with a message like: org.apache.pulsar.broker.service.schema.exceptions.InvalidSchemaDataException: Invalid schema definition data for PROTOBUF schema This issue does not occur with Pulsar 4.0.x brokers. *Root Cause* The root cause appears to be a change in Apache Avro, specifically the introduction of stricter namespace validation in this PR ( https://github.com/apache/avro/pull/2513/files#diff-fee0506b175d3befabdbba7a07dfeb7755c923bed05b876d35d6598dbd7f277f ). Some older versions of `avro-protobuf` generate schemas with a $ character in the namespace, which is now considered invalid. *How to Reproduce* I've created a small project that demonstrates the issue: https://github.com/mattison/pulsar-avro-schema-breaking ( https://github.com/mattisonchao/pulsar-avro-schema-breaking) 1. Start a Pulsar 4.0.8 broker and run the client application. Everything works as expected. 2. Stop the 4.0.8 broker and start a 4.1.2 broker. 3. Run the client application again. The producer will fail to be created. *Discussion* This change, while seemingly originating in Avro, creates a significant breaking change for Pulsar users who may be relying on schemas generated by older tools. I'd like to start a discussion on how we can address this. Some possibilities could be: * Is it possible to rollback the version of Avro? * Could Pulsar provide a workaround to allow for these older schemas to be used? * Should this be documented as a known breaking change with a clear migration path for affected users? * Is there a way to relax the validation in Pulsar while still benefiting from the newer Avro version? I look forward to hearing your thoughts on this. Thanks
