Copilot commented on code in PR #25193:
URL: https://github.com/apache/pulsar/pull/25193#discussion_r2740428978
##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/validator/StructSchemaDataValidator.java:
##########
@@ -97,4 +100,30 @@ private static void
throwInvalidSchemaDataException(SchemaData schemaData,
throw new InvalidSchemaDataException("Invalid schema definition data
for "
+ schemaData.getType() + " schema", cause);
}
+
+ static class CompatibleNameValidator implements NameValidator {
+
+ @Override
+ public Result validate(String name) {
+ if (name == null) {
+ return new Result("Null name");
+ }
+ final int length = name.length();
+ if (length == 0) {
+ return new Result("Empty name");
+ }
+ final char first = name.charAt(0);
+ if (!(Character.isLetter(first) || first == '_' || first == '$')) {
+ return new Result("Illegal initial character: " + name);
+ }
+ for (int i = 1; i < length; i++) {
+ final char c = name.charAt(i);
+ // we need to allow $ for the special case
+ if (!(Character.isLetterOrDigit(c) || c == '_' || c == '$')) {
+ return new Result("Illegal character in: " + name);
Review Comment:
`CompatibleNameValidator` currently uses
`Character.isLetter/isLetterOrDigit`, which treats many Unicode characters
(e.g. Chinese) as letters. That means names like "名字" will validate
successfully, but the added tests expect them to be rejected and Avro’s name
spec is ASCII-based. Please tighten the allowed character set to ASCII (e.g.
`[A-Za-z_$][A-Za-z0-9_$]*`) so behavior matches the intended compatibility
scope (allow `$` but not arbitrary Unicode identifiers).
##########
pulsar-broker/src/test/java/org/apache/pulsar/broker/service/schema/validator/SchemaDataValidatorTest.java:
##########
@@ -23,14 +23,22 @@
import com.fasterxml.jackson.databind.ObjectReader;
import com.fasterxml.jackson.module.jsonSchema.JsonSchema;
import com.fasterxml.jackson.module.jsonSchema.JsonSchemaGenerator;
+import org.apache.avro.protobuf.ProtobufData;
import
org.apache.pulsar.broker.service.schema.exceptions.InvalidSchemaDataException;
+import org.apache.pulsar.broker.service.schema.proto.DataRecordOuterClass;
import org.apache.pulsar.client.api.Schema;
+import org.apache.pulsar.client.impl.schema.ProtobufSchema;
import org.apache.pulsar.common.protocol.schema.SchemaData;
+import org.apache.pulsar.common.schema.SchemaInfo;
import org.apache.pulsar.common.schema.SchemaType;
import org.apache.pulsar.common.util.ObjectMapperFactory;
+import org.junit.jupiter.api.Assertions;
import org.testng.annotations.DataProvider;
import org.testng.annotations.Test;
Review Comment:
This test class uses TestNG annotations (`org.testng.annotations.*`) but the
new assertions import `org.junit.jupiter.api.Assertions`. The broker module
appears to standardize on TestNG (`org.testng.Assert`) and doesn’t include
JUnit 5 dependencies, so this import may fail compilation. Please switch these
assertions to TestNG (`org.testng.Assert` or static imports) and remove the
JUnit Jupiter import.
##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/validator/StructSchemaDataValidator.java:
##########
@@ -49,7 +52,7 @@ public void validate(SchemaData schemaData) throws
InvalidSchemaDataException {
byte[] data = schemaData.getData();
try {
- Schema.Parser avroSchemaParser = new Schema.Parser();
+ Schema.Parser avroSchemaParser = new
Schema.Parser(COMPATIBLE_NAME_VALIDATOR);
avroSchemaParser.setValidateDefaults(false);
Review Comment:
This change fixes schema validation by using `CompatibleNameValidator`, but
other broker-side Avro parsing paths (e.g. schema compatibility checks) still
construct `new Schema.Parser()` with the default validator. If existing
Protobuf-derived Avro schemas contain `$` in record names, compatibility checks
may still fail even though validation now passes. Consider reusing the same
validator (or `NameValidator.NO_VALIDATION`) everywhere the broker parses
stored schema JSON for PROTOBUF/AVRO/JSON-compat paths to fully address the
regression.
##########
pulsar-broker/src/test/java/org/apache/pulsar/broker/service/schema/validator/SchemaDataValidatorTest.java:
##########
@@ -23,14 +23,22 @@
import com.fasterxml.jackson.databind.ObjectReader;
import com.fasterxml.jackson.module.jsonSchema.JsonSchema;
import com.fasterxml.jackson.module.jsonSchema.JsonSchemaGenerator;
+import org.apache.avro.protobuf.ProtobufData;
import
org.apache.pulsar.broker.service.schema.exceptions.InvalidSchemaDataException;
+import org.apache.pulsar.broker.service.schema.proto.DataRecordOuterClass;
import org.apache.pulsar.client.api.Schema;
+import org.apache.pulsar.client.impl.schema.ProtobufSchema;
import org.apache.pulsar.common.protocol.schema.SchemaData;
+import org.apache.pulsar.common.schema.SchemaInfo;
Review Comment:
Unused imports were added here (`org.apache.avro.protobuf.ProtobufData`,
`org.apache.pulsar.common.schema.SchemaInfo`). If the build uses
checkstyle/spotless, this will fail. Please remove unused imports.
```suggestion
import
org.apache.pulsar.broker.service.schema.exceptions.InvalidSchemaDataException;
import org.apache.pulsar.broker.service.schema.proto.DataRecordOuterClass;
import org.apache.pulsar.client.api.Schema;
import org.apache.pulsar.client.impl.schema.ProtobufSchema;
import org.apache.pulsar.common.protocol.schema.SchemaData;
```
##########
pulsar-broker/src/main/proto/DataRecord.proto:
##########
@@ -0,0 +1,17 @@
+syntax = "proto3";
+
+package pulsar.schema;
+option java_package = "org.apache.pulsar.broker.service.schema.proto";
+
Review Comment:
New proto files in this module typically include the ASF license header (see
e.g. `pulsar-broker/src/main/proto/SchemaRegistryFormat.proto`). Please add the
standard license header comment block at the top of this file to match project
conventions and avoid licensing/audit issues.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]