Hi, Asaf.

1. I guess there's no right or wrong way to code. Your implementation also
uses concepts like canRead, writtenSchema, and readSchema, similar to mine.
You just got rid of the builder and left the validator, though there are
duplicate blocks of code, but I think I can improve on that as well.
I feel that there is no hidden logic in my design, the logic is very clear,
and the logic of the builder and validator is very clear.
But let's talk about bo's question first. If not in a
ProtobufNativeSchemaCompatibilityCheck different vadlitor extension. I can
accept both your design and mine. I don't care how I do it, I don't have to
find common behavior in builder. The builder is not going to be useful, so
we can delete it.
2. According to the question I posed above. Implement a new
ProtobufNativeAdvancedSchemaCompatibilityCheck bo think is better, Rather
than in a ProtobufNativeSchemaCompatibilityCheck extension different
validator implementation.

Sorry I can not use the computer and network in the company, I use my
mobile phone to reply to the email, the format may be a bit messy. Please
understand.

Thanks,
sinan


SiNan Liu <liusinan1...@gmail.com> 于 2023年3月8日周三 下午3:53写道:

> Hi, bo.
>
> 1. I understand what you say, to develop a new
> `ProtobufNativeAdvancedSchemaCompatibilityCheck`, rather than changing
> existing `ProtobufNativeSchemaCompatibilityCheck`. But I found a few small
> problems:
>
> (1)ProtobufNativeAdvancedSchemaCompatibilityCheck and
> ProtobufNativeSchemaCompatibilityCheck schemaType is PROTOBUF_NATIVE. It
> looks like both checkers are PROTOBUF not using AVRO-PROTOBUF's "native"
> implementation, which leads to some problems or "unreasonable" and gives me
> some extended thinking and questions.
>
> (2)In broker.conf
>
> `schemaRegistryCompatibilityCheckers`. If
> ProtobufNativeSchemaCompatibilityCheck and
> ProtobufNativeAdvancedSchemaCompatibilityCheck all set. This is going to
> overwrite each other. Because this is a map:
>
>
> https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/SchemaRegistryService.java#L36-L44
>
> ```java
>
> Map<SchemaType, SchemaCompatibilityCheck> checkers = new HashMap<>();
>
> for (String className : checkerClasses) {
>
> SchemaCompatibilityCheck schemaCompatibilityCheck =
> Reflections.createInstance(className,
>
> SchemaCompatibilityCheck.class,
> Thread.currentThread().getContextClassLoader());
>
> checkers.put(schemaCompatibilityCheck.getSchemaType(),
> schemaCompatibilityCheck);
>
> ```
>
> Is this a big problem or a small one? Is it possible or unnecessary? Maybe
> we can write in the documentation that protobufNative checkers can only
> choose one of the two? Why are there two Checkers for different
> implementations of the same schemaType? Why not the checker to create
> different validator, so we don not have to change
> schemaRegistryCompatibilityCheckers.
>
> (3)And after the update to ProtobufNativeAdvancedSchemaCompatibilityCheck.
> Existing topics previously only checked the name of the root message, not
> the content of protobuf.
>
> What if the user wants both Checkers?
>
> Set to ProtobufNativeAdvancedSchemaCompatibilityCheck, affect the topic of
> the existing schema?
>
> Older topics still use the old checker, and newer topics or certain older
> topics use the new advancedchecker.
>
> (4)So should we have one schemaType for a checker? protobufNativeChecker
> can have as many different implementation classes as possible. This
> classname configuration in PIP, let's see if it can be set at the topic
> level. In the current PIP design I just load this parameter into the
> checker when the broker is started and the checkers map is set up. Can I do
> this in the new normal pr if I want to support topic level? Or perfect it
> here?
>
> Add a call PROTOBUF_NATIVE_ADVANCE schemaType corresponding
> ProtobufNativeAdvancedSchemaCompatibilityCheck? (Seems to be more trouble).
>
> Sorry I can not use the computer and network in the company, I use my
> mobile phone to reply to the email, the format may be a bit messy. Please
> understand.
>
> Thanks,
>
> sinan
>
>
> 丛搏 <bog...@apache.org> 于 2023年3月7日周二 下午11:39写道:
>
>> SiNan Liu <liusinan1...@gmail.com> 于2023年3月7日周二 13:22写道:
>> >
>> > Great to see your comment, bo!
>> >
>> > 1. The first way. The protobuf website has a description of the rules,
>> but
>> > no plans to implement them.
>> > https://protobuf.dev/programming-guides/proto/#updating
>>
>> https://groups.google.com/g/protobuf
>> maybe ask here
>>
>> >
>> > 2. I think this PIP can be divided into two parts.
>> > (1) Add a flag(`ValidatorClassName`), load it into
>> > `ProtobufNativeSchemaCompatibilityCheck` when the broker starts.
>> > ValidatorClassName is empty by default, and the implementation
>> continues as
>> > before, with no change for the user.
>>
>> `ProtobufNativeSchemaCompatibilityCheck` is a plugin in `broker.conf`
>> ```
>>
>> schemaRegistryCompatibilityCheckers=org.apache.pulsar.broker.service.schema.JsonSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.AvroSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.ProtobufNativeSchemaCompatibilityCheck
>> ```
>> I do not recommend that we directly modify this plugin and continue to
>> add configuration items, which will cause trouble for users.
>> We have a lot of configs and it's getting very unwieldy.
>> in my opinion, we don't change
>>
>> `org.apache.pulsar.broker.service.schema.ProtobufNativeSchemaCompatibilityCheck`,
>> it is a simple implementation, it doesn't go wrong very often, most
>> users will use it. we can add another ProtobufNativeCheck named
>> `ProtobufNativeAdvancedSchemaCompatibilityCheck ` or other. in this
>> way, we don't need to add this flag. There is no need to consider
>> compatibility, it is just a plug-in and will not affect current logic.
>> If the user needs it, just change the plugin to the new implementation
>>
>> > ```java
>> >     ProtobufNativeSchemaValidator DEFAULT = (fromDescriptors,
>> toDescriptor)
>> > -> {
>> >         for (Descriptors.Descriptor fromDescriptor : fromDescriptors) {
>> >             // The default implementation only checks if the root
>> message
>> > has changed.
>> >             if
>> > (!fromDescriptor.getFullName().equals(toDescriptor.getFullName())) {
>> >                 throw new ProtoBufCanReadCheckException("Protobuf root
>> > message isn't allow change!");
>> >             }
>> >         }
>> >     };
>> > ```
>> > `ValidatorClassName` value also can be set to the current
>> implementation of
>> > PIP add
>> >
>> `org.apache.pulsar.broker.service.schema.validator.ProtobufNativeSchemaBreakValidatorImpl`.
>> >
>> > (2) Recoding the `ProtobufNativeSchemaCompatibilityCheck`. Through the
>> flag
>> > (`ValidatorClassName`) to build different
>> `ProtobufNativeSchemaValidator`.
>> > Isn't it just a plug-in? The user can develop and choose a different
>> > `ProtobufNativeSchemaValidator`. I think it didn't change the logic, it
>> > just allowed him to expand it.
>> >
>> >
>> > I think this PIP should be an enhancement and supplement to the
>> function,
>> > and there is no such thing as unnecessary and meaningless.
>> >
>> >
>> > Thanks,
>> > sinan
>> >
>> >
>> >
>> >
>> >
>> > 丛搏 <bog...@apache.org> 于2023年3月7日周二 11:53写道:
>> >
>> > > I think we have two ways to do that.
>> > >
>> > > First way: We need to advance the improvement of java in protobuf. Ask
>> > > if they have plans to improve.
>> > >
>> > > Second way: the new PROTOBUF_NATIVE `SchemaCompatibilityCheck` should
>> > > be implemented as a plugin, don't change any existing plugin logic
>> > > (it's simple and already used). I don't recommend adding flags for
>> > > rollback, it adds configuration and makes little sense.
>> > >
>> > > Thanks,
>> > > Bo
>> > >
>> > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月6日周一 23:00写道:
>> > >
>> > > >
>> > > > Can you convert the code block which is actually a quote in the
>> > > > beginning of the PIP to something which doesn't require to scroll
>> > > > horizontally so much?
>> > > > Use
>> > > >
>> > >
>> https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-text
>> > > >
>> > > > Let's improve the clarity of what you wrote:
>> > > >
>> > > > "the PROTOBUF uses avro struct to store."
>> > > > -->
>> > > > When Schema type PROTOBUF is used, Pulsar Client assumes the object
>> given
>> > > > to it as message data is an auto-generated POJO containing the
>> > > annotations
>> > > > encoding the schema. The client is using a converter, which
>> converts a
>> > > > Protobuf schema descriptor into an Avro schema and sends that as the
>> > > Schema
>> > > > of the producer/consumer.
>> > > >
>> > > > "On the broker side, protobuf and avro both use SchemaData
>> converted to
>> > > > org.apache.avro.Schema."
>> > > > -->
>> > > > Since the schema is an Avro schema, the implementation of
>> compatibility
>> > > > check on the broker side is to simply re-use the compatibility
>> check of
>> > > the
>> > > > AVRO schema type.
>> > > >
>> > > > "ProtobufSchema is different from ProtobufNativeSchema in schema
>> > > > compatibility check it uses avro-protobuf.
>> > > >
>> > >
>> https://central.sonatype.com/artifact/org.apache.avro/avro-protobuf/1.11.1/overview
>> > > > But the current implementation of ProtobufNative schema
>> compatibility
>> > > > check only
>> > > > checked if the root message name is changed."
>> > > >
>> > > > -->
>> > > > PROTOBUF_NATIVE schema type is different.
>> > > > The client is actually using Protobuf Descriptor as the schema, as
>> > > opposed
>> > > > to Avro schema of PROTOBUF schema type. In the broker, the
>> > > PROTOBUF_NATIVE
>> > > > compatibility check actually hasn't implemented any rule, besides
>> one:
>> > > > checking if the root message name has changed.
>> > > >
>> > > >
>> > > >
>> > > > >    1. For now, there is no official or third-party solution for
>> > > ProtoBuf
>> > > > >    compatibility. If in the future have better solutions of a
>> third
>> > > party or
>> > > > >    the official, we develop new ProtobufNativeSchemaValidator and
>> use,
>> > > so
>> > > > >    add a flag.
>> > > > >
>> > > > > Who do you need to make that configurable? Once you found a third
>> > > party,
>> > > > just switch to it? Who knows, maybe you never will. Introduce it
>> when you
>> > > > find it, not now.
>> > > >
>> > > >
>> > > > We improve in ProtobufNativeSchemaCompatibilityCheck BACKWARD,
>> FORWARD
>> > > > > these strategies. As with the AVRO implementation, protobuf
>> > > compatibility
>> > > > > checking need implementing the canRead method. *This will check
>> that
>> > > > > the writtenschema can be read by readSchema.*
>> > > >
>> > > >
>> > > > I completely disagree.
>> > > > Avro implementation is confusing for our use case. Don't copy that.
>> > > >
>> > > > You have
>> > > >
>> > > > public void checkCompatible(SchemaData from, SchemaData to,
>> > > > SchemaCompatibilityStrategy strategy)
>> > > >         throws IncompatibleSchemaException {
>> > > >     Descriptor fromDescriptor =
>> > > > ProtobufNativeSchemaUtils.deserialize(from.getData());
>> > > >     Descriptor toDescriptor =
>> > > > ProtobufNativeSchemaUtils.deserialize(to.getData());
>> > > >     switch (strategy) {
>> > > >         case BACKWARD_TRANSITIVE:
>> > > >         case BACKWARD:
>> > > >         case FORWARD_TRANSITIVE:
>> > > >         case FORWARD:
>> > > >         case FULL_TRANSITIVE:
>> > > >         case FULL:
>> > > >             checkRootMessageChange(fromDescriptor, toDescriptor,
>> > > strategy);
>> > > >             return;
>> > > >         case ALWAYS_COMPATIBLE:
>> > > >             return;
>> > > >         default:
>> > > >             throw new IncompatibleSchemaException("Unknown
>> > > > SchemaCompatibilityStrategy.");
>> > > >     }
>> > > > }
>> > > >
>> > > > I would rename :
>> > > > from --> currentSchema
>> > > > to --> newSchema
>> > > >
>> > > > Use that switch case and have a method for each like:
>> > > > validateBackwardsCompatibility(currentSchema, newSchema)
>> > > >
>> > > > I dislike canRead and usage of writtenSchema, since you have two
>> > > completely
>> > > > different use cases: from the producing side and the consumer side.
>> > > >
>> > > > schemaValidatorBuilder
>> > > > >
>> > > > > I dislike this proposal. IMO Avro implementation is way too
>> > > complicated.
>> > > > Why not have a simple function for validation for each switch case
>> above?
>> > > > Why do we need strategy and builder, and all this complexity?
>> > > >
>> > > >
>> > > > *Here are the basic compatibility rules we've defined:*
>> > > >
>> > > >
>> > > > IMO it's impossible to read the validation rules as you described
>> them.
>> > > > I wrote how they should be structured numerous times above.
>> > > > I can't validate them.
>> > > >
>> > > >
>> > > > IMO, the current design is very hard to read.
>> > > > Please try to avoid jumping into code sections.
>> > > > Write a high level design section, in which you describe in words
>> what
>> > > you
>> > > > plan to do.
>> > > > Write the validation rules in the structure that is easy to
>> understand:
>> > > > rules per each compatibility check, and use proper words (current
>> schema,
>> > > > new schema), since new schema can be once used for read and once
>> used for
>> > > > write.
>> > > >
>> > > > In its current form it takes too much time to understand the
>> design, and
>> > > it
>> > > > shouldn't be the case.
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Asaf
>> > > >
>> > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > On Sun, Mar 5, 2023 at 3:58 PM SiNan Liu <liusinan1...@gmail.com>
>> wrote:
>> > > >
>> > > > > Hi! I updated the explanation of some things in the PIP issue.
>> And also
>> > > > > added a new “flag” in the conf is used as the different
>> > > > > ProtobufNativeSchemaValidator implementation, also set
>> > > > > ProtobufNativeSchemaValidator default only check whether the name
>> of
>> > > the
>> > > > > root message is the same.
>> > > > >
>> > > > >
>> > > > > Thanks,
>> > > > > sinan
>> > > > >
>> > > > >
>> > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月5日周日 20:21写道:
>> > > > >
>> > > > > > On Wed, Mar 1, 2023 at 4:33 PM SiNan Liu <
>> liusinan1...@gmail.com>
>> > > wrote:
>> > > > > >
>> > > > > > > >
>> > > > > > > > Can you please explain how a Protobuf Schema descriptor can
>> be
>> > > > > > validated
>> > > > > > > > for backward compatibility check using Avro based
>> compatibility
>> > > > > rules?
>> > > > > > > > Doesn't it expect the schema to be Avro, but it is actually
>> a
>> > > > > Protobuf
>> > > > > > > > descriptor?
>> > > > > > > > Is there some translation happening?
>> > > > > > >
>> > > > > > >
>> > > > > > > 1. *You can take a quick look at the previous design, the
>> PROTOBUF
>> > > uses
>> > > > > > > avro struct to store.*
>> > > > > > > https://github.com/apache/pulsar/pull/1954
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L59-L61
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L110-L115
>> > > > > >
>> > > > > >
>> > > > > > Ok. So to summarize your code (easier to write it than send
>> links):
>> > > > > > * Pulsar Client, when used with Protobuf Schema, actually
>> converts
>> > > the
>> > > > > > Protobuf descriptor into an Avro Schema (using code found
>> inside Avro
>> > > > > > library) and saves that Avro schema as the schema. It's not
>> saving
>> > > the
>> > > > > > protobuf descriptor at all. Very confusing I have to add - never
>> > > expected
>> > > > > > that.
>> > > > > > This explains why In the ProtobufSchemaCompatibilityCheck they
>> just
>> > > > > extend
>> > > > > > the Avro without doing any translation.
>> > > > > >
>> > > > > > Thanks for that.
>> > > > > >
>> > > > > > Now thatI finally understand this, I can say that: you *must*
>> explain
>> > > > > that
>> > > > > > in the motivation part in your PIP.
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > 2. *On the broker side, protobuf and avro both use
>> `SchemaData`
>> > > > > converted
>> > > > > > > to `org.apache.avro.Schema`.*
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L1280-L1293
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/ProtobufSchemaCompatibilityCheck.java#L26-L31
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/AvroSchemaBasedCompatibilityCheck.java#L47-L70
>> > > > > >
>> > > > > >
>> > > > > > Actually those links don't really help.
>> > > > > > The main link that helps is:
>> > > > > >
>> > > > > >
>> > > > >
>> > >
>> https://github.com/apache/pulsar/blob/ec102fb024a6ea2b195826778300f20e330dff06/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L102-L122
>> > > > > >
>> > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > I'm sorry - I don't understand.
>> > > > > > > > I understand the different compatibility check strategies.
>> > > > > > > > If you just spell them out here, then as you say, just
>> translate
>> > > the
>> > > > > > > > Protobuf Descriptor into an Avro schema and run the Avro
>> > > > > > > > compatibility validation, no?
>> > > > > > > > I believe the answer is no, since you may want to verify
>> > > different
>> > > > > > things
>> > > > > > > > when it comes to Protobuf, which are different then Avro.
>> > > > > > >
>> > > > > > >
>> > > > > > > 1.
>> > > > > > > *ProtobufSchema is different from ProtobufNativeSchema in
>> that it
>> > > uses
>> > > > > > > avro-protobuf.*
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> https://central.sonatype.com/artifact/org.apache.avro/avro-protobuf/1.11.1/overview
>> > > > > > > *ProtobufNativeSchema needs a native compatibility check, but
>> > > there is
>> > > > > no
>> > > > > > > official or third party implementation. So this PIP does not
>> use
>> > > > > > > avro-protobuf for protobuf compatibility checking.*
>> > > > > > >
>> > > > > > > 2. *By the way, this is implemented in much the same way that
>> > > Apache
>> > > > > avro
>> > > > > > > does compatibility checking.*
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/SchemaValidatorBuilder.java
>> > > > > > > `canReadStrategy`,`canBeReadStrategy`,`mutualReadStrategy`
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateCanRead.java
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateCanBeRead.java
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateMutualRead.java
>> > > > > > > *In `ValidateMutualRead.java`, the arguments of `canRead()`
>> are
>> > > > > > > writtenSchema and readSchema. We only need to change the
>> order of
>> > > > > > arguments
>> > > > > > > we pass to `canRead()`.*
>> > > > > > > ```java
>> > > > > > > private void validateWithStrategy(Descriptors.Descriptor
>> > > toValidate,
>> > > > > > > Descriptors.Descriptor fromDescriptor) throws
>> > > > > > ProtoBufCanReadCheckException
>> > > > > > > {
>> > > > > > > switch (strategy) {
>> > > > > > > case CanReadExistingStrategy -> canRead(fromDescriptor,
>> > > toValidate);
>> > > > > > > case CanBeReadByExistingStrategy -> canRead(toValidate,
>> > > > > fromDescriptor);
>> > > > > > > case CanBeReadMutualStrategy -> {
>> > > > > > > canRead(toValidate, fromDescriptor);
>> > > > > > > canRead(fromDescriptor, toValidate);
>> > > > > > > }
>> > > > > > > }
>> > > > > > > }
>> > > > > > >
>> > > > > > > private void canRead(Descriptors.Descriptor writtenSchema,
>> > > > > > > Descriptors.Descriptor readSchema) throws
>> > > > > ProtoBufCanReadCheckException {
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> ProtobufNativeSchemaBreakCheckUtils.checkSchemaCompatibility(writtenSchema,
>> > > > > > > readSchema);
>> > > > > > > }
>> > > > > > > ```
>> > > > > > >
>> > > > > > >
>> > > > > > I get that you want to take inspiration from the existing Avro
>> Schema
>> > > > > > compatibility check, to do your code design.
>> > > > > > I also understand you *won't* use any existing avro code for
>> that.
>> > > > > > I also understand, you have to write the validation check on
>> your
>> > > own,
>> > > > > > since there is no 3rd party to explain that.
>> > > > > >
>> > > > > > The only thing I can't understand are the actual rules as I
>> wrote
>> > > before,
>> > > > > > since they are written confusingly.
>> > > > > > So, I repeat what I asked before:
>> > > > > >
>> > > > > > I think you should structure the validation rules differently:
>> > > > > >
>> > > > > > * Backward checks
>> > > > > > ** List down rules, where use newSchema (the schema used by
>> producer
>> > > or
>> > > > > > consumer) and existingSchema (last schema used)
>> > > > > > * Forward
>> > > > > > ** List down rules, where use newSchema (the schema used by
>> producer
>> > > or
>> > > > > > consumer) and existingSchema (last schema used)
>> > > > > >
>> > > > > > Once that's accomplished I will be able to understand the
>> different
>> > > > > > validation rules for each compatibility check.
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > >
>> > > > > > > Thanks,
>> > > > > > > sinan
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月1日周三 21:19写道:
>> > > > > > > >
>> > > > > > > > On Mon, Feb 27, 2023 at 3:47 PM SiNan Liu <
>> > > liusinan1...@gmail.com>
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > I read it and they look identical. What's the difference
>> > > between
>> > > > > > > them?
>> > > > > > > > >
>> > > > > > > > > Current avro,json, and protobuf schemas are all
>> implemented
>> > > based
>> > > > > on
>> > > > > > > AVRO.
>> > > > > > > > > > What do you mean, they are all implemented based on
>> Avro? You
>> > > > > mean
>> > > > > > > the
>> > > > > > > > > > protobuf schema is converted into an Avro Schema, and
>> then
>> > > you
>> > > > > use
>> > > > > > > Avro
>> > > > > > > > > > compatibility validation?
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> `org.apache.pulsar.broker.service.schema.ProtobufSchemaCompatibilityCheck`
>> > > > > > > > >
>> > > > > >
>> > > `org.apache.pulsar.broker.service.schema.AvroSchemaCompatibilityCheck`
>> > > > > > > > >
>> > > > > >
>> > > `org.apache.pulsar.broker.service.schema.JsonSchemaCompatibilityCheck`
>> > > > > > > > > They all extends `AvroSchemaBasedCompatibilityCheck`, the
>> > > > > > > > > `checkCompatible()` is the same implementation with
>> `AVRO`.
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > Can you please explain how a Protobuf Schema descriptor can
>> be
>> > > > > > validated
>> > > > > > > > for backward compatibility check using Avro based
>> compatibility
>> > > > > rules?
>> > > > > > > > Doesn't it expect the schema to be Avro, but it is actually
>> a
>> > > > > Protobuf
>> > > > > > > > descriptor?
>> > > > > > > > Is there some translation happening?
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > I think you should structure the validation rules
>> differently:
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > The Compatibility check strategy is described on the
>> website
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> https://pulsar.apache.org/docs/next/schema-understand/#schema-compatibility-check-strategy
>> > > > > > > > > 1. BACKWARD(CanReadExistingStrategy): Consumers using
>> schema
>> > > V3 can
>> > > > > > > process
>> > > > > > > > > data written by producers using the last schema version
>> V2. So
>> > > V2
>> > > > > is
>> > > > > > > > > "writtenSchema" and V3 is "readSchema".
>> > > > > > > > > 2. FORWARD(CanBeReadByExistingStrategy): Consumers using
>> the
>> > > last
>> > > > > > > schema
>> > > > > > > > > version V2 can process data written by producers using a
>> new
>> > > schema
>> > > > > > V3,
>> > > > > > > > > even though they may not be able to use the full
>> capabilities
>> > > of
>> > > > > the
>> > > > > > > new
>> > > > > > > > > schema. So V3 is "writtenSchema" and V2 is "readSchema".
>> > > > > > > > > 3. FULL(CanBeReadMutualStrategy): Schemas are both
>> backward and
>> > > > > > forward
>> > > > > > > > > compatible.
>> > > > > > > > > Schema can evolve. The old version schema and the new
>> version
>> > > > > schema
>> > > > > > > should
>> > > > > > > > > be well understood.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > I'm sorry - I don't understand.
>> > > > > > > > I understand the different compatibility check strategies.
>> > > > > > > > If you just spell them out here, then as you say, just
>> translate
>> > > the
>> > > > > > > > Protobuf Descriptor into an Avro schema and run the Avro
>> > > > > > > > compatibility validation, no?
>> > > > > > > > I believe the answer is no, since you may want to verify
>> > > different
>> > > > > > things
>> > > > > > > > when it comes to Protobuf, which are different then Avro.
>> > > > > > > >
>> > > > > > > > At the current state, I can't understand your design at all.
>> > > Please
>> > > > > > help
>> > > > > > > > clarify that.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > > So each strategy should have its own section.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > The arguments of `canRead()` are writtenSchema and
>> readSchema.
>> > > As
>> > > > > > we've
>> > > > > > > > > just described, we only need to change the order of
>> arguments
>> > > we
>> > > > > pass
>> > > > > > > to
>> > > > > > > > > `canRead()`.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Thanks,
>> > > > > > > > > sinan
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年2月27日周一
>> 20:49写道:
>> > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > And you can see the difference between ProtoBuf and
>> > > > > > ProtoBufNative:
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > >
>> > >
>> https://pulsar.apache.org/docs/next/schema-get-started/#protobufnative
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > https://pulsar.apache.org/docs/next/schema-get-started/#protobuf
>> > > > > > > > > > >
>> > > > > > > > > >  I read it and they look identical. What's the
>> difference
>> > > between
>> > > > > > > them?
>> > > > > > > > > >
>> > > > > > > > > > Current avro,json, and protobuf schemas are all
>> implemented
>> > > based
>> > > > > > on
>> > > > > > > > > AVRO.
>> > > > > > > > > >
>> > > > > > > > > > What do you mean, they are all implemented based on
>> Avro? You
>> > > > > mean
>> > > > > > > the
>> > > > > > > > > > protobuf schema is converted into an Avro Schema, and
>> then
>> > > you
>> > > > > use
>> > > > > > > Avro
>> > > > > > > > > > compatibility validation?
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > > *Here are the basic compatibility rules we've
>> defined:*
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > I think you should structure the validation rules
>> > > differently:
>> > > > > > > > > >
>> > > > > > > > > > * Backward checks
>> > > > > > > > > > ** List down rules, where use newSchema (the schema
>> used by
>> > > > > > producer
>> > > > > > > or
>> > > > > > > > > > consumer) and existingSchema (last schema used)
>> > > > > > > > > > * Forward
>> > > > > > > > > > ** List down rules, where use newSchema (the schema
>> used by
>> > > > > > producer
>> > > > > > > or
>> > > > > > > > > > consumer) and existingSchema (last schema used)
>> > > > > > > > > >
>> > > > > > > > > > So each strategy should have its own section.
>> > > > > > > > > >
>> > > > > > > > > > I'm saying this since you used "writttenSchema" word
>> but it
>> > > > > > > represents
>> > > > > > > > > > something completely different if it's backward or
>> forward
>> > > check.
>> > > > > > > > > >
>> > > > > > > > > > Once you'll have that structure like that, I personally
>> will
>> > > be
>> > > > > > able
>> > > > > > > to
>> > > > > > > > > > read and understand it.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > The motivation and problem statement are now good -
>> thanks
>> > > for
>> > > > > > > improving
>> > > > > > > > > > it.
>> > > > > > > > > >
>> > > > > > > > > > On Mon, Feb 27, 2023 at 8:20 AM SiNan Liu <
>> > > > > liusinan1...@gmail.com>
>> > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hi! I updated the PIP issue again. This time I've
>> added
>> > > some
>> > > > > > > background
>> > > > > > > > > > and
>> > > > > > > > > > > some explanations.
>> > > > > > > > > > >
>> > > > > > > > > > > The compatibility check rules are already written in
>> the
>> > > > > > > > > Implementation.
>> > > > > > > > > > > ProtoBufNative implements the same canRead method as
>> Apache
>> > > > > Avro.
>> > > > > > > > > > > It does this by checking whether the schema for
>> writing and
>> > > > > > reading
>> > > > > > > is
>> > > > > > > > > > > compatible. I also indicate whether the writtenSchema
>> and
>> > > > > > > readSchema of
>> > > > > > > > > > the
>> > > > > > > > > > > Backward, Forward, and Full strategies are the old or
>> the
>> > > new
>> > > > > > > version
>> > > > > > > > > of
>> > > > > > > > > > > the schema.
>> > > > > > > > > > >
>> > > > > > > > > > > Thanks,
>> > > > > > > > > > > sinan
>> > > > > > > > > > >
>> > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年2月26日周日
>> 23:24写道:
>> > > > > > > > > > >
>> > > > > > > > > > > > I'm sorry, but this PIP lacks a lot of background
>> > > knowledge,
>> > > > > so
>> > > > > > > you
>> > > > > > > > > > need
>> > > > > > > > > > > to
>> > > > > > > > > > > > add IMO for people to understand it. You don't need
>> to
>> > > > > explain
>> > > > > > > the
>> > > > > > > > > > entire
>> > > > > > > > > > > > pulsar in this PIP, but at the very least a few
>> > > paragraphs
>> > > > > > > detailing
>> > > > > > > > > > all
>> > > > > > > > > > > > you need to know, to put you in context:
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >    - Start by saying Pulsar as a built-in schema
>> registry
>> > > > > > inside
>> > > > > > > > > Pulsar
>> > > > > > > > > > > >    broker.
>> > > > > > > > > > > >       - Every time the client updates the schema, it
>> > > uploads
>> > > > > it
>> > > > > > > to
>> > > > > > > > > the
>> > > > > > > > > > > >       broker. When that happens, it has a feature
>> which
>> > > > > > validates
>> > > > > > > if
>> > > > > > > > > > the
>> > > > > > > > > > > > new
>> > > > > > > > > > > >       schema version is compatible with the previous
>> > > > > versions.
>> > > > > > > There
>> > > > > > > > > > > > are 4 types
>> > > > > > > > > > > >       of compatibility: Full, ... (complete and
>> explain
>> > > each
>> > > > > > one
>> > > > > > > > > > briefly)
>> > > > > > > > > > > >    - Also explain Pulsar Schema registry supports
>> various
>> > > > > > schema
>> > > > > > > > > > > >    protocols:  Avro, protobuf native, ... (complete
>> the
>> > > > > rest),
>> > > > > > > each
>> > > > > > > > > > > > protocol
>> > > > > > > > > > > >    has a schema which dictates how to serialize and
>> > > > > deserialize
>> > > > > > > the
>> > > > > > > > > > > message
>> > > > > > > > > > > >    content into typed object.
>> > > > > > > > > > > >    - Explain in short what is protobuf native
>> (compare
>> > > > > protobuf
>> > > > > > > > > > > non-native)
>> > > > > > > > > > > >    - Please don't paste code instead of explaining.
>> > > > > > > > > > > >       - Explain that protobuf native current
>> validation
>> > > check
>> > > > > > is
>> > > > > > > only
>> > > > > > > > > > > >       composed of checking the root message name is
>> the
>> > > same
>> > > > > > > between
>> > > > > > > > > > > > the current
>> > > > > > > > > > > >       schema version and the new version.
>> > > > > > > > > > > >          - Explain briefly what is a root message
>> and its
>> > > > > name.
>> > > > > > > > > > > >       - Explain the problem (list scenarios) that
>> we have
>> > > > > > because
>> > > > > > > > > > > protobuf
>> > > > > > > > > > > >       native schema only supports FULL compatibility
>> > > > > > validation.
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > Regarding high level design - as in what you plan
>> to do.
>> > > > > > > > > > > > I suggest you add "High Level Design" and in it
>> detail
>> > > how
>> > > > > you
>> > > > > > > plan
>> > > > > > > > > to
>> > > > > > > > > > > > validate, per protobuf version, per compatibility
>> check
>> > > > > > > (backward,
>> > > > > > > > > > > forward,
>> > > > > > > > > > > > full,...).
>> > > > > > > > > > > > I tried reading the implementation - for me , it's
>> all
>> > > over
>> > > > > the
>> > > > > > > > > place.
>> > > > > > > > > > > Can
>> > > > > > > > > > > > you please list in order what I wrote above, and
>> list the
>> > > > > > > validation
>> > > > > > > > > > > rules
>> > > > > > > > > > > > with a good explanation why you validate it like
>> that?
>> > > > > > > > > > > >
>> > > > > > > > > > > > Lastly, one you have all the validation rules
>> clearly
>> > > stated,
>> > > > > > you
>> > > > > > > can
>> > > > > > > > > > use
>> > > > > > > > > > > > it to document it properly so users can know what
>> > > validation
>> > > > > to
>> > > > > > > > > expect.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Thanks,
>> > > > > > > > > > > >
>> > > > > > > > > > > > Asaf
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > On Wed, Feb 22, 2023 at 5:10 PM SiNan Liu <
>> > > > > > > liusinan1...@gmail.com>
>> > > > > > > > > > > wrote:
>> > > > > > > > > > > >
>> > > > > > > > > > > > > Sorry, my mistake. I removed the code and
>> described the
>> > > > > > design
>> > > > > > > to
>> > > > > > > > > > > improve
>> > > > > > > > > > > > > the PROTOBUF_NATIVE schema compatibility checks.
>> You
>> > > can
>> > > > > > have a
>> > > > > > > > > look.
>> > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年2月22日周三
>> > > 21:16写道:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > I read it but you're almost directly diving
>> into the
>> > > > > code -
>> > > > > > > it
>> > > > > > > > > will
>> > > > > > > > > > > > take
>> > > > > > > > > > > > > me
>> > > > > > > > > > > > > > hours just to reverse engineer your design.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Can you please include a "High Level Design"
>> section
>> > > in
>> > > > > > which
>> > > > > > > you
>> > > > > > > > > > > > explain
>> > > > > > > > > > > > > > how you plan to tackle any issue?
>> > > > > > > > > > > > > > If I can read that section and explain to
>> someone
>> > > else
>> > > > > how
>> > > > > > > this
>> > > > > > > > > > will
>> > > > > > > > > > > > > work,
>> > > > > > > > > > > > > > it means the section is complete.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Let's leave the code to the PRs.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > On Sun, Feb 19, 2023 at 2:59 PM SiNan Liu <
>> > > > > > > > > liusinan1...@gmail.com>
>> > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Hi all,
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > I made a PIP to discuss:
>> > > > > > > > > > > > https://github.com/apache/pulsar/issues/19565
>> > > > > > > > > > > > > .
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > We can talk about the current design here.
>> > > Especially
>> > > > > for
>> > > > > > > the
>> > > > > > > > > > field
>> > > > > > > > > > > > > type
>> > > > > > > > > > > > > > > change check rules, please give your valuable
>> > > advice.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Thanks,
>> > > > > > > > > > > > > > > Sinan
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>>
>

Reply via email to