Hi Sinan,

The doc looks much better!

I have a few additional comments:

Pasting comment from previous emails:

Can you convert the code block which is actually a quote in the
beginning of the PIP to something which doesn't require to scroll
horizontally so much?
Use
https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-text

*Validation Rules*

(1) If the root message names of writtenSchema and readSchema are
> different, then incompatible.

Why? The root message name is not written over the wire to the best of my
knowledge. I haven't found it written in the official doc.


>    - The writtenSchema cannot add required fields, but optional or
>    duplicate fields can be added (The field number must be new).
>
> That's not true.
You can have a required field in writeSchema, and not have that field in
readSchema (based on tag number).

The writtenSchema can not change the field number of any field in
readSchema (the
> field name is the same, but the field number is different).

You have to take into account field type as well when comparing.

If I have in readSchema
int32 justAnID = 1
int32 customerId = 2

and in writeSchema I have
int32 justAnID = 1
string customerId = 3

This is valid.

(4) The writtenSchema can not change the field name of any field in
> readSchema (the field number is the same, but the field name is
> different).

This is incorrect.
Fields names are encoded into the wire. I don't see this in any best
practice.

) The writtenSchema does not change the field name and number, but it does
> change the field type.
>
>    - If the field type is ENUM or MESSAGE, the schema is not compatible
>    when the type name is changed
>    - If the type of the field is another type. The schemas under this
>    rule are not incompatible, but warn the user.(There is another way of
>    testing in PIP issue Alternatives)
>
> Small correction: for the same field number you are not allowed to change
types. Name is irrelevant.

(6) The writtenSchema removes fields that do not have default values in
> readSchema. Then the schema is incompatible.

Protobuf gives you its own default if you don't supply one. This is
incorrect.


*Rules that you don't have in the doc, but should IMO*
* There can not be a field which exists both in readSchema and writeSchema,
with same tag number, having different default values
* There can't be a field in writerSchema, that exists in readerSchema (tag
number based), which in writerSchema is repeated and its type is Scalar (
https://protobuf.dev/programming-guides/proto/#scalar) but in readSchema it
is not repeated anymore.
* There can't be a field in writerSchema, that exists in readerSchema (tag
number based), which in writerSchema its type is scalar, but in readSchema
its type is scalar, it's repeated but with packed=true.

*Rules you have , but I would phrase a bit differently*

I would remove the proto2/proto3 sections, since they only differ in 1
rule, and just mention that distinction inside that rule (less work for the
reade).

* readSchema has a field which doesn't exist in writerSchema (based on tag
number).
   * Proto v2:
      * That field must be `optional` or `repeated` (must not be `required`)
   * Proto v3:
      * No problem.
* There can not be a field which exists both in readSchema and writeSchema,
with the same tag number, but having different types.

*Motivation*

Basically in the motivation section you want people to understand the
following:

Pulsar has built-in support for typed messages. It allows specifying an
encoding scheme and its matching schema.
For example, it supports Avro. You specify a schema for a given topic,
using Avro Schema Definition (i.e. a JSON describing the schema). Everytime
you produce a message, you first declare the schema definition you wish to
use for your messages. The message data should be an avro-encoded binary
data (which the client in some SDKs helps encode a given
data-structure/object).
The same applies when you consume a message. You first specify the schema
you use to read the messages, and the client in some SDKs helps by decoding
the message binary data into an object/data-structure.

Each time you specify a schema to be used, either by a producer or a
consumer, the schema is persisted in Pulsar and given an increasing version
number. If the schema was the same as the previous version, it is not
saved. When the message is persisted, the version number is encoded in the
message headers.

Pulsar provides a very useful feature named Schema Evolution
<https://pulsar.apache.org/docs/2.11.x/schema-understand/#schema-evolution>.
It allows us to check if a new schema version is compatible with previous
versions or versions. When you configure the schema for the topic you
decide the strategy to use for doing the validation check. The strategies
validate the following:

   - BACKWARD strategy
      - A consumer with newSchema can read a message written using
      existingSchema
   - BACKWARD_TRANSITIVE strategy
      - A consumer with newSchema can read messages written using all
      existingSchema
   - FORWARD
      - A consumer with existingSchema can read messages written using
      newSchema
   - FORWARD_TRANSITIVE
      - A consumer defined with any of the existingSchema can read messages
      written using newSchema
   - FULL
      - A consumer defined with newSchema can read messages written using
      existingSchema
      - A consumer defined with existingSchema can read messages written
      using newSchema
   - FULL_TRANSITIVE
      - A consumer defined with newSchema can read messages written using
      any of the existingSchema
      - A consumer defined with any of the existingSchema can read messages
      written using newSchema


Aside from Avro, Pulsar also has two additional supported encodings:
PROTOBUF and PROTOBUF_NATIVE.

PROTOBUF is a bit strange. It encodes the messages using Protobuf encoding,
but the schema that is persisted to Pulsar is *not* Protobuf Descriptor as
you would have expected. The saved schema is a translation of the Protobuf
Descriptor to an Avro Schema, so in fact an Avro schema definition is saved
as the schema.

PROTOBUF_NATIVE was created to fix that shortcoming, by actually persisting
the Protobuf Descriptor and using Protobuf for encoding.
The problem is that the authors of PROTOBUF_NATIVE haven't completed it
fully, and the backward compatibility validation code almost does not
exist: It only checks if the root message name is the same between versions.

GOALS
The goal of this PIP is to amend PROTOBUF_NATIVE by adding a fully
functional validation for any of the defined Schema Compatibility
Strategies.
A secondary goal is to allow the user to choose between different
implementations: The new fully functional validation or the existing
barebones validation.

-------- END

I'm ok with having links in the Motivation , as *further reading*.
I'm against stacking up work for the reader to go read 5-6 different links
just to understand the motivation and background knowledge required to
understand the feature.

I'm against putting code in the Motivation. Especially if it is supposed to
replace description in plain English making it easy to understand the
design.
Leave the code to the motivation.
Paste code only if you absolutely can't use plain old descriptions to
explain.



On Sat, Mar 11, 2023 at 11:46 AM SiNan Liu <liusinan1...@gmail.com> wrote:

> *I guess that's right, too! *
>
> But the name `ProtobufNativeAdvancedSchemaCompatibilityCheck` is better,
> because we don't know whether the future will have V2, V3. The official
> solution can be called `ProtobufNativeOfficialSchemaCompatibilityCheck`, or
> is a good `ProtobufNativeXXXXXXXXSchemaCompatibilityCheck` third-party
> solution.
>
> I've updated my design in PIP issue.
> 1. A new ProtobufNativeSchemaAdvanceCompatibilityCheck, rather than a
> ProtobufNativeSchemaCompatibilityCheck different validator implementation.
> 2. Remove the 'builder'
> 3. Clarify the relationship between newSchema, existingSchema, and
> writtenSchema in canRead.
>
> Help to see if the description is comprehensive and what changes and
> improvements need to be made.
>
> Thanks,
> sinan
>
>
>
> Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月9日周四 17:35写道:
>
> > I like Bo's suggestion - I haven't realized each schema type
> > compatibility check is actually a plugin.
> > It makes sense for any schema type checks to evolve, sometimes in a
> > non-backward compatible way hence having two plugins like
> > protobufNativeSchemaCompatabilityCheckV1 and then
> > protobufNativeSchemaCompatabilityCheckV2 and then
> > protobufNativeSchemaCompatabilityCheckV3 makes sense to me.
> >
> >
> >
> > On Thu, Mar 9, 2023 at 5:49 AM 丛搏 <bog...@apache.org> wrote:
> >
> > >  Hi siNan:
> > >
> > > From my point of view, it is just a plug-in. I don't think it is
> > > necessary to add configuration for the plugin.
> > > This is meaningless, and it will increase the difficulty of use for
> > users.
> > >
> > >
> > > SiNan Liu <liusinan1...@gmail.com> 于2023年3月8日周三 15:54写道:
> > > >
> > > > Hi, bo.
> > > >
> > > > 1. I understand what you say, to develop a new
> > > > `ProtobufNativeAdvancedSchemaCompatibilityCheck`, rather than
> changing
> > > > existing `ProtobufNativeSchemaCompatibilityCheck`. But I found a few
> > > small
> > > > problems:
> > > >
> > > > (1)ProtobufNativeAdvancedSchemaCompatibilityCheck and
> > > > ProtobufNativeSchemaCompatibilityCheck schemaType is PROTOBUF_NATIVE.
> > It
> > > > looks like both checkers are PROTOBUF not using AVRO-PROTOBUF's
> > "native"
> > > > implementation, which leads to some problems or "unreasonable" and
> > gives
> > > me
> > > > some extended thinking and questions.
> > > >
> > > `CompatibilityCheck ` its only a plugin.
> > > `ProtobufNativeSchemaCompatibilityCheck` may sooner or later leave the
> > > stage, when `ProtobufNativeAdvancedSchemaCompatibilityCheck` is
> > > stable, we can make it the default Checker.
> > >
> > > It is just a plug-in, users can change it at will and ensure that it
> > > is used correctly
> > > > (2)In broker.conf
> > > >
> > > > `schemaRegistryCompatibilityCheckers`. If
> > > > ProtobufNativeSchemaCompatibilityCheck and
> > > > ProtobufNativeAdvancedSchemaCompatibilityCheck all set. This is going
> > to
> > > > overwrite each other. Because this is a map:
> > > >
> > > >
> > >
> >
> https://github.com/apache/pulsar/blob/af1360fb167c1f9484fda5771df3ea9b21d1440b/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/SchemaRegistryService.java#L36-L44
> > > >
> > > > ```java
> > > >
> > > > Map<SchemaType, SchemaCompatibilityCheck> checkers = new HashMap<>();
> > > >
> > > > for (String className : checkerClasses) {
> > > >
> > > > SchemaCompatibilityCheck schemaCompatibilityCheck =
> > > > Reflections.createInstance(className,
> > > >
> > > > SchemaCompatibilityCheck.class,
> > > > Thread.currentThread().getContextClassLoader());
> > > >
> > > > checkers.put(schemaCompatibilityCheck.getSchemaType(),
> > > > schemaCompatibilityCheck);
> > > >
> > > > ```
> > > >
> > > > Is this a big problem or a small one? Is it possible or unnecessary?
> > > Maybe
> > > > we can write in the documentation that protobufNative checkers can
> only
> > > > choose one of the two? Why are there two Checkers for different
> > > > implementations of the same schemaType? Why not the checker to create
> > > > different validator, so we don not have to change
> > > > schemaRegistryCompatibilityCheckers.
> > >
> > > users can only use one, not two, which will bring complexity to users
> > >
> > > >
> > > > (3)And after the update to
> > > ProtobufNativeAdvancedSchemaCompatibilityCheck.
> > > > Existing topics previously only checked the name of the root message,
> > not
> > > > the content of protobuf.
> > > >
> > > > What if the user wants both Checkers?
> > > >
> > > > Set to ProtobufNativeAdvancedSchemaCompatibilityCheck, affect the
> topic
> > > of
> > > > the existing schema?
> > > >
> > > > Older topics still use the old checker, and newer topics or certain
> > older
> > > > topics use the new advancedchecker.
> > > >
> > > when `ProtobufNativeAdvancedSchemaCompatibilityCheck` stable,
> > > users will not choose `ProtobufNativeSchemaCompatibilityCheck`.
> > > because it not a complete checker.
> > > > (4)So should we have one schemaType for a checker?
> > protobufNativeChecker
> > > > can have as many different implementation classes as possible. This
> > > > classname configuration in PIP, let's see if it can be set at the
> topic
> > > > level. In the current PIP design I just load this parameter into the
> > > > checker when the broker is started and the checkers map is set up.
> Can
> > I
> > > do
> > > > this in the new normal pr if I want to support topic level? Or
> perfect
> > it
> > > > here?
> > > >
> > > > Add a call PROTOBUF_NATIVE_ADVANCE schemaType corresponding
> > > > ProtobufNativeAdvancedSchemaCompatibilityCheck? (Seems to be more
> > > trouble).
> > > >
> > > > Sorry I can not use the computer and network in the company, I use my
> > > > mobile phone to reply to the email, the format may be a bit messy.
> > Please
> > > > understand.
> > > >
> > > > Thanks,
> > > >
> > > > sinan
> > > >
> > > >
> > > > 丛搏 <bog...@apache.org> 于 2023年3月7日周二 下午11:39写道:
> > > >
> > > > > SiNan Liu <liusinan1...@gmail.com> 于2023年3月7日周二 13:22写道:
> > > > > >
> > > > > > Great to see your comment, bo!
> > > > > >
> > > > > > 1. The first way. The protobuf website has a description of the
> > > rules,
> > > > > but
> > > > > > no plans to implement them.
> > > > > > https://protobuf.dev/programming-guides/proto/#updating
> > > > >
> > > > > https://groups.google.com/g/protobuf
> > > > > maybe ask here
> > > > >
> > > > > >
> > > > > > 2. I think this PIP can be divided into two parts.
> > > > > > (1) Add a flag(`ValidatorClassName`), load it into
> > > > > > `ProtobufNativeSchemaCompatibilityCheck` when the broker starts.
> > > > > > ValidatorClassName is empty by default, and the implementation
> > > continues
> > > > > as
> > > > > > before, with no change for the user.
> > > > >
> > > > > `ProtobufNativeSchemaCompatibilityCheck` is a plugin in
> `broker.conf`
> > > > > ```
> > > > >
> > > > >
> > >
> >
> schemaRegistryCompatibilityCheckers=org.apache.pulsar.broker.service.schema.JsonSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.AvroSchemaCompatibilityCheck,org.apache.pulsar.broker.service.schema.ProtobufNativeSchemaCompatibilityCheck
> > > > > ```
> > > > > I do not recommend that we directly modify this plugin and continue
> > to
> > > > > add configuration items, which will cause trouble for users.
> > > > > We have a lot of configs and it's getting very unwieldy.
> > > > > in my opinion, we don't change
> > > > >
> > > > >
> > >
> >
> `org.apache.pulsar.broker.service.schema.ProtobufNativeSchemaCompatibilityCheck`,
> > > > > it is a simple implementation, it doesn't go wrong very often, most
> > > > > users will use it. we can add another ProtobufNativeCheck named
> > > > > `ProtobufNativeAdvancedSchemaCompatibilityCheck ` or other. in this
> > > > > way, we don't need to add this flag. There is no need to consider
> > > > > compatibility, it is just a plug-in and will not affect current
> > logic.
> > > > > If the user needs it, just change the plugin to the new
> > implementation
> > > > >
> > > > > > ```java
> > > > > >     ProtobufNativeSchemaValidator DEFAULT = (fromDescriptors,
> > > > > toDescriptor)
> > > > > > -> {
> > > > > >         for (Descriptors.Descriptor fromDescriptor :
> > > fromDescriptors) {
> > > > > >             // The default implementation only checks if the root
> > > message
> > > > > > has changed.
> > > > > >             if
> > > > > >
> (!fromDescriptor.getFullName().equals(toDescriptor.getFullName()))
> > {
> > > > > >                 throw new ProtoBufCanReadCheckException("Protobuf
> > > root
> > > > > > message isn't allow change!");
> > > > > >             }
> > > > > >         }
> > > > > >     };
> > > > > > ```
> > > > > > `ValidatorClassName` value also can be set to the current
> > > implementation
> > > > > of
> > > > > > PIP add
> > > > > >
> > > > >
> > >
> >
> `org.apache.pulsar.broker.service.schema.validator.ProtobufNativeSchemaBreakValidatorImpl`.
> > > > > >
> > > > > > (2) Recoding the `ProtobufNativeSchemaCompatibilityCheck`.
> Through
> > > the
> > > > > flag
> > > > > > (`ValidatorClassName`) to build different
> > > > > `ProtobufNativeSchemaValidator`.
> > > > > > Isn't it just a plug-in? The user can develop and choose a
> > different
> > > > > > `ProtobufNativeSchemaValidator`. I think it didn't change the
> > logic,
> > > it
> > > > > > just allowed him to expand it.
> > > > > >
> > > > > >
> > > > > > I think this PIP should be an enhancement and supplement to the
> > > function,
> > > > > > and there is no such thing as unnecessary and meaningless.
> > > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > sinan
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > 丛搏 <bog...@apache.org> 于2023年3月7日周二 11:53写道:
> > > > > >
> > > > > > > I think we have two ways to do that.
> > > > > > >
> > > > > > > First way: We need to advance the improvement of java in
> > protobuf.
> > > Ask
> > > > > > > if they have plans to improve.
> > > > > > >
> > > > > > > Second way: the new PROTOBUF_NATIVE `SchemaCompatibilityCheck`
> > > should
> > > > > > > be implemented as a plugin, don't change any existing plugin
> > logic
> > > > > > > (it's simple and already used). I don't recommend adding flags
> > for
> > > > > > > rollback, it adds configuration and makes little sense.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Bo
> > > > > > >
> > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月6日周一 23:00写道:
> > > > > > >
> > > > > > > >
> > > > > > > > Can you convert the code block which is actually a quote in
> the
> > > > > > > > beginning of the PIP to something which doesn't require to
> > scroll
> > > > > > > > horizontally so much?
> > > > > > > > Use
> > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-text
> > > > > > > >
> > > > > > > > Let's improve the clarity of what you wrote:
> > > > > > > >
> > > > > > > > "the PROTOBUF uses avro struct to store."
> > > > > > > > -->
> > > > > > > > When Schema type PROTOBUF is used, Pulsar Client assumes the
> > > object
> > > > > given
> > > > > > > > to it as message data is an auto-generated POJO containing
> the
> > > > > > > annotations
> > > > > > > > encoding the schema. The client is using a converter, which
> > > converts
> > > > > a
> > > > > > > > Protobuf schema descriptor into an Avro schema and sends that
> > as
> > > the
> > > > > > > Schema
> > > > > > > > of the producer/consumer.
> > > > > > > >
> > > > > > > > "On the broker side, protobuf and avro both use SchemaData
> > > converted
> > > > > to
> > > > > > > > org.apache.avro.Schema."
> > > > > > > > -->
> > > > > > > > Since the schema is an Avro schema, the implementation of
> > > > > compatibility
> > > > > > > > check on the broker side is to simply re-use the
> compatibility
> > > check
> > > > > of
> > > > > > > the
> > > > > > > > AVRO schema type.
> > > > > > > >
> > > > > > > > "ProtobufSchema is different from ProtobufNativeSchema in
> > schema
> > > > > > > > compatibility check it uses avro-protobuf.
> > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://central.sonatype.com/artifact/org.apache.avro/avro-protobuf/1.11.1/overview
> > > > > > > > But the current implementation of ProtobufNative schema
> > > compatibility
> > > > > > > > check only
> > > > > > > > checked if the root message name is changed."
> > > > > > > >
> > > > > > > > -->
> > > > > > > > PROTOBUF_NATIVE schema type is different.
> > > > > > > > The client is actually using Protobuf Descriptor as the
> schema,
> > > as
> > > > > > > opposed
> > > > > > > > to Avro schema of PROTOBUF schema type. In the broker, the
> > > > > > > PROTOBUF_NATIVE
> > > > > > > > compatibility check actually hasn't implemented any rule,
> > besides
> > > > > one:
> > > > > > > > checking if the root message name has changed.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >    1. For now, there is no official or third-party solution
> > for
> > > > > > > ProtoBuf
> > > > > > > > >    compatibility. If in the future have better solutions
> of a
> > > third
> > > > > > > party or
> > > > > > > > >    the official, we develop new
> ProtobufNativeSchemaValidator
> > > and
> > > > > use,
> > > > > > > so
> > > > > > > > >    add a flag.
> > > > > > > > >
> > > > > > > > > Who do you need to make that configurable? Once you found a
> > > third
> > > > > > > party,
> > > > > > > > just switch to it? Who knows, maybe you never will. Introduce
> > it
> > > > > when you
> > > > > > > > find it, not now.
> > > > > > > >
> > > > > > > >
> > > > > > > > We improve in ProtobufNativeSchemaCompatibilityCheck
> BACKWARD,
> > > > > FORWARD
> > > > > > > > > these strategies. As with the AVRO implementation, protobuf
> > > > > > > compatibility
> > > > > > > > > checking need implementing the canRead method. *This will
> > check
> > > > > that
> > > > > > > > > the writtenschema can be read by readSchema.*
> > > > > > > >
> > > > > > > >
> > > > > > > > I completely disagree.
> > > > > > > > Avro implementation is confusing for our use case. Don't copy
> > > that.
> > > > > > > >
> > > > > > > > You have
> > > > > > > >
> > > > > > > > public void checkCompatible(SchemaData from, SchemaData to,
> > > > > > > > SchemaCompatibilityStrategy strategy)
> > > > > > > >         throws IncompatibleSchemaException {
> > > > > > > >     Descriptor fromDescriptor =
> > > > > > > > ProtobufNativeSchemaUtils.deserialize(from.getData());
> > > > > > > >     Descriptor toDescriptor =
> > > > > > > > ProtobufNativeSchemaUtils.deserialize(to.getData());
> > > > > > > >     switch (strategy) {
> > > > > > > >         case BACKWARD_TRANSITIVE:
> > > > > > > >         case BACKWARD:
> > > > > > > >         case FORWARD_TRANSITIVE:
> > > > > > > >         case FORWARD:
> > > > > > > >         case FULL_TRANSITIVE:
> > > > > > > >         case FULL:
> > > > > > > >             checkRootMessageChange(fromDescriptor,
> > toDescriptor,
> > > > > > > strategy);
> > > > > > > >             return;
> > > > > > > >         case ALWAYS_COMPATIBLE:
> > > > > > > >             return;
> > > > > > > >         default:
> > > > > > > >             throw new IncompatibleSchemaException("Unknown
> > > > > > > > SchemaCompatibilityStrategy.");
> > > > > > > >     }
> > > > > > > > }
> > > > > > > >
> > > > > > > > I would rename :
> > > > > > > > from --> currentSchema
> > > > > > > > to --> newSchema
> > > > > > > >
> > > > > > > > Use that switch case and have a method for each like:
> > > > > > > > validateBackwardsCompatibility(currentSchema, newSchema)
> > > > > > > >
> > > > > > > > I dislike canRead and usage of writtenSchema, since you have
> > two
> > > > > > > completely
> > > > > > > > different use cases: from the producing side and the consumer
> > > side.
> > > > > > > >
> > > > > > > > schemaValidatorBuilder
> > > > > > > > >
> > > > > > > > > I dislike this proposal. IMO Avro implementation is way too
> > > > > > > complicated.
> > > > > > > > Why not have a simple function for validation for each switch
> > > case
> > > > > above?
> > > > > > > > Why do we need strategy and builder, and all this complexity?
> > > > > > > >
> > > > > > > >
> > > > > > > > *Here are the basic compatibility rules we've defined:*
> > > > > > > >
> > > > > > > >
> > > > > > > > IMO it's impossible to read the validation rules as you
> > described
> > > > > them.
> > > > > > > > I wrote how they should be structured numerous times above.
> > > > > > > > I can't validate them.
> > > > > > > >
> > > > > > > >
> > > > > > > > IMO, the current design is very hard to read.
> > > > > > > > Please try to avoid jumping into code sections.
> > > > > > > > Write a high level design section, in which you describe in
> > words
> > > > > what
> > > > > > > you
> > > > > > > > plan to do.
> > > > > > > > Write the validation rules in the structure that is easy to
> > > > > understand:
> > > > > > > > rules per each compatibility check, and use proper words
> > (current
> > > > > schema,
> > > > > > > > new schema), since new schema can be once used for read and
> > once
> > > > > used for
> > > > > > > > write.
> > > > > > > >
> > > > > > > > In its current form it takes too much time to understand the
> > > design,
> > > > > and
> > > > > > > it
> > > > > > > > shouldn't be the case.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Asaf
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sun, Mar 5, 2023 at 3:58 PM SiNan Liu <
> > liusinan1...@gmail.com
> > > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi! I updated the explanation of some things in the PIP
> > issue.
> > > And
> > > > > also
> > > > > > > > > added a new “flag” in the conf is used as the different
> > > > > > > > > ProtobufNativeSchemaValidator implementation, also set
> > > > > > > > > ProtobufNativeSchemaValidator default only check whether
> the
> > > name
> > > > > of
> > > > > > > the
> > > > > > > > > root message is the same.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > sinan
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月5日周日 20:21写道:
> > > > > > > > >
> > > > > > > > > > On Wed, Mar 1, 2023 at 4:33 PM SiNan Liu <
> > > liusinan1...@gmail.com
> > > > > >
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Can you please explain how a Protobuf Schema
> descriptor
> > > can
> > > > > be
> > > > > > > > > > validated
> > > > > > > > > > > > for backward compatibility check using Avro based
> > > > > compatibility
> > > > > > > > > rules?
> > > > > > > > > > > > Doesn't it expect the schema to be Avro, but it is
> > > actually a
> > > > > > > > > Protobuf
> > > > > > > > > > > > descriptor?
> > > > > > > > > > > > Is there some translation happening?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 1. *You can take a quick look at the previous design,
> the
> > > > > PROTOBUF
> > > > > > > uses
> > > > > > > > > > > avro struct to store.*
> > > > > > > > > > > https://github.com/apache/pulsar/pull/1954
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L59-L61
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L110-L115
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Ok. So to summarize your code (easier to write it than
> send
> > > > > links):
> > > > > > > > > > * Pulsar Client, when used with Protobuf Schema, actually
> > > > > converts
> > > > > > > the
> > > > > > > > > > Protobuf descriptor into an Avro Schema (using code found
> > > inside
> > > > > Avro
> > > > > > > > > > library) and saves that Avro schema as the schema. It's
> not
> > > > > saving
> > > > > > > the
> > > > > > > > > > protobuf descriptor at all. Very confusing I have to add
> -
> > > never
> > > > > > > expected
> > > > > > > > > > that.
> > > > > > > > > > This explains why In the ProtobufSchemaCompatibilityCheck
> > > they
> > > > > just
> > > > > > > > > extend
> > > > > > > > > > the Avro without doing any translation.
> > > > > > > > > >
> > > > > > > > > > Thanks for that.
> > > > > > > > > >
> > > > > > > > > > Now thatI finally understand this, I can say that: you
> > *must*
> > > > > explain
> > > > > > > > > that
> > > > > > > > > > in the motivation part in your PIP.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 2. *On the broker side, protobuf and avro both use
> > > `SchemaData`
> > > > > > > > > converted
> > > > > > > > > > > to `org.apache.avro.Schema`.*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java#L1280-L1293
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/ProtobufSchemaCompatibilityCheck.java#L26-L31
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://github.com/apache/pulsar/blob/579f22c8449be287ee1209a477aeaad346495289/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/AvroSchemaBasedCompatibilityCheck.java#L47-L70
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Actually those links don't really help.
> > > > > > > > > > The main link that helps is:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://github.com/apache/pulsar/blob/ec102fb024a6ea2b195826778300f20e330dff06/pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/ProtobufSchema.java#L102-L122
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I'm sorry - I don't understand.
> > > > > > > > > > > > I understand the different compatibility check
> > > strategies.
> > > > > > > > > > > > If you just spell them out here, then as you say,
> just
> > > > > translate
> > > > > > > the
> > > > > > > > > > > > Protobuf Descriptor into an Avro schema and run the
> > Avro
> > > > > > > > > > > > compatibility validation, no?
> > > > > > > > > > > > I believe the answer is no, since you may want to
> > verify
> > > > > > > different
> > > > > > > > > > things
> > > > > > > > > > > > when it comes to Protobuf, which are different then
> > Avro.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 1.
> > > > > > > > > > > *ProtobufSchema is different from ProtobufNativeSchema
> in
> > > that
> > > > > it
> > > > > > > uses
> > > > > > > > > > > avro-protobuf.*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://central.sonatype.com/artifact/org.apache.avro/avro-protobuf/1.11.1/overview
> > > > > > > > > > > *ProtobufNativeSchema needs a native compatibility
> check,
> > > but
> > > > > > > there is
> > > > > > > > > no
> > > > > > > > > > > official or third party implementation. So this PIP
> does
> > > not
> > > > > use
> > > > > > > > > > > avro-protobuf for protobuf compatibility checking.*
> > > > > > > > > > >
> > > > > > > > > > > 2. *By the way, this is implemented in much the same
> way
> > > that
> > > > > > > Apache
> > > > > > > > > avro
> > > > > > > > > > > does compatibility checking.*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/SchemaValidatorBuilder.java
> > > > > > > > > > >
> > `canReadStrategy`,`canBeReadStrategy`,`mutualReadStrategy`
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateCanRead.java
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateCanBeRead.java
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/ValidateMutualRead.java
> > > > > > > > > > > *In `ValidateMutualRead.java`, the arguments of
> > > `canRead()` are
> > > > > > > > > > > writtenSchema and readSchema. We only need to change
> the
> > > order
> > > > > of
> > > > > > > > > > arguments
> > > > > > > > > > > we pass to `canRead()`.*
> > > > > > > > > > > ```java
> > > > > > > > > > > private void
> validateWithStrategy(Descriptors.Descriptor
> > > > > > > toValidate,
> > > > > > > > > > > Descriptors.Descriptor fromDescriptor) throws
> > > > > > > > > > ProtoBufCanReadCheckException
> > > > > > > > > > > {
> > > > > > > > > > > switch (strategy) {
> > > > > > > > > > > case CanReadExistingStrategy -> canRead(fromDescriptor,
> > > > > > > toValidate);
> > > > > > > > > > > case CanBeReadByExistingStrategy -> canRead(toValidate,
> > > > > > > > > fromDescriptor);
> > > > > > > > > > > case CanBeReadMutualStrategy -> {
> > > > > > > > > > > canRead(toValidate, fromDescriptor);
> > > > > > > > > > > canRead(fromDescriptor, toValidate);
> > > > > > > > > > > }
> > > > > > > > > > > }
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > private void canRead(Descriptors.Descriptor
> > writtenSchema,
> > > > > > > > > > > Descriptors.Descriptor readSchema) throws
> > > > > > > > > ProtoBufCanReadCheckException {
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> ProtobufNativeSchemaBreakCheckUtils.checkSchemaCompatibility(writtenSchema,
> > > > > > > > > > > readSchema);
> > > > > > > > > > > }
> > > > > > > > > > > ```
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > I get that you want to take inspiration from the existing
> > > Avro
> > > > > Schema
> > > > > > > > > > compatibility check, to do your code design.
> > > > > > > > > > I also understand you *won't* use any existing avro code
> > for
> > > > > that.
> > > > > > > > > > I also understand, you have to write the validation check
> > on
> > > your
> > > > > > > own,
> > > > > > > > > > since there is no 3rd party to explain that.
> > > > > > > > > >
> > > > > > > > > > The only thing I can't understand are the actual rules
> as I
> > > wrote
> > > > > > > before,
> > > > > > > > > > since they are written confusingly.
> > > > > > > > > > So, I repeat what I asked before:
> > > > > > > > > >
> > > > > > > > > > I think you should structure the validation rules
> > > differently:
> > > > > > > > > >
> > > > > > > > > > * Backward checks
> > > > > > > > > > ** List down rules, where use newSchema (the schema used
> by
> > > > > producer
> > > > > > > or
> > > > > > > > > > consumer) and existingSchema (last schema used)
> > > > > > > > > > * Forward
> > > > > > > > > > ** List down rules, where use newSchema (the schema used
> by
> > > > > producer
> > > > > > > or
> > > > > > > > > > consumer) and existingSchema (last schema used)
> > > > > > > > > >
> > > > > > > > > > Once that's accomplished I will be able to understand the
> > > > > different
> > > > > > > > > > validation rules for each compatibility check.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > sinan
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月1日周三
> > 21:19写道:
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Feb 27, 2023 at 3:47 PM SiNan Liu <
> > > > > > > liusinan1...@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I read it and they look identical. What's the
> > > difference
> > > > > > > between
> > > > > > > > > > > them?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Current avro,json, and protobuf schemas are all
> > > implemented
> > > > > > > based
> > > > > > > > > on
> > > > > > > > > > > AVRO.
> > > > > > > > > > > > > > What do you mean, they are all implemented based
> on
> > > > > Avro? You
> > > > > > > > > mean
> > > > > > > > > > > the
> > > > > > > > > > > > > > protobuf schema is converted into an Avro Schema,
> > and
> > > > > then
> > > > > > > you
> > > > > > > > > use
> > > > > > > > > > > Avro
> > > > > > > > > > > > > > compatibility validation?
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> `org.apache.pulsar.broker.service.schema.ProtobufSchemaCompatibilityCheck`
> > > > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > `org.apache.pulsar.broker.service.schema.AvroSchemaCompatibilityCheck`
> > > > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > `org.apache.pulsar.broker.service.schema.JsonSchemaCompatibilityCheck`
> > > > > > > > > > > > > They all extends
> `AvroSchemaBasedCompatibilityCheck`,
> > > the
> > > > > > > > > > > > > `checkCompatible()` is the same implementation with
> > > `AVRO`.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Can you please explain how a Protobuf Schema
> descriptor
> > > can
> > > > > be
> > > > > > > > > > validated
> > > > > > > > > > > > for backward compatibility check using Avro based
> > > > > compatibility
> > > > > > > > > rules?
> > > > > > > > > > > > Doesn't it expect the schema to be Avro, but it is
> > > actually a
> > > > > > > > > Protobuf
> > > > > > > > > > > > descriptor?
> > > > > > > > > > > > Is there some translation happening?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think you should structure the validation rules
> > > > > differently:
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > The Compatibility check strategy is described on
> the
> > > > > website
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://pulsar.apache.org/docs/next/schema-understand/#schema-compatibility-check-strategy
> > > > > > > > > > > > > 1. BACKWARD(CanReadExistingStrategy): Consumers
> using
> > > > > schema
> > > > > > > V3 can
> > > > > > > > > > > process
> > > > > > > > > > > > > data written by producers using the last schema
> > version
> > > > > V2. So
> > > > > > > V2
> > > > > > > > > is
> > > > > > > > > > > > > "writtenSchema" and V3 is "readSchema".
> > > > > > > > > > > > > 2. FORWARD(CanBeReadByExistingStrategy): Consumers
> > > using
> > > > > the
> > > > > > > last
> > > > > > > > > > > schema
> > > > > > > > > > > > > version V2 can process data written by producers
> > using
> > > a
> > > > > new
> > > > > > > schema
> > > > > > > > > > V3,
> > > > > > > > > > > > > even though they may not be able to use the full
> > > > > capabilities
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > new
> > > > > > > > > > > > > schema. So V3 is "writtenSchema" and V2 is
> > > "readSchema".
> > > > > > > > > > > > > 3. FULL(CanBeReadMutualStrategy): Schemas are both
> > > > > backward and
> > > > > > > > > > forward
> > > > > > > > > > > > > compatible.
> > > > > > > > > > > > > Schema can evolve. The old version schema and the
> new
> > > > > version
> > > > > > > > > schema
> > > > > > > > > > > should
> > > > > > > > > > > > > be well understood.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > I'm sorry - I don't understand.
> > > > > > > > > > > > I understand the different compatibility check
> > > strategies.
> > > > > > > > > > > > If you just spell them out here, then as you say,
> just
> > > > > translate
> > > > > > > the
> > > > > > > > > > > > Protobuf Descriptor into an Avro schema and run the
> > Avro
> > > > > > > > > > > > compatibility validation, no?
> > > > > > > > > > > > I believe the answer is no, since you may want to
> > verify
> > > > > > > different
> > > > > > > > > > things
> > > > > > > > > > > > when it comes to Protobuf, which are different then
> > Avro.
> > > > > > > > > > > >
> > > > > > > > > > > > At the current state, I can't understand your design
> at
> > > all.
> > > > > > > Please
> > > > > > > > > > help
> > > > > > > > > > > > clarify that.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > So each strategy should have its own section.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > The arguments of `canRead()` are writtenSchema and
> > > > > readSchema.
> > > > > > > As
> > > > > > > > > > we've
> > > > > > > > > > > > > just described, we only need to change the order of
> > > > > arguments
> > > > > > > we
> > > > > > > > > pass
> > > > > > > > > > > to
> > > > > > > > > > > > > `canRead()`.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > sinan
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年2月27日周一
> > > 20:49写道:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > And you can see the difference between ProtoBuf
> > and
> > > > > > > > > > ProtoBufNative:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > >
> > > https://pulsar.apache.org/docs/next/schema-get-started/#protobufnative
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > >
> > > https://pulsar.apache.org/docs/next/schema-get-started/#protobuf
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >  I read it and they look identical. What's the
> > > difference
> > > > > > > between
> > > > > > > > > > > them?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Current avro,json, and protobuf schemas are all
> > > > > implemented
> > > > > > > based
> > > > > > > > > > on
> > > > > > > > > > > > > AVRO.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > What do you mean, they are all implemented based
> on
> > > > > Avro? You
> > > > > > > > > mean
> > > > > > > > > > > the
> > > > > > > > > > > > > > protobuf schema is converted into an Avro Schema,
> > and
> > > > > then
> > > > > > > you
> > > > > > > > > use
> > > > > > > > > > > Avro
> > > > > > > > > > > > > > compatibility validation?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > *Here are the basic compatibility rules we've
> > > defined:*
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I think you should structure the validation rules
> > > > > > > differently:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > * Backward checks
> > > > > > > > > > > > > > ** List down rules, where use newSchema (the
> schema
> > > used
> > > > > by
> > > > > > > > > > producer
> > > > > > > > > > > or
> > > > > > > > > > > > > > consumer) and existingSchema (last schema used)
> > > > > > > > > > > > > > * Forward
> > > > > > > > > > > > > > ** List down rules, where use newSchema (the
> schema
> > > used
> > > > > by
> > > > > > > > > > producer
> > > > > > > > > > > or
> > > > > > > > > > > > > > consumer) and existingSchema (last schema used)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So each strategy should have its own section.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I'm saying this since you used "writttenSchema"
> > word
> > > but
> > > > > it
> > > > > > > > > > > represents
> > > > > > > > > > > > > > something completely different if it's backward
> or
> > > > > forward
> > > > > > > check.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Once you'll have that structure like that, I
> > > personally
> > > > > will
> > > > > > > be
> > > > > > > > > > able
> > > > > > > > > > > to
> > > > > > > > > > > > > > read and understand it.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The motivation and problem statement are now
> good -
> > > > > thanks
> > > > > > > for
> > > > > > > > > > > improving
> > > > > > > > > > > > > > it.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Feb 27, 2023 at 8:20 AM SiNan Liu <
> > > > > > > > > liusinan1...@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi! I updated the PIP issue again. This time
> I've
> > > added
> > > > > > > some
> > > > > > > > > > > background
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > some explanations.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The compatibility check rules are already
> written
> > > in
> > > > > the
> > > > > > > > > > > > > Implementation.
> > > > > > > > > > > > > > > ProtoBufNative implements the same canRead
> method
> > > as
> > > > > Apache
> > > > > > > > > Avro.
> > > > > > > > > > > > > > > It does this by checking whether the schema for
> > > > > writing and
> > > > > > > > > > reading
> > > > > > > > > > > is
> > > > > > > > > > > > > > > compatible. I also indicate whether the
> > > writtenSchema
> > > > > and
> > > > > > > > > > > readSchema of
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > Backward, Forward, and Full strategies are the
> > old
> > > or
> > > > > the
> > > > > > > new
> > > > > > > > > > > version
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > the schema.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > sinan
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com>
> > 于2023年2月26日周日
> > > > > 23:24写道:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I'm sorry, but this PIP lacks a lot of
> > background
> > > > > > > knowledge,
> > > > > > > > > so
> > > > > > > > > > > you
> > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > add IMO for people to understand it. You
> don't
> > > need
> > > > > to
> > > > > > > > > explain
> > > > > > > > > > > the
> > > > > > > > > > > > > > entire
> > > > > > > > > > > > > > > > pulsar in this PIP, but at the very least a
> few
> > > > > > > paragraphs
> > > > > > > > > > > detailing
> > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > you need to know, to put you in context:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >    - Start by saying Pulsar as a built-in
> > schema
> > > > > registry
> > > > > > > > > > inside
> > > > > > > > > > > > > Pulsar
> > > > > > > > > > > > > > > >    broker.
> > > > > > > > > > > > > > > >       - Every time the client updates the
> > > schema, it
> > > > > > > uploads
> > > > > > > > > it
> > > > > > > > > > > to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >       broker. When that happens, it has a
> > feature
> > > > > which
> > > > > > > > > > validates
> > > > > > > > > > > if
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > >       schema version is compatible with the
> > > previous
> > > > > > > > > versions.
> > > > > > > > > > > There
> > > > > > > > > > > > > > > > are 4 types
> > > > > > > > > > > > > > > >       of compatibility: Full, ... (complete
> and
> > > > > explain
> > > > > > > each
> > > > > > > > > > one
> > > > > > > > > > > > > > briefly)
> > > > > > > > > > > > > > > >    - Also explain Pulsar Schema registry
> > supports
> > > > > various
> > > > > > > > > > schema
> > > > > > > > > > > > > > > >    protocols:  Avro, protobuf native, ...
> > > (complete
> > > > > the
> > > > > > > > > rest),
> > > > > > > > > > > each
> > > > > > > > > > > > > > > > protocol
> > > > > > > > > > > > > > > >    has a schema which dictates how to
> serialize
> > > and
> > > > > > > > > deserialize
> > > > > > > > > > > the
> > > > > > > > > > > > > > > message
> > > > > > > > > > > > > > > >    content into typed object.
> > > > > > > > > > > > > > > >    - Explain in short what is protobuf native
> > > > > (compare
> > > > > > > > > protobuf
> > > > > > > > > > > > > > > non-native)
> > > > > > > > > > > > > > > >    - Please don't paste code instead of
> > > explaining.
> > > > > > > > > > > > > > > >       - Explain that protobuf native current
> > > > > validation
> > > > > > > check
> > > > > > > > > > is
> > > > > > > > > > > only
> > > > > > > > > > > > > > > >       composed of checking the root message
> > name
> > > is
> > > > > the
> > > > > > > same
> > > > > > > > > > > between
> > > > > > > > > > > > > > > > the current
> > > > > > > > > > > > > > > >       schema version and the new version.
> > > > > > > > > > > > > > > >          - Explain briefly what is a root
> > message
> > > > > and its
> > > > > > > > > name.
> > > > > > > > > > > > > > > >       - Explain the problem (list scenarios)
> > > that we
> > > > > have
> > > > > > > > > > because
> > > > > > > > > > > > > > > protobuf
> > > > > > > > > > > > > > > >       native schema only supports FULL
> > > compatibility
> > > > > > > > > > validation.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Regarding high level design - as in what you
> > > plan to
> > > > > do.
> > > > > > > > > > > > > > > > I suggest you add "High Level Design" and in
> it
> > > > > detail
> > > > > > > how
> > > > > > > > > you
> > > > > > > > > > > plan
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > validate, per protobuf version, per
> > compatibility
> > > > > check
> > > > > > > > > > > (backward,
> > > > > > > > > > > > > > > forward,
> > > > > > > > > > > > > > > > full,...).
> > > > > > > > > > > > > > > > I tried reading the implementation - for me ,
> > > it's
> > > > > all
> > > > > > > over
> > > > > > > > > the
> > > > > > > > > > > > > place.
> > > > > > > > > > > > > > > Can
> > > > > > > > > > > > > > > > you please list in order what I wrote above,
> > and
> > > > > list the
> > > > > > > > > > > validation
> > > > > > > > > > > > > > > rules
> > > > > > > > > > > > > > > > with a good explanation why you validate it
> > like
> > > > > that?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Lastly, one you have all the validation rules
> > > clearly
> > > > > > > stated,
> > > > > > > > > > you
> > > > > > > > > > > can
> > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > it to document it properly so users can know
> > what
> > > > > > > validation
> > > > > > > > > to
> > > > > > > > > > > > > expect.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Asaf
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Feb 22, 2023 at 5:10 PM SiNan Liu <
> > > > > > > > > > > liusinan1...@gmail.com>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Sorry, my mistake. I removed the code and
> > > > > described the
> > > > > > > > > > design
> > > > > > > > > > > to
> > > > > > > > > > > > > > > improve
> > > > > > > > > > > > > > > > > the PROTOBUF_NATIVE schema compatibility
> > > checks.
> > > > > You
> > > > > > > can
> > > > > > > > > > have a
> > > > > > > > > > > > > look.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Asaf Mesika <asaf.mes...@gmail.com>
> > > 于2023年2月22日周三
> > > > > > > 21:16写道:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I read it but you're almost directly
> diving
> > > into
> > > > > the
> > > > > > > > > code -
> > > > > > > > > > > it
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > take
> > > > > > > > > > > > > > > > > me
> > > > > > > > > > > > > > > > > > hours just to reverse engineer your
> design.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Can you please include a "High Level
> > Design"
> > > > > section
> > > > > > > in
> > > > > > > > > > which
> > > > > > > > > > > you
> > > > > > > > > > > > > > > > explain
> > > > > > > > > > > > > > > > > > how you plan to tackle any issue?
> > > > > > > > > > > > > > > > > > If I can read that section and explain to
> > > someone
> > > > > > > else
> > > > > > > > > how
> > > > > > > > > > > this
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > work,
> > > > > > > > > > > > > > > > > > it means the section is complete.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Let's leave the code to the PRs.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Sun, Feb 19, 2023 at 2:59 PM SiNan
> Liu <
> > > > > > > > > > > > > liusinan1...@gmail.com>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I made a PIP to discuss:
> > > > > > > > > > > > > > > >
> https://github.com/apache/pulsar/issues/19565
> > > > > > > > > > > > > > > > > .
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > We can talk about the current design
> > here.
> > > > > > > Especially
> > > > > > > > > for
> > > > > > > > > > > the
> > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > change check rules, please give your
> > > valuable
> > > > > > > advice.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > Sinan
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
>

Reply via email to