Hi, Yunze :

1. If the changes may cause some compatibility issues.
How do we solve the compatibility issues? It may be a
breaking change.

2. Another question is if sorting is enabled by default,
is the sorting rule the same as java or other clients?

Putting aside the above two problems, I think it is
good to be consistent with other clients.

Thanks,
Bo

Eric Hare <eric.h...@datastax.com> 于2023年3月29日周三 22:42写道:
>
> +1 - i think keeping the `_sorted_fields` and `_required` defaults consistent 
> between the clients is the way to go.
>
> > On Mar 29, 2023, at 7:09 AM, Yunze Xu <y...@streamnative.io.INVALID> wrote:
> >
> > I found the Python client has two options to control the behavior:
> > 1. Set `_sorted_fields`. It's false by default in the Python client,
> > but it's true in the Java client. i.e. the Java client sorts all
> > fields by default.
> > 2. Set `_required`. It's false by default for all types in the Python
> > client, but it's only false for the string type in the Java client.
> >
> > i.e. given the following Java class:
> >
> > ```java
> > class User {
> >    String name;
> >    int age;
> >    double score;
> > }
> > ```
> >
> > We have to give the following definition in Python:
> >
> > ```python
> > class User(Record):
> >    _sorted_fields = True
> >    name = String()
> >    age = Integer(required=True)
> >    score = Double(required=True)
> > ```
> >
> > I see https://github.com/apache/pulsar/pull/12232 adds the
> > `_sorted_fields` field and disables the field sort by default. It
> > breaks compatibility with the Java client.
> >
> > IMO, we should make `_sorted_fields` true by default and `_required`
> > true for all types other than `String` by default.
> >
> > Thanks,
> > Yunze
> >
> > On Wed, Mar 29, 2023 at 4:00 PM Yunze Xu <y...@streamnative.io> wrote:
> >>
> >> Hi all,
> >>
> >> Recently I found the default generated schema definition in the Python
> >> client is different from the Java client, which leads to some
> >> unexpected behavior.
> >>
> >> For example, given the following class definition in Python:
> >>
> >> ```python
> >> class Data(Record):
> >>    i = Integer()
> >> ```
> >>
> >> The type of `i` field is a union: "type": ["null", "int"]
> >>
> >> While given the following class definition in Java:
> >>
> >> ```java
> >> class Data {
> >>    private final int i;
> >>    /* ... */
> >> }
> >> ```
> >>
> >> The type of `i` field is an integer: "type": "int"
> >>
> >> It brings an issue that if a Python consumer subscribes to a topic
> >> with schema defined above, then a Java producer will fail to create
> >> because of the schema incompatibility.
> >>
> >> Currently, the workaround is to change the schema compatibility
> >> strategy to FORWARD.
> >>
> >> Should we change the way to generate schema definition in the Python
> >> client to be compatible with the Java client? It could bring breaking
> >> changes to old Python clients, but it could guarantee compatibility
> >> with the Java client.
> >>
> >> If not, we still have to introduce an extra configuration to make
> >> Python schema compatible with Java schema. But it requires code
> >> changes. e.g. here is a possible solution:
> >>
> >> ```python
> >> class Data(Record):
> >>    # NOTE: Users might have to add this extra field to control how to
> >> generate the schema
> >>    __java_compatible = True
> >>    i = Integer()
> >> ```
> >>
> >> Thanks,
> >> Yunze
>

Reply via email to