Hi, Yunze : 1. If the changes may cause some compatibility issues. How do we solve the compatibility issues? It may be a breaking change.
2. Another question is if sorting is enabled by default, is the sorting rule the same as java or other clients? Putting aside the above two problems, I think it is good to be consistent with other clients. Thanks, Bo Eric Hare <eric.h...@datastax.com> 于2023年3月29日周三 22:42写道: > > +1 - i think keeping the `_sorted_fields` and `_required` defaults consistent > between the clients is the way to go. > > > On Mar 29, 2023, at 7:09 AM, Yunze Xu <y...@streamnative.io.INVALID> wrote: > > > > I found the Python client has two options to control the behavior: > > 1. Set `_sorted_fields`. It's false by default in the Python client, > > but it's true in the Java client. i.e. the Java client sorts all > > fields by default. > > 2. Set `_required`. It's false by default for all types in the Python > > client, but it's only false for the string type in the Java client. > > > > i.e. given the following Java class: > > > > ```java > > class User { > > String name; > > int age; > > double score; > > } > > ``` > > > > We have to give the following definition in Python: > > > > ```python > > class User(Record): > > _sorted_fields = True > > name = String() > > age = Integer(required=True) > > score = Double(required=True) > > ``` > > > > I see https://github.com/apache/pulsar/pull/12232 adds the > > `_sorted_fields` field and disables the field sort by default. It > > breaks compatibility with the Java client. > > > > IMO, we should make `_sorted_fields` true by default and `_required` > > true for all types other than `String` by default. > > > > Thanks, > > Yunze > > > > On Wed, Mar 29, 2023 at 4:00 PM Yunze Xu <y...@streamnative.io> wrote: > >> > >> Hi all, > >> > >> Recently I found the default generated schema definition in the Python > >> client is different from the Java client, which leads to some > >> unexpected behavior. > >> > >> For example, given the following class definition in Python: > >> > >> ```python > >> class Data(Record): > >> i = Integer() > >> ``` > >> > >> The type of `i` field is a union: "type": ["null", "int"] > >> > >> While given the following class definition in Java: > >> > >> ```java > >> class Data { > >> private final int i; > >> /* ... */ > >> } > >> ``` > >> > >> The type of `i` field is an integer: "type": "int" > >> > >> It brings an issue that if a Python consumer subscribes to a topic > >> with schema defined above, then a Java producer will fail to create > >> because of the schema incompatibility. > >> > >> Currently, the workaround is to change the schema compatibility > >> strategy to FORWARD. > >> > >> Should we change the way to generate schema definition in the Python > >> client to be compatible with the Java client? It could bring breaking > >> changes to old Python clients, but it could guarantee compatibility > >> with the Java client. > >> > >> If not, we still have to introduce an extra configuration to make > >> Python schema compatible with Java schema. But it requires code > >> changes. e.g. here is a possible solution: > >> > >> ```python > >> class Data(Record): > >> # NOTE: Users might have to add this extra field to control how to > >> generate the schema > >> __java_compatible = True > >> i = Integer() > >> ``` > >> > >> Thanks, > >> Yunze >