On Wed, Dec 20, 2023 at 8:41 AM Ben San Nicolas via dev <dev@beam.apache.org> wrote:
> Hi, > > I'm looking to make use of https://github.com/apache/beam/issues/23373 so > I can use a java avro schema with enums xlang from python. > > Are there existing ideas on how to implement this? > > I tried taking a look and the Python SDK has a very simple map from > concrete python type to logical type, which doesn't seem sufficient to > encode additional type info from an argument, since a map from class to > logical type and the associated _from_typing(class) conversion doesn't > maintain any arguments. > > I tried comparing this with the Java SDK, but it looks like it has a full > abstraction around a schema model, which directly encodes the logical arg > type/arg from the protobuf and then handles the conversions implicitly > somehow/elsewhere. As far as I could tell, this abstraction is missing from > python, in the sense that schemas_test.py roundtrips proto classes through > concrete python types like typing.Mapping[typing.Optional[numpy.int64], > bytes] rather than an SDK-defined schema class. > > The simplest solution might be to just update language_type() to return a > tuple of class, arg, and register that in the logical type mapping. Does > this make sense? > Yes, this'd be a great step forward. Not being able to specify the enum logical type (that seems widely used in Java) is a major pain point for Python.