Thanks all for the input. It helped a lot for the doc. From the feedback, - The main concern is that RawType/CoderLogicalType break the strong mapping of schema<->coder. This is a valid concern.
- On the other hand, it is a way to make schemas the fundamental concept (which is a goal of Beam 3) under the situation that Beam and its ecosystem has already evolved for years with many Beam pipelines using (non-portable) coders+custom types. >From these feedbacks, I suggest we proceed with CoderLogicalType approach, given the requirements noted in "Requirement" section of the doc, and in addition, - We should clearly document that this approach, if implemented, should not be used to bypass the schema framework. We always encourage schema-fy structured types. I'll start drafting changes for each supported SDK. Thanks again! On Wed, Sep 3, 2025 at 1:51 PM Yi Hu <[email protected]> wrote: > Hi all, > > Please find the following design doc for a portable RAW field type > enabling arbitrary (serializable) data type to be included and take > advantage of the Beam portable schema framework > > https://s.apache.org/beam-portable-raw-type > > It aims to solve https://github.com/apache/beam/issues/23374 (as well as > https://github.com/apache/beam/issues/19817) as part of schema > improvement for Beam 3 (https://github.com/apache/beam/issues/34672). > > It also includes an appendix of term disambiguation between > Beam/Flink/Avro schema systems that might find useful in general. > > I proposed two alternative designs. Any feedback is welcome! > > Regards, > > Yi > > -- > > Yi Hu, (he/him/his) > > Software Engineer > > >
