Previously I submitted a proposal for adding schemas as a first-class
concept on Beam PCollections. The proposal engendered quite a bit of
discussion from the community - more discussion than I've seen from almost
any of our proposals to date!

Based on the feedback and comments, I reworked the proposal document quite
a bit. It now talks more explicitly about the different between dynamic
schemas (where the schema is not fully not know at graph-creation time),
and static schemas (which are fully know at graph-creation time). Proposed
APIs are more fleshed out now (again thanks to feedback from community
members), and the document talks in more detail about evolving schemas in
long-running streaming pipelines.

Please take a look. I think this will be very valuable to Beam, and welcome
any feedback.

https://docs.google.com/document/d/1tnG2DPHZYbsomvihIpXruUmQ12pHGK0QIvXS1FOTgRc/edit#

Reuven

Reply via email to