On Mon, Jun 15, 2020 at 11:12 AM Robert Bradshaw <rober...@google.com> wrote:
> On Fri, Jun 12, 2020 at 4:12 PM Brian Hulette <bhule...@google.com> wrote: > >> > are unknown fields propagated through if the user only reads/modifies a >> row? >> I'm not sure I understand this question. Are you asking about handling >> schema changes? >> The wire format includes the number of fields in the schema, specifically >> so that we can detect when the schema changes. This is restricted to added >> or removed fields at the end of the schema. i.e. if we receive an element >> that says it has N more fields than the schema this coder was created with >> we assume the pipeline was updated with a schema that drops the last N >> fields and ignore the extra fields. Similarly if we receive an element with >> N fewer fields than we expect we'll just fill the last N fields with nulls. >> This logic is implemented in Python [1] and Java [2], but it's not >> exercised since no runners actually support pipeline update with schema >> changes. >> >> > how does it work in a pipeline update scenario (downgrade / upgrade)? >> It's a standard coder with a defined spec [3] and tests in >> standard_coders.yaml [4] (although we could certainly use more coverage >> there) so I think pipeline update should work fine, unless I'm missing >> something. >> > > The big question is whether the pipeline update will be rejected due to > the Coder having "changed." > > Do you mean changed because the schema has changed, or due to the vagaries of Java serialization? > Brian >> >> [1] >> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/row_coder.py#L177-L189 >> [2] >> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoderGenerator.java#L341-L356 >> [3] >> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L833-L864 >> [4] >> https://github.com/apache/beam/blob/master/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml#L344-L364 >> >> On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote: >> >>> +Boyuan Zhang <boyu...@google.com> >>> >>> On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote: >>> >>>> What is the update / compat story around schemas? >>>> * are unknown fields propagated through if the user only reads/modifies >>>> a row? >>>> * how does it work in a pipeline update scenario (downgrade / upgrade)? >>>> >>>> Boyuan has been working on a Kafka via SDF source and have been trying >>>> to figure out which interchange format to use for the "source descriptors" >>>> that feed into the SDF. Some obvious choices are json, avro, proto, and >>>> Beam schemas all with their caveats. >>>> >>>> On Fri, Jun 12, 2020 at 1:32 PM Brian Hulette <bhule...@google.com> >>>> wrote: >>>> >>>>> Thanks! I see there are jiras for SpannerIO and JdbcIO as part of >>>>> that. Are you planning on using row coder for them? >>>>> If so I want to make sure you're aware of >>>>> https://s.apache.org/beam-schema-io (sent to the dev list last week >>>>> [1]). +Scott Lukas <slu...@google.com> will be working on building >>>>> out the ideas there this summer. His work could be useful for making these >>>>> IOs cross-language (and you would get a mapping to SQL out of it without >>>>> much more effort). >>>>> >>>>> Brian >>>>> >>>>> [1] >>>>> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E >>>>> >>>>> On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski < >>>>> piotr.szuber...@polidea.com> wrote: >>>>> >>>>>> Sure, I'll do that >>>>>> >>>>>> On 2020/05/28 17:54:49, Chamikara Jayalath <chamik...@google.com> >>>>>> wrote: >>>>>> > Great. Thanks for working on this. Can you please add these tasks >>>>>> and JIRAs >>>>>> > to the cross-language transforms roadmap under "Connector/transform >>>>>> > support". >>>>>> > https://beam.apache.org/roadmap/connectors-multi-sdk/ >>>>>> > >>>>>> > Happy to help if you run into any issues during this task. >>>>>> > >>>>>> > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks, >>>>>> > Cham >>>>>> > >>>>>> > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski < >>>>>> piotr.szuber...@polidea.com> >>>>>> > wrote: >>>>>> > >>>>>> > > I added to Jira task of creating cross-language wrappers for Java >>>>>> IOs. It >>>>>> > > will soon be in progress. >>>>>> > > >>>>>> > >>>>>> >>>>>