Thank you, Valentyn!
On Fri, Jul 17, 2020 at 3:25 PM Chamikara Jayalath <chamik...@google.com> wrote: > > > > On Fri, Jul 17, 2020 at 3:01 PM Valentyn Tymofieiev <valen...@google.com> > wrote: >> >> As a general rule, fixes pertaining to new functionality are not a good >> candidate for a cherry-pick. >> >> A case for an exception can be made for polishing features related to major >> wide announcements with a hard deadline, which appears to be the case for >> xlang on Dataflow. >> >> I will prepare an RC2 with xlang fixes and consider other low-risk additions >> from issues that were brought to my attention. > > > Thanks Valentyn. > >> >> >> Thanks >> >> >> On Fri, Jul 17, 2020 at 10:36 AM Chamikara Jayalath <chamik...@google.com> >> wrote: >>> >>> >>> >>> On Fri, Jul 17, 2020 at 10:01 AM Robert Bradshaw <rober...@google.com> >>> wrote: >>>> >>>> Taking a step back, the goal of avoiding cherry-picks is to reduce >>>> risk and increase the velocity of our releases, as otherwise the >>>> release manager gets inundated by a never ending list of features >>>> people want to get in that puts the releases further and further >>>> behind (increasing the desire to get features in in a vicious cycle). >>>> On the flip side, the reason we have a release process with candidates >>>> and voting (as opposed to just declaring a commit id every N weeks to >>>> be "the release") is to give us the flexibility to achieve a level of >>>> quality and polish that may not ever occur in HEAD itself. >>>> >>>> With regards to this specific cross-langauge fix, the motivation is >>>> that those working on it at Google want to widely publish this feature >>>> as newly available on Dataflow. The question to answer here (Cham) is >>>> whether this bug is debilitating enough that were it not to be in the >>>> release we would want to hold off advertising this (and related) >>>> features until the next release. (In my understanding, it would result >>>> in a poor enough user experience that it is.) >>> >>> >>> Yes, I think we will have to either hold off on widely publishing the >>> feature or list this as a potential issue that will be fixed in the next >>> release for anybody who tries cross-language pipelines and runs into this. >>> Note that we are getting in a Python Kafka example [1]. So users will >>> potentially try this out anyways. >>> >>> [1] https://github.com/apache/beam/pull/12188 >>> >>> >>>> >>>> >>>> On the other hand, there's the question of the cost of getting this >>>> fix into the release. The change is simple and well contained, so I >>>> think the risk is low (and, in particular, the cost to include it here >>>> is low enough that it's worth the value provided above). >>>> >>>> Looking at the other proposals, >>>> https://github.com/apache/beam/pull/12196 also seems to meet this bar >>>> (there are possible xlang correctness issues at play here), as does >>>> https://github.com/apache/beam/pull/12175 (mostly due to its >>>> simplicity and the fact that doing it later would be a backwards >>>> compatible change). I'm on the fence about >>>> https://github.com/apache/beam/pull/12171 (if an RC2 is in the works >>>> anyway), and IMHO the others are less compelling as having to be done >>>> now. >>> >>> >>> +1 >>> >>>> >>>> >>>> (On the question of a point release, IMHO anything worth considering >>>> for an x.y.1 release definitely meets the bar for inclusion into an RC >>>> of an ongoing release.) >>>> >>>> - Robert >>>> >>>> >>>> On Thu, Jul 16, 2020 at 8:00 PM Chamikara Jayalath <chamik...@google.com> >>>> wrote: >>>> > >>>> > >>>> > >>>> > On Thu, Jul 16, 2020 at 7:46 PM Chamikara Jayalath >>>> > <chamik...@google.com> wrote: >>>> >> >>>> >> >>>> >> >>>> >> On Thu, Jul 16, 2020 at 7:28 PM Valentyn Tymofieiev >>>> >> <valen...@google.com> wrote: >>>> >>> >>>> >>> >>>> >>> >>>> >>> On Thu, Jul 16, 2020, 19:07 Chamikara Jayalath <chamik...@google.com> >>>> >>> wrote: >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Jul 16, 2020 at 6:16 PM Valentyn Tymofieiev >>>> >>>> <valen...@google.com> wrote: >>>> >>>>> >>>> >>>>> Thanks for the feedback, help with release validation, and for >>>> >>>>> reaching out on dev@ regarding a cherry-pick request. >>>> >>>>> >>>> >>>>> BEAM-10397 pertains to new functionality (xlang support on >>>> >>>>> Dataflow). Are there any reasons that this fix cannot wait until >>>> >>>>> 2.24.0 (release cut date 4 weeks from now)? >>>> >>>>> >>>> >>>>> For transparency, I would like to list other cherry-pick requests >>>> >>>>> that I received off-the list (stakeholders bcc'ed): >>>> >>>>> - https://github.com/apache/beam/pull/12175 >>>> >>>>> - https://github.com/apache/beam/pull/12196 >>>> >>>>> - https://github.com/apache/beam/pull/12171 >>>> >>>>> - https://issues.apache.org/jira/browse/BEAM-10492 (recently added) >>>> >>>>> - https://issues.apache.org/jira/browse/BEAM-10385 >>>> >>>>> - https://github.com/apache/beam/pull/12187 (was available before >>>> >>>>> any of RC1 artifacts were created and integrated) >>>> >>>> >>>> >>>> >>>> >>>> My main concern is Python changes in >>>> >>>> https://github.com/apache/beam/pull/12164. Other changes (at least >>>> >>>> related to x-lang) can wait. >>>> >>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> My response to such requests is guided by the release guide [1]: >>>> >>>>> >>>> >>>>> - None of the issues were a regression from a previous release. >>>> >>>>> - Most are related to new or recently introduced functionality. >>>> >>>>> - 3 of the requests are related to xlang io, which is very exciting >>>> >>>>> and important functionality, but arguably does not impact a large >>>> >>>>> percentage of [existing] users. >>>> >>>> >>>> >>>> >>>> >>>> Agree that this is not a regression from the previous release but it >>>> >>>> may result in inconsistent behavior when users execute x-lang >>>> >>>> pipelines. Actually I think this is a pretty serious issue for >>>> >>>> portability (we are not setting the environment in WindowingStrategy) >>>> >>>> but for some reason we are not hitting this in other tests. >>>> >>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> So they do not seem to be release-blocking according to the guide. >>>> >>>>> >>>> >>>>> At this point creating a new RC would delay 2.23.0 availability by >>>> >>>>> at least a week. While a new RC will improve the stability of xlang >>>> >>>>> IO, it will also delay the release of features and bug fixes >>>> >>>>> available in 2.23.0. It will also create a precedent of >>>> >>>>> inconsistency with release policy. Should we delay the release if we >>>> >>>>> discover another xlang issue during validation next week? >>>> >>>> >>>> >>>> >>>> >>>> To be honest, I don't think re-validating after the cherry-pick >>>> >>>> mentioned above will take a week (unless we find other issues). We >>>> >>>> just need to rebuild and re-validate the Python distribution and may >>>> >>>> be rebuild Dataflow containers. I'm volunteering to help you with >>>> >>>> this :) >>>> >>> >>>> >>> >>>> >>> I was taking 72hrs of voting Window into account that must happen >>>> >>> outside of the weekend and the fact that I will be OOO for one day. >>>> >> >>>> >> >>>> >> Got it. >>>> >> >>>> >>> >>>> >>> >>>> >>> If the issue you mention seriously impacts (can cause data loss, >>>> >>> pipeline failures) all of users on portable stack or other large user >>>> >>> base (not just cross-language support in Dataflow (new user-base) ), >>>> >>> this is definitely a candidate for an ASAP fix. >>>> >>> >>>> >>> What is your assessment of the size of the user base that is affected >>>> >>> by the issue (large, medium, small, does not affect production for any >>>> >>> of existing users)? >>>> >> >>>> >> >>>> >> Impact today I think is low but potential for impact in the future is >>>> >> high. For example, if we update Dataflow service or portable runners to >>>> >> require environment in WindowingStrategy, we'll have to either fork for >>>> >> this or require users to upgrade to a Beam version with the fix. >>>> > >>>> > >>>> > Actually, ignore the "portable runners" part. Seems like we already set >>>> > "context.default_environment_id()" in the WindowingStrategy so impact is >>>> > likely only for Dataflow where we do not set an environment_id in >>>> > serialized WindowingStrategy that is set in GBK. >>>> > >>>> >> >>>> >> >>>> >> Thanks, >>>> >> Cham >>>> >> >>>> >>> >>>> >>> >>>> >>> Thanks! >>>> >>> >>>> >>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> My preferred course of action is to continue with RC0, since release >>>> >>>>> velocity is important for product health. >>>> >>>>> >>>> >>>>> Given that we are having this conversation, we can revise the >>>> >>>>> cherry-pick policy if we think it does not adequately cover this >>>> >>>>> situation. >>>> >>>> >>>> >>>> >>>> >>>> Agree. We have a very strong policy currently regarding cherry-picks >>>> >>>> but it's up to the release manager to look into requests on a >>>> >>>> case-by-case basis. >>>> >>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> We can also propose a patch-version release with urgent >>>> >>>>> cherry-picks (release 2.23.1), or consider a faster release cadence >>>> >>>>> if 6 weeks is too slow. >>>> >>>> >>>> >>>> >>>> >>>> Honestly I don't think this is practical. Making a new patch release, >>>> >>>> validation, vote etc will take 2 weeks or so. We either should >>>> >>>> cherry-pick this into current release or wait till the next one. I >>>> >>>> think patch releases should be reserved for critical updates to LTS >>>> >>>> releases. >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Cham >>>> >>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> Thanks, >>>> >>>>> Valentyn >>>> >>>>> >>>> >>>>> [1] >>>> >>>>> https://beam.apache.org/contribute/release-guide/#review-cherry-picks >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> On Wed, Jul 15, 2020 at 5:41 PM Chamikara Jayalath >>>> >>>>> <chamik...@google.com> wrote: >>>> >>>>>> >>>> >>>>>> I agree. I think Dataflow x-lang users could run into flaky >>>> >>>>>> pipelines due to this. Valentyn, are you OK with creating a new RC >>>> >>>>>> that includes the fix (already merged - >>>> >>>>>> https://github.com/apache/beam/pull/12164) and preferably >>>> >>>>>> https://github.com/apache/beam/pull/12196 ? >>>> >>>>>> >>>> >>>>>> Thanks, >>>> >>>>>> Cham >>>> >>>>>> >>>> >>>>>> On Wed, Jul 15, 2020 at 5:27 PM Heejong Lee <heej...@google.com> >>>> >>>>>> wrote: >>>> >>>>>>> >>>> >>>>>>> I think we need to cherry-pick >>>> >>>>>>> https://issues.apache.org/jira/browse/BEAM-10397 which fixes >>>> >>>>>>> missing environment errors for Dataflow xlang pipelines. >>>> >>>>>>> Internally, we have a flaky xlang kafkaio test because of missing >>>> >>>>>>> environment errors and any xlang pipelines using GroupByKey could >>>> >>>>>>> encounter this. >>>> >>>>>>> >>>> >>>>>>> On Wed, Jul 15, 2020 at 5:08 PM Ahmet Altay <al...@google.com> >>>> >>>>>>> wrote: >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> On Wed, Jul 15, 2020 at 4:55 PM Robert Bradshaw >>>> >>>>>>>> <rober...@google.com> wrote: >>>> >>>>>>>>> >>>> >>>>>>>>> All the artifacts, signatures, and hashes look good. >>>> >>>>>>>>> >>>> >>>>>>>>> I would like to understand the severity of >>>> >>>>>>>>> https://issues.apache.org/jira/browse/BEAM-10397 before giving my >>>> >>>>>>>>> vote. >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> +Heejong Lee to comment on this. >>>> >>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>> On Wed, Jul 15, 2020 at 10:51 AM Pablo Estrada >>>> >>>>>>>>> <pabl...@google.com> wrote: >>>> >>>>>>>>> > >>>> >>>>>>>>> > +1 >>>> >>>>>>>>> > I was able to run the python 3.8 quickstart from wheels on >>>> >>>>>>>>> > DirectRunner. >>>> >>>>>>>>> > I verified hashes for Python files. >>>> >>>>>>>>> > -P. >>>> >>>>>>>>> > >>>> >>>>>>>>> > On Fri, Jul 10, 2020 at 4:34 PM Ahmet Altay <al...@google.com> >>>> >>>>>>>>> > wrote: >>>> >>>>>>>>> >> >>>> >>>>>>>>> >> I validated the python 3 quickstarts. I had issues with >>>> >>>>>>>>> >> running with python 3.8 wheel files, but did not have issues >>>> >>>>>>>>> >> with source distributions, or other python wheel files. I >>>> >>>>>>>>> >> have not tested python 2 quickstarts. >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> Did someone validate python 3.8 wheels on Dataflow? I was not >>>> >>>>>>>> able to run that. >>>> >>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>> >> >>>> >>>>>>>>> >> On Thu, Jul 9, 2020 at 10:53 PM Valentyn Tymofieiev >>>> >>>>>>>>> >> <valen...@google.com> wrote: >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> Hi everyone, >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> Please review and vote on the release candidate #1 for the >>>> >>>>>>>>> >>> version 2.23.0, as follows: >>>> >>>>>>>>> >>> [ ] +1, Approve the release >>>> >>>>>>>>> >>> [ ] -1, Do not approve the release (please provide specific >>>> >>>>>>>>> >>> comments) >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> The complete staging area is available for your review, >>>> >>>>>>>>> >>> which includes: >>>> >>>>>>>>> >>> * JIRA release notes [1], >>>> >>>>>>>>> >>> * the official Apache source release to be deployed to >>>> >>>>>>>>> >>> dist.apache.org [2], which is signed with the key with >>>> >>>>>>>>> >>> fingerprint 1DF50603225D29A4 [3], >>>> >>>>>>>>> >>> * all artifacts to be deployed to the Maven Central >>>> >>>>>>>>> >>> Repository [4], >>>> >>>>>>>>> >>> * source code tag "v2.23.0-RС1" [5], >>>> >>>>>>>>> >>> * website pull request listing the release [6], publishing >>>> >>>>>>>>> >>> the API reference manual [7], and the blog post [8]. >>>> >>>>>>>>> >>> * Java artifacts were built with Maven 3.6.0 and Oracle JDK >>>> >>>>>>>>> >>> 1.8.0_201-b09 . >>>> >>>>>>>>> >>> * Python artifacts are deployed along with the source >>>> >>>>>>>>> >>> release to the dist.apache.org [2]. >>>> >>>>>>>>> >>> * Validation sheet with a tab for 2.23.0 release to help >>>> >>>>>>>>> >>> with validation [9]. >>>> >>>>>>>>> >>> * Docker images published to Docker Hub [10]. >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> The vote will be open for at least 72 hours. It is adopted >>>> >>>>>>>>> >>> by majority approval, with at least 3 PMC affirmative votes. >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> Thanks, >>>> >>>>>>>>> >>> Release Manager >>>> >>>>>>>>> >>> >>>> >>>>>>>>> >>> [1] >>>> >>>>>>>>> >>> https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12347145 >>>> >>>>>>>>> >>> [2] https://dist.apache.org/repos/dist/dev/beam/2.23.0/ >>>> >>>>>>>>> >>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS >>>> >>>>>>>>> >>> [4] >>>> >>>>>>>>> >>> https://repository.apache.org/content/repositories/orgapachebeam-1105/ >>>> >>>>>>>>> >>> [5] https://github.com/apache/beam/tree/v2.23.0-RC1 >>>> >>>>>>>>> >>> [6] https://github.com/apache/beam/pull/12212 >>>> >>>>>>>>> >>> [7] https://github.com/apache/beam-site/pull/605 >>>> >>>>>>>>> >>> [8] https://github.com/apache/beam/pull/12213 >>>> >>>>>>>>> >>> [9] >>>> >>>>>>>>> >>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=596347973 >>>> >>>>>>>>> >>> [10] https://hub.docker.com/search?q=apache%2Fbeam&type=image