Just adapting my PR commentary to this thread:

Our rollback first policy cannot apply to a change that passes all of
Beam's postcommit tests. It *does* apply to Beam's postcommit suites for
each and every runner; they are equally important in this regard.

The purpose of rapid rollback without discussion is foremost to restore
test signal and not to disrupt the work of other contributors, that is why
it is OK to roll back before figuring out if the change was actually bad.
If that isn't at stake, the policy doesn't make sense to apply.

But...

 - We have at least three examples of runners where there are probably
tests outside the Beam repo: Dataflow, Samza runner, and IBM Streams.
 - We also may have users that try running their production loads against
Beam master branch to learn early whether the next release will break them.

These are success stories for Beam. We should respect these other sources
of information for what they are: users and vendors giving us a heads up
about changes that will be a problem if we release them.

Often rollback is still a good option but IMO it is no longer automatically
the best option, and may not even be OK. I would say that the case must be
made clearly and *publicly* that

(1) something is actually broken
(2) the revert fixes the problem
(3) revert is the best option

In this scenario there is time to consider. An important and common case is
that a perfectly fine change exposes something already broken, so the best
option may be sickbaying downstream or pinning their version/commit of Beam
until they can fix.

Kenn


On Fri, Nov 16, 2018 at 8:15 PM Ahmet Altay <al...@google.com> wrote:
>
> It sounds like we are in agreement that addressing issues sooner is
better. I think reverting is in general the less stressful option because
it allows a solution to be developed in parallel. Even with that, it is not
the only option we have and based on the severity and the complexity of the
problem we can consider other options. Fixing forward might be feasible in
some cases.
>
> We can bring issues back to the mailing list. This would be akin to
bringing any issues to the mailing list. I think JIRA is a better tool for
that. These reverts are happening because of an issue, and JIRA allows
informing all involved parties, creates emails to the issues list for later
searching through mailing archives, and creates a record of things in
structured way with components.
>
> We could establish general policies about for all reverts to have an
issue (which we already do because they are regular PRs), including all
people in the discussion (including the author and reviewers) and follow up
with new tests to expands Beam's test coverage.
>
> On Fri, Nov 16, 2018 at 7:55 PM, Thomas Weise <thomas.we...@gmail.com>
wrote:
>>
>>
>>
>> On Fri, Nov 16, 2018 at 7:39 PM Ahmet Altay <al...@google.com> wrote:
>>>
>>> Thank you for bringing this discussion back to the mailing list.
>>>
>>> On Fri, Nov 16, 2018 at 6:49 PM, Thomas Weise <t...@apache.org> wrote:
>>>>
>>>> We have observed instances of changes being reverted in master that
have been authored following the contributor guidelines and pass all tests
(post commit). While we generally seem to have quite a bit of revert action
happening [1], this thread is about those instances that are outside of our
documented policies.
>>>>
>>>> For a contributor, it isn't a good experience to see reverts
(especially not out of the blue) after a PR has been reviewed, all tests
pass and generally care has been taken to do the right things.
>>>
>>>
>>> I completely agree. Everyone involved needs to have the context about
why a change is being reverted. A JIRA with information is probably a good
way to do it, similar to the any other issue we track.
>>>
>>>>
>>>>
>>>> Changes can have unforeseen effects downstream. Usually releases
provide the opportunity to mitigate such issues, not necessarily by a
revert, but in many cases by another change that keeps everyone happy.
Unexpected reverts can break someone else and are disruptive.
>>>
>>>
>>> I disagree about waiting until release to resolve an identified issue.
Let's say we become aware of an issue through channels other than Beam
tests (e.g. user reports, a contributor running into something not captured
in Beam tests) and we know that it is a credible issue that will block the
release anyway. Addressing the issue sooner will be less painful than
addressing later. (I am not suggesting addressing every such issue similar
to we do not block releases on every open issue. There needs to be a due
diligence to understand the severity and the impact.)
>>>
>>> We can improve on the above process. If we end up reverting a change as
a result of a report that is not covered by existing Beam tests, we could
expand the tests to catch same class of problems even earlier by means of
Beam's own tests.
>>
>>
>> I'm referring to the release process as an example how such issues can
be addressed. I'm not suggesting to wait until a release to address issues;
as you say the sooner they are identified the better.
>>
>> However, I don't agree with taking out the revert hammer on the
slightest sign that there could be a downstream problem. There are better
ways to handle this. First of all, I think that these issues should be
visible on the dev mailing list so that there is more awareness overall
(which will lead to better test coverage and useful feedback in general).
Second, we should make an effort to resolve issues in a forward looking
way. If after all it turns out that a revert is most appropriate for the
situation, it should follow explicit agreement.
>>
>>>
>>>
>>>>
>>>>
>>>> Some discussion already took place on one specific PR [3], but that is
just an example and by no means an isolated incident.
>>>>
>>>> On some of these reverts, "internal" problems
>>>
>>>
>>> This is a communication issue that needs be addressed. Over time we are
getting new contributors and that is great, but it also means these new
folks need to understand operating conditions of this community. Giving
direct feedback would generally be most efficient here.
>>>
>>>>
>>>> with an important runner are cited, with little to no explanation. It
would be nice if folks with more insight can shed more light on this.
>>>
>>>
>>> All runners we accepted at out master branch and include in our
releases is important. I do not find a reference to an important runner in
the examples other than a question about what would the outcome be for a
runner other than Dataflow. I think the outcome would be the same, and in
my opinion it should be. We need to be careful not breaking any runner we
support.
>>>
>>>>
>>>> I hope that as outcome of this discussion, we can arrive at a better
understanding of why and when such reverts were necessary and possibly find
ways to largely avoid them going forward.
>>>>
>>>> Thanks!
>>>> Thomas
>>>>
>>>>
>>>> [1]
https://github.com/apache/beam/search?o=desc&q=Revert&s=committer-date&type=Commits
>>>> [2] https://beam.apache.org/contribute/postcommits-policies/
>>>> [3] https://github.com/apache/beam/pull/7012
>>>>
>>>

Reply via email to