I think it will be reasonable to disable/sickbay any flaky test that is
actively blocking people. Collective cost of flaky tests for such a large
group of contributors is very significant.

Most of these issues are unassigned. IMO, it makes sense to assign these
issues to the most relevant person (who added the test/who generally
maintains those components). Those people can either fix and re-enable the
tests, or remove them if they no longer provide valuable signals.

Ahmet

On Wed, Jul 15, 2020 at 4:55 PM Kenneth Knowles <k...@apache.org> wrote:

> The situation is much worse than that IMO. My experience of the last few
> days is that a large portion of time went to *just connecting failing runs
> with the corresponding Jira tickets or filing new ones*.
>
> Summarized on PRs:
>
>  - https://github.com/apache/beam/pull/12272#issuecomment-659050891
>  - https://github.com/apache/beam/pull/12273#issuecomment-659070317
>  - https://github.com/apache/beam/pull/12225#issuecomment-656973073
>  - https://github.com/apache/beam/pull/12225#issuecomment-657743373
>  - https://github.com/apache/beam/pull/12224#issuecomment-657744481
>  - https://github.com/apache/beam/pull/12216#issuecomment-657735289
>  - https://github.com/apache/beam/pull/12216#issuecomment-657780781
>  - https://github.com/apache/beam/pull/12216#issuecomment-657799415
>
> The tickets:
>
>  - https://issues.apache.org/jira/browse/BEAM-10460
> SparkPortableExecutionTest
>  - https://issues.apache.org/jira/browse/BEAM-10471 CassandraIOTest >
> testEstimatedSizeBytes
>  - https://issues.apache.org/jira/browse/BEAM-10504 ElasticSearchIOTest >
> testWriteFullAddressing and testWriteWithIndexFn
>  - https://issues.apache.org/jira/browse/BEAM-10470 JdbcDriverTest
>  - https://issues.apache.org/jira/browse/BEAM-8025 CassandraIOTest
> > @BeforeClass (classmethod)
>  - https://issues.apache.org/jira/browse/BEAM-8454 FnHarnessTest
>  - https://issues.apache.org/jira/browse/BEAM-10506 SplunkEventWriterTest
>  - https://issues.apache.org/jira/browse/BEAM-10472 direct runner
> ParDoLifecycleTest
>  - https://issues.apache.org/jira/browse/BEAM-9187
> DefaultJobBundleFactoryTest
>
> Here are our P1 test flake bugs:
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22)%20AND%20resolution%20%3D%20Unresolved%20AND%20labels%20%3D%20flake%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
>
> It seems quite a few of them are actively hindering people right now.
>
> Kenn
>
> On Wed, Jul 15, 2020 at 4:23 PM Andrew Pilloud <apill...@google.com>
> wrote:
>
>> We have two test suites that are responsible for a large percentage of
>> our flaky tests and  both have bugs open for about a year without being
>> fixed. These suites are ParDoLifecycleTest (BEAM-8101
>> <https://issues.apache.org/jira/browse/BEAM-8101>) in Java
>> and BigQueryWriteIntegrationTests in python (py3 BEAM-9484
>> <https://issues.apache.org/jira/browse/BEAM-9484>, py2 BEAM-9232
>> <https://issues.apache.org/jira/browse/BEAM-9232>, old duplicate
>> BEAM-8197 <https://issues.apache.org/jira/browse/BEAM-8197>).
>>
>> Are there any volunteers to look into these issues? What can we do to
>> mitigate the flakiness until someone has time to investigate?
>>
>> Andrew
>>
>

Reply via email to