A lot of these are good fixit tasks (shameless plug [1]),

  * archeology on the @Ignored tests to normalize them with proper labelling
  * checkstyle validation for @Ignored tests
  * spring cleaning on open JIRA issues

Having a Jira issue for the first 2 of these and adding the beam-fixit
label would be helpful. The third one is a bit trickier to coordinate and
may involve inventing some new process, or growing the group of dedicated
contributors for triaging/cleaning the backlog.

[1]:
https://lists.apache.org/thread.html/r9ddc77a8fee58ad02f68e2d9a7f054aab3e55717cc88ad1d5bc49311%40%3Cdev.beam.apache.org%3E

On Wed, May 20, 2020 at 4:22 PM Kyle Weaver <kcwea...@google.com> wrote:

> > I think that after 2-3 years being ignored the tests might have already
> lost their relevance.
>
> This is true of all jira issues, and I think consensus on another thread
> was that we don't want to auto-close issues.
> https://lists.apache.org/thread.html/rb51dfffbc8caf40efe7e1d137402438a05d0375fd945bda8fd7e33d2%40%3Cdev.beam.apache.org%3E
>
> I think the best suggestion from that thread was that we do "spring
> cleaning" on open JIRA issues. I know some contributors are doing that
> already, but it'd be great if we could coordinate a wider-scale effort as
> opposed to just a handful of dedicated contributors trying to do everything.
>
> On Mon, May 18, 2020 at 4:21 AM Jan Lukavský <je...@seznam.cz> wrote:
>
>> Hi,
>>
>> +1 for creating a checkstyle validation that we have associated JIRA with
>> each Ignored test. But it seems to me we might need something more, because
>> some of the associated JIRAs are open for years. I think that after 2-3
>> years being ignored the tests might have already lost their relevance.
>>
>> Jan
>> On 5/15/20 10:20 PM, Luke Cwik wrote:
>>
>> For the ones without the label, someone would need to use blame and track
>> back to why it was sickbayed.
>>
>> On Fri, May 15, 2020 at 1:08 PM Kenneth Knowles <k...@apache.org> wrote:
>>
>>> There are 101 instances of @Ignore, and I've listed them below. A few
>>> takeaways:
>>>
>>>  - highly concentrated in ZetaSQL, and then second tier in various state
>>> tests specific to a runner
>>>  - there are not that many overall, so I'm not sure a report will add
>>> much
>>>  - they do not all have Jiras
>>>  - they do not even all have any explanation at all (some don't leave
>>> out the string parameter, but have an empty string!)
>>>
>>> Having a checkstyle that there is a Jira attached seems nice. Then we
>>> could easily grep out the Jiras and not depend on the "sickbay" label.
>>>
>>> Raw data (to see the individual items, just do the grep and not the
>>> processing)
>>>
>>>   % grep --recursive --exclude-dir build '@Ignore' . | cut -d ' ' -f 1 |
>>> sort | uniq -c | sort -r
>>>   27
>>> ./sdks/java/extensions/sql/zetasql/src/test/java/org/apache/beam/sdk/extensions/sql/zetasql/ZetaSQLDialectSpecTest.java:
>>>   11
>>> ./runners/flink/src/test/java/org/apache/beam/runners/flink/streaming/FlinkBroadcastStateInternalsTest.java:
>>>    7
>>> ./runners/spark/src/test/java/org/apache/beam/runners/spark/stateful/SparkStateInternalsTest.java:
>>>    7
>>> ./runners/apex/src/test/java/org/apache/beam/runners/apex/translation/utils/ApexStateInternalsTest.java:
>>>    4
>>> ./sdks/java/testing/nexmark/src/test/java/org/apache/beam/sdk/nexmark/queries/QueryTest.java:
>>>    4
>>> ./runners/spark/src/test/java/org/apache/beam/runners/spark/structuredstreaming/StructuredStreamingPipelineStateTest.java:
>>>    2
>>> ./sdks/java/io/xml/src/test/java/org/apache/beam/sdk/io/xml/XmlSourceTest.java:
>>>    2
>>> ./sdks/java/io/mqtt/src/test/java/org/apache/beam/sdk/io/mqtt/MqttIOTest.java:
>>>    2
>>> ./sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/meta/provider/pubsub/PubsubJsonIT.java:
>>>    2
>>> ./sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/BeamSqlLineIT.java:
>>>    2
>>> ./sdks/java/extensions/euphoria/src/test/java/org/apache/beam/sdk/extensions/euphoria/core/testkit/ReduceByKeyTest.java:
>>>    2
>>> ./sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java:
>>>    2
>>> ./sdks/java/core/src/test/java/org/apache/beam/sdk/coders/PCollectionCustomCoderTest.java:
>>>    2
>>> ./runners/direct-java/src/test/java/org/apache/beam/runners/direct/ExecutorServiceParallelExecutorTest.java:
>>>    1
>>> ./sdks/java/testing/nexmark/src/test/java/org/apache/beam/sdk/nexmark/sources/UnboundedEventSourceTest.java:
>>>    1
>>> ./sdks/java/testing/nexmark/src/test/java/org/apache/beam/sdk/nexmark/queries/sql/SqlQuery5Test.java:
>>>    1
>>> ./sdks/java/io/kudu/src/test/java/org/apache/beam/sdk/io/kudu/KuduIOTest.java:
>>>    1
>>> ./sdks/java/io/kafka/src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java:
>>>    1
>>> ./sdks/java/io/hadoop-file-system/src/test/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemTest.java:
>>>    1
>>> ./sdks/java/io/clickhouse/src/test/java/org/apache/beam/sdk/io/clickhouse/ClickHouseIOTest.java:
>>>    1
>>> ./sdks/java/io/amazon-web-services2/src/test/java/org/apache/beam/sdk/io/aws2/dynamodb/DynamoDBIOTest.java:@Ignore
>>> ("[BEAM-7794]
>>>    1
>>> ./sdks/java/io/amazon-web-services/src/test/java/org/apache/beam/sdk/io/aws/dynamodb/DynamoDBIOTest.java:@Ignore
>>> ("[BEAM-7794]
>>>    1
>>> ./sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/meta/provider/kafka/KafkaCSVTableIT.java:@Ignore
>>> ("https://issues.apache.org/jira/projects/BEAM/issues/BEAM-7523";)
>>>    1
>>> ./sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/impl/JdbcDriverTest.java:
>>>    1
>>> ./sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlExplainTest.java:
>>>    1
>>> ./sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslSqlStdOperatorsTest.java:
>>>    1
>>> ./sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamComplexTypeTest.java:
>>>    1
>>> ./sdks/java/extensions/euphoria/src/test/java/org/apache/beam/sdk/extensions/euphoria/core/testkit/JoinTest.java:
>>>    1
>>> ./sdks/java/extensions/euphoria/src/test/java/org/apache/beam/sdk/extensions/euphoria/core/docs/DocumentationExamplesTest.java:
>>>    1
>>> ./sdks/java/core/src/test/java/org/apache/beam/sdk/values/PDoneTest.java:
>>>    1
>>> ./sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/WatchTest.java:
>>>    1
>>> ./sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/SplittableDoFnTest.java:
>>>    1
>>> ./sdks/java/core/src/test/java/org/apache/beam/sdk/io/BoundedReadFromUnboundedSourceTest.java:
>>>    1
>>> ./sdks/java/core/src/test/java/org/apache/beam/sdk/coders/RowCoderTest.java:
>>>    1
>>> ./runners/spark/src/test/java/org/apache/beam/runners/spark/structuredstreaming/translation/streaming/SimpleSourceTest.java:
>>>    1
>>> ./runners/spark/src/test/java/org/apache/beam/runners/spark/structuredstreaming/aggregators/metrics/sink/SparkMetricsSinkTest.java:@Ignore
>>> ("Has
>>>    1
>>> ./runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/BatchStatefulParDoOverridesTest.java:
>>>    1
>>> ./runners/direct-java/src/test/java/org/apache/beam/runners/direct/WatermarkManagerTest.java:
>>>    1
>>> ./runners/apex/src/test/java/org/apache/beam/runners/apex/translation/ParDoTranslatorTest.java:
>>>
>>>
>>> Kenn
>>>
>>> On Tue, May 12, 2020 at 1:24 PM Mikhail Gryzykhin <
>>> gryzykhin.mikh...@gmail.com> wrote:
>>>
>>>> I wonder if we can add graph to community metrics showing ignored tests
>>>> by language/project/overall. That can be useful to see focus area.
>>>>
>>>> On Tue, May 12, 2020 at 12:28 PM Jan Lukavský <je...@seznam.cz> wrote:
>>>>
>>>>> +1, visualizing the number of ignored tests in a graph seems useful.
>>>>> Even better with some slices (e.g. per runner, module, ...).
>>>>> On 5/12/20 8:02 PM, Ahmet Altay wrote:
>>>>>
>>>>> +1 to generate a report instead of removing these tests. A report like
>>>>> this could help us with prioritization. It is easier to address issues 
>>>>> when
>>>>> we can quantify how much of a problem it is.
>>>>>
>>>>> I am curious what we can do to incentivize reducing the number of
>>>>> flaky/ignored tests? A report itself might provide incentive, it is
>>>>> rewarding to see ignored tests numbers go down over time.
>>>>>
>>>>> On Mon, May 11, 2020 at 8:30 AM Luke Cwik <lc...@google.com> wrote:
>>>>>
>>>>>> Deleting ignored tests does lead us to losing the reason as to why
>>>>>> the test case was around so I would rather keep it around. I think it 
>>>>>> would
>>>>>> be more valuable to generate a report that goes on the website/wiki 
>>>>>> showing
>>>>>> stability of the modules (num tests, num passed, num skipped, num failed
>>>>>> (running averages over the past N runs)). We had discussed doing 
>>>>>> something
>>>>>> like this for ValidatesRunner so we could show which runner supports what
>>>>>> automatically.
>>>>>>
>>>>>> On Mon, May 11, 2020 at 12:53 AM Jan Lukavský <je...@seznam.cz>
>>>>>> wrote:
>>>>>>
>>>>>>> I think that we do have Jira issues for ignored test, there should
>>>>>>> be no problem with that. The questionable point is that when test gets
>>>>>>> Ignored, people might consider the problem as "less painful" and 
>>>>>>> postpone
>>>>>>> the correct solution until ... forever. I'd just like to discuss if 
>>>>>>> people
>>>>>>> see this as an issue. If yes, should we do something about that, or if 
>>>>>>> no,
>>>>>>> maybe we can create a rule that test marked as Ignored for long time 
>>>>>>> might
>>>>>>> be deleted, because apparently is only a dead code.
>>>>>>> On 5/6/20 6:30 PM, Kenneth Knowles wrote:
>>>>>>>
>>>>>>> Good point.
>>>>>>>
>>>>>>> The raw numbers are available in the test run output. See
>>>>>>> https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PreCommit_Java_Cron/2718/testReport/
>>>>>>>  for
>>>>>>> the "skipped" column.
>>>>>>> And you get the same on console or Gradle Scan:
>>>>>>> https://scans.gradle.com/s/ml3jv5xctkrmg/tests?collapse-all
>>>>>>> This would be good to review periodically for obvious trouble spots.
>>>>>>>
>>>>>>> But I think you mean something more detailed. Some report with
>>>>>>> columns: Test Suite, Test Method, Jira, Date Ignored, Most Recent Update
>>>>>>>
>>>>>>> I think we can get most of this from Jira, if we just make sure that
>>>>>>> each ignored test has a Jira and they are all labeled in a consistent 
>>>>>>> way.
>>>>>>> That would be the quickest way to get some result, even though it is not
>>>>>>> perfectly automated and audited.
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Tue, May 5, 2020 at 2:41 PM Jan Lukavský <je...@seznam.cz> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> it seems we are accumulating test cases (see discussion in [1])
>>>>>>>> that are
>>>>>>>> marked as @Ignored (mostly due to flakiness), which is generally
>>>>>>>> undesirable. Associated JIRAs seem to be open for a long time, and
>>>>>>>> this
>>>>>>>> might generally cause that we loose code coverage. Would anyone
>>>>>>>> have
>>>>>>>> idea on how to visualize these Ignored tests better? My first idea
>>>>>>>> would
>>>>>>>> be something similar to "Beam dependency check report", but that
>>>>>>>> seems
>>>>>>>> to be not the best example (which is completely different issue :)).
>>>>>>>>
>>>>>>>> Jan
>>>>>>>>
>>>>>>>> [1] https://github.com/apache/beam/pull/11614
>>>>>>>>
>>>>>>>>

Reply via email to