andreydevyatkin commented on code in PR #30424: URL: https://github.com/apache/beam/pull/30424#discussion_r1504660326
########## .test-infra/metrics/sync/github/github_runs_prefetcher/code/main.py: ########## @@ -143,6 +144,41 @@ def enhance_workflow(workflow): print(f"No yaml file found for workflow: {workflow.name}") +async def check_workflow_flakiness(workflow): + def filter_workflow_runs(run, issue): + started_at = datetime.strptime(run.started_at, "%Y-%m-%dT%H:%M:%SZ") + closed_at = datetime.strptime(issue["closed_at"], "%Y-%m-%dT%H:%M:%SZ") + if started_at > closed_at: + return True + return False + + if not len(workflow.runs): + return False + + url = f"https://api.github.com/repos/{GIT_ORG}/beam/issues" + headers = {"Authorization": get_token()} + semaphore = asyncio.Semaphore(5) + workflow_runs = workflow.runs + params = { + "state": "closed", + "labels": f"flaky_test,workflow_id: {workflow.id}", + } + response = await fetch(url, semaphore, params, headers) + if len(response): + print(f"Found a recently closed issue for the {workflow.name} workflow") + workflow_runs = [run for run in workflow_runs if filter_workflow_runs(run, response[0])] + + print(f"Number of workflow runs to consider: {len(workflow_runs)}") + success_rate = 1.0 + if len(workflow_runs): + failed_runs = list(filter(lambda r: r.status == "failure", workflow_runs)) + print(f"Number of failed workflow runs: {len(failed_runs)}") + success_rate -= (len(failed_runs) / len(workflow_runs)) + + print(f"Success rate: {success_rate}") + return True if success_rate < workflow.threshold else False Review Comment: Should we limit number of runs instead of number of failures? Let's say we have a threshold=0.8 and the following sequence of results (s - success , f - failure): s, f, s, f, s. So, in fact we have 5 runs that seem suspicious and the actual success rate is 0.6 - shouldn't the result be a reason to create an issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org