andreydevyatkin commented on code in PR #30424:
URL: https://github.com/apache/beam/pull/30424#discussion_r1504660326


##########
.test-infra/metrics/sync/github/github_runs_prefetcher/code/main.py:
##########
@@ -143,6 +144,41 @@ def enhance_workflow(workflow):
         print(f"No yaml file found for workflow: {workflow.name}")
 
 
+async def check_workflow_flakiness(workflow):
+    def filter_workflow_runs(run, issue):
+        started_at = datetime.strptime(run.started_at, "%Y-%m-%dT%H:%M:%SZ")
+        closed_at = datetime.strptime(issue["closed_at"], "%Y-%m-%dT%H:%M:%SZ")
+        if started_at > closed_at:
+            return True
+        return False
+
+    if not len(workflow.runs):
+        return False
+
+    url = f"https://api.github.com/repos/{GIT_ORG}/beam/issues";
+    headers = {"Authorization": get_token()}
+    semaphore = asyncio.Semaphore(5)
+    workflow_runs = workflow.runs
+    params = {
+        "state": "closed",
+        "labels": f"flaky_test,workflow_id: {workflow.id}",
+    }
+    response = await fetch(url, semaphore, params, headers)
+    if len(response):
+        print(f"Found a recently closed issue for the {workflow.name} 
workflow")
+        workflow_runs = [run for run in workflow_runs if 
filter_workflow_runs(run, response[0])]
+
+    print(f"Number of workflow runs to consider: {len(workflow_runs)}")
+    success_rate = 1.0
+    if len(workflow_runs):
+        failed_runs = list(filter(lambda r: r.status == "failure", 
workflow_runs))
+        print(f"Number of failed workflow runs: {len(failed_runs)}")
+        success_rate -= (len(failed_runs) / len(workflow_runs))
+
+    print(f"Success rate: {success_rate}")
+    return True if success_rate < workflow.threshold else False

Review Comment:
   Should we limit number of runs instead of number of failures?
   Let's say we have a threshold=0.8 and the following sequence of results (s - 
success , f - failure): s, f, s, f, s. So, in fact we have 5 runs that seem 
suspicious and the actual success rate is 0.6 - shouldn't the result be a 
reason to create an issue?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to