Copilot commented on code in PR #64590:
URL: https://github.com/apache/airflow/pull/64590#discussion_r3025320184


##########
dev/breeze/src/airflow_breeze/commands/pr_commands.py:
##########
@@ -1842,6 +1850,12 @@ def _fetch_author_profile(token: str, login: str, 
github_repository: str) -> dic
         "contributed_repos_total": contrib_total,
     }
     _author_profile_cache[login] = profile
+
+    # Persist to disk for reuse across sessions
+    from airflow_breeze.utils.pr_cache import save_author_profile
+
+    save_author_profile(github_repository, login, profile)

Review Comment:
   New imports were added inside `_fetch_author_profile`. Per project 
guidelines, keep imports at module scope unless there's a specific reason 
(circular import / worker isolation / lazy loading). Please move 
`save_author_profile` into the existing top-level 
`airflow_breeze.utils.pr_cache` import list and remove this in-function import.



##########
dev/breeze/src/airflow_breeze/utils/pr_cache.py:
##########
@@ -76,6 +76,7 @@ def save(self, github_repository: str, key: str, data: dict) 
-> None:
 classification_cache = CacheStore("classification_cache")
 triage_cache = CacheStore("triage_cache")
 status_cache = CacheStore("status_cache", ttl_seconds=4 * 3600)
+author_cache = CacheStore("author_cache", ttl_seconds=7 * 24 * 3600)
 

Review Comment:
   The PR description currently frames this as a direction proposal “before 
writing the implementation”, but the diff includes an implemented 
author-profile disk cache plus new tests. Please update the PR description 
(and/or title) to reflect the implemented Phase 1 scope so reviewers and 
release notes are aligned.



##########
dev/breeze/src/airflow_breeze/commands/pr_commands.py:
##########
@@ -1773,11 +1773,19 @@ def _fetch_single_pr_graphql(token: str, 
github_repository: str, pr_number: int)
 def _fetch_author_profile(token: str, login: str, github_repository: str) -> 
dict:
     """Fetch author profile info via GraphQL: account age, PR counts, 
contributed repos.
 
-    Results are cached per login so the same author is only queried once.
+    Results are cached in memory (per session) and on disk (across sessions, 
7-day TTL).
     """
     if login in _author_profile_cache:
         return _author_profile_cache[login]
 
+    # Try disk cache before hitting the API
+    from airflow_breeze.utils.pr_cache import get_cached_author_profile
+
+    disk_profile = get_cached_author_profile(github_repository, login)

Review Comment:
   New imports were added inside `_fetch_author_profile`. Project guidelines 
require imports at module scope (unless for circular imports / isolation / lazy 
loading). Since this module already imports from 
`airflow_breeze.utils.pr_cache` at the top, add `get_cached_author_profile` to 
that existing import block and remove this in-function import.



##########
dev/breeze/src/airflow_breeze/utils/pr_cache.py:
##########
@@ -131,6 +132,16 @@ def save_status_cache(github_repository: str, cache_key: 
str, payload: dict | li
     status_cache.save(github_repository, cache_key, {"payload": payload})
 
 
+def get_cached_author_profile(github_repository: str, login: str) -> dict | 
None:
+    """Load a cached author profile. Returns None if missing or expired (7-day 
TTL)."""
+    return author_cache.get(github_repository, f"author_{login}")
+
+
+def save_author_profile(github_repository: str, login: str, profile: dict) -> 
None:
+    """Persist an author profile to disk."""
+    author_cache.save(github_repository, f"author_{login}", profile)

Review Comment:
   `get_cached_author_profile` currently returns the raw JSON dict from 
`CacheStore`, which includes the internal `cached_at` field when TTL is 
enabled. That means callers get a different shape depending on whether the 
profile came from the API (no `cached_at`) or disk (has `cached_at`). Consider 
storing the profile under a dedicated key (e.g. `{ "profile": ... }`) and 
returning only that key, matching the pattern used by `status_cache` and 
avoiding leaking cache metadata into the runtime profile object.
   ```suggestion
       data = author_cache.get(github_repository, f"author_{login}")
       return data.get("profile") if data else None
   
   
   def save_author_profile(github_repository: str, login: str, profile: dict) 
-> None:
       """Persist an author profile to disk."""
       author_cache.save(github_repository, f"author_{login}", {"profile": 
profile})
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to