On Fri Jun 5, 2026 at 8:09 PM UTC, Andres Freund wrote:
> Hi,
>
> I noticed that a handfull of CI runs already lead to exceeding the available
> cache space.  One can pay for more cache space, but I think the problem is
> more that what we currently do doesn't work well.
>
> With cirrus-ci all branches shared one cache, but that's not the case with
> github actions. Except for being able to read caches from the default branch
> (master in our case), other branches have completely separate cache
> namespaces.  That's probably the right call, safety wise, but makes our ccache
> approach .. not great.
>
> We should only upload a new cache when the ccache cache hit ratio of the
> existing cache entry has gotten low.

I had started reviewing this patch the day it was originally sent, but 
due to circumstances I couldn't finish the review before it was 
committed. I had some thoughts with regard to improving the Python 
script itself. Attached are some improvements that make the code 
a little more pythonic as well as more easily usable locally for testing 
purposes. Some of the patches may be more valuable than others.

-- 
Tristan Partin
PostgreSQL Contributors Team
AWS (https://aws.amazon.com)
From 650f97c15d0bd9e8c5d725afe53d4bb2eab92770 Mon Sep 17 00:00:00 2001
From: Tristan Partin <[email protected]>
Date: Wed, 17 Jun 2026 19:05:43 +0000
Subject: [PATCH v1 1/7] Use long options for ccache commands

-s, -X, and -z don't mean much to someone reading this script unless
they are very familiar with the ccache CLI.

Signed-off-by: Tristan Partin <[email protected]>
---
 src/tools/ci/gha_ccache_decide.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/tools/ci/gha_ccache_decide.py b/src/tools/ci/gha_ccache_decide.py
index a8e32310d0..ea9fbe9452 100644
--- a/src/tools/ci/gha_ccache_decide.py
+++ b/src/tools/ci/gha_ccache_decide.py
@@ -49,7 +49,7 @@ def main():
     # Log ccache stats, useful for more in-depth understanding. To avoid
     # swamping the output, collapse it in a group.
     print("::group::ccache_stats")
-    print(run(["ccache", "-s", "-vv"]))
+    print(run(["ccache", "--show-stats", "-vv"]))
     print("::endgroup::")
 
     # compute cache hit ratio
@@ -84,12 +84,12 @@ def main():
     # probably be improved.
     print("::group::ccache_shrink")
     print(run(["ccache", "--evict-older-than", f"{45*60}s"]))
-    print(run(["ccache", "-X", "10"]))
+    print(run(["ccache", "--recompress", "10"]))
 
     # Don't store ccache stats, otherwise we'd need to reset the cache access
     # data after restoring the cache in the next run, to be able to get the
     # hit ratio of the CI run.
-    print(run(["ccache", "-z"]))
+    print(run(["ccache", "--zero-stats"]))
     print("::endgroup::")
 
     # Before continuing, try to kill all ccache instances, otherwise
-- 
Tristan Partin
https://tristan.partin.io

From 86bf2412ecd471dd901dbee5271e5a84fde243fd Mon Sep 17 00:00:00 2001
From: Tristan Partin <[email protected]>
Date: Wed, 17 Jun 2026 19:06:42 +0000
Subject: [PATCH v1 2/7] Use json to parse ccache statistics

Instead of relying on regex to parse ccache CLI output, we can ask
ccache to output its statistics in JSON, which Python natively supports.

Signed-off-by: Tristan Partin <[email protected]>
---
 src/tools/ci/gha_ccache_decide.py | 24 ++++--------------------
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/src/tools/ci/gha_ccache_decide.py b/src/tools/ci/gha_ccache_decide.py
index ea9fbe9452..b4874517fb 100644
--- a/src/tools/ci/gha_ccache_decide.py
+++ b/src/tools/ci/gha_ccache_decide.py
@@ -1,7 +1,7 @@
 #!/usr/bin/env python3
 
+import json
 import os
-import re
 import shutil
 import subprocess
 
@@ -14,24 +14,6 @@ def run(cmd, check=True):
         stderr=subprocess.STDOUT,
     ).stdout
 
-def parse_ccache_stats():
-    out = run(["ccache", "--print-stats"])
-    hits = 0
-    misses = 0
-
-    for line in out.splitlines():
-        line = line.strip()
-        m = re.match(r"^local_storage_hit\s+(\d+)$", line)
-        if m:
-            hits = int(m.group(1))
-            continue
-        m = re.match(r"^local_storage_miss\s+(\d+)$", line)
-        if m:
-            misses = int(m.group(1))
-            continue
-
-    return hits, misses
-
 def append_github_output(key, value):
     output_path = os.environ["GITHUB_OUTPUT"]
     with open(output_path, "a", encoding="utf-8") as f:
@@ -53,7 +35,9 @@ def main():
     print("::endgroup::")
 
     # compute cache hit ratio
-    hits, misses = parse_ccache_stats()
+    ccache_stats = json.loads(run(["ccache", "--print-stats", "--format", "json"]))
+    hits = ccache_stats["local_storage_hit"]
+    misses = ccache_stats["local_storage_miss"]
     total = hits + misses
     hit_pct = int((hits / total) * 100) if total > 0 else 100
 
-- 
Tristan Partin
https://tristan.partin.io

From 25881caa9baca7980dda530e27b6025bcf4e8c4a Mon Sep 17 00:00:00 2001
From: Tristan Partin <[email protected]>
Date: Wed, 17 Jun 2026 19:09:29 +0000
Subject: [PATCH v1 3/7] Remove the shutil.which() call

This call is pointless. We were resolving `killall` to a filesystem
path, and immediately discarding it just to re-resolve it again in the
subsequent run() call. This is not a pythonic was to write this code.
Instead, we can wrap the function call with a try/except block and just
ignore any exception that may arise.

Signed-off-by: Tristan Partin <[email protected]>
---
 src/tools/ci/gha_ccache_decide.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/tools/ci/gha_ccache_decide.py b/src/tools/ci/gha_ccache_decide.py
index b4874517fb..4b322b8087 100644
--- a/src/tools/ci/gha_ccache_decide.py
+++ b/src/tools/ci/gha_ccache_decide.py
@@ -2,7 +2,6 @@
 
 import json
 import os
-import shutil
 import subprocess
 
 def run(cmd, check=True):
@@ -79,8 +78,10 @@ def main():
     # Before continuing, try to kill all ccache instances, otherwise
     # it's possible that on cancellations there is still running
     # ccaches that cause the upload to fail.
-    if shutil.which("killall"):
+    try:
         print(run(["killall", "ccache"], check=False))
+    except FileNotFoundError:
+        pass
 
     return 0
 
-- 
Tristan Partin
https://tristan.partin.io

From de4cf351d9ce083c3666c7ffc0f1f900eacfa441 Mon Sep 17 00:00:00 2001
From: Tristan Partin <[email protected]>
Date: Wed, 17 Jun 2026 19:20:47 +0000
Subject: [PATCH v1 4/7] Use a context manager to write grouped CI output

With a context manager, we can take responsibility from the caller to
write proper GitHub Action output syntax, and instead let the caller
focus on logging relevant output.

Signed-off-by: Tristan Partin <[email protected]>
---
 src/tools/ci/gha_ccache_decide.py | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/src/tools/ci/gha_ccache_decide.py b/src/tools/ci/gha_ccache_decide.py
index 4b322b8087..58b1d14c96 100644
--- a/src/tools/ci/gha_ccache_decide.py
+++ b/src/tools/ci/gha_ccache_decide.py
@@ -3,6 +3,7 @@
 import json
 import os
 import subprocess
+from contextlib import contextmanager
 
 def run(cmd, check=True):
     return subprocess.run(
@@ -18,6 +19,14 @@ def append_github_output(key, value):
     with open(output_path, "a", encoding="utf-8") as f:
         f.write(f"{key}={value}\n")
 
+@contextmanager
+def group(name):
+    print(f"::group::{name}")
+    try:
+        yield
+    finally:
+        print("::endgroup::")
+
 def main():
     on_default_branch = os.environ["ON_DEFAULT_BRANCH"] == "true"
 
@@ -29,9 +38,8 @@ def main():
 
     # Log ccache stats, useful for more in-depth understanding. To avoid
     # swamping the output, collapse it in a group.
-    print("::group::ccache_stats")
-    print(run(["ccache", "--show-stats", "-vv"]))
-    print("::endgroup::")
+    with group("ccache_stats"):
+        print(run(["ccache", "--show-stats", "-vv"]))
 
     # compute cache hit ratio
     ccache_stats = json.loads(run(["ccache", "--print-stats", "--format", "json"]))
@@ -65,15 +73,14 @@ def main():
     # branch differs a lot). Therefore evict ccache entries that are a
     # bit older. The cutoff here is fairly arbitrary, it could
     # probably be improved.
-    print("::group::ccache_shrink")
-    print(run(["ccache", "--evict-older-than", f"{45*60}s"]))
-    print(run(["ccache", "--recompress", "10"]))
+    with group("ccache_shrink"):
+        print(run(["ccache", "--evict-older-than", f"{45*60}s"]))
+        print(run(["ccache", "--recompress", "10"]))
 
-    # Don't store ccache stats, otherwise we'd need to reset the cache access
-    # data after restoring the cache in the next run, to be able to get the
-    # hit ratio of the CI run.
-    print(run(["ccache", "--zero-stats"]))
-    print("::endgroup::")
+        # Don't store ccache stats, otherwise we'd need to reset the cache
+        # access data after restoring the cache in the next run, to be able to
+        # get the hit ratio of the CI run.
+        print(run(["ccache", "--zero-stats"]))
 
     # Before continuing, try to kill all ccache instances, otherwise
     # it's possible that on cancellations there is still running
-- 
Tristan Partin
https://tristan.partin.io

From 5961484cb19125c87d91ace9df9b250471cd94f5 Mon Sep 17 00:00:00 2001
From: Tristan Partin <[email protected]>
Date: Wed, 17 Jun 2026 19:38:08 +0000
Subject: [PATCH v1 5/7] Provide ccache target rate as an input to the script

This makes it much easier to test locally. Now, we don't depend on an
ON_DEFAULT_BRANCH environment variable within the script. We pull the
decision making up to the CI step itself to tell the script what rate to
target.

Signed-off-by: Tristan Partin <[email protected]>
---
 .github/workflows/pg-ci.yml       |  8 +++++++-
 src/tools/ci/gha_ccache_decide.py | 29 +++++++++++++++--------------
 2 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/.github/workflows/pg-ci.yml b/.github/workflows/pg-ci.yml
index 5bc5292d2a..6ccd8c2475 100644
--- a/.github/workflows/pg-ci.yml
+++ b/.github/workflows/pg-ci.yml
@@ -368,7 +368,13 @@ jobs:
         if: |
           always() &&
           steps.ccache-restore-branch.conclusion == 'success'
-        run: python3 src/tools/ci/gha_ccache_decide.py
+        run: |
+          # Decide the target hit percentage below which we decide to upload a
+          # new cache. On non-default branches a few misses aren't that bad.
+          # But, as the caches of the default branch are shared with all
+          # branches, it's worth aiming for a higher ratio there.
+          python3 src/tools/ci/gha_ccache_decide.py \
+            --target-rate $([ "$ON_DEFAULT_BRANCH" = "true" ] && echo 95 || echo 80)
 
       - &ccache_save_step
         name: "ccache: Upload cache"
diff --git a/src/tools/ci/gha_ccache_decide.py b/src/tools/ci/gha_ccache_decide.py
index 58b1d14c96..fe9db62891 100644
--- a/src/tools/ci/gha_ccache_decide.py
+++ b/src/tools/ci/gha_ccache_decide.py
@@ -1,5 +1,6 @@
 #!/usr/bin/env python3
 
+import argparse
 import json
 import os
 import subprocess
@@ -27,15 +28,7 @@ def group(name):
     finally:
         print("::endgroup::")
 
-def main():
-    on_default_branch = os.environ["ON_DEFAULT_BRANCH"] == "true"
-
-    # Decide the target hit percentage below which we decide to upload a new
-    # cache. On non-default branches a few misses aren't that bad. But, as the
-    # caches of the default branch are shared with all branches, it's worth
-    # aiming for a higher ratio there.
-    target_rate = 95 if on_default_branch else 80
-
+def main(args):
     # Log ccache stats, useful for more in-depth understanding. To avoid
     # swamping the output, collapse it in a group.
     with group("ccache_stats"):
@@ -48,19 +41,19 @@ def main():
     total = hits + misses
     hit_pct = int((hits / total) * 100) if total > 0 else 100
 
-    print(f"hits: {hits}, misses: {misses}, hit_pct: {hit_pct}, target rate: {target_rate}")
+    print(f"hits: {hits}, misses: {misses}, hit_pct: {hit_pct}, target rate: {args.target_rate}")
 
     # If the cache hit ratio was high, or the absolute number of misses
     # (e.g. in case of a failed build) was low, there is no point in
     # generating a new cache entry. We have limited cache space.
-    if hit_pct >= target_rate:
-        print(f"hit rate {hit_pct} is above target of {target_rate}, skip creating new cache entry")
+    if hit_pct >= args.target_rate:
+        print(f"hit rate {hit_pct} is above target of {args.target_rate}, skip creating new cache entry")
         should_save = False
     elif misses <= 10:
         print(f"only {misses} misses, skip creating new cache entry")
         should_save = False
     else:
-        print(f"hit rate {hit_pct} is below target of {target_rate}, create new cache entry")
+        print(f"hit rate {hit_pct} is below target of {args.target_rate}, create new cache entry")
         should_save = True
 
     append_github_output("should_save", str(should_save).lower())
@@ -93,4 +86,12 @@ def main():
     return 0
 
 if __name__ == "__main__":
-    exit(main())
+    parser = argparse.ArgumentParser(description="Decide whether to save cache")
+    parser.add_argument(
+        "--target-rate",
+        type=int,
+        default=80,
+        help="target hit rate below which to save cache",
+    )
+
+    exit(main(parser.parse_args()))
-- 
Tristan Partin
https://tristan.partin.io

From 53ef4d0e88fe08f3c9dfa305cbe1cff4ecb521ac Mon Sep 17 00:00:00 2001
From: Tristan Partin <[email protected]>
Date: Wed, 17 Jun 2026 19:48:02 +0000
Subject: [PATCH v1 6/7] Check that GITHUB_OUTPUT exists before trying to
 append to it

Without defining GITHUB_OUTPUT as an environment variable before testing
the script, it would error out with an undefined variable. This will
improve local development a bit.

Signed-off-by: Tristan Partin <[email protected]>
---
 src/tools/ci/gha_ccache_decide.py | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/tools/ci/gha_ccache_decide.py b/src/tools/ci/gha_ccache_decide.py
index fe9db62891..1b58990939 100644
--- a/src/tools/ci/gha_ccache_decide.py
+++ b/src/tools/ci/gha_ccache_decide.py
@@ -6,6 +6,8 @@
 import subprocess
 from contextlib import contextmanager
 
+GITHUB_OUTPUT = os.environ.get("GITHUB_OUTPUT")
+
 def run(cmd, check=True):
     return subprocess.run(
         cmd,
@@ -16,9 +18,9 @@ def run(cmd, check=True):
     ).stdout
 
 def append_github_output(key, value):
-    output_path = os.environ["GITHUB_OUTPUT"]
-    with open(output_path, "a", encoding="utf-8") as f:
-        f.write(f"{key}={value}\n")
+    if GITHUB_OUTPUT:
+        with open(GITHUB_OUTPUT, "a", encoding="utf-8") as f:
+            f.write(f"{key}={value}\n")
 
 @contextmanager
 def group(name):
-- 
Tristan Partin
https://tristan.partin.io

From 16b12c125ca98909456f467f7b801edd2e598b19 Mon Sep 17 00:00:00 2001
From: Tristan Partin <[email protected]>
Date: Wed, 17 Jun 2026 22:00:01 +0000
Subject: [PATCH v1 7/7] Make gha_ccache_decide.py executable

The script already had a shebang line, so it seems intended for this
script to be executable.

Signed-off-by: Tristan Partin <[email protected]>
---
 src/tools/ci/gha_ccache_decide.py | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 mode change 100644 => 100755 src/tools/ci/gha_ccache_decide.py

diff --git a/src/tools/ci/gha_ccache_decide.py b/src/tools/ci/gha_ccache_decide.py
old mode 100644
new mode 100755
-- 
Tristan Partin
https://tristan.partin.io

Reply via email to