Just YOLO'ing the entire mail thread into an LLM context is quite
expensive especially if reviewing a thread against your current tree
state. This skill allows the agent to extract just the comments and
tags saving tokens.

Signed-off-by: Alex Bennée <[email protected]>
---
 .agents/skills/distil-mail-thread/SKILL.md    | 33 ++++++++++
 .../distil-mail-thread/scripts/parse_mail.py  | 64 +++++++++++++++++++
 2 files changed, 97 insertions(+)
 create mode 100644 .agents/skills/distil-mail-thread/SKILL.md
 create mode 100644 .agents/skills/distil-mail-thread/scripts/parse_mail.py

diff --git a/.agents/skills/distil-mail-thread/SKILL.md 
b/.agents/skills/distil-mail-thread/SKILL.md
new file mode 100644
index 00000000000..3c083ec5aa1
--- /dev/null
+++ b/.agents/skills/distil-mail-thread/SKILL.md
@@ -0,0 +1,33 @@
+---
+name: distil-mail-thread
+description: Extract and summarize reviewer comments from a QEMU or kernel 
mailing list thread dump (like a b4 mbox or text export). Use this when the 
user asks to "distil", "parse", or "extract feedback" from a mail thread file.
+license: GPL-2.0-or-later
+---
+
+# Distil Mail Thread
+
+This skill helps you extract reviewer comments and feedback from a long 
mailing list thread file, filtering out quoted text, patch diffs, and headers. 
It relies on a Python script included in this skill's `scripts/` directory.
+
+## How to use this skill
+
+1. **Locate the target file**: Identify the mail thread file the user wants to 
parse (e.g., `wxft-rfc-mail-thread.txt`).
+2. **Execute the script**: Run the included Python script against the file. 
The script is located in the `scripts/` directory of this skill.
+
+   ```bash
+   python /path/to/distil-mail-thread/scripts/parse_mail.py 
<path_to_mail_thread_file.txt>
+   ```
+   *(Note: Adjust the path to the script based on where this skill is 
installed. You can use your filesystem tools to locate 
`distil-mail-thread/scripts/parse_mail.py`.)*
+
+3. **Read the output**: The script will generate a file named 
`parsed_comments.txt` in the current working directory. Use your file reading 
tools to examine its contents.
+4. **Analyze and Summarize**: Read through the extracted comments and provide 
a structured summary to the user, correlating feedback to specific patches if 
necessary.
+
+## Expected Output
+The `parsed_comments.txt` will look like this:
+```
+--- REPLY FROM Reviewer Name <[email protected]> ---
+Subject: Re: [PATCH 01/10] ...
+Comment text here...
+============================================================
+```
+
+Use this structured text to efficiently analyze the feedback.
diff --git a/.agents/skills/distil-mail-thread/scripts/parse_mail.py 
b/.agents/skills/distil-mail-thread/scripts/parse_mail.py
new file mode 100644
index 00000000000..2924d07dc56
--- /dev/null
+++ b/.agents/skills/distil-mail-thread/scripts/parse_mail.py
@@ -0,0 +1,64 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+import sys
+import os
+
+if len(sys.argv) < 2:
+    print("Usage: python parse_mail.py <mail_thread_file.txt>")
+    sys.exit(1)
+
+input_file = sys.argv[1]
+output_file = "parsed_comments.txt"
+
+try:
+    with open(input_file, "r", encoding="utf-8") as f:
+        text = f.read()
+except FileNotFoundError:
+    print(f"Error: File not found - {input_file}")
+    sys.exit(1)
+
+# Split by the separator used in lore.kernel.org / b4 dumps
+messages = text.split("----------------------------------------")
+
+with open(output_file, "w", encoding="utf-8") as f:
+    for msg in messages:
+        if not msg.strip(): continue
+
+        lines = msg.strip().split('\n')
+        author = ""
+        subject = ""
+        body_start = 0
+        for i, line in enumerate(lines):
+            if line.startswith("From: "): author = line[6:]
+            if line.startswith("Subject: "): subject = line[9:]
+            if not line.strip() and body_start == 0:
+                body_start = i + 1
+                break
+
+        # Filter out the original patch author (assuming they are Alex
+        # Bennée in this specific context, but for a general tool, we
+        # should probably just look for non-patch emails or specific
+        # reviewers).
+        # We will keep it general: exclude the main author if we can guess it,
+        # or just extract everything that doesn't look like code or a patch
+        # description.
+        # Actually, let's keep all comments that are replies (indicated by >
+        # quoting or Re: in subject).
+
+        is_reply = "Re: " in subject or subject.startswith("Re:")
+
+        if is_reply and author != "" and not author.startswith("qemu-devel"):
+            f.write(f"--- REPLY FROM {author} ---\nSubject: {subject}\n")
+
+            # extract comments
+            comments_extracted = False
+            for line in lines[body_start:]:
+                is_metadata = (line.startswith(">") or
+                               line.startswith("---") or
+                               line.startswith("diff "))
+                if not is_metadata:
+                    if line.strip():
+                        comments_extracted = True
+                    f.write(line + "\n")
+            f.write("="*60 + "\n\n")
+
+print(f"Done. Extracted comments saved to {output_file}")
-- 
2.47.3


Reply via email to