Just YOLO'ing the entire mail thread into an LLM context is quite expensive especially if reviewing a thread against your current tree state. This skill allows the agent to extract just the comments and tags saving tokens.
Signed-off-by: Alex Bennée <[email protected]> --- .agents/skills/distil-mail-thread/SKILL.md | 33 ++++++++++ .../distil-mail-thread/scripts/parse_mail.py | 64 +++++++++++++++++++ 2 files changed, 97 insertions(+) create mode 100644 .agents/skills/distil-mail-thread/SKILL.md create mode 100644 .agents/skills/distil-mail-thread/scripts/parse_mail.py diff --git a/.agents/skills/distil-mail-thread/SKILL.md b/.agents/skills/distil-mail-thread/SKILL.md new file mode 100644 index 00000000000..3c083ec5aa1 --- /dev/null +++ b/.agents/skills/distil-mail-thread/SKILL.md @@ -0,0 +1,33 @@ +--- +name: distil-mail-thread +description: Extract and summarize reviewer comments from a QEMU or kernel mailing list thread dump (like a b4 mbox or text export). Use this when the user asks to "distil", "parse", or "extract feedback" from a mail thread file. +license: GPL-2.0-or-later +--- + +# Distil Mail Thread + +This skill helps you extract reviewer comments and feedback from a long mailing list thread file, filtering out quoted text, patch diffs, and headers. It relies on a Python script included in this skill's `scripts/` directory. + +## How to use this skill + +1. **Locate the target file**: Identify the mail thread file the user wants to parse (e.g., `wxft-rfc-mail-thread.txt`). +2. **Execute the script**: Run the included Python script against the file. The script is located in the `scripts/` directory of this skill. + + ```bash + python /path/to/distil-mail-thread/scripts/parse_mail.py <path_to_mail_thread_file.txt> + ``` + *(Note: Adjust the path to the script based on where this skill is installed. You can use your filesystem tools to locate `distil-mail-thread/scripts/parse_mail.py`.)* + +3. **Read the output**: The script will generate a file named `parsed_comments.txt` in the current working directory. Use your file reading tools to examine its contents. +4. **Analyze and Summarize**: Read through the extracted comments and provide a structured summary to the user, correlating feedback to specific patches if necessary. + +## Expected Output +The `parsed_comments.txt` will look like this: +``` +--- REPLY FROM Reviewer Name <[email protected]> --- +Subject: Re: [PATCH 01/10] ... +Comment text here... +============================================================ +``` + +Use this structured text to efficiently analyze the feedback. diff --git a/.agents/skills/distil-mail-thread/scripts/parse_mail.py b/.agents/skills/distil-mail-thread/scripts/parse_mail.py new file mode 100644 index 00000000000..2924d07dc56 --- /dev/null +++ b/.agents/skills/distil-mail-thread/scripts/parse_mail.py @@ -0,0 +1,64 @@ +# SPDX-License-Identifier: GPL-2.0-or-later +import sys +import os + +if len(sys.argv) < 2: + print("Usage: python parse_mail.py <mail_thread_file.txt>") + sys.exit(1) + +input_file = sys.argv[1] +output_file = "parsed_comments.txt" + +try: + with open(input_file, "r", encoding="utf-8") as f: + text = f.read() +except FileNotFoundError: + print(f"Error: File not found - {input_file}") + sys.exit(1) + +# Split by the separator used in lore.kernel.org / b4 dumps +messages = text.split("----------------------------------------") + +with open(output_file, "w", encoding="utf-8") as f: + for msg in messages: + if not msg.strip(): continue + + lines = msg.strip().split('\n') + author = "" + subject = "" + body_start = 0 + for i, line in enumerate(lines): + if line.startswith("From: "): author = line[6:] + if line.startswith("Subject: "): subject = line[9:] + if not line.strip() and body_start == 0: + body_start = i + 1 + break + + # Filter out the original patch author (assuming they are Alex + # Bennée in this specific context, but for a general tool, we + # should probably just look for non-patch emails or specific + # reviewers). + # We will keep it general: exclude the main author if we can guess it, + # or just extract everything that doesn't look like code or a patch + # description. + # Actually, let's keep all comments that are replies (indicated by > + # quoting or Re: in subject). + + is_reply = "Re: " in subject or subject.startswith("Re:") + + if is_reply and author != "" and not author.startswith("qemu-devel"): + f.write(f"--- REPLY FROM {author} ---\nSubject: {subject}\n") + + # extract comments + comments_extracted = False + for line in lines[body_start:]: + is_metadata = (line.startswith(">") or + line.startswith("---") or + line.startswith("diff ")) + if not is_metadata: + if line.strip(): + comments_extracted = True + f.write(line + "\n") + f.write("="*60 + "\n\n") + +print(f"Done. Extracted comments saved to {output_file}") -- 2.47.3
