On 05/20/2015 06:56 PM, Skip Montanaro wrote: > I have a list of spam HTML messages in pipermail archives. I need to clear > out their content (no great problem there), but I also want to clean the > corresponding messages in the raw mbox file (zap subject, message body, > etc, but leave a placeholder message so future archive regeneration doesn't > mess up article numbers). Looking at one of these messages (HTML source), I > see nothing like a message id which would allow me to unambiguously > identify the corresponding raw message. Does something exist? If not, what > heuristics have people developed to perform this mapping?
The poster's address in the HTML is a link that looks like: > <A > HREF="mailto:list%40example.com?Subject=Re%3A%20%Actual_Subject&In-Reply-To=%3CCAKmAgbSRpqwRU1sR8ij36psvSUyrXMWv-AcEVp%3D1%2BCWRZHh4Rg%40mail.gmail.com%3E" > TITLE="Actual_Subject">poster at example.com > </A><BR> The In-Reply-To fragment is the Message-ID. -- Mark Sapiro <[email protected]> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan ------------------------------------------------------ Mailman-Users mailing list [email protected] https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org
