On Thu, 18 Jul 2019, Michael C Robinson wrote:

I wrote a simple script that greps an mbox for the subjects of every email in it. Problem is, a lot of these subjects are htmlized or something similar and are not plain text. Any suggestions on alternative approaches to extracting the subjects of every email in the spam box and emailing them to myself?

Python, among other tools, will decode the UTF-8 headers for you, e.g.,

Python 2.7.10 (default, Oct  6 2017, 22:29:07)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.31)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
import email
from email.header import decode_header
decode_header(u'=?UTF-8?B?U3RpY2sgdGhpcyB0byB5b3VyIHNraW4gYW5kIG1lbHQgMWxiL2RheS4uLg==?=')
[('Stick this to your skin and melt 1lb/day...', 'utf-8')]


--
Paul Heinlein
heinl...@madboa.com
45°38' N, 122°6' W
_______________________________________________
PLUG mailing list
PLUG@pdxlinux.org
http://lists.pdxlinux.org/mailman/listinfo/plug

Reply via email to