This is not for extracting just the "Content-Type: text/plain" section of an email message, but rather for converting an HTML file to plain text.
I needed to do this on a very limited scale, so I just wrote a few lines of PHP that suited my situation: function textify ($file) { $contents = file_get_contents($file); $contents = strip_tags($contents); $contents = htmlspecialchars_decode($contents, ENT_QUOTES); // including single and double quotes $contents = str_replace(' ', ' ', $contents); // replace entity with space $contents = preg_replace('#\{literal\}.*?\{/literal\}#mUs', '', $contents); // remove {literal} Smarty blocks $contents = preg_replace("/[\t ]+/", " ", $contents); // replace successive blanks with a single blank $contents = preg_replace("/^[\t ]+/m", "", $contents); // remove leading blanks $contents = preg_replace("/^ *$\n/mU", "", $contents); // remove empty lines return $contents; } There is a class[1] to do this in PHP that has been used by several full programs such as PHPMailer. Curiously, PHPMailer *removed* the class because the former is GPL while the latter is LGPL[2] [1] https://github.com/mtibben/html2text [2] https://github.com/PHPMailer/PHPMailer/commit/127d26ef3c43118d82c244c15016cf37d67504c6 Greg Rundlett https://eQuality-Tech.com https://freephile.org _______________________________________________ Discuss mailing list Discuss@blu.org http://lists.blu.org/mailman/listinfo/discuss