On Wed, Feb 18, 2004 at 09:15:51PM -0800, David Muir Sharnoff wrote:
> I made a mistake.  Based on what spammers are using, &#0x65; is not
> valid but e is.  There can be extra zeros: e

    $uri =~ s/\&\#0*(3[3-9]|[4-9]\d|1[01]\d|12[0-6]);/sprintf "%c",$1/e;

that's the RE.  So it'll do &#, any number of 0's, and the printable char 
values...
guess I'll have to add another one for the hex versions... :|

http://www.htmlhelp.com/reference/html40/entities/ is a good reference btw. :)

> I really don't like the names body, rawbody, and text.  They are
> most confusing.   Can we please deprecate them in favor of 
> new terms?

Feel free to bring it up in a new thread...  I don't think it's going
to get very far fwiw, but now's the time to talk about this stuff.

-- 
Randomly Generated Tagline:
"If you know the enemy and know yourself, you need not fear the result
 of a hundred battles.  If you know yourself but not the enemy, for every
 victory gained you will also suffer a defeat. If you know neither the
 enemy nor yourself, you will succumb in every battle." - Sun Tzu

Attachment: pgpPRC0VGtQaZ.pgp
Description: PGP signature

Reply via email to