On 2001.10.25, in <[EMAIL PROTECTED]>, "Ben Gertzfield" <[EMAIL PROTECTED]> wrote: > > Here's a patch that actually throws out all-HTML emails, but just > removes HTML parts. > > Actually, why don't we just decode HTML attachments like any other, > and let the user beware if they want to click on it? There are lots > of legitimate reasons to allow HTML attachments. I can't think of any > to allow all-HTML messages. *grin* We could treat all-HTML messages in > the same way, just provide a link and let the user beware if they > click on it.
Unfortunately, I think there are legitimate reasons for allowing HTML messages (as well as parts) into the record. But I don't think that legitimizes passing the HTML through literally -- this poses a big potential threat to archive viewers. I don't care to make a full-blown rendering of HTML; I'd argue that it's not Mailman's job -- but it is Mailman's job (or, more precisely, the archiver's job) to provide any text available to the archive viewer. Whether its display is true to the intentions of the poster is subject to endless debate, but HTML is widely expected to be legible even if it's not rendered per specification -- and it almost always is, if you try hard enough -- so I think that the content should be available. I suggested transliterating the HTML with < and > tokens, to make it harmless but legible, in case there's significant text inside. But, admittedly, that is pretty ugly. What about simply stripping out ALL markup, leaving only bare text -- and perhaps doing some minor interpretation for <br> and <p> tags, just to improve readability? Then throw in a link to the original, as Ben suggests, for good measure. > The patch also adds a filename to the replacement payload, so that > users can have an idea of what they're going to see if a description > was not provided (VERY common). Ah, filenames. I'd actually like to see the filename stored on the server as requested in the MIME content-disposition. I don't think the archiver needs to guarantee literalism here; a good-faith effort is sufficient. But I think it's significant in many cases, where the transmission filename is really how the file needs to be saved locally. Minimally I'd like the filename to be shown on the archive display, but it'd be nice if I don't need to change the filename in my browser's "save as..." dialog each time I save an attachment. I'd suggest a very basic sanitizing of the basename of the MIME filename. Something like s!.*[:/\\]!! to remove pathname components for all three major pathname separators, and then (optionally) to either hex-encode the non-alphanumeric symbols, a la HTML, or to replace them with some other token. -- -D. [EMAIL PROTECTED] NSIT University of Chicago _______________________________________________ Mailman-Developers mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-developers