>>>>> "BG" == Ben Gertzfield <[EMAIL PROTECTED]> writes:
BG> Here's a port of HyperArch and pipermail to mimelib. This BG> allows proper parsing of multipart messages, and will make BG> i18n handling much easier. This is a big step forward, I BG> think, because now we no longer have two very different BG> Message classes in Mailman. I'm still looking at this patch. I have some qualms about it. If I commit this patch, we'll need to further do the mimelib->email conversion, but that shouldn't be hard. First... BG> This also patches pythonlib/mailbox.py to use mimelib instead BG> of rfc822. This is the last use of rfc822 in Mailman, so we BG> can now remove pythonlib/rfc822.py completely from the BG> archives -- now we use mimelib entirely! It also modifies pythonlib/cgi.py to use mimelib. Neither are good ideas because it means our copies get farther out of sync with Python's and we'll always have to carry around our copies. The purpose of the Mailman/pythonlib directory is to allow us to defer requiring newer versions of Python. Right now, Mailman should work with Python 2.0, but some of the modules that have been patched since then have useful stuff we need now. So I put copies of the latest standard library files in Mailman/pythonlib as a form of forward compability. Eventually, I can remove these once I require a version of Python that has these patches in them. An example is Cookie.py. When MM required only Py1.5.2, I had to provide a Cookie.py, but because Py2.0 has its own Cookie.py, we can use that and forget about our own copy. Similarly with cgi.py, rfc822.py, and others (I do need to do a bit of cleaning up here though). Fortunately, I think your changes to cgi.py aren't necessary, and we can accomplish your mailbox.py changes by changing Mailman/Mailbox.py instead. We do still need rfc822.py (I think) because email/mimelib package in some cases just wraps rfc822.py code instead of reimplementing or cutting-and-pasting the source. BG> This patch depends on the mimelib patch I just sent; it uses BG> the get_decoded_payload() function I added to get a nice text BG> representation of even a multi-part message. This will let us BG> even display a message for non-text parts of a message, and BG> eventually will let HyperArch display attachments inline. And BG> of course, as I mentioned in my previous mail, this will BG> prevent base64 gobbeldygook from showing up in the archives. BG> This patch even deals with multiple text/* attachments to a BG> message, and will include them all in the archive even if BG> they're base64 or quoted-printable encoded. I think this is a decent patch, and I'm probably going to commit these, after I rewrite them for the email package. BG> It currently does not deal with replacing high-ASCII BG> characters with HTML entities in HyperArch; I'm going to deal BG> with that next by taking the htmlentitydefs module's hash, BG> inverting it, and using that as a big global BG> search-and-replace, if the charset is undefined or iso-8859-1. My biggest question here is why you took most of the code out of Article._get_body() in HyperArch.py. IIRC, Jeremy added all this stuff so that charset handling would be saner. The idea is that if there is a single charset for the message, that would be the charset used for the web archive page. But if the page had multiple charsets, then it would pick the most common one. AFAIK, there's no way to represent multiple charsets in a single HTML page. An example of the latter is an index page for a list that has Subject: fields with many different charsets. Which one do you pick? In your patch, it seems like everything comes out iso-8859-1, and that doesn't seem right. -Barry _______________________________________________ Mailman-Developers mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-developers