On Tue, 01 May 2012 13:57:50 +0200, Georg Brandl <g.bra...@gmx.net> wrote: > Other planned large-scale changes: > > * Addition of the "regex" module > * Email version 6
I guess it's time to talk about my plans for this one :) RIM/QNX is currently paying me to work on their stuff rather than email6, (but it does leave me with some time for email6). However, while QNX directly funded a big chunk of email6, as a consequence of their current priorities the whole of the email6 spec isn't going to be implemented for Python3.3. There is, however, a very useful big chunk of it that is pretty much done: the improved header parsing, header API, and header folding. I covered the primary improvements in my PyCon talk, for those who were there or have seen the video. Even that is not quite complete, but I'm currently planning to finish it before alpha 4. (There may be a couple of details that won't make it in until beta1.) At the PyCon sprints I finished the folding implementation. It's every bit as ugly as the old folding implementation that I simplified some time ago, but it gets a lot more corner cases right, and implements an important feature that the old folding algorithm got wrong more often than not: folding at "higher level syntactic breaks". So while I'd like to revisit that code and improve it, it *works*. So any further work on that can be bug-fix stage. Also at the sprints I started on a performance refactoring. It has been bothering me for a while that any program using the new code would have been doing a complete RFC5322 parse on every header in every message, even if it was processing a boatload of messages, only cared about the content of a few headers, and wanted to just pass the rest through. I was treating fixing that as a premature optimization, though I had some thoughts about how to do so. Well, to my great surprise, the most logical way of fixing it turned out to have two significant benefits: the code got simpler, and it provides a way to maintain pretty much 100% backward compatibility with Python3.2. I guess some optimizations aren't premature. The basic scheme (which I have almost completely implemented in the email6 feature repo at this point) is to continue to store the raw data from a parse in the Message just like we always have, and only do the full RFC5322 parse when either an application program asks for the header, or a generator needs to re-fold that header for some reason. By setting the policy controls appropriately and being aware of the consequences of looking at a header, an application could take advantage of the new header parsing for headers of interest with minimal performance impact compared to 3.2. Now, here's the tricky bit. The new API for headers has been out on PyPI for review for almost a year now, but hasn't seen what you would call widespread use. In particular, I haven't gotten any feedback about it. It seems to me that introducing this new API in 3.3 would be a perfect application of PEP 411...except that email is already a package in the standard library. This is where the backward-compatibility of my performance refactor comes in. The way this works is that the policy object, which has already been added to the 3.3 codebase and *has* gotten some review and feedback, controls what happens to the headers. The way the code in the 'nemail6' branch of /features/email6 currently works is that the policy used by default is named 'compat32'. (Actually it's compat5 right now in the repository, but I plan to change the name today.) That policy implements the exact same header handling that 3.2 currently uses (bugs and all). The new header handling is introduced by any *other* pre-defined policy an application may select. Thus, if code is not changed to use one of the new named policies, nothing changes and we have full backward compatibility. If a policy is specified, then the new header handling code (and the API it provides) is used. What I'm currently preparing is two patches. The first patch will refactor the policy code that was already committed so that the above scheme can be implemented, and so that compat32 is the default policy for 3.3. (This is the 'nemail6base' branch in /features/email6.) The second patch will use the policy hooks introduced by the first patch to add the new policies that use the new header parsing/folding code. My plan is that the first patch will go into 3.3 regardless (and should be ready for review/commit soon). What I'd like to do is have the second patch introduce the new policies as *provisional policies*. That is, in the spirit but not the letter of PEP 411, I'd like the new header API to be considered provisional and subject to improvement in 3.4 based on what we learn by having it actually out there in the field and getting tested. --David _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com