Re: [Python-Dev] Python 2.7 Won't Build
Le vendredi 17 septembre 2010 00:09:09, Tom Browder a écrit : > I did, and eventually discovered the problem: I tried to "nosy" Barry > as requested by adding his e-mail address, but that causes an error in > the tracker. After I finally figured that out, I successfully entered > the original bug (and reported it on the "tracker bug"). http://bugs.python.org/issue9880 Ah, yes, you have to add nicknames, not emails. Barry nickname is "barry", and he's already on the nosy list (because he answered to your issue). -- Victor Stinner http://www.haypocalc.com/ ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Polymorphic best practices [was: (Not) delaying the 3.2 release]
R. David Murray a écrit : I'm trying one approach in email6: Bytes and String subclasses, where the subclasses have an attribute named 'literals' derived from a utility module that does this: literals = dict( empty = '', colon = ':', newline = '\n', space = ' ', tab = '\t', fws = ' \t', headersep = ': ', ) class _string_literals: pass class _bytes_literals: pass for name, value in literals.items(): setattr(_string_literals, name, value) setattr(_bytes_literals, name, bytes(value, 'ASCII')) del literals, name, value And the subclasses do: class BytesHeader(BaseHeader): lit = email.utils._bytes_literals class StringHeader(BaseHeader): lit = email.utils._string_literals I've just written a decorator which applies a similar strategy for insulated functions, by passing them an appropriate namespace as an argument. It could be useful in cases where only a few functions are polymorphic, not a full class or module. http://code.activestate.com/recipes/577393-decorator-for-writing-polymorphic-functions/ Cheers, B. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [issue1633863] AIX: configure ignores $CC
Hi Martin,
I have started to correct quite a lot of issues I have with Python on
AIX, and since I had to test quite a lot of patchs, I though it would be
more convenient to setup a buildbot for that platform.
So I now have a buildbot environment with 2 slaves (AIX 5.3 and 6.1)
that builds and tests Python (branch py3k) with both gcc and xlc (the
native AIX compiler) (I have 4 builders ("py3k-aix6-xlc",
"py3k-aix5-xlc", "py3k-aix6-gcc", "py3k-aix5-gcc").
I expect to add 4 more builders for branch 2.7 in coming days.
I would like to share the results of this buildbot to the Python
community so that issues with AIX could be addressed more easily.
R. David Murray pointed me to the page on the python wiki concerning
buildbot. It is stated there that is is possible to connect some slaves
to some official Python buildbot master.
Unfortunately, I don't think this solution is possible for me: I don't
think the security team in my company would appreciate that a server
inside our network runs some arbitrary shell commands provided by some
external source. Neither can I expose the buildbot master web interface.
Also I had to customize the buildbot rules in order to work with some
specificities of AIX (see attached master.cfg), and I can't guarantee
that this buildbot will run 24 hours a day; I may have to schedule it
only once at night for example if it consumes too much resources.
(And the results are very unstable at the moment, mostly because of
issue 9862).
On the other hand, I could upload the build results with rsync or scp
somewhere or setup some MailNotifier if that can help.
How do you think I could share those results?
regards
--
Sébastien Sablé
Le 15/09/2010 23:28, R. David Murray a écrit :
R. David Murray added the comment:
Sébastien, you could email Martin (tracker id loewis) about adding your
buildbot to our unstable fleet (or even to stable if it is stable; that is, the
tests normally pass and don't randomly fail). As long as you are around to
help fix bugs it would be great to have an aix buildbot in our buildbot fleet.
(NB: see also http://wiki.python.org/moin/BuildBot, which unfortunately is a
bit out of date...)
--
nosy: +r.david.murray
___
Python tracker
___
# -*- python -*-
# ex: set syntax=python:
# This is a sample buildmaster config file. It must be installed as
# 'master.cfg' in your buildmaster's base directory (although the filename
# can be changed with the --basedir option to 'mktap buildbot master').
# It has one job: define a dictionary named BuildmasterConfig. This
# dictionary has a variety of keys to control different aspects of the
# buildmaster. They are documented in docs/config.xhtml .
# This is the dictionary that the buildmaster pays attention to. We also use
# a shorter alias to save typing.
c = BuildmasterConfig = {}
### BUILDSLAVES
# the 'slaves' list defines the set of allowable buildslaves. Each element is
# a BuildSlave object, which is created with bot-name, bot-password. These
# correspond to values given to the buildslave's mktap invocation.
from buildbot.buildslave import BuildSlave
c['slaves'] = [BuildSlave("phenix", "bot1passwd", max_builds=1),
BuildSlave("sirius", "bot2passwd", max_builds=1)]
# to limit to two concurrent builds on a slave, use
# c['slaves'] = [BuildSlave("bot1name", "bot1passwd", max_builds=2)]
# 'slavePortnum' defines the TCP port to listen on. This must match the value
# configured into the buildslaves (with their --master option)
c['slavePortnum'] = 9989
### CHANGESOURCES
# the 'change_source' setting tells the buildmaster how it should find out
# about source code changes. Any class which implements IChangeSource can be
# put here: there are several in buildbot/changes/*.py to choose from.
from buildbot.changes.pb import PBChangeSource
c['change_source'] = PBChangeSource()
# For example, if you had CVSToys installed on your repository, and your
# CVSROOT/freshcfg file had an entry like this:
#pb = ConfigurationSet([
#(None, None, None, PBService(userpass=('foo', 'bar'), port=4519)),
#])
# then you could use the following buildmaster Change Source to subscribe to
# the FreshCVS daemon and be notified on every commit:
#
#from buildbot.changes.freshcvs import FreshCVSSource
#fc_source = FreshCVSSource("cvs.example.com", 4519, "foo", "bar")
#c['change_source'] = fc_source
# or, use a PBChangeSource, and then have your repository's commit script run
# 'buildbot sendchange', or use contrib/svn_buildbot.py, or
# contrib/arch_buildbot.py :
#
#from buildbot.changes.pb import PBChangeSource
#c['change_source'] = PBChangeSource()
# If you wat to use SVNPoller, it might look something like
# # Where to get source code changes
# from buildbot.changes.svnpoller import SVNPoller
# source_code_svn_url='https://svn.myproject.org/bluejay/trunk'
Re: [Python-Dev] Polymorphic best practices [was: (Not) delaying the 3.2 release]
Le jeudi 16 septembre 2010 à 22:51 -0400, R. David Murray a écrit :
> > > On disk, using utf-8,
> > > one might store the text representation of the message, rather than
> > > the wire-format (ASCII encoded) version. We might want to write such
> > > messages from scratch.
> >
> > But then the user knows the encoding (by "user" I mean what/whoever
> > calls the email API) and mentions it to the email package.
>
> Yes? And then? The email package still has to parse the file, and it
> can't use its normal parse-the-RFC-data parser because the file could
> contain *legitimate* non-ASCII header data. So there has to be a separate
> parser for this case that will convert the non-ASCII data into RFC2047
> encoded data. At that point you have two parsers that share a bunch of
> code...and my current implementation lets the input to the second parser
> be text, which is the natural representation of that data, the one the
> user or application writer is going to expect.
But you said it yourself: that "e-mail-like data" data is not an email.
You could have a separate converter class for these special cases.
Also, I don't understand why an application would want to assemble an
e-mail by itself if it doesn't know how to do so, and produces wrong
data. Why not simply let the application do:
m = Message()
m.add_header("From", "Accented Bàrry ")
m.add_body("Hello Barry")
> > And then you have two separate worlds while ultimately the same
> > concepts are underlying. A library accepting BytesMessage will crash
> > when a program wants to give a StringMessage and vice-versa. That
> > doesn't sound very practical.
>
> Yes, and a library accepting bytes will crash when a program wants
> to give it a string. So? That's how Python3 works. Unless, of
> course, the application decides to be polymorphic :)
Well, the application wants to handle abstracted e-mail messages. I'm
sure people would rather not deal with the difference(s) between
BytesMessages and StringMessages.
That's like saying we should have BytesConfigParser for bytes
configuration files and StringConfigParser for string configuration
files, with incompatible APIs.
("surrogateescape")
> On the other hand, that might be a way to make the current API work
> at least a little bit better with 8bit input data. I'll have to think
> about that...
Yes, that's what I was talking about.
You can even choose ("ascii", "surrogateescape") if you don't want to
wrongly choose an 8-bit encoding such as utf-8 or latin-1.
(I'm deliberately ignoring the case where people would use a non-ASCII
compatible encoding such as utf-16; I hope you don't want to support
that :-))
Regards
Antoine.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] standards for distribution names
On Thu, Sep 16, 2010 at 12:08:59PM +0100, Chris Withers wrote: > Hi All, > > Following on from this question: > > http://twistedmatrix.com/pipermail/twisted-python/2010-September/022877.html > > ...I'd thought that the "correct names" for distributions would have > been documented in one of: > > http://www.python.org/dev/peps/pep-0345 > http://www.python.org/dev/peps/pep-0376 > http://www.python.org/dev/peps/pep-0386 > > ...but having read them, I drew a blank. > > Where are the standards for this or is it still a case of "whatever > setuptools does"? > > Chris > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/daniel.buch%40gmail.com You may also find this thread from the packaging google group useful, although it may not be quite what you're looking for: http://bit.ly/96SMuM Cheers, -- ~Dan pgpNBmBHKIAoI.pgp Description: PGP signature ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [issue1633863] AIX: configure ignores $CC
Hi Sebastien, Unfortunately, I don't think this solution is possible for me: I don't think the security team in my company would appreciate that a server inside our network runs some arbitrary shell commands provided by some external source. I still think this would be the best thing, and I feel that from a security point of view, it doesn't really differ from what you are doing now already - see below. Neither can I expose the buildbot master web interface. That shouldn't be necessary. Also I had to customize the buildbot rules in order to work with some specificities of AIX (see attached master.cfg), and I can't guarantee that this buildbot will run 24 hours a day; I may have to schedule it only once at night for example if it consumes too much resources. (And the results are very unstable at the moment, mostly because of issue 9862). If you are having the build slave compile Python, I'd like to point out that you *already* run arbitrary shell commands provided by some external source: if somebody would check some commands into Python's configure.in, you would unconditionally execute them. So if it's ok that you run the Python build process at all, it should (IMO) also be acceptable to run a build slave. If there are concerns that running it under your Unix account gives it too much power, you should create a separate, locked-down account. On the other hand, I could upload the build results with rsync or scp somewhere or setup some MailNotifier if that can help. How do you think I could share those results? I'd be hesitant to support this as a special case. If the results are not in the standard locations, people won't look at them, anyway. Given that one often also needs access to the hardware in order to fix problems, it might be sufficient if only you look at the buildslave results, and create bug reports whenever you notice a problem. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [issue1633863] AIX: configure ignores $CC
On Fri, 17 Sep 2010 11:40:12 +0200
Sébastien Sablé wrote:
> Hi Martin,
>
> I have started to correct quite a lot of issues I have with Python on
> AIX, and since I had to test quite a lot of patchs, I though it would be
> more convenient to setup a buildbot for that platform.
>
> So I now have a buildbot environment with 2 slaves (AIX 5.3 and 6.1)
> that builds and tests Python (branch py3k) with both gcc and xlc (the
> native AIX compiler) (I have 4 builders ("py3k-aix6-xlc",
> "py3k-aix5-xlc", "py3k-aix6-gcc", "py3k-aix5-gcc").
Following on Martin's comments, you might also want to share things
with the ActiveState guys who, AFAIK, maintain an AIX version of Python
(but you have been the most active AIX user on the bug tracker lately;
perhaps they are keeping their patches to themselves).
(see http://www.activestate.com/activepython )
Regards
Antoine.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add PEP 444, Python Web3 Interface.
Am 16.09.10 02:02, schrieb John Nagle: On 9/15/2010 4:44 PM, [email protected] wrote: ``SERVER_PORT`` must be a bytes instance (not an integer). What's that supposed to mean? What goes in the "bytes instance"? A character string in some format? A long binary number? If the latter, with which byte ordering? What problem does this\ solve? Just interpreting (i.e. not having participated in the specification): Given the CGI background of all this, SERVER_PORT is an ASCII-encoded decimal rendering of the port number. As to what problem this solves: I guess it allows for easy pass-through from the web server. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (Not) delaying the 3.2 release
On Fri, Sep 17, 2010 at 5:43 AM, Martin (gzlist) wrote: > In the example I gave, 十 encodes in CP932 as '\x8f\\', and the > function gets confused by the second byte. Obviously the right answer > there is just to use unicode, rather than write a function that works > with weird multibyte codecs. That does make it clear that "ASCII superset" is an inaccurate term - a better phrase is "ASCII compatible", since that correctly includes multibyte codecs like UTF-8 which explicitly ensure that the byte values in multibyte characters are all outside the 0x00 to 0x7F range of ASCII. So the domain of any polymorphic text manipulation functions we define would be: - Unicode strings - byte sequences where the encoding is either: - a single byte ASCII superset (e.g. iso-8859-*, cp1252, koi8*, mac*) - an ASCII compatible multibyte encoding (e.g. UTF-8, EUC-JP) Passing in byte sequences that are encoded using an ASCII incompatible multibyte encoding (e.g. CP932, UTF-7, UTF-16, UTF-32, shift-JIS, big5, iso-2022-*, EUC-CN/KR/TW) or a single byte encoding that is not an ASCII superset (e.g. EBCDIC) will have undefined results. I think that's still a big enough win to be worth doing, particularly as more and more of the other variable width multibyte encodings are phased out in favour of UTF-8. Cheers, Nick. P.S. Hey Barry, is there anyone at Canonical you can poke about https://bugs.launchpad.net/xorg-server/+bug/531208? Tinkering with this stuff on Kubuntu would be significantly less annoying if I could easily type arbitrary Unicode characters into Konsole ;) -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Polymorphic best practices [was: (Not) delaying the 3.2 release]
On Sep 17, 2010, at 12:10 PM, Antoine Pitrou wrote:
>Also, I don't understand why an application would want to assemble an
>e-mail by itself if it doesn't know how to do so, and produces wrong
>data. Why not simply let the application do:
>
>m = Message()
>m.add_header("From", "Accented Bàrry ")
>m.add_body("Hello Barry")
Very often you'll start with a template of a message your application wants to
send. Then you'll interpolate a few values into it, and you'd like to easily
convert the result into an RFC valid email.
Is that template bytes or text (or either)?
-Barry
signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Polymorphic best practices [was: (Not) delaying the 3.2 release]
On Sep 16, 2010, at 11:45 PM, Terry Reedy wrote: >Based on the discussion so far, I think you should go ahead and >implement the API agreed on by the mail sig both because is *has* been >agreed on (and thinking about the wsgi discussion, that seems to be a >major achievement) and because it seems sensible to me also, as far as >I understand it. The proof of the API will be in the testing. As long >as you *think* it covers all intended use cases, I am not sure that >abstract discussion can go much further. +1 >I do have a thought about space and duplication. For normal messages, >it is not an issue. For megabyte (or in the future, gigabyte?) >attachments, it is. So if possible, there should only be one extracted >blob for both bytes and string versions of parsed messages. Or even >make the extraction from the raw stream lazy, when specifically >requested. This has been discussed in the email-sig. Many people have asked for an API where message payloads can be stored on-disk instead of in-memory. Headers, I don't think will every practically be so big as to not be storable in-memory. But if your message has a huge mp3, the parser should have the option to leave the bytes of that payload in a disk cache and transparently load it when necessary. I think we should keep that in mind, but it's way down on the list of "gotta haves" for email6. -Barry signature.asc Description: PGP signature ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Summary of Python tracker Issues
ACTIVITY SUMMARY (2010-09-10 - 2010-09-17)
Python tracker at http://bugs.python.org/
To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.
Issues stats:
open2541 (+42)
closed 19128 (+69)
total 21669 (+65)
Open issues with patches: 1060
Issues opened (42)
==
#9824: SimpleCookie should escape commas and semi-colons
http://bugs.python.org/issue9824 opened by spookylukey
#9831: test_distutils fails on MacOSX 10.6
http://bugs.python.org/issue9831 opened by cartman
#9838: Inadequate C-API to Python 3 I/O objects
http://bugs.python.org/issue9838 opened by pv
#9841: sysconfig and distutils.sysconfig differ in subtle ways
http://bugs.python.org/issue9841 opened by eric.araujo
#9842: Document ... used in recursive repr of containers
http://bugs.python.org/issue9842 opened by eric.araujo
#9844: calling nonexisting function under __INSURE__
http://bugs.python.org/issue9844 opened by eli.bendersky
#9845: Allow changing the method in urllib.request.Request
http://bugs.python.org/issue9845 opened by tarek
#9846: ZipExtFile provides no mechanism for closing the underlying fi
http://bugs.python.org/issue9846 opened by john.admanski
#9849: Argparse needs better error handling for nargs
http://bugs.python.org/issue9849 opened by Jason.Baker
#9850: obsolete macpath module dangerously broken and should be remov
http://bugs.python.org/issue9850 opened by ned.deily
#9851: multiprocessing socket timeout will break client
http://bugs.python.org/issue9851 opened by hume
#9852: test_ctypes fail with clang
http://bugs.python.org/issue9852 opened by cartman
#9854: SocketIO should return None on EWOULDBLOCK
http://bugs.python.org/issue9854 opened by pitrou
#9856: Change object.__format__(s) where s is non-empty to a Deprecat
http://bugs.python.org/issue9856 opened by eric.smith
#9857: SkipTest in tearDown is reported an as an error
http://bugs.python.org/issue9857 opened by pitrou
#9858: Python and C implementations of io are out of sync
http://bugs.python.org/issue9858 opened by pitrou
#9859: Add tests to verify API match of modules with 2 implementation
http://bugs.python.org/issue9859 opened by stutzbach
#9860: Building python outside of source directory fails
http://bugs.python.org/issue9860 opened by belopolsky
#9861: subprocess module changed exposed attributes
http://bugs.python.org/issue9861 opened by pclinch
#9862: test_subprocess hangs on AIX
http://bugs.python.org/issue9862 opened by sable
#9864: email.utils.{parsedate,parsedate_tz} should have better return
http://bugs.python.org/issue9864 opened by pitrou
#9865: OrderedDict doesn't implement __sizeof__
http://bugs.python.org/issue9865 opened by pitrou
#9866: Inconsistencies in tracing list comprehensions
http://bugs.python.org/issue9866 opened by belopolsky
#9867: Interrupted system calls are not retried
http://bugs.python.org/issue9867 opened by aronacher
#9868: test_locale leaves locale changed
http://bugs.python.org/issue9868 opened by ocean-city
#9869: long_subtype_new segfault in pure-Python code
http://bugs.python.org/issue9869 opened by cwitty
#9871: IDLE dies when using some regex
http://bugs.python.org/issue9871 opened by Popa.Claudiu
#9873: Allow bytes in some APIs that use string literals internally
http://bugs.python.org/issue9873 opened by ncoghlan
#9874: Message.attach() loses empty attachments
http://bugs.python.org/issue9874 opened by [email protected]
#9875: Garbage output when running setup.py on Windows
http://bugs.python.org/issue9875 opened by exarkun
#9876: ConfigParser can't interpolate values from other sections
http://bugs.python.org/issue9876 opened by asolovyov
#9877: Expose sysconfig._get_makefile_filename() in public API
http://bugs.python.org/issue9877 opened by barry
#9878: Avoid parsing pyconfig.h and Makefile by autogenerating extens
http://bugs.python.org/issue9878 opened by barry
#9880: Python 2.7 Won't Build: SystemError: unknown opcode
http://bugs.python.org/issue9880 opened by Tom.Browder
#9882: abspath from directory
http://bugs.python.org/issue9882 opened by ipatrol
#9883: minidom: AttributeError: DocumentFragment instance has no attr
http://bugs.python.org/issue9883 opened by Aubrey.Barnard
#9884: The 4th parameter of method always None or 0 on x64 Windows.
http://bugs.python.org/issue9884 opened by J2.NETe
#9886: Make operator.itemgetter/attrgetter/methodcaller easier to dis
http://bugs.python.org/issue9886 opened by ncoghlan
#9887: distutil's build_scripts doesn't read utf-8 in all locales
http://bugs.python.org/issue9887 opened by hagen
#460474: codecs.StreamWriter: reset() on close()
http://bugs.python.org/issue460474 reopened by r.david.murray
#767645: incorrect os.path.supports_unicode_filenames
http://bugs.python.org/issue767645 reopened by haypo
#1076515: shutil.move clobbers read-only files.
http://bugs.python.org/issue1076515 reopened by brian.curtin
M
Re: [Python-Dev] Polymorphic best practices [was: (Not) delaying the 3.2 release]
On 16/09/2010 23:05, Antoine Pitrou wrote:
On Thu, 16 Sep 2010 16:51:58 -0400
"R. David Murray" wrote:
What do we store in the model? We could say that the model is always
text. But then we lose information about the original bytes message,
and we can't reproduce it. For various reasons (mailman being a big one),
this is not acceptable. So we could say that the model is always bytes.
But we want access to (for example) the header values as text, so header
lookup should take string keys and return string values[2].
Why can't you have both in a single class? If you create the class
using a bytes source (a raw message sent by SMTP, for example), the
class automatically parses and decodes it to unicode strings; if you
create the class using an unicode source (the text body of the e-mail
message and the list of recipients, for example), the class
automatically creates the bytes representation.
I think something like this would be great for WSGI. Rather than focus
on whether bytes *or* text should be used, use a higher level object
that provides a bytes view, and (where possible/appropriate) a unicode
view too.
Michael
(of course all processing can be done lazily for performance reasons)
What about email files on disk? They could be bytes, or they could be,
effectively, text (for example, utf-8 encoded).
Such a file can be two things:
- the raw encoding of a whole message (including headers, etc.), then
it should be fed as a bytes object
- the single text body of a hypothetical message, then it should be fed
as a unicode object
I don't see any possible middle-ground.
On disk, using utf-8,
one might store the text representation of the message, rather than
the wire-format (ASCII encoded) version. We might want to write such
messages from scratch.
But then the user knows the encoding (by "user" I mean what/whoever
calls the email API) and mentions it to the email package.
What I'm having an issue with is that you are talking about a bytes
representation and an unicode representation of a message. But they
aren't representations of the same things:
- if it's a bytes representation, it will be the whole, raw message
including envelope / headers (also, MIME sections etc.)
- if it's an unicode representation, it will only be a section of the
message decodable as such (a text/plain MIME section, for example;
or a decoded header value; or even a single e-mail address part of a
decoded header)
So, there doesn't seem to be any reason for having both a BytesMessage
and an UnicodeMessage at the same abstraction level. They are both
representing different things at different abstraction levels. I don't
see any potential for confusion: raw assembled e-mail message = bytes;
decoded text section of a message = unicode.
As for the problem of potential "bogus" raw e-mail data
(e.g., undecodable headers), well, I guess the library has to make a
choice between purity and practicality, or perhaps let the user choose
themselves. For example, through a `strict` flag. If `strict` is true,
raise an error as soon as a non-decodable byte appears in a header, if
`strict` is false, decode it through a default (encoding, errors)
convention which can be overriden by the user (a sensible possibility
being "utf-8, surrogateescape" to allow for lossless round-tripping).
As I said above, we could insist that files on
disk be in wire-format, and for many applications that would work fine,
but I think people would get mad at us if didn't support text files[3].
Again, this simply seems to be two different abstraction levels:
pre-generated raw email messages including headers, or a single text
waiting to be embedded in an actual e-mail.
Anyway, what polymorphism means in email is that if you put in bytes,
you get a BytesMessage, if you put in strings you get a StringMessage,
and if you want the other one you convert.
And then you have two separate worlds while ultimately the same
concepts are underlying. A library accepting BytesMessage will crash
when a program wants to give a StringMessage and vice-versa. That
doesn't sound very practical.
[1] Now that surrogateesscape exists, one might suppose that strings
could be used as an 8bit channel, but that only works if you don't need
to *parse* the non-ASCII data, just transmit it.
Well, you can parse it, precisely. Not only, but it round-trips if you
unparse it again:
header_bytes = b"From: bogus\xFFname"
name, value = header_bytes.decode("utf-8", "surrogateescape").split(":")
name
'From'
value
' bogus\udcffname'
"{0}:{1}".format(name, value).encode("utf-8", "surrogateescape")
b'From: bogus\xffname'
In the end, what I would call a polymorphic best practice is "try to
avoid bytes/str polymorphism if your domain is well-defined
enough" (which I admit URLs aren't necessarily; but there's no
question a single text/XXX e-mail section is text, and a whole
assembled e-mail message is bytes).
Regards
Antoine.
__
Re: [Python-Dev] Polymorphic best practices [was: (Not) delaying the 3.2 release]
On Fri, Sep 17, 2010 at 3:25 PM, Michael Foord wrote: > On 16/09/2010 23:05, Antoine Pitrou wrote: > >> On Thu, 16 Sep 2010 16:51:58 -0400 >> "R. David Murray" wrote: >> >>> What do we store in the model? We could say that the model is always >>> text. But then we lose information about the original bytes message, >>> and we can't reproduce it. For various reasons (mailman being a big >>> one), >>> this is not acceptable. So we could say that the model is always bytes. >>> But we want access to (for example) the header values as text, so header >>> lookup should take string keys and return string values[2]. >>> >> Why can't you have both in a single class? If you create the class >> using a bytes source (a raw message sent by SMTP, for example), the >> class automatically parses and decodes it to unicode strings; if you >> create the class using an unicode source (the text body of the e-mail >> message and the list of recipients, for example), the class >> automatically creates the bytes representation. >> >> I think something like this would be great for WSGI. Rather than focus on > whether bytes *or* text should be used, use a higher level object that > provides a bytes view, and (where possible/appropriate) a unicode view too. > This is what WebOb does; e.g., there is only bytes version of a POST body, and a view on that body that does decoding and encoding. If you don't touch something, it is never decoded or encoded. I only vaguely understand the specifics here, and I suspect the specifics matter, but this seems applicable in this case too -- if you have an incoming email with a smattering of bytes, inline (2047) encoding, other encoding declarations, and then orthogonal systems like quoted-printable, you don't want to touch that stuff if you don't need to as handling unicode objects implies you are normalizing the content, and that might have subtle impacts you don't know about, or don't want to know about, or maybe just don't fit into the unicode model (like a string with two character sets). Note that WebOb does not have two views, it has only one view -- unicode viewing bytes. I'm not sure I could keep two views straight. I *think* Antoine is describing two possible canonical data types (unicode or bytes) and two views. That sounds hard. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Catalog-sig] egg_info in PyPI
On Fri, Sep 17, 2010 at 10:02 PM, Jannis Leidel wrote: > On 17.09.2010, at 20:43, Martin v. Löwis wrote: > >> Here at the DZUG conference, we are planning to integrate explicit access to >> setuptools metadata to the package index. >> >> The current idea is to store the contents of the egg_info directory, >> giving remote access to specific files. By default, PyPI will extract, >> per release, data from the egg that may get uploaded (using the first >> one if multiple eggs get uploaded). If no egg gets uploaded, a VM >> based build service will generate it from a source distributions. >> Tools like setuptools or distribute could also directly upload this >> information, e.g. as part of the register command. >> >> Any opinions? > > I'm confused, wouldn't that basically be a slap in the face for the people > having worked on PEP345 and distutils2, especially during the Summer of Code? > > Also, and I understand enthusiasm tends to build up during conferences, but > wouldn't supporting setuptools' egg-info directory again be a step backwards > after all those months of discussion about the direction of Python packaging? Yeah, we worked on a new standard that was accepted - PEP 345 PyPI is currently publishing pep 345 info as a matter of fact - I did the patch and there's one package that already uses it http://pypi.python.org/pypi/Distutils2. (no deps on this one, but other stuff like links..) I am about to release the work we did during GSOC in distutils2, a first beta that includes all the work we done. Now you want to publish another metadata format at PyPI ? If PyPI takes that direction and adopts, promotes and publishes a standard that is not the one we worked on in the past year, this will increase our difficulty to push the new format so its adopted by the tools then the community. People will just get confused because they will find two competing metadata formats That's exactly the situation where we were at, and that's exactly where I don't want to go back. I am not even understanding what's the benefit of doing this since an egg_info directory is obtained at *build* time and can differ from a machine to another, so it seems pretty useless for me to publish this. The whole point of PEP 345 is to extend our metadata to statically provide dependencies at PyPI, thanks to a micro-language that allows you to describe dependencies for any platform. We worked hard to build some standards, but if PyPI doesn't help us here, everything we did and are doing is useless. Tarek -- Tarek Ziadé | http://ziade.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Some news from my sandbox project
Hi,
I'm still developing irregulary my sandbox project since last june. pysandbox
is a sandbox to execute untrusted Python code. It is able to execute unmodified
Python code with a low overhead. I consider it as stable and secure.
http://github.com/haypo/pysandbox/
Today, the biggest problem is the creation of a read only view of the
__builtins__ dictionary. I tried to create my own object with the dict API,
but I got quickly a segfault. I realized that ceval.c is hardcoded to use
PyDict functions on __builtins__ (LOAD_GLOBAL instruction). So I created a
subclass of dict and replaced modify function (__setitem__, update, clear,
...).
I would like to know if you will agree to modify ceval.c (and maybe some other
functions) to support __builtins__ of another type than dict. I mean add a
fast check (PyDict_CheckExact) on the type. If you agree, I can will an issue
with a patch.
The two last vulnerabilities came from this problem: it was possible to use
dict methods on __builtins__, eg. dict.update(__builtins__, {...}) and
dict.__init__(__builtins__, {...}). Because of that, pysandbox removes all
dict methods able to modify a dict. And so "d={...}; d.update(...)" raises an
error (d has no update attribute) :-/
---
If you would like to test pysandbox, just join ##fschfsch channel of the
Freenode IRC server and talk to fschfsch. It's an IRC bot using pysandbox to
evaluate Python expressions. It is also on #python-fr and #python channels,
but please use ##fschfsch for tests.
http://github.com/haypo/pysandbox/wiki/fschfsch
Or you can pysandbox on your computer. Download the last git version (github
provides tarballs if you don't have git program), install it and run: python
interpreter.py. You have to compile _sandbox, a C module required to modify
some Python internals.
The last git version is compatible with Python 2.5, 2.6 and 2.7. It works on
3.1 and 3.2 after a conversion with 2to3 and a small hack on sandbox/proxy.py:
replace "elif isinstance(value, OBJECT_TYPES):" by "else:" (and remove the
existing else statement). I'm not sure that this hack is safe, and so I didn't
commited it yet.
--
Victor Stinner
http://www.haypocalc.com/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Catalog-sig] egg_info in PyPI
On 2010-09-17, at 4:04 PM, Tarek Ziadé wrote: > I am not even understanding what's the benefit of doing this since an > egg_info directory is obtained at *build* time and can differ from a > machine to another, so it seems pretty useless for me to publish this. I am in full agreement with Tarek here. At ActiveState, we maintain our own index that differs from PyPI in two ways (among others): - use setuptools.package_index to scrap sdists for packages that don't upload them to pypi - PKG-INFO and requires.txt are extracted (if doesn't exist, generate using egg_info cmd) from each of the sdist (and then our index provides the full metadata - with internal links to sdists - as a sqlite db for the builder processes on each platform) The problem with extracting PKG-INFO and requires.txt on the index server is that, the contents of requires.txt sometimes differ based on which platform and Python version on which the egg_info command was run. For eg., the "tox" project depends[1] on "virtualenv" package if it is run using Python2, but not on Python3. > The whole point of PEP 345 is to extend our metadata to statically > provide dependencies at PyPI, thanks to a micro-language that allows > you to describe dependencies for any platform. Static metadata would allow packages like "tox" to configure Python version / platform specific dependencies without resorting to runtime checks. > We worked hard to build some standards, but if PyPI doesn't help us > here, everything we did and are doing is useless. Ideally, in future - I should be able to query static metadata (with environment markers[2] and such) for *any* package from PyPI. And this static metadata is simply a DIST-INFO file (instead of being a directory with a bunch of files in it). I don't really see a point in providing access to, say, the list of entry points of each package. As for as package managers is concerned, the only things that matter are a) list of package names and versions, b) source tarball for each release c) and the corresponding metadata with dependency information. -srid [1] http://code.google.com/p/pytox/source/browse/setup.py#30 [2] http://www.python.org/dev/peps/pep-0345/#environment-markers ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
