Re: [Python-Dev] c/ElementTree XML serialisation
On 08/05/12 17:21, Alex Leach wrote: The w3c SVG specification / recommendation http://www.w3.org/TR/SVG/script.html allows forscript andstyle tags, recommending to wrap the text node in a![CDATA[ … ]]. The spec uses a CDATA section in the example, for demonstration purposes only. It's not a recommendation. CDATA sections are of use for hand-authoring readability, but don't help in machine-serialised documents. You don't get away from the need to encode out-of-band sequences (notably ]] is still invalid) so it doesn't buy you any simplicity. it's definitely a problem when generating SVG No, not really. Neither XML nor SVG mandate use of CDATA sections here; a normal XML-encoded text node as produced by _serialize_xml is fine, and works with all XML processing tools. HTML serialisation has custom rules (the two CDATA elements) because the HTML syntax is not XML. XML languages (including SVG and non-legacy served-as-XML XHTML) have no such special cases. (There are other problems in ElementTree's serialiser that make the output unreflective of the infoset in certain cases, but not here.) -- And Clover mailto:a...@doxdesk.com http://www.doxdesk.com/ gtalk:chat?jid=bobi...@gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Add a new locale codec?
On 2012-02-08 09:28, Simon Cross wrote: I think I'm -1 on a locale encoding because it refers to different actual encodings depending on where and when it's run, which seems surprising, and there's already a more explicit way to achieve the same effect. I'd agree that this is undesirable, and I don't really want locale-specific behaviour to leak out in other places that accept a encoding name (eg ?xml encoding=locale?), but we already have this behaviour with the mbcs encoding on Windows which refers to the locale-specific 'ANSI' code page. -- And Clover mailto:a...@doxdesk.com http://www.doxdesk.com/ gtalk:chat?jid=bobi...@doxdesk.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of the fix for the hash collision vulnerability
On 2012-01-13 11:20, Lennart Regebro wrote: The vulnerability is basically only in the dictionary you keep the form data you get from a request. I'd have to disagree with this statement. The vulnerability is anywhere that creates a dictionary (or set) from attacker-provided keys. That would include HTTP headers, RFC822-family subheaders and parameters, the environ, input taken from JSON or XML, and so on - and indeed hash collision attacks are not at all web-specific. The problem with having two dict implementations is that a caller would have to tell libraries that use dictionaries which implementation to use. So for example an argument would have to be passed to json.load[s] to specify whether the input was known-sane or potentially hostile. Any library could ever use dictionaries to process untrusted input *or any library that used another library that did* would have to pass such a flag through, which would quickly get very unwieldy indeed... or else they'd have to just always use safedict, in which case we're in pretty much the same position as we are with changing dict anyway. -- And Clover mailto:a...@doxdesk.com http://www.doxdesk.com/ gtalk:chat?jid=bobi...@gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3333: wsgi_string() function
On Tue, 2011-01-04 at 03:44 +0100, Victor Stinner wrote: What is this horrible encoding bytes-as-unicode? It is a unicode string decoded from bytes using ISO-8859-1. ISO-8859-1 is the encoding specified by the HTTP RFC, as well as having the happy property of preserving every input byte. os.environ is supposed to be correctly decoded and contain valid unicode characters. Nope. It is not possible to ‘correctly’ decode to unicode for os.environ because that decoding happens long before the web application gets a look in. Maybe the web application is using UTF-8, maybe it's using cp1252, but if we let the server/gateway decide and do that decoding before the application can do anything about it, we will get the wrong encoding in *many* cases and the result will be permanent, unrecoverable mangling of non-ASCII characters in submitted headers. If WSGI uses another encoding than the locale encoding (which is a bad idea), It's an absolutely necessary idea. The locale encoding is nothing to do with the web application's encoding. Windows applications need to be able to use UTF-8 (which is never the ANSI code page), and web applications in general need to be deployable to any server without having to worry about the server's locale. The locale-dependent status quo is that non-ASCII characters in URL paths and other HTTP headers don't work for Python apps. The recoding dances present in wsgiref's CGIHandler for 3.2 are distasteful but completely necessary to normalise differences in encodings used by various servers and platforms to generate their CGI environment. it should use os.environb and decodes keys and values using its own encoding. Well yes, but: (a) os.environb doesn't exist in previous Python 3.1, making it impossible to implement WSGI before 3.2; (b) there are also non-HTTP-related environment variables, which may contain native Unicode strings (eg, very commonly, Windows pathnames), so you have to have both environ *and* environb. The bytes-or-bytes-in-Unicode argument is something that has been bounced around Web-SIG for literally *years*; this is what we ended up with. Although I personally like bytes, frankly, a re-run of this argument *again* whilst WSGI remains in perpetual stalemate does not appeal. WSGI and wsgiref in Python 3.0-3.1 simply not work at all. This has been an embarrassing situation for what is supposed to be a leading web language. Let's not perpetuate this sorry story to 3.2 as well. -- And Clover mailto:a...@doxdesk.com http://www.doxdesk.com skype:uknrbobince gtalk:chat?jid=bobi...@gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3333: wsgi_string() function
On Tue, 2011-01-04 at 03:44 +0100, Victor Stinner wrote: What is this horrible encoding bytes-as-unicode? It is a unicode string decoded from bytes using ISO-8859-1. ISO-8859-1 is the encoding specified by the HTTP RFC, as well as having the happy property of preserving every input byte. PEP requires it. os.environ is supposed to be correctly decoded and contain valid unicode characters. It is not possible to ‘correctly’ decode to unicode for os.environ because that decoding happens long before the web application (the only party that knows what encoding should be in use) gets a look in. Maybe the web application is using UTF-8, maybe it's using cp1252, but if we let the server/gateway decide and do that decoding before the application can do anything about it, we will get the wrong encoding in *many* cases and the result will be permanent, unrecoverable mangling of non-ASCII characters in submitted headers. If WSGI uses another encoding than the locale encoding (which is a bad idea), It's an absolutely necessary idea. The locale encoding is nothing to do with the web application's encoding. Windows applications need to be able to use UTF-8 (which is never the ANSI code page), and web applications in general need to be deployable to any server without having to worry about the server's locale. The locale-dependent status quo is that non-ASCII characters in URL paths and other HTTP headers don't work for Python apps. The recoding dances present in wsgiref's CGIHandler for 3.2 are distasteful but completely necessary to normalise differences in encodings used by various servers and platforms to generate their CGI environment. it should use os.environb and decodes keys and values using its own encoding. Well yes, but: (a) os.environb doesn't exist in previous Python 3.1, making it impossible to implement WSGI before 3.2; (b) a byte environment on Windows would have to be encoded from the Unicode environment, with a server-specific encoding, and then what encoding are you going to choose for the variables that contain non-HTTP-sourced native Unicode strings (such as, very commonly, Windows pathnames)? The bytes-or-bytes-in-Unicode argument is something that has been bounced around Web-SIG for literally *years*; this is what we ended up with. Although I personally like bytes, frankly, a re-run of this argument *again* whilst WSGI remains in perpetual stalemate does not appeal. WSGI and wsgiref in Python 3.0-3.1 simply does not work. This has long been an embarrassing situation for what is supposed to be a leading web language. Let us not perpetuate this sorry story to 3.2 as well. -- And Clover mailto:a...@doxdesk.com http://www.doxdesk.com skype:uknrbobince gtalk:chat?jid=bobi...@gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] minidom and DOM level 2
Jason Orendorff wrote: I don't suppose you'd be willing to update it for Python 2.5, would you? Can do, but at this point I'm not aware of any work having been done on the issues listed there between the 2.3 and 2.5 releases. The danger is people may be used to the wrong minidom behaviours, given they have been static for so long and are in many cases central to how minidom works. -- And Clover mailto:[EMAIL PROTECTED] http://www.doxdesk.com/ ___ Python-Dev mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] minidom and DOM level 2
Jason Orendorff wrote: OK, I think this is worthwhile then. :) I'll read the spec and submit a patch. You're planning to implement EntityReference in minidom? That'll be fun! :-) One of the nastier corners of DOM and XML in general. I'd be happy to do some diffing between the implementation, documentation, tests, and the Recommendation I hacked up an experimental test harness for the W3 DOM Test Suite in order to test my own imp; you might find it useful: http://doxdesk.com/software/py/domts.html The TS is far from definitive, but its tests for Level 1 and 2 are to the best of my knowledge accurate. Incidentally minidom falls far short of passing even Level 1 Core for more reasons than omission of EntityReference. I noted the main known problems with it here: http://pyxml.sourceforge.net/topics/compliance.html Good luck! -- And Clover mailto:[EMAIL PROTECTED] http://www.doxdesk.com/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Explicit Lexical Scoping (pre-PEP?)
Guido van Rossum [EMAIL PROTECTED] wrote: 1. do nothing 2. extend global's meaning 3. add outer keyword 2.5. extend global syntax to cover both [really global] and [innermost matching scope]. eg. global x, y outer # trailing non-keyword global in x, y # re-use keyword not global x# ceci n'est pas un global ... # something less ugly? Personally it's not a burning need Agreed. Inability to write as well as read nested scopes is more of an aesthetic wart than a practical one IMO. -- And Clover mailto:[EMAIL PROTECTED] http://www.doxdesk.com/ -- And Clover mailto:[EMAIL PROTECTED] http://www.doxdesk.com/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New-style icons, .desktop file
Morning! I've done some tweaks to the previously-posted-about icon set, taking note of some of the comments here and on -list. In particular, amongst more minor changes: - added egg icon (based on zip) - flipped pycon to work better with shortcut arrow - emphasised borders of 32x32 version of pycon, and changed text colour, in order to distinguish more from pyc. In balance I think this is enough of a change, though Nick Coghlan's idea of just using the plus in this case also has merit - indeed this is what happens at 16x16 - built .ico files without the Windows Vista enormo-icons. If the icons *were* to go in the win32 distribution, it probably wouldn't make sense to spend the considerably larger filesize on Vista icons until such time as people are actually using Vista. - included PNG and SVG version of icons. The SVG unfortunately doesn't preserve all the effects of the original Xara files, partly because a few effects (feathering, bevels) can't be done in SVG 1.1, and partly because the current conversion process is bobbins (but we're working on that!). So the nice gradients and things are gone, but it should still be of use as a base for anyone wanting to hack on the icons. - excised file() in favour of open() ;-) Files and preview here: http://doxdesk.com/img/software/py/icons2.zip http://doxdesk.com/img/software/py/icons2.png Oh, and is the intention to deprecate the purple/green horizontal snake logo previously used for 'Python for Windows' (as well as PSF)? If so, Erik van B's installer graphic could probably do with a quick refresh. cheers, -- And Clover mailto:[EMAIL PROTECTED] http://www.doxdesk.com/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com