Re: [Python-Dev] Improved evaluator added to ast module

2012-10-18 Thread Daniel Holth
On Thu, Oct 11, 2012 at 1:36 PM, Vinay Sajip  wrote:
> Daniel Holth  gmail.com> writes:
>
>> How does this compare to the markerlib approach? In markerlib you just
>> make sure all the AST nodes are in a set of allowed nodes, currently
>> (Compare, BoolOp, Attribute, Name, Load, Str, cmpop, boolop), and then
>> use the normal eval(). Is one way more secure / fast / flexible than
>> the other?
>
> I don't think performance is an issue, and the markerlib approach seems just
> as reasonable as the one I've taken, except that it calls eval(), whereas my
> approach doesn't. It boils down to what should be allowed in expressions, and
> what shouldn't be.
>
> ISTM there is a space for a limited evaluator that's less limiting than
> literal_eval(). I do realise that this type of sandboxing is not easy to 
> achieve,
> and I'm not aiming to advance the state of the art here - I just want to close
> the issue in the best way I can.

I bet the literal_eval approach simply predates compile(ast) which is
a Python 2.6 feature. It is also probably slightly faster on CPython
to avoid compile(ast) if you are only evaluating the code once.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Improved evaluator added to ast module

2012-10-18 Thread Georg Brandl
On 10/18/2012 03:16 PM, Daniel Holth wrote:
> On Thu, Oct 11, 2012 at 1:36 PM, Vinay Sajip  wrote:
>> Daniel Holth  gmail.com> writes:
>>
>>> How does this compare to the markerlib approach? In markerlib you just
>>> make sure all the AST nodes are in a set of allowed nodes, currently
>>> (Compare, BoolOp, Attribute, Name, Load, Str, cmpop, boolop), and then
>>> use the normal eval(). Is one way more secure / fast / flexible than
>>> the other?
>>
>> I don't think performance is an issue, and the markerlib approach seems just
>> as reasonable as the one I've taken, except that it calls eval(), whereas my
>> approach doesn't. It boils down to what should be allowed in expressions, and
>> what shouldn't be.
>>
>> ISTM there is a space for a limited evaluator that's less limiting than
>> literal_eval(). I do realise that this type of sandboxing is not easy to 
>> achieve,
>> and I'm not aiming to advance the state of the art here - I just want to 
>> close
>> the issue in the best way I can.
> 
> I bet the literal_eval approach simply predates compile(ast) which is
> a Python 2.6 feature.

Nope. All of ast (in contrast to _ast) is new in 2.6.

> It is also probably slightly faster on CPython
> to avoid compile(ast) if you are only evaluating the code once.

Yes; if you inspect the nodes anyway you can just as well evaluate them
on the way.

Georg

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] accept the wheel PEPs 425, 426, 427

2012-10-18 Thread Daniel Holth
I'd like to submit the Wheel PEPs 425 (filename metadata), 426
(Metadata 1.3), and 427 (wheel itself) for acceptance. The format has
been stable since May and we are preparing a patch to support it in
pip, but we need to earn consensus before including it in the most
widely used installer.

Wheel is a binary packaging format that is mostly based on the
phenomenal work done by 'packaging' as PEP 376 et al. The resulting
packages solve packaging problems by being installable without
setup.py or any variation of distutils, lxml can be installed in 0.7
seconds, and a simple installer is just "unzip" inside site-packages.
Wheel installers know about the major version number of the spec
itself, and will refuse to install version 2.0 wheels if they do not
understand them, so the format can be changed down the line.

Let me know what I need to do to get it accepted, if anything needs to
be added or revised, or, preferably, that it is awesome and you want
to use it ASAP.

Thanks,

Daniel Holth
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 427 comment: code signing

2012-10-18 Thread martin

I'm -1 on the usage of ed25519 in PEP 427. While the PEP proposes to use JSON
Web signatures, this algorithm is not supported by the current JWS draft [1].

Instead, I suggest to use the ES256 algorithm from JWS, i.e. ECDSA with the
NIST P-256 curve and SHA-256. This has the advantage of using standard
algorithms [2].

I don't know what the rationale for suggesting ed25519 is; I suppose that
existence of a pure-Python implementation played a role. However:
- ECDSA also has a pure-Python implementation
- ECDSA is well-supported by OpenSSL, i.e. a signature generator may also
  invoke the OpenSSL command line for efficient implementation. I believe
  M2Crypto also exposes enough of OpenSSL tp perform ECDSA signing and
  verification.

I'm -0 on the use of JWS; I would prefer a signature format that is already
an established internet standard (such a PGP or S/MIME). However, it does look
that this may become a proper internet standard in the near future, so it's
an ok choice.

If it really must be ed25519, I request that this is registered with IANA
once the PEP is accepted, the RFC is accepted, and the JWS algorithm
registry is open.

Regards,
Martin

[1] http://tools.ietf.org/html/draft-ietf-jose-json-web-signature-06
[2] http://tools.ietf.org/html/draft-ietf-jose-json-web-algorithms-06

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] accept the wheel PEPs 425, 426, 427

2012-10-18 Thread Benjamin Peterson
2012/10/18 Daniel Holth :
> Let me know what I need to do to get it accepted, if anything needs to
> be added or revised, or, preferably, that it is awesome and you want
> to use it ASAP.

Traditionally, you send the peps to python-dev, so people can bikeshed inline.

-- 
Regards,
Benjamin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 425 comment: package names

2012-10-18 Thread martin
ISTM that some important information and some elaboration is missing  
from PEP 425.


The PEP is currently silent on how exactly these "compatibility tags"  
are combined
with the package name, and the file extension. This should be  
specified, and preferably

some examples be given.

Regards,
Martin



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 426 comment: field order

2012-10-18 Thread martin
I'd like to request that PEP 426 is extended to talk about the order  
of fields.
In particular, for the Extension field, is it necessary that all  
"additional tags"

follow immediately the respective Extension field?

I also request that RFC 2119 terminology is followed strictly. In particular,
the phrasing "Additional tags defined by the extension should be of  
the form string/Name:"

is unclear - under what "particular circumstances" can I deviate from that
requirement, i.e. use some form other than string/Name?

Regards,
Martin


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 426 comment: field order

2012-10-18 Thread Daniel Holth
Will add that the order is not significant. It is essentially a multidict.

On Thu, Oct 18, 2012 at 2:45 PM,   wrote:
> I'd like to request that PEP 426 is extended to talk about the order of
> fields.
> In particular, for the Extension field, is it necessary that all "additional
> tags"
> follow immediately the respective Extension field?
>
> I also request that RFC 2119 terminology is followed strictly. In
> particular,
> the phrasing "Additional tags defined by the extension should be of the form
> string/Name:"
> is unclear - under what "particular circumstances" can I deviate from that
> requirement, i.e. use some form other than string/Name?
>
> Regards,
> Martin
>
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] accept the wheel PEPs 425, 426, 427

2012-10-18 Thread Antoine Pitrou
On Thu, 18 Oct 2012 14:35:19 -0400
Benjamin Peterson  wrote:
> 2012/10/18 Daniel Holth :
> > Let me know what I need to do to get it accepted, if anything needs to
> > be added or revised, or, preferably, that it is awesome and you want
> > to use it ASAP.
> 
> Traditionally, you send the peps to python-dev, so people can bikeshed inline.

Or at least send the URLs, it's helpful.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 425 comment: package names

2012-10-18 Thread Daniel Holth
On Thu, Oct 18, 2012 at 2:36 PM,   wrote:
> ISTM that some important information and some elaboration is missing from
> PEP 425.
>
> The PEP is currently silent on how exactly these "compatibility tags" are
> combined
> with the package name, and the file extension. This should be specified, and
> preferably
> some examples be given.

Wheel specifies how it uses the tags. You have to strip the known
extension from the filename. I can include its example "this is how a
particular file format uses the tags" in the pep. It is
name-version-tag-tag-tag.ext with all - folded to _
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 425 comment: package names

2012-10-18 Thread martin


Zitat von Daniel Holth :


On Thu, Oct 18, 2012 at 2:36 PM,   wrote:

ISTM that some important information and some elaboration is missing from
PEP 425.

The PEP is currently silent on how exactly these "compatibility tags" are
combined
with the package name, and the file extension. This should be specified, and
preferably
some examples be given.


Wheel specifies how it uses the tags. You have to strip the known
extension from the filename.


Hmm. The word "extension" doesn't even appear in connection with
file names in the PEP (only in relation to "C extensions").

Does the PEP, or does it not, specify that a dash must be used between
the package-name-and-version? Neither the words "hyphen" nor "dash"
appear in the PEP, except that hyphens are used inside the tag, and that
hyphens and dots in get_platform() results are replaced.

Regards,
Martin


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 427 comment: code signing

2012-10-18 Thread Daniel Holth
On Thu, Oct 18, 2012 at 2:21 PM,   wrote:
> I'm -1 on the usage of ed25519 in PEP 427. While the PEP proposes to use
> JSON
> Web signatures, this algorithm is not supported by the current JWS draft
> [1].
>
> Instead, I suggest to use the ES256 algorithm from JWS, i.e. ECDSA with the
> NIST P-256 curve and SHA-256. This has the advantage of using standard
> algorithms [2].
>
> I don't know what the rationale for suggesting ed25519 is; I suppose that
> existence of a pure-Python implementation played a role. However:
> - ECDSA also has a pure-Python implementation
> - ECDSA is well-supported by OpenSSL, i.e. a signature generator may also
>   invoke the OpenSSL command line for efficient implementation. I believe
>   M2Crypto also exposes enough of OpenSSL tp perform ECDSA signing and
>   verification.
>
> I'm -0 on the use of JWS; I would prefer a signature format that is already
> an established internet standard (such a PGP or S/MIME). However, it does
> look
> that this may become a proper internet standard in the near future, so it's
> an ok choice.
>
> If it really must be ed25519, I request that this is registered with IANA
> once the PEP is accepted, the RFC is accepted, and the JWS algorithm
> registry is open.

I expected ed25519 to be somewhat controversial. I will register it
with IANA when possible. I chose it because ed25519 is fast enough
that you need never consider dis-using it for performance reasons. The
wheel reference implementation includes a reasonably performant
~250-line pure-Python version*. Unlike ECDSA, signature generation
does not consume entropy; this feature of ECDSA broke the Playstation
3's code signing system.

JWS is likewise tiny to implement, so the wheel reference installer
always can and does verify the internal consistency of every signed
wheel.

S/MIME signatures are allowed as a courtesy to a government contractor
friend but are not implemented.

* https://bitbucket.org/dholth/wheel/src/tip/wheel/signatures
* http://ed25519.cr.yp.to
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] accept the wheel PEPs 425, 426, 427

2012-10-18 Thread Daniel Holth
On Thu, Oct 18, 2012 at 3:10 PM, Antoine Pitrou  wrote:
> On Thu, 18 Oct 2012 14:35:19 -0400
> Benjamin Peterson  wrote:
>> 2012/10/18 Daniel Holth :
>> > Let me know what I need to do to get it accepted, if anything needs to
>> > be added or revised, or, preferably, that it is awesome and you want
>> > to use it ASAP.
>>
>> Traditionally, you send the peps to python-dev, so people can bikeshed 
>> inline.
>
> Or at least send the URLs, it's helpful.

The texts seemed a bit long for direct inclusion.

http://www.python.org/dev/peps/pep-0425/
http://www.python.org/dev/peps/pep-0426/
http://www.python.org/dev/peps/pep-0427/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 425 comment: package names

2012-10-18 Thread Daniel Holth
On Thu, Oct 18, 2012 at 3:23 PM,   wrote:
>
> Zitat von Daniel Holth :
>
>
>> On Thu, Oct 18, 2012 at 2:36 PM,   wrote:
>>>
>>> ISTM that some important information and some elaboration is missing from
>>> PEP 425.
>>>
>>> The PEP is currently silent on how exactly these "compatibility tags" are
>>> combined
>>> with the package name, and the file extension. This should be specified,
>>> and
>>> preferably
>>> some examples be given.
>>
>>
>> Wheel specifies how it uses the tags. You have to strip the known
>> extension from the filename.
>
>
> Hmm. The word "extension" doesn't even appear in connection with
> file names in the PEP (only in relation to "C extensions").
>
> Does the PEP, or does it not, specify that a dash must be used between
> the package-name-and-version? Neither the words "hyphen" nor "dash"
> appear in the PEP, except that hyphens are used inside the tag, and that
> hyphens and dots in get_platform() results are replaced.

The pep was only meant to specify the tag format which is a-b-c or
a.z-b-c , and some typical values of those tags. I agree that it is
confusing not to include an example of a tag being used in the context
of an actual file name.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] accept the wheel PEPs 425, 426, 427

2012-10-18 Thread Benjamin Peterson
2012/10/18 Daniel Holth :
> On Thu, Oct 18, 2012 at 3:10 PM, Antoine Pitrou  wrote:
>> On Thu, 18 Oct 2012 14:35:19 -0400
>> Benjamin Peterson  wrote:
>>> 2012/10/18 Daniel Holth :
>>> > Let me know what I need to do to get it accepted, if anything needs to
>>> > be added or revised, or, preferably, that it is awesome and you want
>>> > to use it ASAP.
>>>
>>> Traditionally, you send the peps to python-dev, so people can bikeshed 
>>> inline.
>>
>> Or at least send the URLs, it's helpful.
>
> The texts seemed a bit long for direct inclusion.

Those are pretty short as PEPs go. :)


-- 
Regards,
Benjamin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] accept the wheel PEPs 425, 426, 427

2012-10-18 Thread Daniel Holth
PEP: 425
Title: Compatibility Tags for Built Distributions
Version: $Revision$
Last-Modified: 07-Aug-2012
Author: Daniel Holth 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 27-Jul-2012
Python-Version: 3.4
Post-History: 8-Aug-2012


Abstract


This PEP specifies a tagging system to indicate with which versions of
Python a built or binary distribution is compatible.  A set of three
tags indicate which Python implementation and language version, ABI,
and platform a built distribution requires.  The tags are terse because
they will be included in filenames.


PEP Editor's Note
=

While the naming scheme described in this PEP will not be supported directly
in the standard library until Python 3.4 at the earliest, draft
implementations may be made available in third party projects.


Rationale
=

Today "python setup.py bdist" generates the same filename on PyPy
and CPython, but an incompatible archive, making it inconvenient to
share built distributions in the same folder or index.  Instead, built
distributions should have a file naming convention that includes enough
information to decide whether or not a particular archive is compatible
with a particular implementation.

Previous efforts come from a time where CPython was the only important
implementation and the ABI was the same as the Python language release.
This specification improves upon the older schemes by including the Python
implementation, language version, ABI, and platform as a set of tags.

By comparing the tags it supports with the tags listed by the
distribution, an installer can make an educated decision about whether
to download a particular built distribution without having to read its
full metadata.

Overview


The tag format is {python tag}-{abi tag}-{platform tag}

python tag
‘py27’, ‘cp33’
abi tag
‘cp32dmu’, ‘none’
platform tag
‘linux_x86_64’, ‘any’

For example, the tag py27-none-any indicates compatible with Python 2.7
(any Python 2.7 implementation) with no abi requirement, on any platform.

Details
===

Python Tag
--

The Python tag indicates the implementation and version required by
a distribution.  Major implementations have abbreviated codes, initially:

* py: Generic Python (does not require implementation-specific features)
* cp: CPython
* ip: IronPython
* pp: PyPy
* jy: Jython

Other Python implementations should use `sys.implementation.name`.

The version is `py_version_nodot`.  CPython gets away with no dot,
but if one is needed the underscore `_` is used instead.  PyPy should
probably use its own versions here `pp18`, `pp19`.

The version can be just the major version `2` or `3` `py2`, `py3` for
many pure-Python distributions.

Importantly, major-version-only tags like `py2` and `py3` are not
shorthand for `py20` and `py30`.  Instead, these tags mean the packager
intentionally released a cross-version-compatible distribution.

A single-source Python 2/3 compatible distribution can use the compound
tag `py2.py3`.  See `Compressed Tag Sets`, below.

ABI Tag
---

The ABI tag indicates which Python ABI is required by any included
extension modules.  For implementation-specific ABIs, the implementation
is abbreviated in the same way as the Python Tag, e.g. `cp33d` would be
the CPython 3.3 ABI with debugging.

The CPython stable ABI is `abi3` as in the shared library suffix, and
is available starting with Python 3.2.

Implementations with a very unstable ABI may use the first 6 bytes (as
8 base64-encoded characters) of the SHA-256 hash of ther source code
revision and compiler flags, etc, but will probably not have a great need
to distribute binary distributions. Each implementation's community may
decide how to best use the ABI tag.

Platform Tag


The platform tag is simply `distutils.util.get_platform()` with all
hyphens `-` and periods `.` replaced with underscore `_`.

* win32
* linux_i386
* linux_x86_64

Use
===

The tags are used by installers to decide which built distribution
(if any) to download from a list of potential built distributions.
The installer maintains a list of (pyver, abi, arch) tuples that it
will support.  If the built distribution's tag is `in` the list, then
it can be installed.

For example, an installer running under CPython 3.3 on a linux_x86_64
system might support::

 1. cp33-cp33m-linux_x86_64
 2. cp33-none-linux_x86_64
 3. cp3-abi3-linux_x86_64
 4. cp33-none-any
 5. cp3-none-any
 6. py33-none-any
 7. py3-none-any

A user could instruct their installer to fall back to building from an
sdist more or less often by configuring this list of tags.

Rarely there will be more than one supported built distribution for a
particular version of a package.  For example, a packager could release
a package tagged `cp3-abi3-linux_x86_64` that contains an optional C
extension and the same distribution tagged `py3-none-any` that does not.
The index of the tag in the supported tags list breaks the tie, and the

Re: [Python-Dev] accept the wheel PEPs 425, 426, 427

2012-10-18 Thread Daniel Holth
PEP: 426
Title: Metadata for Python Software Packages 1.3
Version: $Revision$
Last-Modified: $Date$
Author: Daniel Holth 
Discussions-To: Distutils SIG
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 30 Aug 2012


Abstract


This PEP describes a mechanism for adding metadata to Python distributions.
It includes specifics of the field names, and their semantics and
usage.

This document specifies version 1.3 of the metadata format.
Version 1.0 is specified in PEP 241.
Version 1.1 is specified in PEP 314.
Version 1.2 is specified in PEP 345.

Version 1.3 of the metadata format adds fields designed to make
third-party packaging of Python Software easier and defines a
formal extension mechanism.  The fields are "Setup-Requires-Dist"
"Provides-Extra", and "Extension".  This version also adds the `extra`
variable to the `environment markers` specification.

Metadata Files
==

The syntax defined in this PEP is for use with Python distribution
metadata files. The file format is a simple UTF-8 encoded Key: value
format with no maximum line length. It is parsable by the ``email``
module with an appropriate ``email.policy.Policy()``.  The field names
listed in the `Fields`_ section are used as the header names.

There are two standard locations for these metadata files:

* the ``PKG-INFO`` file included in the base directory of Python
  source distribution archives (as created by the distutils ``sdist``
  command)
* the ``dist-info/METADATA`` files in a Python installation database, as
  described in PEP 376.

Other tools involved in Python distribution may choose to record this
metadata in additional tool-specific locations (e.g. as part of a
binary distribution archive format).

Encoding


Metadata 1.3 files are UTF-8 with the restriction that keys must be
ASCII. Parser implementations should be aware that older versions of
the Metadata specification do not specify an encoding.

Fields
==

This section specifies the names and semantics of each of the
supported metadata fields.

Fields marked with "(Multiple use)" may be specified multiple
times in a single metadata file.  Other fields may only occur
once in a metadata file.  Fields marked with "(optional)" are
not required to appear in a valid metadata file; all other
fields must be present.

Metadata-Version


Version of the file format; "1.3" is the only legal value.

Example::

Metadata-Version: 1.3


Name


The name of the distribution.

Example::

Name: BeagleVote


Version
:::

A string containing the distribution's version number.  This
field  must be in the format specified in PEP 386.

Example::

Version: 1.0a2


Platform (multiple use)
:::

A Platform specification describing an operating system supported by
the distribution which is not listed in the "Operating System" Trove
classifiers.
See "Classifier" below.

Examples::

Platform: ObscureUnix
Platform: RareDOS


Supported-Platform (multiple use)
:

Binary distributions containing a metadata file will use the
Supported-Platform field in their metadata to specify the OS and
CPU for which the binary distribution was compiled.  The semantics of
the Supported-Platform field are not specified in this PEP.

Example::

Supported-Platform: RedHat 7.2
Supported-Platform: i386-win32-2791


Summary
:::

A one-line summary of what the distribution does.

Example::

Summary: A module for collecting votes from beagles.


Description (optional)
::

A longer description of the distribution that can run to several
paragraphs.  Software that deals with metadata should not assume
any maximum size for this field, though people shouldn't include
their instruction manual as the description.

The contents of this field can be written using reStructuredText
markup [1]_.  For programs that work with the metadata, supporting
markup is optional; programs can also display the contents of the
field as-is.  This means that authors should be conservative in
the markup they use.

To support empty lines and lines with indentation with respect to
the RFC 822 format, any CRLF character has to be suffixed by 7 spaces
followed by a pipe ("|") char. As a result, the Description field is
encoded into a folded field that can be interpreted by RFC822
parser [2]_.

Example::

Description: This project provides powerful math functions
|For example, you can use `sum()` to sum numbers:
|
|Example::
|
|>>> sum(1, 2)
|3
|

This encoding implies that any occurences of a CRLF followed by 7 spaces
and a pipe char have to be replaced by a single CRLF when the field is unfolded
using a RFC822 reader.


Keywords (optional)
:::

A list of additional keywords to be used to assist searching
for the distribution in a larger catalog.

Example::

Keywords: dog pup

Re: [Python-Dev] accept the wheel PEPs 425, 426, 427

2012-10-18 Thread Daniel Holth
PEP: 427
Title: The Wheel Binary Package Format 0.1
Version: $Revision$
Last-Modified: $Date$
Author: Daniel Holth 
Discussions-To: 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 20-Sep-2012
Post-History:


Abstract


This PEP describes a built-package format for Python called "wheel".

A wheel is a ZIP-format archive with a specially formatted file name
and the ``.whl`` extension.  It contains a single distribution nearly
as it would be installed according to PEP 376 with a particular
installation scheme.  A wheel file may be installed by simply
unpacking into site-packages with the standard 'unzip' tool, while
preserving enough information to spread its contents out onto their
final paths at any later time.


Note


This draft PEP describes version 0.1 of the "wheel" format. When the PEP
is accepted, the version will be changed to 1.0.  (The major version
is used to indicate potentially backwards-incompatible changes to the
format.)


Rationale
=

Python needs a package format that is easier to install than sdist.
Python's sdist packages are defined by and require the distutils and
setuptools build systems, running arbitrary code to build-and-install,
and re-compile, code just so it can be installed into a new
virtualenv.  This system of conflating build-install is slow, hard to
maintain, and hinders innovation in both build systems and installers.

Wheel attempts to remedy these problems by providing a simpler
interface between the build system and the installer.  The wheel
binary package format frees installers from having to know about the
build system, saves time by amortizing compile time over many
installations, and removes the need to install a build system in the
target environment.


Details
===

Installing a wheel 'distribution-1.0.py32.none.any.whl'
---

Wheel installation notionally consists of two phases:

- Unpack.

  a. Parse ``distribution-1.0.dist-info/WHEEL``.
  b. Check that installer is compatible with Wheel-Version.  Warn if
 minor version is greater, abort if major version is greater.
  c. If Root-Is-Purelib == 'true', unpack archive into purelib
 (site-packages).
  d. Else unpack archive into platlib (site-packages).

- Spread.

  a. Unpacked archive includes ``distribution-1.0.dist-info/`` and (if
 there is data) ``distribution-1.0.data/``.
  b. Move each subtree of ``distribution-1.0.data/`` onto its
 destination path. Each subdirectory of ``distribution-1.0.data/``
 is a key into a dict of destination directories, such as
 ``distribution-1.0.data/(purelib|platlib|headers|scripts|data)``.
 The initially supported paths are taken from
 ``distutils.command.install``.
  c. If applicable, update scripts starting with ``#!python`` to point
 to the correct interpreter.
  d. Update ``distribution-1.0.dist.info/RECORD`` with the installed
 paths.
  e. Remove empty ``distribution-1.0.data`` directory.
  f. Compile any installed .py to .pyc. (Uninstallers should be smart
 enough to remove .pyc even if it is not mentioned in RECORD.)

Recommended installer features
''

Rewrite ``#!python``.
In wheel, scripts are packaged in
``{distribution}-{version}.data/scripts/``.  If the first line of
a file in ``scripts/`` starts with exactly b'#!python', rewrite to
point to the correct interpreter.  Unix installers may need to add
the +x bit to these files if the archive was created on Windows.

Generate script wrappers.
In wheel, scripts packaged on Unix systems will certainly not have
accompanying .exe wrappers.  Windows installers may want to add them
during install.


File Format
---

File name convention


The wheel filename is ``{distribution}-{version}(-{build
tag})?-{python tag}-{abi tag}-{platform tag}.whl``.

distribution
Distribution name, e.g. 'django', 'pyramid'.

version
PEP-386 compliant version, e.g. 1.0.

build tag
Optional build number.  Must start with a digit.  A tie breaker if
two wheels have the same version.  Sort as None if unspecified,
else sort the initial digits as a number, and the remainder
lexicographically.

language implementation and version tag
E.g. 'py27', 'py2', 'py3'.

abi tag
E.g. 'cp33m', 'abi3', 'none'.

platform tag
E.g. 'linux_x86_64', 'any'.

For example, ``distribution-1.0-1-py27-none-any.whl`` is the first
build of a package called 'distribution', and is compatible with
Python 2.7 (any Python 2.7 implementation), with no ABI (pure Python),
on any CPU architecture.

The last three components of the filename before the extension are
called "compatibility tags."  The compatibility tags express the
package's basic interpreter requirements and are detailed in PEP 425.


File contents
'

#. Wheel files contain a folder {distribution}-{version}.dist-info/
   with the PEP 426 metadata (Meta

[Python-Dev] Why not using the hash when comparing strings?

2012-10-18 Thread Victor Stinner
Hi,

I would like to know if there a reason for not using the hash of
(bytes or unicode) strings when comparing two objects and the hash of
the two objects was already been computed. Using the hash would speed
up comparaison of long strings when the two strings are different.

Something like:

if ((op == Py_EQ || op == Py_NE)
&& a->ob_shash != -1
&& b->ob_shash != -1
&& a->ob_shash != b->ob_shash) {
/* strings are not equal */
}

There are hash collision, so a->ob_shash == b->ob_shash doesn't mean
that the two strings are equal. But if the two hashs are different,
the two strings are different. Isn't it?

Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why not using the hash when comparing strings?

2012-10-18 Thread Steven D'Aprano

On 19/10/12 12:03, Victor Stinner wrote:

Hi,

I would like to know if there a reason for not using the hash of
(bytes or unicode) strings when comparing two objects and the hash of
the two objects was already been computed. Using the hash would speed
up comparaison of long strings when the two strings are different.


Assuming the hash has already been compared, then I imagine it would be
faster.


Something like:

 if ((op == Py_EQ || op == Py_NE)
 &&  a->ob_shash != -1
 &&  b->ob_shash != -1
 &&  a->ob_shash != b->ob_shash) {
 /* strings are not equal */
 }

There are hash collision, so a->ob_shash == b->ob_shash doesn't mean
that the two strings are equal. But if the two hashs are different,
the two strings are different. Isn't it?


I would certainly hope so :)


--
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why not using the hash when comparing strings?

2012-10-18 Thread Benjamin Peterson
2012/10/18 Victor Stinner :
> Hi,
>
> I would like to know if there a reason for not using the hash of
> (bytes or unicode) strings when comparing two objects and the hash of
> the two objects was already been computed. Using the hash would speed
> up comparaison of long strings when the two strings are different.
>
> Something like:
>
> if ((op == Py_EQ || op == Py_NE)
> && a->ob_shash != -1
> && b->ob_shash != -1
> && a->ob_shash != b->ob_shash) {
> /* strings are not equal */
> }
>
> There are hash collision, so a->ob_shash == b->ob_shash doesn't mean
> that the two strings are equal. But if the two hashs are different,
> the two strings are different. Isn't it?

It would be interesting to see how common it is for strings which have
their hash computed to be compared.



-- 
Regards,
Benjamin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 426 comment: field order

2012-10-18 Thread Daniel Holth
Added some notes about the (lack of) ordering.

The email module provides an ordered multidict interface to the data.
The first tag wins (if you improperly define Name: twice for example),
but the order of everything is preserved. We just don't need it,
except that it might be surprising to see your classifiers randomly
re-ordered.

It is also possible to textwrap.dedent(p['Description']) with p =
email.parser.Parser().parsestr(metadata).

I don't really expect anyone to use email.parser.Parser() so I'm
hesitant to mention it, but it seems to work. I say it's read-only
because the 3.2 Parser()/Generator() can't reserialize it due to
Unicode. The improved Python 3.3 email module would be able to under
the right conditions.


diff -r 79e95f487a33 -r 4773b6b3e8f2 pep-0426.txt
--- a/pep-0426.txt  Thu Oct 18 08:31:44 2012 +0100
+++ b/pep-0426.txt  Thu Oct 18 21:10:26 2012 -0400
@@ -33,10 +33,14 @@

 The syntax defined in this PEP is for use with Python distribution
 metadata files. The file format is a simple UTF-8 encoded Key: value
-format with no maximum line length. It is parsable by the ``email``
+format with no maximum line length.  It is parseable by the ``email``
 module with an appropriate ``email.policy.Policy()``.  The field names
 listed in the `Fields`_ section are used as the header names.

+In Python 3.2, a serviceable read-only parser is::
+
+email.parser.Parser().parsestr(metadata)
+
 There are two standard locations for these metadata files:

 * the ``PKG-INFO`` file included in the base directory of Python
@@ -66,7 +70,8 @@
 times in a single metadata file.  Other fields may only occur
 once in a metadata file.  Fields marked with "(optional)" are
 not required to appear in a valid metadata file; all other
-fields must be present.
+fields must be present.  The fields may appear in any order within
+the file.

 Metadata-Version
 
@@ -480,12 +485,17 @@

 An ASCII string, not containing whitespace or the / character, that
 indicates the presence of extended metadata. Additional tags defined by
-the extension should be of the form string/Name::
+an `Extension: Chili` should be of the form `Chili/Name`::

 Extension: Chili
 Chili/Type: Poblano
 Chili/Heat: Mild

+An implementation might iterate over all the declared `Extension:`
+fields to invoke the processors for those extensions.  As the order of
+the fields is not used, the `Extension: Chili` field may appear before
+or after its declared tags `Chili/Type:` etc.
+
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why not using the hash when comparing strings?

2012-10-18 Thread MRAB

On 2012-10-19 02:03, Victor Stinner wrote:

Hi,

I would like to know if there a reason for not using the hash of
(bytes or unicode) strings when comparing two objects and the hash of
the two objects was already been computed. Using the hash would speed
up comparaison of long strings when the two strings are different.

Something like:

 if ((op == Py_EQ || op == Py_NE)
 && a->ob_shash != -1
 && b->ob_shash != -1
 && a->ob_shash != b->ob_shash) {
 /* strings are not equal */
 }

There are hash collision, so a->ob_shash == b->ob_shash doesn't mean
that the two strings are equal. But if the two hashs are different,
the two strings are different. Isn't it?


Correct. It's true for any hashable type.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] accept the wheel PEPs 425, 426, 427

2012-10-18 Thread Stephen J. Turnbull
Executive summary:

You probably should include a full ABNF grammar

Daniel Holth writes:

 > To support empty lines and lines with indentation with respect to
 > the RFC 822 format, any CRLF character has to be suffixed by 7 spaces
 > followed by a pipe ("|") char. [...]
 > This encoding implies that any occurences of a CRLF followed by 7 spaces
 > and a pipe char have to be replaced by a single CRLF when the field
 > is unfolded using a RFC822 reader.

This isn't RFC 822 unfolding at all.  An RFC 822 "reader" will simply
remove the CRLF and optionally "canonicalize" the spaces (the latter
is not allowed by RFC 822, but sometimes it's observed).  This implies
that if you use an RFC 822 reader, you need to replace instances of the
regexp r"\s+\|" with a newline.  (If you have a conforming reader, you
can use the regexp r"\s{7}\|" instead.)  And of course you have to
RFC-2047-encode non-ASCII in an RFC-822 field.

So please don't refer to the basic format ("field-name: field-body"
followed by optional continuation lines) as "RFC822".  "Inspired by
RFC 822" maybe.  Better "chosen to resemble the familiar RFC 822
header format used in email and netnews."  (Note that RFC 822 is
actually ambiguous even about the basic format; section 3.4.2 implies
that "name   :body" would be an acceptable field, although section
3.1.2 doesn't seem to allow space before the colon.  Referring to RFC
822 as a standard here is a bad idea.  There is a reason why that
standard gets revised/replaced periodically!)

I don't understand why you specify that the newline is represented by
CRLF *after* unfolding.  Once unfolded, these fields are all what
RFC822 would call "unstructured fields" (in that context of that RFC).
They will contain text followed by a terminating CRLF, but including
no others.  In fact that CRLF is redundant, and may as well be
stripped (and probably will be, in most implementations).

I don't understand why you specify newline as CRLF here, except to
pretend that you're respecting RFC 822.  But all you're using are the
division of a field into field-name and field-body by a colon, and the
convention that a newline followed by folding whitespace is a
continuation line.  These are both trivial to implement, and almost
all implementations will undoubtedly read the file as *text* in
universal newline mode.  I see no reason to specify a binary format.

 > Author-email (optional)
 > :::
 > 
 > A string containing the author's e-mail address.  It can contain
 > a name and e-mail address in the legal forms for a RFC-822
 > ``From:`` header.

Heavens above, no!  From RFC 822, this:

Wilt . (the  Stilt) [email protected]

is a legal email address, which probably would be represented
conventionally as

"Wilt (the Stilt) Chamberlain" 

However, it's not at all clear that all mail clients, let alone just
plain folks, will interpret the first form correctly.  And there are
worse examples given in that RFC.  Is there a reason why you can't
require these to be in the form recommended by RFC 5322 (ie, the
"conventional representation" above)?  Or you could relax this so that
the quotes are prohibited.

 > License (optional)
 > ::
 > 
 > Text indicating the license covering the distribution where the license
 > is not a selection from the "License" Trove classifiers. See
 > "Classifier" below.  This field may also be used to specify a
 > particular version of a licencse which is named via the ``Classifier``
A
typo+

 > field, or to indicate a variation or exception to such a license.

This won't do as is.  It doesn't exclude the possibility of including
a complete license, and if that is intentional, this field needs to be
in the same format as "Distribution".  Licenses are complex documents,
needing at least some of the power of something like ReST.  You may as
well give them all of it.

 > Project-URL (multiple-use)
 > Provides-Extra (multiple use)

Hyphen or no hyphen?  Consistency is good.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Rejecting PEPs 407 and 413?

2012-10-18 Thread Nick Coghlan
With the 3.4 release PEP published using a traditional schedule,
perhaps MvL would care to do the honours as BDFL-Delegate in rejecting
the two "faster release cycle for the standard library" PEPs?

(I know I asked to hold off on that when MvL last brought it up, but
I've since realised that "do the first alpha early" isn't a
stand-alone PEP, it's just a matter of convincing Larry it's
worthwhile and negotiating timing with the release team after there
are some release-worthy features on trunk)

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] accept the wheel PEPs 425, 426, 427

2012-10-18 Thread Daniel Holth
On Thu, Oct 18, 2012 at 10:55 PM, Stephen J. Turnbull
 wrote:
> Executive summary:
>
> You probably should include a full ABNF grammar
>
> Daniel Holth writes:
>
>  > To support empty lines and lines with indentation with respect to
>  > the RFC 822 format, any CRLF character has to be suffixed by 7 spaces
>  > followed by a pipe ("|") char. [...]
>  > This encoding implies that any occurences of a CRLF followed by 7 spaces
>  > and a pipe char have to be replaced by a single CRLF when the field
>  > is unfolded using a RFC822 reader.
>
> This isn't RFC 822 unfolding at all.  An RFC 822 "reader" will simply
> remove the CRLF and optionally "canonicalize" the spaces (the latter
> is not allowed by RFC 822, but sometimes it's observed).  This implies
> that if you use an RFC 822 reader, you need to replace instances of the
> regexp r"\s+\|" with a newline.  (If you have a conforming reader, you
> can use the regexp r"\s{7}\|" instead.)  And of course you have to
> RFC-2047-encode non-ASCII in an RFC-822 field.
>
> So please don't refer to the basic format ("field-name: field-body"
> followed by optional continuation lines) as "RFC822".  "Inspired by
> RFC 822" maybe.  Better "chosen to resemble the familiar RFC 822
> header format used in email and netnews."  (Note that RFC 822 is
> actually ambiguous even about the basic format; section 3.4.2 implies
> that "name   :body" would be an acceptable field, although section
> 3.1.2 doesn't seem to allow space before the colon.  Referring to RFC
> 822 as a standard here is a bad idea.  There is a reason why that
> standard gets revised/replaced periodically!)
>
> I don't understand why you specify that the newline is represented by
> CRLF *after* unfolding.  Once unfolded, these fields are all what
> RFC822 would call "unstructured fields" (in that context of that RFC).
> They will contain text followed by a terminating CRLF, but including
> no others.  In fact that CRLF is redundant, and may as well be
> stripped (and probably will be, in most implementations).
>
> I don't understand why you specify newline as CRLF here, except to
> pretend that you're respecting RFC 822.  But all you're using are the
> division of a field into field-name and field-body by a colon, and the
> convention that a newline followed by folding whitespace is a
> continuation line.  These are both trivial to implement, and almost
> all implementations will undoubtedly read the file as *text* in
> universal newline mode.  I see no reason to specify a binary format.
>
>  > Author-email (optional)
>  > :::
>  >
>  > A string containing the author's e-mail address.  It can contain
>  > a name and e-mail address in the legal forms for a RFC-822
>  > ``From:`` header.
>
> Heavens above, no!  From RFC 822, this:
>
> Wilt . (the  Stilt) [email protected]
>
> is a legal email address, which probably would be represented
> conventionally as
>
> "Wilt (the Stilt) Chamberlain" 
>
> However, it's not at all clear that all mail clients, let alone just
> plain folks, will interpret the first form correctly.  And there are
> worse examples given in that RFC.  Is there a reason why you can't
> require these to be in the form recommended by RFC 5322 (ie, the
> "conventional representation" above)?  Or you could relax this so that
> the quotes are prohibited.
>
>  > License (optional)
>  > ::
>  >
>  > Text indicating the license covering the distribution where the license
>  > is not a selection from the "License" Trove classifiers. See
>  > "Classifier" below.  This field may also be used to specify a
>  > particular version of a licencse which is named via the ``Classifier``
> A
> typo+
>
>  > field, or to indicate a variation or exception to such a license.
>
> This won't do as is.  It doesn't exclude the possibility of including
> a complete license, and if that is intentional, this field needs to be
> in the same format as "Distribution".  Licenses are complex documents,
> needing at least some of the power of something like ReST.  You may as
> well give them all of it.
>
>  > Project-URL (multiple-use)
>  > Provides-Extra (multiple use)
>
> Hyphen or no hyphen?  Consistency is good.

I will include or remove the hyphen.

Your other comments are also true of the predecessor Metadata 1.2.

The | folding discussion could probably die. Personally I do not
respect RFC822 at all (in this format). I rather expect the pragmatic
implementer to more or less [line.split(':', 1) for line in
open('METADATA') if line[0].isalpha()]. The fields that matter at
runtime (Name, Version, Requires-Dist, Provides-Extra) are all
single-line only. Basically everything else is a curiosity for the
human reader.

The .dist-info (PEP 376) or the wheel spec should gain a well-known
file package-1.0.dist-info/LICENSE. Many open source licenses require
that you include the license with every copy of the program.

Thanks,

Da