Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-28 Thread Daniel Holth
On Thu, Feb 28, 2013 at 12:54 AM, Nick Coghlan ncogh...@gmail.com wrote:
 On Thu, Feb 28, 2013 at 7:59 AM, Daniel Holth dho...@gmail.com wrote:
 My aim is to provide a hook mechanism that specifically does not say
 anything about the way the cache is stored or even whether the hook
 produces a cache at all. It will just run when pip is done.

 How does the following idea sound?

 New metadata field: Post-Install
 Format: a *single* callable reference in entry-points format (i.e.
 module.name:callable.name)
 Call signature:

 def post_install_hook(metadata, extras, previous_version=None):
 ...

 extras would be a tuple indicating which extras were installed.

 For an upgrade, previous_version would be set to the version that
 was previously installed. For a clean installation, it would either be
 None or omitted entirely.

 The metadata argument would be the PEP 426 metadata, reformatted as
 JSON-compatible structured metadata. I had planned to postpone
 defining the algorithm for that conversion until after PEP 426
 acceptance, but if we're going to add a post-install hook mechanism to
 PEP 426, I think it makes more sense to define it up front:

 1. The top level is a mapping, with lowercase versions of all PEP 426
 fields as keys. All multiple-use fields other than requires-python
 are pluralised (that one is only multiple use so you can depend on a
 different version of Python given different environment markers - for
 example, supporting Python 2.6 everywhere, but requiring Python 2.7 on
 Windows. Aside from those cases, you can collapse an arbitrarily
 complex version specifier down to a single line)
 3. Every mandatory field is present, with a string value
 4. If present, the keywords field, references a list of keywords
 (created via str.split)
 5. If present, the description is always stored under the
 description key, even if provided in the PEP 426 metadata payload
 6. If any other optional field is present, it references a string value
 7. If present, the project-urls key references a mapping of labels to URLs.
 8. If present, the extensions key references a mapping of extension
 names to the extension's embedded JSON metadata. (Note: this is the
 key reason for my planned change to the extension format from
 arbitrary subfields to allowing only a single json subfield - it
 greatly simplifies this aspect of the translation to structured
 metadata, *and* makes it more flexible and powerful at the same time)
 9. For any multi-use field that is present and supports environment
 markers, it is a reference to a mapping where each key is a
 whitespace-normalized (i.e. every sequence of whitespace converted to
 a single space) environment marker string that references a list of
 string values. The unqualified fields are referenced by the string
 always. This breakdown allows each unique environment marker to be
 evaluated only once to determine whether or not it is applicable,
 regardless of how many times it was originally used.
 10. If any other multi-use field is present, it references a list of
 string values.

 For example:

 Metadata-Version: 2.0
 Name: BeagleVote
 Version: 1.0a2
 Summary: A module for collecting votes from beagles.
 Keywords: dog puppy voting election
 Project-URL: Bug, Issue Tracker,
 http://bitbucket.org/tarek/distribute/issues/
 Requires-Dist: pkginfo
 Requires-Dist: PasteDeploy
 Requires-Dist: zope.interface (3.5.0)
 Extension: Chili
 Chili/json: {
 Type: Poblano,
 Heat: Mild
 }

Apparently, these beagles like their chili. (This is not a helpful
 description)

 Would become:

 {
 metadata-version: 2.0,
 name: BeagleVote,
 version: 1.0a2,
 summary: A module for collecting votes from beagles.,
 description: Apparently, these beagles like their chili.
 (This is not a helpful description),
 keywords: [dog, puppy, voting, election],
 project-urls: {
 Bug, Issue Tracker:
 http://bitbucket.org/tarek/distribute/issues/;
 },
 requires-dists: {always: [pkginfo, PasteDeploy,
 zope.interface (3.5.0)]},
 extensions: {
 Chili: {
 Type: Poblano,
 Heat: Mild
 }
 }
 }

 An apparently simpler alternative would be to rely on PEP 376 to
 retrieve the full metadata and only provide the distribution name and
 version to the hook:

 def post_install_hook(distname, current_version, previous_version=None):
 ...

 The key disadvantage of that seemingly simpler approach is it *only*
 works for post install and pre uninstall hooks, *and* requires that
 the post-install hook have the tools needed to read the PEP 376
 metadata. If we later want to add pre-install, build or archiving
 hooks, they would need the structured metadata format anyway, as
 relying on PEP 376 isn't an option for software that hasn't been
 installed yet. This simpler alternative 

Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-28 Thread Nick Coghlan
On Fri, Mar 1, 2013 at 12:00 AM, Daniel Holth dho...@gmail.com wrote:
 We will probably wind up with some JSON very much like that. I like
 just exposing it as an ordered multidict with the same key names as
 mentioned in the PEP.

A multidict is not really JSON-compatible - making sure there's an
unambiguous mapping to an ordinary dictionary is highly desirable.
Also, it's handy to pre-split and group the entries conditioned on the
environment markers.

 IMO the environment marker for always is just
  (empty string).

I initially had that, but it looked weird in the case where there
weren't any conditional entries, and it also looks weird when
accessing the data structure. By contrast, always is a
self-describing key.

 My hook would be a literal Entry-Point. You would install a package
 twisted.plugins that would register its interest in installation
 changes by declaring the entry point [packaging.hooks]
 post_install=twisted.plugins:hook. Afterwards, every time you install
 or uninstall another package, twisted.plugins.hook() would be called.
 It would iterate over all installed distributions using some API like
 pkg_resources.working_set or distlib's database and do whatever it
 needed to do. It could be called once per pip invocation instead of
 once per individual package.

 The hook is not guaranteed to run. If you do not run the hook, you
 should expect Twisted's plugin discovery process to take longer just
 like it does today. In fact the packages available on sys.path are not
 guaranteed to have been installed at all.

This is *not* the same kind of hook at all. The proposed hook is only
run when *Twisted* is installed to replace some current legitimate
customisation of ./setup.py install behaviour, not when an arbitrary
package is installed to let Twisted know about it. Your suggestion
would indeed be more appropriately part of an installer-specific entry
point (but one made much easier by the standard including an algorithm
for conversion to structured metadata).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-28 Thread Daniel Holth
On Thu, Feb 28, 2013 at 10:04 AM, Nick Coghlan ncogh...@gmail.com wrote:
 On Fri, Mar 1, 2013 at 12:00 AM, Daniel Holth dho...@gmail.com wrote:
 We will probably wind up with some JSON very much like that. I like
 just exposing it as an ordered multidict with the same key names as
 mentioned in the PEP.

 A multidict is not really JSON-compatible - making sure there's an
 unambiguous mapping to an ordinary dictionary is highly desirable.
 Also, it's handy to pre-split and group the entries conditioned on the
 environment markers.

Sure, nothing wrong with it. Just don't bother pluralizing the names.
Goose: gander becomes geese : {} no thanks.

 IMO the environment marker for always is just
  (empty string).

 I initially had that, but it looked weird in the case where there
 weren't any conditional entries, and it also looks weird when
 accessing the data structure. By contrast, always is a
 self-describing key.

Or True, or an environment-marker tautology...

 My hook would be a literal Entry-Point. You would install a package
 twisted.plugins that would register its interest in installation
 changes by declaring the entry point [packaging.hooks]
 post_install=twisted.plugins:hook. Afterwards, every time you install
 or uninstall another package, twisted.plugins.hook() would be called.
 It would iterate over all installed distributions using some API like
 pkg_resources.working_set or distlib's database and do whatever it
 needed to do. It could be called once per pip invocation instead of
 once per individual package.

 The hook is not guaranteed to run. If you do not run the hook, you
 should expect Twisted's plugin discovery process to take longer just
 like it does today. In fact the packages available on sys.path are not
 guaranteed to have been installed at all.

 This is *not* the same kind of hook at all. The proposed hook is only

That is why this conversation has been so confusing :-)

 run when *Twisted* is installed to replace some current legitimate
 customisation of ./setup.py install behaviour, not when an arbitrary
 package is installed to let Twisted know about it. Your suggestion
 would indeed be more appropriately part of an installer-specific entry
 point (but one made much easier by the standard including an algorithm
 for conversion to structured metadata).
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-27 Thread Vinay Sajip
Nick Coghlan ncoghlan at gmail.com writes:

 I'm not a fan of post-install hooks - that way lies setup.py. If
 people want to run arbitrary code at install time, they can publish a
 platform specific installer.
 
 *Maybe* we can go down that path in the Python 3.5 timeframe, but for now, no.

I'm concerned that this might affect adoption: there are a lot of projects that
have non-trivial custom code in setup.py - often doing mundane stuff like 
copying
files around before the actual setup() call. Having hooks will enable easier
migration for such projects (which include, for example, Twisted, Cython, 
NumPy).
I don't believe it's realistic to expect them all to create platform-specific
installers; they'll just carry on using setuptools/distribute. If we want to 
move
things forward in packaging, surely we have to make migration easier? IMO this
was one of the things that distutils2/packaging also did not address
sufficiently.

Just to clarify: when I say hooks, what I mean is setuptools-style entry
points that the installer looks for, which are used to customise the 
installation
process. I believe it is possible to provide limited extensibility using hooks
without it leading to the complete ad-hocery that setup.py entails.

Regards,

Vinay Sajip

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-27 Thread Nick Coghlan
On Wed, Feb 27, 2013 at 8:52 PM, Vinay Sajip vinay_sa...@yahoo.co.uk wrote:
 Just to clarify: when I say hooks, what I mean is setuptools-style entry
 points that the installer looks for, which are used to customise the 
 installation
 process.

The command to create a wheel from a source archive is currently still
./setup.py bdist_wheel. This may be executed on an appropriate build
system rather than the target system, but aside from that everything
in setup.py should still execute normally. This is the major
difference between the current attempt and distutils2: du2 made moving
from setup.py to setup.cfg a requirement to generate the new metadata
format. By contrast, I want at least distribute, as well as the Python
3.4 distutils, to be able to generate wheels (including the new
metadata) from current setup.py files.

 I believe it is possible to provide limited extensibility using hooks
 without it leading to the complete ad-hocery that setup.py entails.

For version 1.0, the only install-time modification that all wheel
installers must implement is fanning files out to their target
locations based on sysconfig directories and rewriting script shebang
lines (they may also want to generate parallel Windows executables,
but with the Windows launcher, that's less necessary).

If a project needs more than that, they cannot ship wheels at this
time, and will need to continue shipping source distributions that can
execute arbitrary code at install time. Alternatively (and
preferably), such a project could split out a support library that is
wheel compatible, and have a separate component that must be installed
from source and is able to make arbitrary changes to the target
system.

*Incremental* change, and explicitly leaving some use cases to source
distribution and ./setup.py for the moment is the key to creating a
distribution format that is as simple as we can make it while still
supporting a wide variety of use cases. Will we eventually get
pre-install and post-install hooks ala RPM and other platform specific
systems? Quite possibly. But let's see how far we can get without them
first - in particular, I want to focus people's initial efforts on
putting the smarts into the wheel *creation* process rather than
delaying decisions until install time.

The initial problem I believe we need to solve is the one of arcane
build systems for key dependencies, and the simple fact that most
Windows users aren't equipped to build software written in C in the
first place. Eggs tried to tackle that problem years ago, but ignored
things like the Filesystem Hierarchy Standard and the interests of OS
distributions and system administrators, limiting its adoption to
those developers that were happy with the idea of storing *everything*
inside a single directory (the various legitimate concerns with the
default behaviour of easy_install also didn't help). Wheel is designed
to integrate more cleanly with platform specific conventions,
hopefully overcoming some of those past objections to the egg format.

This preliminary approach also integrates well with centralised system
management tools like Puppet, Chef and Salt - for those, the states
and configurations of services and other components are handled
through the management infrastructure, and the language specific
package management tools are just a way to get the application code
onto the target systems in a controlled fashion.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-27 Thread Daniel Holth
On Wed, Feb 27, 2013 at 6:45 AM, Nick Coghlan ncogh...@gmail.com wrote:
 On Wed, Feb 27, 2013 at 8:52 PM, Vinay Sajip vinay_sa...@yahoo.co.uk wrote:
 Just to clarify: when I say hooks, what I mean is setuptools-style entry
 points that the installer looks for, which are used to customise the 
 installation
 process.

 The command to create a wheel from a source archive is currently still
 ./setup.py bdist_wheel. This may be executed on an appropriate build
 system rather than the target system, but aside from that everything
 in setup.py should still execute normally. This is the major
 difference between the current attempt and distutils2: du2 made moving
 from setup.py to setup.cfg a requirement to generate the new metadata
 format. By contrast, I want at least distribute, as well as the Python
 3.4 distutils, to be able to generate wheels (including the new
 metadata) from current setup.py files.

Vinay's distlib has taken the wheel spec at its word, runs an
unmodified install command with all the various paths set to
wheel-compatible distname-1.0.data/scripts etc., and converts the
.egg-info directory to .dist-info the same as bdist_wheel's final
step.

All wheel does is it takes a basic assumption of distutils2 (avoid
running setup.py), rearranges it slightly (avoid running setup.py at
install time) and magically people seem to like it. I wanted lxml to
compile faster and wound up with a distutils escape hatch. Now I think
that avoiding running *distutils* at install time is much more
important than avoiding setup.py.

 I believe it is possible to provide limited extensibility using hooks
 without it leading to the complete ad-hocery that setup.py entails.

 For version 1.0, the only install-time modification that all wheel
 installers must implement is fanning files out to their target
 locations based on sysconfig directories and rewriting script shebang
 lines (they may also want to generate parallel Windows executables,
 but with the Windows launcher, that's less necessary).

 If a project needs more than that, they cannot ship wheels at this
 time, and will need to continue shipping source distributions that can
 execute arbitrary code at install time. Alternatively (and
 preferably), such a project could split out a support library that is
 wheel compatible, and have a separate component that must be installed
 from source and is able to make arbitrary changes to the target
 system.

 *Incremental* change, and explicitly leaving some use cases to source
 distribution and ./setup.py for the moment is the key to creating a
 distribution format that is as simple as we can make it while still
 supporting a wide variety of use cases. Will we eventually get
 pre-install and post-install hooks ala RPM and other platform specific
 systems? Quite possibly. But let's see how far we can get without them
 first - in particular, I want to focus people's initial efforts on
 putting the smarts into the wheel *creation* process rather than
 delaying decisions until install time.

It's just the 1.0 release. There's no hurry to write the document
entitled PEP 376 is now the/a standard *interchange* format for
distribution metadata; here's how you can experiment with caching
runtime introspection. Other tasks such as create the simplest
possible useful packaging system for the stdlib [by only including the
install feature] and create an ecosystem of interoperable
third-party products to do everything else are higher up on the Grand
Python Packaging Plan or GP3 (tm) to-do list.

 The initial problem I believe we need to solve is the one of arcane
 build systems for key dependencies, and the simple fact that most
 Windows users aren't equipped to build software written in C in the
 first place. Eggs tried to tackle that problem years ago, but ignored
 things like the Filesystem Hierarchy Standard and the interests of OS
 distributions and system administrators, limiting its adoption to
 those developers that were happy with the idea of storing *everything*
 inside a single directory (the various legitimate concerns with the
 default behaviour of easy_install also didn't help). Wheel is designed
 to integrate more cleanly with platform specific conventions,
 hopefully overcoming some of those past objections to the egg format.

It's designed to make binary packaging generally interesting, even if
you don't have C extensions, or even if you do have a C compiler. This
will hopefully be a benefit to our Windows community as well.

 This preliminary approach also integrates well with centralised system
 management tools like Puppet, Chef and Salt - for those, the states
 and configurations of services and other components are handled
 through the management infrastructure, and the language specific
 package management tools are just a way to get the application code
 onto the target systems in a controlled fashion.

 Cheers,
 Nick.

 --
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
 

Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-27 Thread Vinay Sajip
Daniel Holth dholth at gmail.com writes:

 Vinay's distlib has taken the wheel spec at its word, runs an
 unmodified install command with all the various paths set to
 wheel-compatible distname-1.0.data/scripts etc., and converts the
 .egg-info directory to .dist-info the same as bdist_wheel's final
 step.

Right, except there's no conversion of .egg-info to .dist-info in distlib
itself. That's done by the separate wheeler.py demonstration script, which
uses vanilla pip to install to a holding location, converts the .egg-info to
.dist-info and then builds the wheel from that.

At installation time, the wheel's .dist-info contents are moved to the
installation site's site-packages, except for WHEEL, which is omitted,
and RECORD which is recreated.

 All wheel does is it takes a basic assumption of distutils2 (avoid
 running setup.py), rearranges it slightly (avoid running setup.py at
 install time) and magically people seem to like it. I wanted lxml to
 compile faster and wound up with a distutils escape hatch. Now I think

A happy accident, then!

 that avoiding running *distutils* at install time is much more
 important than avoiding setup.py.
 

 It's just the 1.0 release. There's no hurry to write the document
 entitled PEP 376 is now the/a standard *interchange* format for
 [snip]
 third-party products to do everything else are higher up on the Grand
 Python Packaging Plan or GP3 (tm) to-do list.

I suppose you're right, but I want to make as much progress as I can while I
still have the time I can spend on this, and while the grey cells haven't
succumbed to packaging fatigue ...  :-)

Regards,

Vinay Sajip


___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-27 Thread Daniel Holth
On Wed, Feb 27, 2013 at 10:08 AM, Vinay Sajip vinay_sa...@yahoo.co.uk wrote:
 Daniel Holth dholth at gmail.com writes:

 Vinay's distlib has taken the wheel spec at its word, runs an
 unmodified install command with all the various paths set to
 wheel-compatible distname-1.0.data/scripts etc., and converts the
 .egg-info directory to .dist-info the same as bdist_wheel's final
 step.

 Right, except there's no conversion of .egg-info to .dist-info in distlib
 itself. That's done by the separate wheeler.py demonstration script, which
 uses vanilla pip to install to a holding location, converts the .egg-info to
 .dist-info and then builds the wheel from that.

 At installation time, the wheel's .dist-info contents are moved to the
 installation site's site-packages, except for WHEEL, which is omitted,
 and RECORD which is recreated.

 All wheel does is it takes a basic assumption of distutils2 (avoid
 running setup.py), rearranges it slightly (avoid running setup.py at
 install time) and magically people seem to like it. I wanted lxml to
 compile faster and wound up with a distutils escape hatch. Now I think

 A happy accident, then!

 that avoiding running *distutils* at install time is much more
 important than avoiding setup.py.


 It's just the 1.0 release. There's no hurry to write the document
 entitled PEP 376 is now the/a standard *interchange* format for
 [snip]
 third-party products to do everything else are higher up on the Grand
 Python Packaging Plan or GP3 (tm) to-do list.

 I suppose you're right, but I want to make as much progress as I can while I
 still have the time I can spend on this, and while the grey cells haven't
 succumbed to packaging fatigue ...  :-)

Luckily parts of your brain are red and black. I'm amazed at the
effort you've put forth so far. The idea isn't to limit the amount of
progress but simply to have a good separation between a smaller number
things we need to agree on and probably put in the stdlib (for example
dependency declarations and a basic binary format) and the things we
don't have to or are very unlikely to agree on that will probably be
outside the stdlib (for example a not-likely-forthcoming universal
build system, and perhaps the best way to cache .dist-info assuming
the feature is even beneficial at all).

Anyway Nick has been describing a different thing numpy or package
specific post-install hook than the proposal some way to run code
that is intended to cache .dist-info directories at install time
without patching every installer.
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-27 Thread Glyph

On Feb 27, 2013, at 2:52 AM, Vinay Sajip vinay_sa...@yahoo.co.uk wrote:

 Nick Coghlan ncoghlan at gmail.com writes:
 
 I'm not a fan of post-install hooks - that way lies setup.py. If
 people want to run arbitrary code at install time, they can publish a
 platform specific installer.
 
 *Maybe* we can go down that path in the Python 3.5 timeframe, but for now, 
 no.
 
 I'm concerned that this might affect adoption: there are a lot of projects 
 that
 have non-trivial custom code in setup.py - often doing mundane stuff like 
 copying
 files around before the actual setup() call. Having hooks will enable easier
 migration for such projects (which include, for example, Twisted, Cython, 
 NumPy).
 I don't believe it's realistic to expect them all to create platform-specific
 installers; they'll just carry on using setuptools/distribute.

Quite so.

Post-install hooks are a requirement for Twisted and for many projects which 
depend on Twisted.  The hook is always the same on every platform, so it's not 
a platform-specific installer issue.

Frankly, a big appeal of some next-generation package distribution system is 
the introduction of a proper set of events we can hook into, instead of 
assuming that by some accident of timing we can work out when the software is 
being installed and call some random function from the bottom of setup.py 
with a bunch of state scooped out of distutils' internals.  The current 
situation is a total mess.

-glyph

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-27 Thread Donald Stufft
On Wednesday, February 27, 2013 at 1:47 PM, Glyph wrote:
 
 On Feb 27, 2013, at 2:52 AM, Vinay Sajip vinay_sa...@yahoo.co.uk 
 (mailto:vinay_sa...@yahoo.co.uk) wrote:
  Nick Coghlan ncoghlan at gmail.com (http://gmail.com/) writes:
  
   I'm not a fan of post-install hooks - that way lies setup.py. If
   people want to run arbitrary code at install time, they can publish a
   platform specific installer.
   
   *Maybe* we can go down that path in the Python 3.5 timeframe, but for 
   now, no.
  
  I'm concerned that this might affect adoption: there are a lot of projects 
  that
  have non-trivial custom code in setup.py - often doing mundane stuff like 
  copying
  files around before the actual setup() call. Having hooks will enable easier
  migration for such projects (which include, for example, Twisted, Cython, 
  NumPy).
  I don't believe it's realistic to expect them all to create 
  platform-specific
  installers; they'll just carry on using setuptools/distribute.
 Quite so.
 
 Post-install hooks are a requirement for Twisted and for many projects which 
 depend on Twisted.  The hook is always the same on every platform, so it's 
 not a platform-specific installer issue.
 
 Frankly, a big appeal of some next-generation package distribution system is 
 the introduction of a proper set of events we can hook into, instead of 
 assuming that by some accident of timing we can work out when the software is 
 being installed and call some random function from the bottom of setup.py 
 with a bunch of state scooped out of distutils' internals.  The current 
 situation is a total mess.
 
 -glyph
 
 ___
 Distutils-SIG maillist - Distutils-SIG@python.org 
 (mailto:Distutils-SIG@python.org)
 http://mail.python.org/mailman/listinfo/distutils-sig
 
 

I'm generally +1 on hooks, the failure of setup.py isn't particularly that
it's executable, it's that you can't access the metadata without executing
it. In general hooks also allow people to easily disable them during install
if they don't wish for that (of course packages have no reason to support that
if they don't want to).
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-27 Thread Glyph
On Feb 27, 2013, at 11:04 AM, Daniel Holth dho...@gmail.com wrote:
 What does it have to do in the hook?
 

This: https://twistedmatrix.com/documents/current/core/howto/plugin.html#auto3

While this is theoretically optional - Twisted will behave mostly correctly 
without it - it noticeably improves the start-up performance of Twisted-based 
command-line tools, like 'twistd' and 'trial'.

-glyph___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-27 Thread Glyph
On Feb 27, 2013, at 10:49 AM, Donald Stufft donald.stu...@gmail.com wrote:

 I'm generally +1 on hooks, the failure of setup.py isn't particularly that
 it's executable, it's that you can't access the metadata without executing
 it. In general hooks also allow people to easily disable them during install
 if they don't wish for that (of course packages have no reason to support that
 if they don't want to).

I pretty much agree.  I'd be happy – enthusiastic, even – for Twisted to update 
to some static metadata expression system.

-glyph

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-27 Thread PJ Eby
On Mon, Feb 25, 2013 at 9:39 AM, Nick Coghlan ncogh...@gmail.com wrote:
 (This probably belongs in a successor to PEP 376, but I'll leave it
 under the PEP 426 umbrella for now)

 One of the points raised regarding PEP 426's integrated metadata
 format is the potential for runtime issues with pkg_resources as it
 reads and processes the metadata during startup, particularly if it
 needs to process any environment markers. While I acknowledge the
 suggestions I have received that we should really be moving away from
 the current filesystem based distributed installation information to a
 real database that properly handle import hooks, I'm looking for
 something simpler that will make it easier for setuptools and
 distribute to consume the new metadata format (and thus hopefully make
 them more amenable to generating it as well)

 Assuming we add an Entry-Points field as I have proposed in another
 message, I'd like to propose that installers generate three additional
 cache files as part of the installation process:

 dist-info-dir/__cache__/version.txt
 dist-info-dir/__cache__/requires-dist.txt
 dist-info-dir/__cache__/entry-points.txt

 version.txt would just be the version of the installed distribution
 (no need to parse the main metadata file just to read the version
 field)

 requires-dist.txt would be similar to the pkg_resources requires.txt
 format, but use PEP 426 version specifiers. It would:
 - only contain runtime requirements where the environment markers
 match the current system
 - be split into sections based on the extras definition needed to
 get the environment marker to pass

 entry-points.txt would be the same format as the pkg_resources 
 entry_points.txt

 Cheers,
 Nick.

Since this isn't going to be backwards-compatible anyway, may I suggest that:

1. The caching algorithm be fixed and defined as part of the extension machinery
2. The caching consists of simply copying the data to a file, whose
name is programmatically based on the extension/field name.
3. Environment markers are not processed - that's up to the tool
consuming the cached data

This way, if e.g. entry points are defined as an extension, then the
Builder making a wheel doesn't need to understand entry points, it
just has to copy fields to a file.  It allows other resource types
(like i18n/l10n resources) to be defined in the metadata and cached
for runtime use, without needing a metadata version upgrade or any
tool rewrites.  And not processing environment markers means that
pure-Python wheels can still be used by just placing them on sys.path.
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-27 Thread Daniel Holth
On Wed, Feb 27, 2013 at 4:48 PM, PJ Eby p...@telecommunity.com wrote:
 On Mon, Feb 25, 2013 at 9:39 AM, Nick Coghlan ncogh...@gmail.com wrote:
 (This probably belongs in a successor to PEP 376, but I'll leave it
 under the PEP 426 umbrella for now)

 One of the points raised regarding PEP 426's integrated metadata
 format is the potential for runtime issues with pkg_resources as it
 reads and processes the metadata during startup, particularly if it
 needs to process any environment markers. While I acknowledge the
 suggestions I have received that we should really be moving away from
 the current filesystem based distributed installation information to a
 real database that properly handle import hooks, I'm looking for
 something simpler that will make it easier for setuptools and
 distribute to consume the new metadata format (and thus hopefully make
 them more amenable to generating it as well)

 Assuming we add an Entry-Points field as I have proposed in another
 message, I'd like to propose that installers generate three additional
 cache files as part of the installation process:

 dist-info-dir/__cache__/version.txt
 dist-info-dir/__cache__/requires-dist.txt
 dist-info-dir/__cache__/entry-points.txt

 version.txt would just be the version of the installed distribution
 (no need to parse the main metadata file just to read the version
 field)

 requires-dist.txt would be similar to the pkg_resources requires.txt
 format, but use PEP 426 version specifiers. It would:
 - only contain runtime requirements where the environment markers
 match the current system
 - be split into sections based on the extras definition needed to
 get the environment marker to pass

 entry-points.txt would be the same format as the pkg_resources 
 entry_points.txt

 Cheers,
 Nick.

 Since this isn't going to be backwards-compatible anyway, may I suggest that:

 1. The caching algorithm be fixed and defined as part of the extension 
 machinery
 2. The caching consists of simply copying the data to a file, whose
 name is programmatically based on the extension/field name.
 3. Environment markers are not processed - that's up to the tool
 consuming the cached data

 This way, if e.g. entry points are defined as an extension, then the
 Builder making a wheel doesn't need to understand entry points, it
 just has to copy fields to a file.  It allows other resource types
 (like i18n/l10n resources) to be defined in the metadata and cached
 for runtime use, without needing a metadata version upgrade or any
 tool rewrites.  And not processing environment markers means that
 pure-Python wheels can still be used by just placing them on sys.path.

My aim is to provide a hook mechanism that specifically does not say
anything about the way the cache is stored or even whether the hook
produces a cache at all. It will just run when pip is done.
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-27 Thread Nick Coghlan
On Thu, Feb 28, 2013 at 7:59 AM, Daniel Holth dho...@gmail.com wrote:
 My aim is to provide a hook mechanism that specifically does not say
 anything about the way the cache is stored or even whether the hook
 produces a cache at all. It will just run when pip is done.

How does the following idea sound?

New metadata field: Post-Install
Format: a *single* callable reference in entry-points format (i.e.
module.name:callable.name)
Call signature:

def post_install_hook(metadata, extras, previous_version=None):
...

extras would be a tuple indicating which extras were installed.

For an upgrade, previous_version would be set to the version that
was previously installed. For a clean installation, it would either be
None or omitted entirely.

The metadata argument would be the PEP 426 metadata, reformatted as
JSON-compatible structured metadata. I had planned to postpone
defining the algorithm for that conversion until after PEP 426
acceptance, but if we're going to add a post-install hook mechanism to
PEP 426, I think it makes more sense to define it up front:

1. The top level is a mapping, with lowercase versions of all PEP 426
fields as keys. All multiple-use fields other than requires-python
are pluralised (that one is only multiple use so you can depend on a
different version of Python given different environment markers - for
example, supporting Python 2.6 everywhere, but requiring Python 2.7 on
Windows. Aside from those cases, you can collapse an arbitrarily
complex version specifier down to a single line)
3. Every mandatory field is present, with a string value
4. If present, the keywords field, references a list of keywords
(created via str.split)
5. If present, the description is always stored under the
description key, even if provided in the PEP 426 metadata payload
6. If any other optional field is present, it references a string value
7. If present, the project-urls key references a mapping of labels to URLs.
8. If present, the extensions key references a mapping of extension
names to the extension's embedded JSON metadata. (Note: this is the
key reason for my planned change to the extension format from
arbitrary subfields to allowing only a single json subfield - it
greatly simplifies this aspect of the translation to structured
metadata, *and* makes it more flexible and powerful at the same time)
9. For any multi-use field that is present and supports environment
markers, it is a reference to a mapping where each key is a
whitespace-normalized (i.e. every sequence of whitespace converted to
a single space) environment marker string that references a list of
string values. The unqualified fields are referenced by the string
always. This breakdown allows each unique environment marker to be
evaluated only once to determine whether or not it is applicable,
regardless of how many times it was originally used.
10. If any other multi-use field is present, it references a list of
string values.

For example:

Metadata-Version: 2.0
Name: BeagleVote
Version: 1.0a2
Summary: A module for collecting votes from beagles.
Keywords: dog puppy voting election
Project-URL: Bug, Issue Tracker,
http://bitbucket.org/tarek/distribute/issues/
Requires-Dist: pkginfo
Requires-Dist: PasteDeploy
Requires-Dist: zope.interface (3.5.0)
Extension: Chili
Chili/json: {
Type: Poblano,
Heat: Mild
}

   Apparently, these beagles like their chili. (This is not a helpful
description)

Would become:

{
metadata-version: 2.0,
name: BeagleVote,
version: 1.0a2,
summary: A module for collecting votes from beagles.,
description: Apparently, these beagles like their chili.
(This is not a helpful description),
keywords: [dog, puppy, voting, election],
project-urls: {
Bug, Issue Tracker:
http://bitbucket.org/tarek/distribute/issues/;
},
requires-dists: {always: [pkginfo, PasteDeploy,
zope.interface (3.5.0)]},
extensions: {
Chili: {
Type: Poblano,
Heat: Mild
}
}
}

An apparently simpler alternative would be to rely on PEP 376 to
retrieve the full metadata and only provide the distribution name and
version to the hook:

def post_install_hook(distname, current_version, previous_version=None):
...

The key disadvantage of that seemingly simpler approach is it *only*
works for post install and pre uninstall hooks, *and* requires that
the post-install hook have the tools needed to read the PEP 376
metadata. If we later want to add pre-install, build or archiving
hooks, they would need the structured metadata format anyway, as
relying on PEP 376 isn't an option for software that hasn't been
installed yet. This simpler alternative also won't work for
eventually decoupling the installation database from a particular
filesystem layout (e.g. adding metadata support to import hooks or
tunnelling the 

Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-25 Thread Paul Moore
On 25 February 2013 14:39, Nick Coghlan ncogh...@gmail.com wrote:
 (This probably belongs in a successor to PEP 376, but I'll leave it
 under the PEP 426 umbrella for now)

 One of the points raised regarding PEP 426's integrated metadata
 format is the potential for runtime issues with pkg_resources as it
 reads and processes the metadata during startup, particularly if it
 needs to process any environment markers. While I acknowledge the
 suggestions I have received that we should really be moving away from
 the current filesystem based distributed installation information to a
 real database that properly handle import hooks, I'm looking for
 something simpler that will make it easier for setuptools and
 distribute to consume the new metadata format (and thus hopefully make
 them more amenable to generating it as well)

 Assuming we add an Entry-Points field as I have proposed in another
 message, I'd like to propose that installers generate three additional
 cache files as part of the installation process:

 dist-info-dir/__cache__/version.txt
 dist-info-dir/__cache__/requires-dist.txt
 dist-info-dir/__cache__/entry-points.txt

 version.txt would just be the version of the installed distribution
 (no need to parse the main metadata file just to read the version
 field)

 requires-dist.txt would be similar to the pkg_resources requires.txt
 format, but use PEP 426 version specifiers. It would:
 - only contain runtime requirements where the environment markers
 match the current system
 - be split into sections based on the extras definition needed to
 get the environment marker to pass

 entry-points.txt would be the same format as the pkg_resources 
 entry_points.txt

Why a __cache__ subdirectory? Is this purely an easier-to-process copy
of what's in the METADATA file? If so, I'd prefer to simply take the
information out of the METADATA file and have it in a single separate
file in the first place. IIUC, that's what Daniel is suggesting as
well.

We don't really need everything to be in a single file, surely?

Paul.
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-25 Thread Nick Coghlan
On Tue, Feb 26, 2013 at 12:45 AM, Paul Moore p.f.mo...@gmail.com wrote:
 We don't really need everything to be in a single file, surely?

Yes, I want the metadata to map cleanly to a single data structure so
it can be more easily managed through things that *aren't* file
systems (such as finally getting the installation database to support
import hooks and also for potential metadata publication through TUF).

However, decomposing it for efficient runtime access and backwards
compatibility reasons makes sense.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig


Re: [Distutils] PEP 426: proposed metadata caching convention

2013-02-25 Thread Daniel Holth
Post install hooks are different than setup.py because they are installed
first and then run for all packages, and are not requested by the installed
dist. They are more like rewriting script #!python shebang.

May I humbly suggest deleting things from this pep until it is acceptable
and not the other way around?
On Feb 25, 2013 11:54 AM, Paul Moore p.f.mo...@gmail.com wrote:

 On 25 February 2013 15:10, Nick Coghlan ncogh...@gmail.com wrote:
  entry-points.txt is pure backwards compatibility, though. The only
  reason I didn't suggest reusing the setuptools name for the file is
  because I want the __cache__ in the name to clearly identify the files
  the installer derives from METADATA rather than the ones defined in
  PEP 376 or installed as part of the distribution.

 One thing I *would* like to suggest is that the cached versions of the
 data should be optional. My specific reason for this is that as things
 stand, many wheels are usable without installation, simply by putting
 them on sys.path. As wheels are a distribution format, they won't have
 the cached data, and I'd be unhappy if that fact broke the ability to
 use them as zips.

 Paul.

___
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig